Hacker News new | ask | show | jobs
Model Merging in Pre-Training of Large Language Models (arxiv.org)
2 points by veryluckyxyz 402 days ago