Hacker News new | ask | show | jobs
by supermdguy 17 days ago
It's interesting because their last model series (Phi) was based around the thesis that high-quality synthetic data is better than a large pre-training corpus.