| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by yalok 490 days ago
	This wonder if there’s similar research on reducing the amount of data (by improving its quality) for pretraining

1 comments

sebzim4500 490 days ago

Yeah that was the idea behind the Phi series of models. It gets good benchmark results but you can still tell something is missing when you actually try to use it for anything.

link