Y
Hacker News
new
|
ask
|
show
|
jobs
by
yalok
490 days ago
This wonder if there’s similar research on reducing the amount of data (by improving its quality) for pretraining
1 comments
sebzim4500
490 days ago
Yeah that was the idea behind the Phi series of models. It gets good benchmark results but you can still tell something is missing when you actually try to use it for anything.
link