Y
Hacker News
new
|
ask
|
show
|
jobs
by
nuz
701 days ago
Lots and lots of synthetic data from the bigger models training the smaller ones would be my guess.