Y
Hacker News
new
|
ask
|
show
|
jobs
by
ahmedhawas123
325 days ago
This is super helpful and I had not seen it, thanks so much for sharing! And I hear you on training being an alpha, at the size of the model I wonder how much of this is distillation and using o3/o4 data.