Y
Hacker News
new
|
ask
|
show
|
jobs
by
byzantinegene
50 days ago
we're already doing that, it's called distillation and how models like deepseek are trained.