Hacker News new | ask | show | jobs
by byzantinegene 50 days ago
we're already doing that, it's called distillation and how models like deepseek are trained.