Y
Hacker News
new
|
ask
|
show
|
jobs
by
lossolo
323 days ago
> Could you train a small language model with a big one?
Yes, it's called distillation.