Hacker News new | ask | show | jobs
by cactusfrog 323 days ago
This is really interesting. I think force fields in molecular dynamics have underwent a similar NN revolution. You train your NN on the output of expensive calculations to replace the expensive function with a cheap one. Could you train a small language model with a big one?
1 comments

> Could you train a small language model with a big one?

Yes, it's called distillation.