Hacker News new | ask | show | jobs
by danielcampos93 880 days ago
Probably as simple as training the smaller model to approximate the larger model. Well studied and done via tinylm and minilm.