Hacker News new | ask | show | jobs
by dr_dshiv 321 days ago
And it’s a massive distillation of the mother model, so the costs of inference are likely low.