Hacker News new | ask | show | jobs
by vlovich123 497 days ago
Thanks for the correction! Can it be retrofitted into existing models through distillation or do you have to train the model from scratch?