Hacker News new | ask | show | jobs
by intrasight 31 days ago
Is that true? If the distillation is not lossy and the model runs much faster due to less resource consumption, then it may outperform.
1 comments

One of those conditionals is a pretty huge assumption.
It's an assumption and it can be tested