Hacker News new | ask | show | jobs
by pop_mccoy 63 days ago
Explains the high performance of distilled models then (e.g. Chinese ones).