Y
Hacker News
new
|
ask
|
show
|
jobs
by
rdksu
6 days ago
Have you run ablations on the actual effect/impact of on-policy distillation on contributing to the performance ? Just Curious ! As Unsloth based mixed quantisation methods on MoE models are widely used with great community rep.