|
|
|
|
|
by adam_patarino
66 days ago
|
|
Compressing a mixture of experts model to fit on smaller hardware with a reinforcement learning approach called Self-Distillation Policy Optimization, progressive expert pruning, multi-objective knowledge distillation, speculative decoding, and custom quantization. |
|