Hacker News new | ask | show | jobs
Compressing LLMs with progressive pruning and multi-objective distillation (rig.ai)
4 points by adam_patarino 66 days ago
2 comments

Compressing a mixture of experts model to fit on smaller hardware with a reinforcement learning approach called Self-Distillation Policy Optimization, progressive expert pruning, multi-objective knowledge distillation, speculative decoding, and custom quantization.
Awesome!