Hacker News new | ask | show | jobs
by osanseviero 922 days ago
This blog post might be interesting - https://huggingface.co/blog/moe

MoEs are especially useful for much faster pre-training. During inference, the model will be fast but still require a very high amount of VRAM. MoEs don't do great in fine-tuning but recent work shows promising instruction-tuning results. There's also quite a bit of ongoing work around MoEs quantization.

In general, MoEs are interesting for high throughput cases with high number of machines, so this is not so so exciting for a local setup, but the recent work in quantization makes it more appealing.