Hacker News new | ask | show | jobs
by theLiminator 794 days ago
Probably why MoE models are so competitive now. Basically that idea within a single model.
2 comments

I don't think MoE is the way forward. The bottleneck is memory, and MoE trades MORE memory consumption for lower inference times at a given performance level.

Before too long we're going to see architectures where a model decomposes a prompt into a DAG of LLM calls based on expertise, fans out sub-prompts then reconstitutes the answer from the embeddings they return.

Please, what is an MoE model?
Mixture of Experts. A popular example is Mixtral.