| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by theLiminator 794 days ago
	Probably why MoE models are so competitive now. Basically that idea within a single model.

2 comments

CuriouslyC 793 days ago

I don't think MoE is the way forward. The bottleneck is memory, and MoE trades MORE memory consumption for lower inference times at a given performance level.

Before too long we're going to see architectures where a model decomposes a prompt into a DAG of LLM calls based on expertise, fans out sub-prompts then reconstitutes the answer from the embeddings they return.

link

elevaet 794 days ago

Please, what is an MoE model?

link

T-A 794 days ago

https://huggingface.co/blog/moe

link

orra 794 days ago

Mixture of Experts. A popular example is Mixtral.

link