Hacker News new | ask | show | jobs
by ntlm1686 434 days ago
"made an ambitious bet on MoEs"? No, DeepSeek is MoE, and they succeeded. Meta is not betting on MoE, it just does what other people have done.
1 comments

Llama4 seems in many ways a cut and paste of DeepSeek. Including the shared expert and the high sparsity. It's a DeepSeek that does not work well.