| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by germanjoey 996 days ago

Sambanova just launched something similar to what you're describing. It's a demo of their new chip running a 1T param MoE model 150 7B llama2s, each retrained to be an expert in a different topic. So one of them is a "law" expert, another on "physics", etc.

They've got a video here [1] (scroll down slightly) that compares it against a 180B Falcon model that's running on GPUs on HuggingFace. The MoE results are not only just as good quality-wise, but also ridiculously fast. Like, nearly instant. A big benefit is that the experts can be swapped-out and retrained with new data, which is obviously not as easy with the more monolithic 180B model.

[1] https://sambanova.ai/launch2023

1 comments

tarruda 996 days ago

Really impressive, thanks for sharing

link