Hacker News new | ask | show | jobs
by arjvik 724 days ago
(I work at Etched.) You need something as complex as CUDA only to support general-purpose programmability; Sohu is built for one thing and one thing only: transformers. So while we certainly need a software stack to harness the chip, it’s much easier to do so, and even easier then to adapt existing LLM serving tools (vLLM, etc.) to use this stack.
1 comments

> Sohu is built for one thing and one thing only: transformers

Thanks for clarifying this. Could you clarify whether your chip supports the transformer architecture in general, or only specific models for e.g. Llama 70B? In case of the latter, would your ASIC have to be reprogrammed for each model?

Transformers in general. There’s no reprogramming of the ASIC needed, just applying a different sequence of layers, and that’s exactly what our software stack is meant to support.