|
|
|
|
|
by arjvik
724 days ago
|
|
(I work at Etched.) You need something as complex as CUDA only to support general-purpose programmability; Sohu is built for one thing and one thing only: transformers. So while we certainly need a software stack to harness the chip, it’s much easier to do so, and even easier then to adapt existing LLM serving tools (vLLM, etc.) to use this stack. |
|
Thanks for clarifying this. Could you clarify whether your chip supports the transformer architecture in general, or only specific models for e.g. Llama 70B? In case of the latter, would your ASIC have to be reprogrammed for each model?