|
|
|
|
|
by Workaccount2
638 days ago
|
|
Cerebras has ridiculously large LLM ASICs that can hit crazy speeds. You can try it with llama 8B and 70B: https://inference.cerebras.ai/ It's pretty fast, but my understanding is that it is still too expensive even accounting for the speed-up. |
|
And yeah their cost is ridiculous, on the order for high 6 to low 7 figures per wafer. The rack alone looks several times more expensive than the 8x NVIDIA pods [1]
[1] https://web.archive.org/web/20230812020202/https://www.youtu...