|
|
|
|
|
by gcollard-
112 days ago
|
|
Testing this hardware LLM (LLAMA 3.1 8B on a chip) I get ~16k tokens per second. With frontier models plateauing, I’ve been convinced AI will end up like bitcoin mining, and that NVIDIA’s general-purpose GPUs will be replaced by model-specific chips. Glad to see someone innovating in this space. |
|