Y
Hacker News
new
|
ask
|
show
|
jobs
by
exabrial
117 days ago
I feel like we need an entirely new type of silicon for LLMs. Something completely focused on bandwidth and storage probably at the sacrifice of raw computation power.
1 comments
garethsprice
117 days ago
Something like this? (Llama 3.1-8B etched into custom silicon delivering 16,000 tok/s, doesn't use much PCIe bandwidth):
-
https://taalas.com/the-path-to-ubiquitous-ai/
-
https://chatjimmy.ai/
link
exabrial
117 days ago
Wowsa that’s amazing! Exactly what I was imagining. To do that with 2500 watts is incredible.
link
- https://taalas.com/the-path-to-ubiquitous-ai/ - https://chatjimmy.ai/