Hacker News new | ask | show | jobs
by filterfiber 911 days ago
The current bottleneck for most current hardware is RAM capacity than memory bandwidth and last is FLOPS/TOPS.

The coral has 8 MB of SRAM which uh, won't fit the 2GB+ that nearly any decent LLM require even after being quantized.

LLMs are mostly memory and memory bandwidth limited right now.