|
|
|
|
|
by filterfiber
911 days ago
|
|
The current bottleneck for most current hardware is RAM capacity than memory bandwidth and last is FLOPS/TOPS. The coral has 8 MB of SRAM which uh, won't fit the 2GB+ that nearly any decent LLM require even after being quantized. LLMs are mostly memory and memory bandwidth limited right now. |
|