Y
Hacker News
new
|
ask
|
show
|
jobs
by
mmoskal
409 days ago
Just for some callibration: approx. no one runs 32 bit for LLMs on any sort of iron, big or otherwise. Some models (eg DeepSeek V3, and derivatives like R1) are native FP8. FP8 was also common for llama3 405b serving.