Y
Hacker News
new
|
ask
|
show
|
jobs
by
hadlock
389 days ago
As mentioned you can run this on a server board with 768+ gb memory in cpu mode. Average joe is going to be running quantized 30b (not 600b+) models on an $300/$400/$900 8/12/16gb GPU
1 comments
rahimnathwani
389 days ago
I'm not sure that's enough RAM to run it at full precision (FP8).
This guy ran a 4-bit quantized version with 768GB RAM:
https://news.ycombinator.com/item?id=42897205
link
This guy ran a 4-bit quantized version with 768GB RAM: https://news.ycombinator.com/item?id=42897205