Hacker News new | ask | show | jobs
by adontz 769 days ago
If anyone knows, is this about the best model one can run locally on an old consumer grade GPU (GXT 1080 in my case)?
1 comments

Llama 3 8B is pretty much the king of its model class right now, so yeah. Meta’s instruct fine tune is also a safe choice, really the only thing you have to play with is the quantization level. Llama 8b 4bit isn’t great, but 8bit might be pushing it on the gtx 1080. I’d almost consider offloading a few layers to the cpu just to avoid dealing with the 4bit model.