Hacker News new | ask | show | jobs
by viraj_shah 531 days ago
Do you have a good resource for learning what kinds of hardware can run what kinds of models locally? Benchmarks, etc?

I'm also trying to tie together different hardware specs to model performance, whether that's training or inference. Like how does memory, VRAM, memory bandwidth, GPU cores, etc. all play into this. Know of any good resources? Oddly enough I might be best off asking an LLM.

2 comments

I tested ollama with 7600XT at work and the mentioned 7900XTX. Both run fine with their VRAM limitations. So you can just switch between different quantization of llama 3.1 or the vast amount of different models at https://ollama.com/search
To prevent custom implementations is recommended to get a Nvidia card. Minimum 3080 to get some results. But if you want video you should go for either 4090 or 5090. ComfUI is a popular interface which you can use for graphical stuff. Images and videos. Local text models I would recommend to use the Misty app. Basically a wrapper and downloader for various models. Tons of youtube videos on how to achieve stuff.