Show HN: A GPU/VRAM filter for finding LLMs that will run on your hardware

Y	Hacker News new \| ask \| show \| jobs

	Show HN: A GPU/VRAM filter for finding LLMs that will run on your hardware (whichllmmodel.com)
	2 points by mzubairtahir 1 day ago
	I kept seeing people ask "Which model i can run on my gpu", "will model X fit on my GPU". Thats why I built a filter on whichllmmodel that lets you search models by what will actually fit on your hardware (8GB, 16GB, 24GB, etc.) at a given quantization level.

4 comments

GreyOcten 8 hours ago

handy, but the gap most of these filters have is that "fits in VRAM" doesn't mean usable. context length blows up the KV cache fast, a 7B that fits at 2k tokens will OOM at 32k. factoring context len + quant into the estimate is where it'd actually save people from getting burned.

link

mzubairtahir 3 hours ago

i think you did not check app properly, it is actually taking required context window from the user and then caluclate kv cache size and then count it along with size of model itself. it also reserves some more memory to avoid oom....

link

necovek 1 day ago

Very broken: "live minimums" do not allow me to remove 512 token limit and put a bigger number easily.

No unified or shared memory scenarios (like Apple's M platform or AMD's integrated GPU platform).

link

mzubairtahir 19 hours ago

actually that input is broken. and sorry for that. and I am adding shared memory features in next iterations.

link

mzubairtahir 19 hours ago

that broken input is fixed

link

johng 1 day ago

Was going to mention this. I'm on an M1 Max and wanted to see what the site suggested.

link

CRSilkworth 1 day ago

very nice idea. Would be nice if you could also keep desired context as a free parameter and let the models tell you what maximum context you could have.

link

mzubairtahir 19 hours ago

actually that's free by design, it is just broken. fixing it in next sprint. And really thanks for your feedback!!!

link

mzubairtahir 19 hours ago

now that is fixed, please try it

link

xlr8_track 23 hours ago

Awesome, how do I contribute to this? A gihub link or smthg?

link

mzubairtahir 19 hours ago

actually, currently it is not open source, but I am thinking about making it open source so that other developers can also contribute in it(espeically data layer). what do you think?

link

xlr8_track 13 hours ago

Yes, many people like myself are willing to contribute. It'll take the load off you and give you time to work on other features or projects.

link