| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by beebaween 480 days ago
	What's the best way to run this is I prefer to use local GPUs?

2 comments

We’re adding this as we speak. Ollama support is already there, and here’s vLLM inference: https://github.com/vlm-run/vlmrun-hub/pull/120

You can try out some of our schemas with Ollama if you want: https://github.com/vlm-run/vlmrun-hub (instructions in Readme)