| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by genewitch 462 days ago

LM studio in API mode, then literally any frontend that talks openAI api.

Or, just use the LM studio front end, it's better than anything I've used for desktop use.

I get 35t/s gemma 15b Q8 - you'll need a smaller one, probably gemma 3 15b q4k_l. I have a 3090, that's why.