|
|
|
|
|
by genewitch
462 days ago
|
|
LM studio in API mode, then literally any frontend that talks openAI api. Or, just use the LM studio front end, it's better than anything I've used for desktop use. I get 35t/s gemma 15b Q8 - you'll need a smaller one, probably gemma 3 15b q4k_l. I have a 3090, that's why. |
|