|
|
|
|
|
by KronisLV
304 days ago
|
|
I had pretty mixed experiences with the 20B version of GPT-OSS, sometimes that thing would just start looping in the thinking block and no sampler parameters would seem to do anything for specific questions. That said Qwen3 and Qwen3 Coder are both pretty nice. Also ERNIE 4.5 if the benchmarks are to be trusted but I mostly run Ollama instead of vLLM now so can’t test it out atm (apparently llama.cpp added support for them recently though). The models by Mistral might also be worth a look and personally I thought the EuroLLM project was also nice, but MoE models feel way more palatable on limited hardware. Neither seem to be able to directly compete with Sonnet 4 or Gemini 2.5 Pro, would need way better hardware to come close. |
|