|
|
|
|
|
by embedding-shape
121 days ago
|
|
Again, you're not specifying what GPT-OSS you're talking about, there are two versions, 20b and 120b. Not to mention if you have a consumer GPU, you're most likely running it with additional quantization too, but you're not saying what version. > Jinja threw a bunch of errors and GPT-OSS couldn't make tool calls. This was an issue for a week or two when GPT-OSS initially launched, as none of the inference engines had properly implemented support for it, especially around tool calling. I'm running GPT-OSS-120b MXFP4 with LM Studio and directly with llama.cpp, the recent versions handle it well and I have no errors. However, when I've tried either 120b or 20b with additional quantization (not the "native" MXFP4 ones), I've seen that they're having troubles with the tool syntax too. > Not llama What does your original comment mean then? You said llama was "strictly" better than GPT-OSS, which specific model variant are you talking about or you miswrote somehow? |
|