| Agree. Concrete example:
"What was the Japanese codeword for Midway Island in WWII?" Answer on Wikipedia: https://en.wikipedia.org/wiki/Battle_of_Midway#U.S._code-bre... dolphin3.0-llama3.1-8b Q4_K_S [4.69 GB on disk]: correct in <2 seconds deepseek-r1-0528-qwen3-8b Q6_K [6.73 GB]: correct in 10 seconds gpt-oss-20b MXFP4 [12.11 GB] low reasoning: wrong after 6 seconds gpt-oss-20b MXFP4 [12.11 GB] high reasoning: wrong after 3 minutes ! Yea yea it's only one question of nonsense trivia. I'm sure it was billions well spent. It's possible I'm using a poor temperature setting or something but since they weren't bothered enough to put it in the model card I'm not bothered to fuss with it. |