| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by szundi 58 days ago

I don’t know what people are doing but Minimax produced 16 bugreports which of 15 was false positives (literally a mistake).

In contrast ChatGPT 5.3 and also Opus has a 90% rate at least on this same project. (Embedded)

All other tests were the same. What are you doing with these models?