|
|
|
|
|
by dns_snek
245 days ago
|
|
Ah, the "you're using the wrong model" fallacy (is there a name for this?) In the eyes of the evangelists, every major model seems to go from "This model is close to flawless at this task, you MUST try this TODAY" to "It's absolutely wild that anyone would ever consider using such a no-good, worthless model for this task" over the course of a year or so. The old model has to be re-framed for the new model to look more impressive. When GPT-4 was released I was told it was basically a senior-level developer, now it's an obviously worthless model that you'd be a fool to use to write so much as a throwaway script. |
|
What I’m saying is that there should be a disclaimer: hey, we’re testing these models for the average person, that have no idea about AI. People who actually know AI would never use them in this way.
A better idea: educate people. Add “Here’s the best way to use them btw…” to the report.
All I’m saying is, it’s a tool, and yes you can use it wrong. That’s not a crazy realization. It applies to every other tool.
We knew that the hallucation rate for gpt 4o was nuts. From the start. We also know that gpt-5 has a much lower hallucination rate. So there are no surprises here, I’m not saying anything groundbreaking, and neither are they.