Hacker News new | ask | show | jobs
by mdp2021 562 days ago
And why would the "average human" count?!

"Support, the calculator gave a bad result for 345987*14569" // "Yes, well, also your average human would"

...That why we do not ask "average humans"!

2 comments

"On two occasions I have been asked, 'Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?' I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question."

So the result might not necessarily be bad, it's just that the machine _can_ detect that you entered the wrong figures! By the way, the answer is 7.

average human matters here because the OP said

> Can you please give an example of a “completely illogical statement” produced by o1 model? I suspect it would be easier to get an average human to produce an illogical statement.

> because the OP said

And the whole point is nonsensical. If you discussed whether it would be ethically acceptable to canaries it would make more sense.

"The database is losing records...!" // "Also people forget." : that remains not a good point.

Because the cost competitive alternative to llms are often just ordinary humans
Following the trail as you did originally: you do not hire "ordinary humans", you hire "good ones for the job"; going for a "cost competitive" bargain can be suicidal in private enterprise and criminal in public ones.

Sticking instead to the core matter: the architecture is faulty, unsatisfactory by design, and must be fixed. We are playing with the partials of research and getting some results, even some useful tools, but the idea that this is not the real thing must be clear - also since this two years plus old boom brought another horribly ugly cultural degradation ("spitting out prejudice as normal").

I interpreted the op's argument to be that

> For simple tasks where we would alternatively hire only ordinary humans AIs have similar error rates.

Yes if a task requires deep expertise or great care the AI is a bad choice. But lots of tasks don't. And in those kinds of tasks even ordinary humans are already too expensive to be economically viable

Sorry for the delay. If you are still there:

> But lots of tasks

Do you have good examples of tasks in which dubious verbal prompt could be an acceptable outcome?

By the way, I noticed:

> AI

Do not confuse LLMs with general AI. Notably, general AI was also implemented in system where critical failures would be intolerable - i.e., made to be reliable, or part of a finally reliable process.