| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by cameronh90 604 days ago
	At scale, you are doing the same thing with humans too. LLMs seem to have an error rate similar to humans for the majority of simple, boring tasks, if not even a bit better since they don't get distracted and start copying and pasting their previous answers. The difference with LLMs is they simply cannot (currently) do the most complex tasks that some humans can, and when they do produce erroneous output, the errors aren't very human-like. We can all understand a cut and paste error so don't hold it against the operator, but making up sources feels like a lie and breeds distrust.

1 comments

maeil 604 days ago

> At scale, you are doing the same thing with humans too. LLMs seem to have an error rate similar to humans for the majority of simple, boring tasks, if not even a bit better since they don't get distracted and start copying and pasting their previous answers.

This is the big one missed by the frequent comments on here wondering whether LLMs are a fad, or claiming in their current state they cannot be used to replace humans in non-trivial real-world business workflows. In fact, even 1.5 years ago at the time of GPT 3.5, the technology was already good enough.

The yardstick is the peformance of humans in the real world on a specific task. Humans, often tired, having a cold, distracted, going through a divorce. Humans who even when in a great condition make plenty of mistakes.

I guess a lot of developers struggle with understanding this because so far when software has replaced humans, it was software that on the face of it (though often not in practice) did not make mistakes if bug-free. But that has been never been necessary for software to replace humans - hence buggy software still succeeding in doing so. Of course, often software even replaces humans when it's worse at a task for cost reasons.

They're at the very least competitive, if not better than, doctors at diagnosing illnesses [1].

[1] https://www.nytimes.com/2024/11/17/health/chatgpt-ai-doctors...

link

cameronh90 604 days ago

Related to that, I once had a CT scan for a potentially fatal brain concern, and the note that the radiologist sent back to my consultant was for a completely different patient, and the notes for my scan were attached to someone else's report. The only reason it was caught was because it referred to me as "she".

If we were both the same gender, I probably would have had my skull opened up for no reason, and she would have been discharged and later died.

link

skydhash 604 days ago

> The yardstick is the peformance of humans in the real world on a specific task.

Humans make humans errors, that we can anticipate, recognize, couter, and mitigate. And the rise of deterministic automation was because they help with the parts that are more likely to generate an error. The LLMs strategy always seems like solving a problem that is orthogonal to business objectives, and mainly serves individuals instead.

link

tokioyoyo 604 days ago

Almost all deterministic automation also has error rates. The error rates were higher in the past to the order of magnitudes, but we got better at creating reliable software.

We’re judging an entirely new segment of development after only 2 years of it being actively in public. And overall, LLMs have gotten exponentially better.

link

goatlover 604 days ago

The bigger, more controversial claim is that LLMs will be net loss for human jobs, when all past automation has been a net positive. Including IT, where automation has led to a vast growth of software jobs, as more can be accomplished with higher level languages, tools, frameworks, etc.

For example, compilers didn't put programmers out of business in the 60s, it made programming more available to people with higher level languages.

link

namaria 604 days ago

A net positive in the long term matters little when it can mean a lifetime of unemployment to a generation of humans. It's easy to dismiss the human suffering incurred during industrialization when we can enjoy its fruits but those who suffered are long dead.

link