Hacker News new | ask | show | jobs
by groby_b 332 days ago
> how do you know how much is correct

Because it's a budget. Verifying them is _much_ cheaper than finding all the entries in a giant PDF in the first place.

> the butterfly effect of dependence on an undependable stochastic system

We're using stochastic systems for a long time. We know just fine how to deal with them.

> Meanwhile an agent that you accept to get only 98% of things right is meeting expectations.

There are very few tasks humans complete at a 98% success rate either. If you think "build spreadsheet from PDF" comes anywhere close to that, you've never done that task. We're barely able to recognize objects in their default orientation at a 98% success rate. (And in many cases, deep networks outperform humans at object recognition)

The task of engineering has always been to manage error rates and risk, not to achieve perfection. "butterfly effect" is a cheap rhetorical distraction, not a criticism.

1 comments

There are in fact lots of tasks people complete immediately at 99.99% success rate at first iteration or 99.999% after self and peer checking work

Perhaps importantly checking is a continual process and errors are identified as they are made and corrected whilst in context instead of being identified later by someone completely devoid of any context a task humans are notably bad at.

Lastly it's important to note the difference between a overarching task containing many sub tasks and the sub tasks.

Something which fails at a sub task comprising 10 sub tasks 2% of the time per task has a miserable 18% failure rate at the overarching task. By 20 it's failed at 1 in 3 attempts worse a failing human knows they don't know the answer the failing AI produces not only wrong answers but convincing lies

Failure to distinguish between human failure and AI failure in nature or degree of errors is a failure of analysis.

> There are in fact lots of tasks people complete immediately at 99.99% success rate at first iteration or 99.999% after self and peer checking work

This is so absurd that I wonder if you're telling? Humans don't even have a 99.99% success rate in breathing, let alone any cognitive tasks.

> Humans don't even have a 99.99% success rate in breathing

Will you please elaborate a little on this?

Humans cough or otherwise have to clear their airways about 1 in every 1,000 breaths, which is a 99.9% success rate.
Thank you for following up
That’s quite good given the complexity and fragility of the system and the chaotic nature of the environment.