Hacker News new | ask | show | jobs
by dan_quixote 304 days ago
As a former mechanical engineer, I visualize this phenomenon like a "tolerance stackup". Effectively meaning that for each part you add to the chain, you accumulate error. If you're not damn careful, your assembly of parts (or conclusions) will fail to measure up to expectations.
5 comments

I like this approach. Also having dipped my toes in the engineering world (professionally) I think it naturally follows that you should be constantly rechecking your designs. Those tolerances were fine to begin with, but are they now that things have changed? It also makes you think about failure modes. What can make this all come down and if it does what way will it fail? Which is really useful because you can then leverage this to design things to fail in certain ways and now you got a testable hypothesis. It won't create proof, but it at least helps in finding flaws.
The example I heard was to picture the Challenger shuttle, and the O-rings used worked 99% of the time. Well, what happens to the failure rate when you have 6 O-rings in a booster rocket, and you only need one to fail for disaster? Now you only have a 94% success rate.
IIRC the Challenger o-ring problem was much more deterministic. That the flaw was known and caused by the design not considering the actual operational temperature range. Which, I think there's a good lesson to learn there (and from several NASA failure): the little things matter. It's idiotic to ignore a $10 fix if the damage would cost billions of dollars.

But I still think your point is spot on and that's really what matters haha

Basically the same as how dead reckoning your location works worse the longer you've been traveling?
Dead reckoning is a great analogy for coming to conclusions based on reason alone. Always useful to check in with reality.
And always worth keeping an eye on the maximum possible divergence from reality you're currently at, based on how far you've reasoned from truth, and how less-than-sure each step was.

Maybe you're right. But there's a non-zero chance you're also max wrong. (Which itself can be bounded, if you don't wander too far)

My preferred argument against the AI doom hypothesis is exactly this: it has 8 or so independent prerequisites with unknown probabilities. Since you multiply the probabilities of each prerequisite to get the overall probability, you end up with a relatively low overall probability even when the probability of each prerequisite is relatively high, and if just a few of the prerequisites have small probabilities, the overall probability basically can’t be anything other than very small.

Given this structure to the problem, if you find yourself espousing a p(doom) of 80%, you’re probably not thinking about the issue properly. If in 10 years some of those prerequisites have turned out to be true, then you can start getting worried and be justified about it. But from where we are now there’s just no way.

I saw an article recently that talked about stringing likely inferences together but ending up with an unreliable outcome because enough 0.9 probabilities one after the other lead to an unlikely conclusion.

Edit: Couldn't find the article, but AI referenced Baysian "Chain of reasoning fallacy".

I think you have this oversimplified. Stringing together inferences can take us in either direction. It really depends on how things are being done and this isn't always so obvious or simple. But just to show both directions I'll give two simple examples (real world holds many more complexities)

It is all about what is being modeled and how the inferences string together. If these are being multiplied, then yes, this is going to decreases as xy < x and xy < y for every x,y < 1.

But a good counter example is the classic Bayesian Inference example[0]. Suppose you have a test that detects vampirism with 95% accuracy (Pr(+|vampire) = 0.95) and has a false positive rate of 1% (Pr(+|mortal) = 0.01). But vampirism is rare, affecting only 0.1% of the population. This ends up meaning a positive test only gives us a 8.7% likelihood of a subject being a vampire (Pr(vampire|+). The solution here is that we repeat the testing. On our second test Pr(vampire) changes from 0.001 to 0.087 and Pr(vampire|+) goes to 89% and a third getting us to about 99%.

[0] Our equation is

                  Pr(+|vampire)Pr(vampire)
  Pr(vampire|+) = ------------------------
                           Pr(+)
And the crux is Pr(+) = Pr(+|vampire)Pr(vampire) + Pr(+|mortal)(1-Pr(vampire))
Worth noting that solution only works if the false positives are totally random, which is probably not true of many real world cases and would be pretty hard to work out.
Definitely. Real world adds lots of complexities and nuances, but I was just trying to make the point that it matters how those inferences compound. That we can't just conclude that compounding inferences decreases likelihood
Well they were talking about a chain, A->B, B->C, C->D.

You're talking about multiple pieces of evidence for the same statement. Your tests don't depend on any of the previous tests also being right.

Be careful with your description there, are you sure it doesn't apply to the Bayesian example (which was... illustrative...? And not supposed to be every possible example?)? We calculated f(f(f(x))), so I wouldn't say that this "doesn't depend on the previous 'test'". Take your chain, we can represent it with h(g(f(x))) (or (f∘g∘h)(x)). That clearly fits your case for when f=g=h. Don't lose sight of the abstractions.
Can’t you improve thing if you can calibrate with a known good vampire? You’d think NIST or the CDC would have one locked in a basement somewhere.
IDK, probably? I'm just trying to say that iterative inference doesn't strictly mean decreasing likelihood.

I'm not a virologist or whoever designs these kinds of medical tests. I don't even know the right word to describe the profession lol. But the question is orthogonal to what's being discussed here. I'm only guessing "probably" because usually having a good example helps in experimental design. But then again, why wouldn't the original test that we're using have done that already? Wouldn't that be how you get that 95% accurate test?

I can't tell you the biology stuff, I can just answer math and ML stuff and even then only so much.

GPT6 would come faster but we ran out of Casandra blood.
The thought of a BIPM Reference Vampire made me chuckle.
Assuming your vampire tests are independent.
Correct. And there's a lot of other assumptions. I did make a specific note that it was a simplified and illustrative example. And yes, in the real world I'd warn about being careful when making i.i.d. assumptions, since these assumptions are made far more than people realize.
I like this analogy.

I think of a bike's shifting systems; better shifters, better housings, better derailleur, or better chainrings/cogs can each 'improve' things.

I suppose where that becomes relevant to here, is that you can have very fancy parts on various ends but if there's a piece in the middle that's wrong you're still gonna get shit results.

You only as strong as the weakest link.

Your SCSI devices are only as fast as the slowest device in the chain.

I don't need to be faster than the bear, I only have to be faster than you.

> Your SCSI devices are only as fast as the slowest device in the chain.

There are not many forums where you would see this analogy.

This is what I hate about real life electronics. Everything is nice on paper, but physics sucks.

  > Everything is nice on paper
I think the reason this is true is mostly because how people do things "on paper". We can get much more accurate with "on paper" modeling, but the amount of work increases very fast. So it tends to be much easier to just calculate things as if they are spherical chickens in a vacuum and account for error than it is to calculate including things like geometry, drag, resistance, and all that other fun jazz (which you still will also need to account for error/uncertainty though this now can be smaller).

Which I think at the end of the day the important lesson is more how simple explanations can be good approximations that get us most of the way there but the details and nuances shouldn't be so easily dismissed. With this framing we can choose how we pick our battles. Is it cheaper/easier/faster to run a very accurate sim or cheaper/easier/faster to iterate in physical space?