| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by dan_quixote 352 days ago
	As a former mechanical engineer, I visualize this phenomenon like a "tolerance stackup". Effectively meaning that for each part you add to the chain, you accumulate error. If you're not damn careful, your assembly of parts (or conclusions) will fail to measure up to expectations.

5 comments

godelski 352 days ago

I like this approach. Also having dipped my toes in the engineering world (professionally) I think it naturally follows that you should be constantly rechecking your designs. Those tolerances were fine to begin with, but are they now that things have changed? It also makes you think about failure modes. What can make this all come down and if it does what way will it fail? Which is really useful because you can then leverage this to design things to fail in certain ways and now you got a testable hypothesis. It won't create proof, but it at least helps in finding flaws.

isleyaardvark 351 days ago

The example I heard was to picture the Challenger shuttle, and the O-rings used worked 99% of the time. Well, what happens to the failure rate when you have 6 O-rings in a booster rocket, and you only need one to fail for disaster? Now you only have a 94% success rate.

godelski 350 days ago

IIRC the Challenger o-ring problem was much more deterministic. That the flaw was known and caused by the design not considering the actual operational temperature range. Which, I think there's a good lesson to learn there (and from several NASA failure): the little things matter. It's idiotic to ignore a $10 fix if the damage would cost billions of dollars.

But I still think your point is spot on and that's really what matters haha

ctkhn 352 days ago

Basically the same as how dead reckoning your location works worse the longer you've been traveling?

toasterlovin 351 days ago

Dead reckoning is a great analogy for coming to conclusions based on reason alone. Always useful to check in with reality.

ethbr1 351 days ago

And always worth keeping an eye on the maximum possible divergence from reality you're currently at, based on how far you've reasoned from truth, and how less-than-sure each step was.

Maybe you're right. But there's a non-zero chance you're also max wrong. (Which itself can be bounded, if you don't wander too far)

toasterlovin 351 days ago

My preferred argument against the AI doom hypothesis is exactly this: it has 8 or so independent prerequisites with unknown probabilities. Since you multiply the probabilities of each prerequisite to get the overall probability, you end up with a relatively low overall probability even when the probability of each prerequisite is relatively high, and if just a few of the prerequisites have small probabilities, the overall probability basically can’t be anything other than very small.

Given this structure to the problem, if you find yourself espousing a p(doom) of 80%, you’re probably not thinking about the issue properly. If in 10 years some of those prerequisites have turned out to be true, then you can start getting worried and be justified about it. But from where we are now there’s just no way.

robocat 352 days ago

I saw an article recently that talked about stringing likely inferences together but ending up with an unreliable outcome because enough 0.9 probabilities one after the other lead to an unlikely conclusion.

Edit: Couldn't find the article, but AI referenced Baysian "Chain of reasoning fallacy".

godelski 352 days ago

I think you have this oversimplified. Stringing together inferences can take us in either direction. It really depends on how things are being done and this isn't always so obvious or simple. But just to show both directions I'll give two simple examples (real world holds many more complexities)

It is all about what is being modeled and how the inferences string together. If these are being multiplied, then yes, this is going to decreases as xy < x and xy < y for every x,y < 1.

But a good counter example is the classic Bayesian Inference example[0]. Suppose you have a test that detects vampirism with 95% accuracy (Pr(+|vampire) = 0.95) and has a false positive rate of 1% (Pr(+|mortal) = 0.01). But vampirism is rare, affecting only 0.1% of the population. This ends up meaning a positive test only gives us a 8.7% likelihood of a subject being a vampire (Pr(vampire|+). The solution here is that we repeat the testing. On our second test Pr(vampire) changes from 0.001 to 0.087 and Pr(vampire|+) goes to 89% and a third getting us to about 99%.

[0] Our equation is

                  Pr(+|vampire)Pr(vampire)
  Pr(vampire|+) = ------------------------
                           Pr(+)

And the crux is Pr(+) = Pr(+|vampire)Pr(vampire) + Pr(+|mortal)(1-Pr(vampire))

p1necone 352 days ago

Worth noting that solution only works if the false positives are totally random, which is probably not true of many real world cases and would be pretty hard to work out.

godelski 352 days ago

Definitely. Real world adds lots of complexities and nuances, but I was just trying to make the point that it matters how those inferences compound. That we can't just conclude that compounding inferences decreases likelihood

Dylan16807 351 days ago

Well they were talking about a chain, A->B, B->C, C->D.

You're talking about multiple pieces of evidence for the same statement. Your tests don't depend on any of the previous tests also being right.

godelski 351 days ago

Be careful with your description there, are you sure it doesn't apply to the Bayesian example (which was... illustrative...? And not supposed to be every possible example?)? We calculated f(f(f(x))), so I wouldn't say that this "doesn't depend on the previous 'test'". Take your chain, we can represent it with h(g(f(x))) (or (f∘g∘h)(x)). That clearly fits your case for when f=g=h. Don't lose sight of the abstractions.

wombatpm 351 days ago

Can’t you improve thing if you can calibrate with a known good vampire? You’d think NIST or the CDC would have one locked in a basement somewhere.

godelski 351 days ago

IDK, probably? I'm just trying to say that iterative inference doesn't strictly mean decreasing likelihood.

I'm not a virologist or whoever designs these kinds of medical tests. I don't even know the right word to describe the profession lol. But the question is orthogonal to what's being discussed here. I'm only guessing "probably" because usually having a good example helps in experimental design. But then again, why wouldn't the original test that we're using have done that already? Wouldn't that be how you get that 95% accurate test?

I can't tell you the biology stuff, I can just answer math and ML stuff and even then only so much.

weard_beard 351 days ago

GPT6 would come faster but we ran out of Casandra blood.

ethbr1 351 days ago

The thought of a BIPM Reference Vampire made me chuckle.

tintor 351 days ago

Assuming your vampire tests are independent.

godelski 351 days ago

Correct. And there's a lot of other assumptions. I did make a specific note that it was a simplified and illustrative example. And yes, in the real world I'd warn about being careful when making i.i.d. assumptions, since these assumptions are made far more than people realize.

to11mtm 352 days ago

I like this analogy.

I think of a bike's shifting systems; better shifters, better housings, better derailleur, or better chainrings/cogs can each 'improve' things.

I suppose where that becomes relevant to here, is that you can have very fancy parts on various ends but if there's a piece in the middle that's wrong you're still gonna get shit results.

dylan604 352 days ago

You only as strong as the weakest link.

Your SCSI devices are only as fast as the slowest device in the chain.

I don't need to be faster than the bear, I only have to be faster than you.

jandrese 351 days ago

> Your SCSI devices are only as fast as the slowest device in the chain.

There are not many forums where you would see this analogy.

guerrilla 352 days ago

This is what I hate about real life electronics. Everything is nice on paper, but physics sucks.

godelski 352 days ago

  > Everything is nice on paper

I think the reason this is true is mostly because how people do things "on paper". We can get much more accurate with "on paper" modeling, but the amount of work increases very fast. So it tends to be much easier to just calculate things as if they are spherical chickens in a vacuum and account for error than it is to calculate including things like geometry, drag, resistance, and all that other fun jazz (which you still will also need to account for error/uncertainty though this now can be smaller).

Which I think at the end of the day the important lesson is more how simple explanations can be good approximations that get us most of the way there but the details and nuances shouldn't be so easily dismissed. With this framing we can choose how we pick our battles. Is it cheaper/easier/faster to run a very accurate sim or cheaper/easier/faster to iterate in physical space?