Hacker News new | ask | show | jobs
by krinchan 3582 days ago
Except Big-O was never intended to describe wall clock time. As soon as you try to use Big-O to describe a number of concrete units you'll fail. Big-O is purely a comparative measure, this algorithm vs. that algorithm. As it stands, it is fairly meaningless outside of purely academic exercises.

EDIT: I mean, consider that the entire basis for Big-O notation assumes that every "operation" are always equal. Adding is always constant. Big-O is very hand wavey and claiming the constants matter is to invent an entirely new notation.

1 comments

Big-O isn't handwavey, it was absolutely intended to describe (among other things!) wall-clock time, and the article isn't just talking about constants. Also, many operations (like adding) are constant-time on modern computers, outside of memory hierarchy effects, ALU stalls, etc., since they're bound to finish within exactly one clock-cycle.
Big-O is mathematically rigorously defined, but has nothing to do with wall-clock time. It's about the asymptotic behaviour of functions. How you interpret the values of these functions is up to you - of course you can model wall-clock time, but then you'd probably be interested in all the constant factors and lower-order terms that asymptotics hide from you, so Big-O analysis is the wrong tool.
If wall-clock time is being affected by non-constant factors (as the article convincingly asserts), why on earth wouldn't you want to model those factors with big-O notation? Moreover, why do you think that people uninterested in wall-clock time wouldn't care about constant factors, but people interested in wall-clock time unilaterally would?
If your program does two steps, one of which takes time O(n) and the other O(√n), then your program takes time O(n) because that dominates. That's an example of lower order terms. If you measure wall-clock time, then you'd be interested to know about that second step. But if you're looking at asymptotic analysis, then it's completely irrelevant because the linear term dominates it.

Your other questions are all best answered by: Big-O notation is not the right tool if you want to make predictions about wall-clock time. I don't think I need to justify why someone interested in wall-clock time would be very interested in the constant factors involved - the difference between n and 20n is a factor of twenty, and whether a program takes half a second or ten does make quite the difference.

But asymptotically, they're the same. Someone who is interested in the asymptotic behaviour wants to know how fast the function grows - is it linear? linearithmic? quadratic? cubic? exponential? (with which base - 1.1^n, 2^n, 10^n?). For them, a constant factor isn't interesting.

Of course, neither extreme is particularly helpful when trying to compare algorithms in practice. Modelling the hardware extensively and including all the little constant factors is exceedingly tedious, even if you can actually get all the information required to do so. Pure asymptotics also don't help much if they hide huge constant factors. That's why Algorithm Engineering is a thing (and a very good thing!). Basically, it tries to bridge the gap between algorithms that theorists like, and those that practitioners use. To that end, design, analysis, implementation, and experimentation must form a circle and influence each other. Wikipedia probably does a better job of explaining it, though.

The article points out that accessing n words of RAM takes O(nsqrt(n)), which is not dominated by O(n), which is my entire point here. And in the cache-oblivious model (for example), big-O notation is still alive and well; there are just additional variables introduced to account for the various constants.