Hacker News new | ask | show | jobs
by tmtvl 1121 days ago
It's weird that there are so many claims in here that the data structures and algorithms are perfectly performant yet there isn't even one look at generated assembly or any acknowledgement of the underlying system that is supposed to run the code. Proving things are Big O performant is neat, but at some point the code has to hit hardware.
6 comments

> It's weird

It's not weird, this is standard in academic computer science. It would be weird to do otherwise. In a theoretical dissertation/paper like this you can't just randomly bring up compiled assembly, it's completely and utterly off topic, it's not any more on topic than bringing up if the code was ran by an interpreter, or JVM, or transpiled to Haskell, or ran on GPU etc...

"Computer science is no more about computers than astronomy is about telescopes."

-- Edsger W. Dijkstra

However, I do believe that astronomers put references to the actual instruments they used in their publications.

So do most algorithms papers that have benchmarks in them. It's not always useful though, because information about how this algorithm compares to some other one on a PDP-10 doesn't necessarily translate to modern machines.
I can't speak for other fields, but in statistics and machine learning, it's not uncommon to see at least some practical experiments or simulations alongside theoretical results, or at least some discussion of practical considerations.

There are definitely plenty of pure theory papers as well, and pure theory is still a contribution. However one would at least hope to see subsequent work that does focus on the practical aspect.

> at some point the code has to hit hardware

Yes, but that's not a concern of a computer scientist. Implementation and execution of the algorithm are up to the reader.

It's like complaining that engineers don't do enough novel computer science research; of course they don't! It's not their job.

Which profession is expected to make progress on actual software performance?
Computer scientists (and of course engineers) both care about real-world performance, but some computer scientists just care about the theory.
That's a very naive view of computer science. That is the attitude of people who have given up and decided that computers are too big for science now.
You're right, there are plenty of papers focusing on real-world performance. I chose not to capture the nuance because I wasn't sure how to express it succinctly.
How I see it: computer science as an academic field is roughly split between theory and systems. The theory folks tend to evaluate progress through proofs. The systems folks tend to evaluate progress through experiments.
I’ve vouched for this, because it brings up an interesting point (albeit, one that proven provocative and been expressed before). Also, many of your comments are dead.

Big O/algorithmic complexity is interesting in that it tends to abstract away the architecture of the underlying processor. A copy is a copy is a copy. Arithmetic is arithmetic is arithmetic. Regardless of what instructions/processing units the underlying processor has. Regardless of data access. Regardless of parallelization. Regardless of memory complexity. All we use are brutish units of “work” — despite not all “workers” being equal.

It reminds me a bit of scientific management: treating your “workers” as interchangeable units that carry out what has been decided to be the most efficient movements and processes — completely disregarding the individual characteristics of the worker. For a humanist example, consider the individual quirks of the physiology (not only in size, length, and composition of the skeletal system via the muscles, bones, tendons and so on that would necessitate there being specific movements and workloads that best suit them — but also in psychology; the brain as its own work-producing part that is uniquely suited and most efficient for specific workloads; rather than an all-encompassing abstract average “best workstyle” that makes no note of these characteristics, but simply decides to use external metrics like “time to pick up a box using X, Y, or Z form.”)

The same parallel can be drawn to computers: different processors and collections of hardware (what is essentially the “body and brain”) have different quirks and different workloads they perform best at. For example, GPUs are much more useful in the case of vectorized/parallel workloads where the operations are relatively simple (f.e matrix arithmetic). While you can run an iterative matrix multiplication algorithm on a CPU in n^3 time — your data access costs will be significant. Instead, running a parallel algorithm on a GPU, with RAM very close by, you can achieve log^2 n.

This is where CS research really shines: not running away from the realities of physical systems, but embracing them.

Other commenters have addressed the key point - it's not weird. CLRS doesn't look at generated assembly either.

One other thing I want to add, though, is that this book was published in 1996. The CPU landscape at the time was far more diverse, both in terms of ISA (which assembly?) and also in terms of capabilities. Even on the x86 line, plenty of people were still using 486s, which weren't even superscalar, and thus had pretty strikingly different performance characteristics than, say, a Pentium Pro (a "modern", OoO CPU). Tying your algorithmic analysis to a particular piece of hardware would be pretty useless; providing a satisfactory look at the performance on so many different (micro-)architectures could risk overwhelming the content.

Moreover, at the time, performance was, maybe, a little more straightforward, making your concerns perhaps not quite as important. Memory was slower than the CPU but not the way it is now. Caches were much smaller and less useful. Branch prediction was more primitive. CPUs had relatively weak reordering capabilities, if any at all, and simpler pipelines. There were few dedicated SIMD instructions. This was a world where linked lists still often made sense, because the hardware penalties you paid for them were smaller. In other words: in 1996 the on-paper performance wasn't at quite as big a risk of diverging quite so wildly from the real-world performance, so keeping things simple and focusing on the on-paper performance was less objectionable.

Yeah, Big-O notation is only part of the picture and yes, "galactic algorithms" that are theoretically efficient but only for astronomically big inputs do exist, but in many cases (especially if your algorithm isn't doing anything particularly deep or clever) linear is faster than quadratic which is faster than exponential on real world input, too.

Mathematics is the science of modeling and abstraction and that abstraction exists in order to make the process of knowledge gathering easier. It works very well for the sciences, so I would say the method has been proven. Of course, if the models turn out to be completely inappropriate, that would be different, but so far, the simple machine models used (implicitly or explicitly) in the analysis of algorithms seem to be rather reasonable.

The alternative that you suggest, trying to benchmark concrete implementations of algorithms has so many confounders that it would be very hard to execute properly. I'm sure people are doing this too, but the formal Big O proof has a different quality to it because it does not depend on those particulars.

I'd even say that maths is what appears when you stop willing to pay for concretion.
Also, while I haven't read any further yet, based on the table on page 3, their big O performance is pretty bad.
How helpful would it be to see 30 year old generated assembly and benchmarks for an i486 with a tiny cache, no prefetching, and relatively tiny memory vs instruction latency compared to today’s CPUs?