Hacker News new | ask | show | jobs
by exo762 2538 days ago
What is premature and what is not? Is choosing fit for a job data structures a premature optimization? I don't think so. But I've seen people arguing against it.
1 comments

First you make it work, then you benchmark it. Then you see if that particular part is a bottleneck and whether there is a business case for optimisation.

I know it's fun and exciting to optimise a function to perform at maximum efficiency, but people tend to forget that someone has to read that piece of code in the future and understand it.

All the fancy tricks might've given a 2% increase in performance, but made it 200% less understandable by anyone except the codegolfing optimizer trying to be clever. =)

TL;DR:

  Spectrum of performance:
  LO |---*-------*--------*------------*-------| HI
         ^       ^        ^            ^
         |       |        |            |_root of all evil if premature
         |       |        |_you should be here
         |       |_you can be here if you don't do stupid things
         |_you are here
--

> All the fancy tricks might've given a 2% increase in performance, but made it 200% less understandable by anyone except the codegolfing optimizer trying to be clever.

This applies to hairy, last-ditch effort optimizations. The kind of your average programmer isn't even capable of doing. It's nothing like the optimizations most real-world code needs.

It's why I consider the "premature optimization" adage to be actively harmful, as it legitimizes lack of care and good craftsmanship.

From what I've seen, a lot of code can be trivially optimized with no net loss to readability (and sometimes a gain!), by simply removing dumb things, mostly around data structures. Fixes involve using vectors instead of lists or hash tables, depending on size and access and add/delete patterns. Using reference equality checks instead of string comparisons. Not recalculating the same value all over again inside a loop.

The kind of things above are ones that bleed performance all over your application, for no good reasons. I consider it a difference between a newbie and a decent programmer - whether or not they internalized how to code without stupid performance mistakes, so that the code they write is by default both readable and reasonably performant.

Then you go to actual optimizations, the kind that benefit from a benchmark - not because doing them elsewhere is wrong in principle, but because they take time and noticeably alter code structure. Using better algorithm, and/or using a better data structure, both come here. They don't have to impact readability, as long as you isolate them from the rest of the system behind a simple interface.

(Like, e.g. one day I achieved 100x boost of performance of an application component by replacing a school-level Dijkstra implementation with a two-step A* -based algorithm and data structure specifically designed for the problem being solved, and easily managed to wrap it in an even simpler interface than original. Since the component was user-facing, it pretty much single-handedly changed the perception of application from sluggish to snappy. The speedup itself probably saved many people-hours for users who were a captive audience anyway (this was an internal tool).)

Only then you get to the "premature optimization is a root of all evil" part, which is hairy tricks and extreme levels of micromanagement. Making sure you don't cons anything, or more than absolutely necessary. Counting cycles, exploiting cache-friendly data layouts, etc. This can have such a big impact on a system and surrounding code that it does really benefit from not being done until absolutely needed (except if you know you'll need it from the start - e.g. in some video games).

>changed the perception of application from sluggish to snappy

.. so, you measured the performence (sluggish), saw the need for improvement and improved it (snappy). It is not premature optimization. It would be premature optimization if it happend without mesaurement and without need.

I agree with your examples above. If you choose the right data strcuture/algorithms/patterns without sacrafising readability or development speed. By all means. But don't spend hours to improve something which dosn't need improvement.

That's true. What I personally advocate is: first, learn enough about programming and language to not do stupid things - so your code is already somewhat performant by default, at zero cost to readability. Second, when you're designing, think a little bit about performance and, given two designs of similar complexity but different performance characteristics, pick the more performant one. Three, when refactoring, if you see something stupid performance-wise, fix it too. All these things cost you little time and make your application overall snappier and less likely to develop performance problems in the future.

Beyond that, measure before you optimize, as such interventions will require larger amount of effort, so it makes sense to do them in the order of highest-impact first.

(Also note that "performance", while usually synonymous to "execution speed", is really about overall resource management. It's worth keeping memory in mind too, in particular, and power usage if your application could be used on portable devices. Which is really most webapps nowadays.)