Hacker News new | ask | show | jobs
by sokoloff 4149 days ago
I wish I knew whose quote this was: "Use your intuition to ASK questions, not to answer them."

Specific to this topic:

Use a profiler to determine where you're spending your time. Don't guess; measure.

When you find a candidate area to work on, determine whether caching or a fundamental algorithm change could work. Those will usually get you far greater gains than optimizing code as-is.

2 comments

I generally agree but there are a few caveats:

(1) A profiler won't tell you that there is an algorithm that is O(2n) while the one you're using is O(n^2). Only reasoning about your problem will do that.

(2) Most profilers won't tell you that much about cache effects, and those can be huge in some cases. This is more of an issue with tight-loop math-heavy code like codecs and crypto and primitive data structures and the like.

(3) Some code can be "uniformly slow" or "structurally slow everywhere." An example would be C++ code that over-uses inheritance patterns and defines every single method as virtual. In that case you are losing massive cycles to indirect fetch/call instructions, and a profiler will not tell you this. It will only show you the relative hotspots, not the absolute overall slowness of this type of code. Another example would be over-engineered code. A profiler will help you optimize down the problem areas in a pile of over-engineered spaghetti, but it will not help you make the overall code base more elegant to realize large overall gains. This is often a special case of point #1.

1. Yes, but if you use a representative size dataset and the profiler doesn't show that as a hotspot, there's no point in changing algorithms.

2. Agreed, but remember the value of the profiler is telling you, "You're spending 97% of your time in this small subset of your code" and your job is to conclude, "Hmm, I should look there for improvements and not at the whole rest of the program." It can't tell you what changes to make, but it keeps you from spending time optimizing what's in the 3% of runtime.

3. Agreed, with the caveat that most profilers that I've used did tell me the absolute overall slowness in addition to the relative slowness, but that wasn't the major thrust of your comment so I'll just observe that small difference and otherwise agree.

Would truly help to know what kind of code echo272 wants to run faster. Language and environment.
Thank you vardump. It's numerical linear algebra kernels on C, in a multi-core environment.