Hacker News new | ask | show | jobs
by strstr 2401 days ago
Most engineers don't write code with hard performance constraints. Game devs probably need to be fighting to get every frame.

For the bulk of the eng I work with the concept of StoreLoad reordering on x86 would be an academic distraction.

5 comments

> Most engineers don't write code with hard performance constraints.

Only because most software "engineers" don't give two shits about the actual user experience of their glacially slow over-engineered garbage.

99.999% of performance issues in development are related to the abstract model, and not the underlying implementation details. Things such as searching in a big unordered list, repeating the same work, stupid SQL queries ect.
The way I usually follow it is 1) Is this OK? For example It is only called once in the code in an error path? 2) Did I do something silly? For example, I left in some extra debug code. 3) is the code doing something silly? For example extra work, dragging in extra data that is not needed? 4) is the code written in a way that causes extra work, re-init on inner loop, loop hoisting, etc 5) Is there a better algorithm for this? Binary search vs linear? 6) is the code doing too many things at once. De-serialize it re-serialize it 7) Is there a better abstraction for this? Monolith code vs microservice? 8) Is there any compiler switches that are 'easy' wins? Packing, -O2, etc? Usually not. 9) What sort of memory arch does this machine have? It varies between machines even for the x86 world. For example if I rearrange my memory structures do I use less cache and emit less code. The cache line on this box is x bytes and on that box is y bytes. Some languages do not lend themselves well to this one due to the way their VM/ISA is written. 10) is there some asm that would help? usually not.

Usually 1-7 are all you need. If you get all the way to 10 you are in the deep end for most things.

Big O is good for many things. But in reality big O is O(N)+C where the C can get you. That is where the later steps help. But usually you can totally ignore it. Most of the big wins I get are just from flipping out a bad search for a O(log(n)) search, or removing 'extra' code.

On the BigO notation, you have to remember is that it's an upper bound on the runtime. There is no guarantee other than the growth of the function.
Oh I agree. It is just that +C bit. What happens when your O(nLog(n)) basically just trashes the cache because of your data structures? Yet maybe something 'worse' would run better because of the way cache works? That +c bit can sometimes turn into its own big O. It is a nasty little gotcha when it does happen.

It is a good framework to get you in the ballpark of the correct thing. Even usually 99% of the time it is right. But sometimes the arch bites back due to your data.

The worst piece of code I had to optimize was running a database search with an O(n^3) algorithm.

Fixing that did not require knowledge of cache lines.

No, because most software development houses prioritize getting something that works over performance wankery.
Performance is a feature.
But not a feature a substantial number of paying customers care about.
We'll never know if that's true, because they were never given the option.
The problem is you never know when you’ll go from “dev who doesn’t care” to “dev who does”.

While I agree that the details of StoreLoad are likely a distraction the big picture concepts of cache coherence presented in this article are table stakes for performant systems.

The problem is you never know when you’ll go from “dev who doesn’t care” to “dev who does”.

This is true for literally every hard problem in dev though, and the implication that you need to grasp everything just in case you need it is silly. The problem space in compsci is too big to know everything. We have to choose.

Part of the reason this kind of blogs exist is to give you more information to choose properly.
While that might be true in practice, I do think not knowing when you transition between those two indicates a failing by your coworkers/mentors.

I encourage people I work with to read many of the books in this book series. I particularly encourage them to read “Hardware and Software Support for Virtualization”, since it’s basically a book on their job.

Just to be clear, the book series you are referring to is "Synthesis Lectures on Computer Architecture" published by Morgan-Claypool, right?

If you had to rank them in importance for the average engineer, how would you rank them?

Money is another way to measure performance. People that understand these low level principles can write a backend service in C/C++/Java/Erlang/Whatever that can do the work of hundreds of instances of a service written in higher level languages that ignore these concepts. That's not to say that it's never wise to use higher level languages, but these principles can and do save mature established businesses real money.
True, but if my bad python implementation can do the job on one old single core computer with plenty of CPU left over what do I care - in fact because python is generally more productive for small programs it is a lot cheaper. Most programs don't actually handle so much data that they need every bit of performance.

Of course the minority that do need performance are where the real engineers are needed.

I'm writing a little C++/Qt application to visualise a fairly small amount of data.

The lag is already noticeable. If I don't pay attention to performance, I know the end result will be slow and unpleasant to use.

That is true, but you are likely better off using a library than focusing yourself on cache optimization.
Using a library (specifically, taking advantage of Qt's QVariant and view/model framework), is what triggered most of the slowdown.
I was just in a room with a guy who claims he "takes pride in his work" so everything has to be as fast as technically possible., use the least RAM down to KBs, never write to a disk if possible, etc. The problem is the schools. Academics have no concept of who is paying for everything around them its like asking a welfare recipient to teach you about budgets.