| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by aidenn0 1557 days ago

The proper argument was always that optimizing compilers generate better assembly than 90% of the people using them could generate, and in a fraction of the time.

However these things often get turned into stronger (or different) arguments as they pass from mouth to ear repeatedly.

Sometimes they change completely, as in "the plural of anecdote is data"

2 comments

blippage 1556 days ago

I wanted to write a memcpy() routine for a microcontroller. I wrote a naive version where I copied from src to dst one byte at a time. You can find algorithms which are more efficient than this, which will typically copy 32 bit words at a time.

The interesting thing is, I turned on compiler optimisations. When I examined the assembled output (even though my knowledge of assembly is poor), I discovered that it had made the optimisations that you would find in a more complex C implementation. The compiler obviously thought to itself "I see what you're doing here", and put in a better version.

So the moral of the story is: your compiler is likely to be able to figure out a lot.

link

mhh__ 1556 days ago

Even ignoring the usual optimizations like using SIMD and loop unrolling to find parallelism when doing memcpy, the compiler actually has techniques for spotting certain loop idioms so it can actually replace the loop with a memcpy library call if it deems it profitable (e.g. tell it it's likely to have N>bigNumber and it'll go for a library)

link

cbm-vic-20 1556 days ago

There are additional optimizations like using C's printf without any extra arguments, the compiler will replace that with a call to puts, which doesn't have the formatting code. You can see this in Compiler Explorer.

https://godbolt.org/z/dvdzE4M6T

link

ghusbands 1556 days ago

Quite often, that doesn't end up very efficient, because without "restrict", the result has to be identical to what it would be if it was copied byte by byte, for all possible overlaps of the two inputs.

link

Sprite_tm 1556 days ago

Lots of memcpy() implementations are still more efficient than a dumb byte-by-byte copy. They'll copy the (unaligned) head and the tail in bytes, but the bulk of the data using whatever data type and method is fastest.

link

samus 1557 days ago

> "the plural of anecdote is data"

I love that line!

link