Hacker News new | ask | show | jobs
by BubRoss 2197 days ago
What are you basing this on exactly?
1 comments

Despite years of propaganda, C is not well matched to CPUs currently in use (quite opposite in places), and the typical optimizations don't necessarily work when dealing with external I/O that you need to do in a switch.

Essentially, even if you write in C, reaching higher speeds will involve using "assembly masquerading as C" rather than depending on compiler optimizations.

Also, Snabb uses LuaJIT, which already generated quite tight code, so the performance gap that I suspect some imagine just isn't that wide.

>C is not well matched to CPUs + >depending on compiler optimizations + >Snabb uses LuaJIT .. quite tight code

==

You can write great C-based systems and avoid assembly if you a) know, always, what your compiler is doing and b) know, always, what your CPU is doing...

My point was that the same optimizations that made C fast break down when you need to take into account often intricate dance between CPU caches, memory, I/O bus etc. - so that unless you go into cpu-model-specific assembly tweaks just using C might not bring you as much benefit.

Is it possible to get better? yes. Would it count as "normal C"? I would say not really (if we say yes, then CL code on SBCL with huge custom VOPs counts)

I'd be interested to learn about significantly complex, nontrivial systems of say 100K to 1M LOC scale that require reasoning about every single instruction from the perspective of every other instruction, in order for the system to work.
You do not do that. Instead you optimize the short, critical part.

Of course it does not apply to everything, you need a few hotspots, but it is quite common: audio/video codecs, scientific computation, games, crypto... And even networking.

And with Lua (and C, and C++) its pretty easy to manage the complexity. Just put things where they belong.
I think the answer here is that LuaJIT is fast, and that well written native programs would still be faster, not that C "isn't well matched to CPUs". Modern optimizations are more about memory access patterns than anything else, with SIMD and concurrency beyond that. Focusing on assembly is really not the apex is used to be. For starters CPUs have multiple integer and floating point units, and they get scheduled in an out of order CPU. Out of order execution is as much about keeping the various units busy as it is about doing loads as soon as possible to avoid stalling.

I think if you are going to claim that C or C derivatives aren't actually fast and the idea that they are is due to "propaganda" then you should back that up with something concrete, because it goes against a lot of established expertise.