|
|
|
|
|
by flohofwoe
1104 days ago
|
|
You should still at least care about 'CPU friendly' data layout in memory and data access patterns in your code to make the CPU's life easier, compiler magic won't help all that much there. This can often trivially give you a 10x, and sometimes a 100x performance difference for real-world single-threaded code, especially if it needs to work on big data sets. Your fancy high level compiler won't magically reshuffle your data in memory to help with prefetching (at least I'm not aware of a language that does). Most high level languages popular today are "rooted" in the 90's when the latency gap between CPU and memory practically didn't exist and thus don't care about his specific aspect. Explicit control over memory layout is probably also one of the a main reasons why C stood the test of time so well. |
|
That's one hell of a lot of speed up, can you explain how you managed that? I mean I can certainly believe it but not when you put "trivially" in front of it, I could only conceive of that in very special cases and with very careful coding.