|
|
|
|
|
by haberman
1605 days ago
|
|
The visualization tools presented look really nice, but they seem to present program execution as sequential and linear, which is a model that seems like it will really break down at these time scales (10s of cycles). Modern processors will look hundreds of instructions into the future and try to start executing them as soon as possible. Branches are predicted far in advance of when they can actually be evaluated. Many instructions can be executing simultaneously. A clean tidy flame graph showing 1-3ns slices (~5 cycles) cannot help but be a vast simplification of what the CPU is really doing. The linked page about Processor Trace says this: > instruction data (control flow) is perfectly accurate but timing information is less accurate The article mentions using magic-trace to detect changes in inlining decisions made by the compiler. This is a case where it will shine, since PT can perfectly capture the control flow, and it doesn't necessarily rely on having perfect timestamps for everything. |
|
Anyway, I wanted to say how much I appreciate your comment of 10 years ago. I'm also a parser nerd, and a performance nerd, and I feel strongly that programmers have a professional responsibility to write code in a way that expresses our intent by a logical minimum of instructions/work. I strongly suspect that this will become important again in the future, not because the ratio of software-efficiency to hardware-power decreases again, but because climate concerns will drive us to measure our code in performance-per-watt rather than performance-per-dollar (depending on what action is taken on carbon pricing, it may be a distinction without a difference).
I look forward to the day when grossly inefficient software is rightly considered to be as unacceptable as grossly inefficient SUVs, and people in our profession are forced to take responsibility for the damage that their obscenely inefficient crap is doing. I hope Python 4 comes with a snorkel.