|
|
|
|
|
by rrss
1605 days ago
|
|
It seems like this is basically unavoidable on existing hardware, though, right? if we imagine there existed some visualization that could more accurately represent the complexity of a core, I don’t know how it would be possible to get the data, because AFAIK there are no methods to trace processor execution for modern processors at higher fidelity than this. even sampling profilers have similar issues with being limited to the model of sequential instruction streams, since each sample gives a single program counter, not the full view of everything the core has in flight. |
|
I also agree that sampling profilers have the same issue: instruction-level views of sampling profiles should be taken with a grain of salt.
My concern is that flame graphs with 1-3ns of resolution are presented as a selling point of the tool, without any mention of the caveats around how this model really breaks down at this time scale. I would like to know more details of how the PT data actually relates to the out-of-order execution. Does a branch's timestamp correspond to when that branch was retired? Do we actually know what the timestamp corresponds to, or is it not well-specified? Are there cases where the timestamp is known to be misleading about the true bottleneck?
I don't know the answers to these questions, but I see a tool like this, I really want more information about the strengths and limitations of the data.