Hacker News new | ask | show | jobs
by woadwarrior01 2417 days ago
Also worth mentioning here is perf[1], which is great for low overhead profiling. Also, perf profiles can be turned into profiles compatible with GCC and LLVM PGO to build optimized binaries based on production runs, using autofdo[2]. In my use case, the instrumentation overhead was too high to use regular profiling on production workloads.

[1]: https://en.m.wikipedia.org/wiki/Perf_%28Linux%29

[2]: https://github.com/google/autofdo

4 comments

Don't forget Hotspot as a way to visualize perf results, which vanilla perf is unfortunately a bit lacking in: https://github.com/KDAB/hotspot
Thank you, I have been looking for something that could display perf recordings as a swimming lane view for threads.
For anyone looking to use perf, Brendan Gregg's page on it is the best resource I know of: http://www.brendangregg.com/perf.html
Fun fact: the pprof tool that comes in gperftools can read perf data files, so you can use them together if you prefer pprof’s reporting.
perf and its ilk are obviously useful, but you need to be aware of several cans of worms with sampling hardware counters, in particular. These include the timing mechanism for sampling, the documentation and intrinsic usefulness of particular counters, and issues with multiplexing more than what can be used simultaneously. For multiplexing see, for instance, https://www.research.manchester.ac.uk/portal/files/59933625/...