| Cool project, lightweight and easy to drop into a codebase I have a few ideas. 1. Zone name storage – right now you store a raw char. If it points to a stack/local string it can blow up . Better to use const char (if always literals) or strdup. 2. Thread-safety – the global arrays (prof_zones, prof_zones_stack) aren’t protected. In multithreaded workloads everything breaks. You could make them thread_local. 3. Cycles vs time – __rdtsc() isn’t always stable (TurboBoost, NUMA, CPU scaling). clock_gettime() is more portable. Might be nice to let the user pick via a macro. 4. Output – the printf("[%16s]...") is old-school. Sorting zones by total_secs or total_cycles would make hot spots pop out immediately. 5. Limits – PROF_MAX_NUM_ZONES = 256 is small for bigger projects. Could malloc and grow dynamically. 6. Overhead – the profiler doesn’t subtract its own overhead (push/pop, clock reads), so absolute numbers are inflated. Relative numbers are still valid, though. Bonus idea: add a macro like #define PROF_SCOPE(name) for (int _i = (prof_begin(name), 0); !_i; prof_end(), _i++) so you can just write: void foo() {
PROF_SCOPE("foo");
// code...
} |