Hacker News new | ask | show | jobs
by galangalalgol 1427 days ago
Our pipelines have asan ( and cpp check clang tidy coverity and coverage stuff) but no valgrind, is there something it is good at that we are missing?
4 comments

ASAN on its own doesn't detect uninitialized memory. MSAN can, though. Valgrind is also more than just the memcheck sub-tool - there are others, like Cachegrind, which is a cache and branch-prediction profiler.

https://github.com/google/sanitizers/wiki/AddressSanitizerCo... https://github.com/google/sanitizers/wiki/MemorySanitizer https://valgrind.org/docs/manual/manual.html

Yeah, valgrind can report L1/L2 cache misses and report the percentage of branch mispredictions. It also reports the exact number of instructions processed, and how many of those instructions cache missed. It's great for improving small code that needs to be performant.

I'd use asan over valgrind only for memory leaks. It's faster.

If you only want memory leaks, LSan will do that for you.

In general, I tend to use ASan for nearly everything I used Valgrind for back in the day; it's faster and usually more precise (Valgrind cannot reliably detect small overflows between stack variables). Valgrind if I cannot recompile, or if ASan doesn't find th issue. Callgrind and Cachegrind never; perf does a much better job, much faster. DHAT never; Heaptrack gives me what I want.

Valgrind was and is a fantastic tool; it became part of my standard toolkit together with the editor, compiler, debugger and build system. But technology has moved on for me.

Amen. Between the various sanitizers and perf, I stopped needing valgrind a few years ago.

But when it was the only option it was fantastically useful.

If I understand correctly valgrind (cachegrind) reports L1/L2 cache misses based on a simulated CPU/cache model.

On Linux, you can easily instrument real cache events using the very powerful perf suite. There is an overwhelming number of events you can instrument (use perf-list(1) to show them), but a simple example could look like this:

  $ perf stat -d -- sh -c 'find ~ -type f -print | wc -l'
  ^Csh: Interrupt
   Performance counter stats for 'sh -c find ~ -type f -print | wc -l':
  
               47,91 msec task-clock                #    0,020 CPUs utilized
                 599      context-switches          #   12,502 K/sec
                  81      cpu-migrations            #    1,691 K/sec
                 569      page-faults               #   11,876 K/sec
         185.814.947      cycles                    #    3,878 GHz                      (28,71%)
         105.650.405      instructions              #    0,57  insn per cycle           (46,15%)
          22.991.322      branches                  #  479,863 M/sec                    (46,72%)
             643.767      branch-misses             #    2,80% of all branches          (46,14%)
          26.010.223      L1-dcache-loads           #  542,871 M/sec                    (36,80%)
           2.449.173      L1-dcache-load-misses     #    9,42% of all L1-dcache accesses  (29,62%)
             517.052      LLC-loads                 #   10,792 M/sec                    (22,53%)
             133.152      LLC-load-misses           #   25,75% of all LL-cache accesses  (16,02%)
  
         2,403975646 seconds time elapsed
  
         0,005972000 seconds user
         0,046268000 seconds sys
Ignore the command, it's just a placeholder to get meaningful values. The -d flag adds basic cache events, by adding another -d you also get load and load miss events for the dTLB, iTLB and L1i cache.

But as mentioned, you can instrument any event supported by your system. Including very obscure events such as uops_executed.cycles_ge_2_uops_exec (Cycles where at least 2 uops were executed per-thread) or frontend_retired.latency_ge_2_bubbles_ge_2 (Retired instructions that are fetched after an interval where the front-end had at least 2 bubble-slots for a period of 2 cycles which was not interrupted by a back-end stall).

You can also record data using perf-record(1) and inspect them using perf-report(1) or - my personal favorite - the Hotspot tool (https://github.com/KDAB/hotspot).

Sorry for hijacking the discussion a little, but I think perf is an awesome little tool and not as widely known as it should be. IMO, when using it as a profiler (perf-record), it is vastly superior to any language-specific built-in profiler. Unfortunately some languages (such as Python or Haskell) are not a good fit for profiling using perf instrumentation as their stack frame model does not quite map to the C model.

If your tests can take the performance hit, Valgrind would tell you about uninitialized memory reads, which isn't covered by those tools you mentioned. If however, you are able to add MSAN (i.e. able to rebuild the entire product, including dependencies, with -fsanitize=memory) to the pipeline, then you would have the same coverage as Valgrind.
The main reason for Valgrind would be if you're working with a binary that you can't recompile to add the ASAN instrumentation.