HPCToolkit and TAU are good options for profiling C++ applications. They come from HPC so they are intended for use with parallel and highly concurrent applications.
Threadspotter (paratools.com) and maqao.org might be of interest, at least for x86_64 GNU/Linux but I wouldn't know about C++ specifics.
[TAU is doubtless a good bet, but for what it's worth for general interest, the other common systems for HPC are openspeedshop.org, cube/scalasca (scalasca.org), and extrae/paraver (bsc.es). A good comparison of them all would be useful, but I've not found one.]
[TAU is doubtless a good bet, but for what it's worth for general interest, the other common systems for HPC are openspeedshop.org, cube/scalasca (scalasca.org), and extrae/paraver (bsc.es). A good comparison of them all would be useful, but I've not found one.]