32 bit multiplication : 3 pJ [1]
The energy savings come from not transporting data.
[1] http://www.sigmod2014.org/damon/slides/picojoule.kozyrakis.p...