| Got me wondering how this compares with neural efficiency, realizing ofc that there's nothing really apples-to-apples here. Training one of these big models takes 100kWh for 1e19 flops, so that's 100k Wh, 360M Ws, or 360MJ or 3.6 1e8J. 1e8Joules/1e19flops = 1e-11J/flop Neurons take 1e-8J/spike.[1] Math check appreciated :) Does seem plausible to think of a single neuron spike (hodgkin-huxley cable model) being modeled with ~1k flops. Though I'm firmly of the opinion that nobody really knows how the brain works.. the neural spike activity could be pure epiphenomenon.. who knows! [1] “Finally, the energy supply to a neuron by ATP is 8.31 × 10−9 J. Meanwhile, integrating the total power with respect to time we will get the consumed electric power, which is 8.75 × 10−9 J. This is more energy than the ATP supplied. The energy efficiency is 105.3%. This is an anomaly…” - 2017 Feb 16 Wang, Xu, Institute for Cognitive Neurodynamics, East China University of Science and Technology
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5337805/ |