Hacker News new | ask | show | jobs
by ben_w 820 days ago
> They're not guesses. We know they use A100s and we know how fast an A100 goes.

And we don't know how many GPT-4 instances run on any single A100, or if it's the other way around and how many A100s are needed to run a single GPT-4 instance. We also don't know how many tokens/second any given instance produces, so multiple users may be (my guess is they are) queued on any given instance. We have a rough idea how many machines they have, but not how intensively they're being used.

> You can cut a brain open and see how many neurons it has and how often they fire. Kurzweil's 10 petaflops for the brain (100e9 neurons * 1000 connections * 200 calculations) is a bit high for me honestly. I don't think connections count as flops. If a neuron only fires 5-50 times a second then that'd put the human brain at .5 to 5 teraflops it seems to me.

You're double-counting. "If a neuron only fires 5-50 times a second" = maximum synapse firing rate * fraction of cells active at any given moment, and the 200 is what you get from assuming it could go at 1000/second (they can) but only 20% are active at any given moment (a bit on the high side, but not by much).

Total = neurons * synapses/neuron * maximum synapse firing rate * fraction of cells active at any given moment * operations per synapse firing

1e11 * 1e3 * 1e3 Hz * 10% (of your brain in use at any given moment, where the similarly phrased misconception comes from) * 1 floating point operation = 1e16/second = 10 PFLOP

It currently looks like we need more than 1 floating point operation to simulate a synapse firing.

> The other estimates like 1e28 are measuring different things.

Things which may turn out to be important for e.g. Hebbian learning. We don't know what we don't know. Our brains are much more sample-efficient than our ANNs.