Hacker News new | ask | show | jobs
by tomjohnneill 843 days ago
I can definitely imagine they're not covering the amortised cost of the training with the cost per individual inference request. It seems less likely to me that they're making a significant loss on each subsequent request, but again no source from me on that either.

Looking a bit more into this, I found this paper: https://arxiv.org/pdf/2311.16863.pdf. It references a table saying that text generation uses 0.047 kWh per 1000 inferences, which is 1-2 orders of magnitude lower than my estimate. Though that is for GPT2, so possibly tracks to something roughly in the ~0.001 kWh per inference for GPT3.5.

1 comments

Well doesn’t the compute time for transformers scale roughly quadratically with model size?

Would it make sense for power consumption to also scale roughly quadratically?

I'm not sure. The figures I've seen suggest that GPT3 required 10x more energy to train than GPT2 (e.g. https://www.nnlabs.org/power-requirements-of-large-language-....), so I think a roughly 1-2 order of magnitude increase in energy usage from GPT2 to GPT3.5 makes sense.