|
|
|
|
|
by hugmynutus
409 days ago
|
|
I find this unconvincing. The actual discussion of LLM generation is very lacking. The original link [1] cites a discussion of the cost per query of GPT-4o at 0.3whr [2]. When you read the document [2] itself you see 0.3whr is a lower bound & 40whr is the upper bound. The paper [2] is actually pretty solid, I recommend it. It uses the public metrics from other LLM APIs to derive a likely distribution of the context size of the average query for GPT-4o which is a reasonable approach given that data isn't public. Then factoring in GPU power per FLOP, average utilization during, and cloud/renting overhead. It admits this likely has non-trivial error bars, concluding the average is between 1-4whr per query. This is disappointing to me as the original link [1] attempts to bring in this source [2] to disprove the 3whr "myth" created by another paper [3], yet this 3whr figure lies directly in the error bars their new source [2] arrives at. Links: 1. https://simonwillison.net/2025/Apr/29/chatgpt-is-not-bad-for... 2. https://epoch.ai/gradient-updates/how-much-energy-does-chatg... 3. https://www.sciencedirect.com/science/article/pii/S254243512... Edit: whr not w/hr |
|
Thus the results inherently fail to analyze the underlying question.
A more realistic estimate is to take their total spending assuming X% of their expenses are electricity directly or indirectly because the environmental impact isn’t adds up. Even that ignores the energy costs on 3rd party servers when they download their training data.