|
|
|
|
|
by penteract
302 days ago
|
|
From the report: > To calculate the energy consumption for the median Gemini
Apps text prompt on a given day, we first determine the average
energy/prompt for each model, and then rank these models by
their energy/prompt values.
We then construct a cumulative
distribution of text prompts along this energy-ranked list to
identify the model that serves the 50-th percentile prompt. They are measuring more than one model. I assume this statement describes how they chose which model to report the LM arena score for, and it's a ridiculous way to do so - the LM arena score calculated this way could change dramatically day-to-day. |
|