| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by AftHurrahWinch 296 days ago

> "We aim to inform debate with clear, sourced numbers while avoiding sensationalism."

This is a great aspiration, but it seems to be contradicted by the rest of the page, which provides unclear numbers from unsourced categories.

> CO₂: $AI_{CO_2e} \approx (AI_{electricity} \times grid_{emission\_factor})$

How are you accounting for Power Purchase Agreements (PPAs) and Renewable Energy Credits (RECs)?

> Water: $AI_{water} \approx (DC_{water\_per\_kWh} \times AI_{electricity}) + (PowerGen_{water\_intensity} \times AI_{electricity})$

Where do the values for $DC_{water\_per\_kWh}$ (the Water Usage Effectiveness, or WUE) and $PowerGen_{water\_intensity}$ come from? These vary wildly by cooling system (evaporative vs. closed-loop) and energy source (hydro vs. nuclear vs. gas).

> Electricity: $AI_{electricity} \approx (IT_{load} \times utilization \times hours) \times PUE$

How do you estimate $IT_{load}$? Is this based on TDP of GPUs? A specific list of GPUs? Market share estimates?

What is the assumed $utilization$ for inference vs. training?

Which $PUE$ is used? A global average? A regional one? A company-specific one?

2 comments

rboug 295 days ago

> On “sources”

Each of the above pulls from operator sustainability reports, industry surveys/benchmarks, grid datasets (national/regional emission factors), and academic studies for water/energy intensities and inference energy per token. Where multiple ranges exist, we pick a conservative central value and call out the range.

I’ll add a compact table of constants + ranges + citations in the Methodology page so it’s easy to audit and nitpick. If you have a favorite dataset for WUE by cooling type or per-region grid water intensity, I’d love pointers—this is exactly the kind of feedback that improves the baseline.

link

rboug 295 days ago

Thank you for your msg. Short answers below; happy to go deeper.

> CO2 (PPAs/RECs)

- We currently use location-based grid factors (national/regional) and do not net out PPAs/RECs. That is the conservative choice for a public baseline.

- If a workload is known to be contract-matched (hourly/locational), we can apply a market-based view; I plan to expose a toggle (location- vs market-based) so both views are visible.

> Water (WUE & power-generation water)

- DC_water_per_kWh (WUE): when operators publish site/region values we use them. Otherwise we assign a cooling class (evaporative / closed-loop / seawater / air-only) and take a central value from published ranges. That gives order-of-magnitude accuracy without claiming site precision.

- PowerGen_water_intensity: technology-specific consumption factors (not withdrawals) by fuel/tech (gas, coal, nuclear, hydro, etc.), weighted by the grid mix of the region when it’s known; otherwise a conservative aggregate. Hydropower is treated as low consumption, high withdrawal.

> Electricity (IT load, utilization, PUE)

- IT_load / Training: bottom-up from reported compute for frontier runs + known fleet sizes; extrapolated to mid-scale using public training reports.

- IT_load / Inference: top-down from usage volumes (requests/tokens/images) × energy per unit by model class, calibrated from published perf/W measurements and vendor/benchmark data. We don’t simply sum GPU TDP; we use perf/W + utilization.

- utilization: ranges by workload class; we take a conservative central value (higher for sustained training, lower/peaky for inference). These are sensitivity levers and shown in the methodology.

- PUE: operator/region-specific when disclosed; otherwise we apply a conservative default for hyperscale vs. generic DCs (kept distinct). PUE is another sensitivity knob we surface.

link

AftHurrahWinch 295 days ago

> That gives order-of-magnitude accuracy without claiming site precision.

> I’ll add a compact table of constants + ranges + citations in the Methodology page

This is a worthwhile project. HN and 'the discourse' needs a reliable, citable source for these metrics. Adding a table of citations is a crucial step towards that.

Your confidence intervals are probably more precise than an order-of-magnitude (LeBron James is the same size as a six story building, within one order-of-magnitude), and I'm excited to see the ranges as your site evolves.

link

rboug 295 days ago

Thank you! Really appreciate this. I’ll prioritize the citations/ranges table in /methodology to make this properly citable and easier to audit.

link