Hacker News new | ask | show | jobs
by QuadmasterXLII 636 days ago
Llama 403b takes OOM a kilowatt minute to respond on our local gpu server, or about 10 grams of C02 per email. Last I checked, add another 20 grams of amortized manufacturing emissions. A typical commute is OOM 5-10 kg of CO2.

this article is alarmist bullshit. (for entirely unrelated reasons openai delenda est)

1 comments

So you can double your commute‘s environmental impact by using llama 1000x per day?

That sounds pretty bad still, no?

A thousand times? I’d have a hard time typing out that many queries in 8 hours. Even 100 seems like a stretch for someone who uses it within something like cursor.
More and more environments offer LLM aid without having you explicitly typing in a query. E.g. trigger inference whenever static analysis fails (e.g. on a compile error). Or trigger an LLM aided auto-complete with Ctrl-Space. I don't think it'll be particularly unusual to reach 1000 queries in a working day that way.
Coding models these days use an inference every time you stop typing. Let’s say it’s 0.1 inference oer keystroke. If you keep VSCode open all day, I could believe it’s a significant energy draw.

Google now uses several inferences per Google search.

The average user’s #inferences-per-day is going to skyrocket.

My point is that it’s understandable to consider AI a significant contributor to the average professional’s energy budget. It’s not an insult to point this out.