| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by marginalia_nu 634 days ago
	If I run some simple inference locally on a 4090 (450 TDW card) it takes order of seconds and that sucker's going full blast, you're looking at order of 1 kJ, which is significantly higher than what is quoted in the article. Article numbers line up better with CPU inference for ~1s.

2 comments

Panzer04 634 days ago

1kj is nothing. That's 0.3wh, or 0.0003kwh.

link

marginalia_nu 634 days ago

That's for a single inference though. You can do about 3600 of them in an hour.

link

aubanel 633 days ago

Yes but the article's setting is precisely about 1 email so 1 inference, and their number is 0.14kWh, which is way off.

link

gcr 634 days ago

I’m still kind of skeptical. M-series Apple hardware doesn’t even get warm during inference with some local models.

Edit: Nah I’m convinced, look at table 1. Inference costs are around 20mL in a datacenter environment.

link

marginalia_nu 634 days ago

1 kJ is for reference enough to heat 1 L (33 oz) of water by ~0.25C (~0.5F). The machine will probably heat up a few degrees if you run inference once, but since it's essentially one big heatsink it will dissipate throughout the body and into the air. The problem begins when you run it continuously, as you would in a datacenter.

link

MSFT_Edging 634 days ago

Datacenters aren't running M-series chips.

link

guitarlimeo 634 days ago

Well not M-series chips specifically, but chips optimized for these kind of workloads (like the neural engine in M-series chips is).

link

dartos 634 days ago

IIRC The M series chip isn’t specifically optimized for ML workloads, the biggest gain it has is having unified video and cpu memory as transferring layers between the two is a big bottleneck on non Apple systems.

Real ML hardware (like the Nvidia H1000s) that can handle the kind of inference traffic you see in production get hot and use quite a bit of energy, especially when they run at full blast 24/7

link

gcr 634 days ago

Google’s TPU energy usage is a well-kept secret / competitive advantage. If energy efficiency isn’t a major concern for them, I bet it will be in a couple years.

link