| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by nomel 23 days ago
	You didn't touch on the most important aspect for cost: die area! How much die space ($) will that circuitry, that's probably statistically near zero chance for you main customers workload (who has model weight of 0 or 1!?), add. And, if you can stomach the cost, what else could you put there instead?

3 comments

Thorondor 23 days ago

Weights should not be 0 (at least not frequently) but in a ReLU-based neural network, activations are 0 pretty often. You're absolutely right about die area though.

link

nomel 23 days ago

> near zero chance for you main customers workload

What percent of this hardware is running inference for ReLU models? ;)

link

imtringued 22 days ago

Nvidia has added structural sparsity to their GPUs and every time they pull out a flops or tops number, they assume you will use structural sparsity.

The die area argument here makes no sense. Supporting structural sparsity can be done either by duplicating the multipliers with and without the support or you have a single general purpose multiplier that does both, in which case you can have twice as many of them.

Also, in ReLU^2 networks, 90%+ parameters are zero.

link

nomel 21 days ago

> The die area argument here makes no sense.

Any logic you add to the GPU is physical silicon and metal that take up physical space.

> duplicating the multipliers with and without the support or you have a single general purpose multiplier that does both

That would be extra physical logic, which would be extra physical space on the die. "can be done" isn't my point, it's that "doing requires surface area".

link

rowanG077 22 days ago

I expect the degraded critical path will most likely be worse than a bit of die area. On modern processes you have A LOT of transistors to play with.

link