| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by Nokinside 3036 days ago

Specialization brings speedups.

TPUv2 is specially optimized for deep learning.

Nvidia's Volta microarchitecture is graphics processor with additional tensor units. It's a General-purpose (GPGPU) chip designed with graphics and other scientific computing tasks in mind. Nvidia has enjoyed monopoly power in the market and single microarchitecture has been enough in every high performance category.

Next logical step for Nvidia is to develop specialized deep learning TPU to compete with TPUv2 and others.

2 comments

twtw 3036 days ago

> Next logical step for Nvidia is to develop specialized deep learning TPU to compete with TPUv2 and others

I don't know, this benchmark seems to show V100 doing pretty well against a specialized ASIC. It may well be that all NVIDIA has to do is cut costs on V100 to make a two V100s about as expensive as the cloud TPUv2. With increased batch size, it looks like two V100s would have performance comparable to TPUv2.

link

deepnotderp 3036 days ago

Volta V100 already has "tensor cores" which are basically little matrix multiplication ASICs.

link

Nokinside 3036 days ago

That's what I said.

The microarchctiecture has many unnecessary things and it's not optimized as a whole for deep learning.

link

deepnotderp 3036 days ago

I believe it was either the last MICRO* or the one before that when Dally addressed this point. The specialized hardware for graphics ends up comprising such a small portion of the overall chip that it wasn't worth it to remove it. The "GPUs were made for graphics thus aren't good for DL" argument really doesn't hold a lot of water IMO.

* It might've been a difference conference now that I think of it.

link