| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by howlgarnish 2069 days ago

Coral is powered by an Edge TPU (Tensor Processing Unit), which wipes the floor with GPUs like the Jetson Nano when it comes to running Tensorflow:

https://blog.usejournal.com/google-coral-edge-tpu-vs-nvidia-...

...and Google is pretty invested in TPUs, since it uses lots of them in house.

https://en.wikipedia.org/wiki/Tensor_Processing_Unit

5 comments

michaelt 2069 days ago

They might be great for inference with tensorflow - but from what I can tell from Google's documentation, Coral doesn't support training at all.

I'm sure an ML accelerator that doesn't support training will be great for applications like mass-produced self-driving cars. But for hobbyists - the kind of people who care about the difference between a $170 dev board and a $100 dev board - being unable to train is a pretty glaring omission.

MichaelBurge 2069 days ago

You wouldn't want to use it for training: This chip can do 4 INT8 TOPs with 2 watts. A Tesla T4 can do 130 INT8 TOPs with 70 watts, and 8.1 FP32 TFLOPs.

Assuming that ratio holds, you'd maybe get 231 GFLOPs for training. The Nvidia GTX 9800 that I bought in 2008 gets 432 GFLOPs according to a quick Google search.

Hobbyists don't care about power efficiency for training, so buy any GPU made in the last 12 years instead, train on your desktop, and transfer the trained model to the board.

rewq4321 2069 days ago

On the other hand, it would be useful for people experimenting with low-compute online learning. Also, those types of projects tend to have novel architectures that benefit from the generality of a GPU.

ianai 2069 days ago

Last I’ve heard covid was making GPUs about as difficult to find as the other things it’s jacked the prices up on, too.

gridlockd 2069 days ago

You can get pretty much any GPU at pre-COVID prices right now, except for the newest generation NVIDIA GPUs that just came out to higher-than-expected demand.

omgwtfbyobbq 2069 days ago

As a hobbyist in a state with relatively high electricity prices, I do care about the power efficiency of training.

jnwatson 2069 days ago

Training is what the cloud is for.

wongarsu 2069 days ago

That makes a $170 board that can also do training look dirt cheap in comparison

lawrenceyan 2069 days ago

Good luck training anything in any reasonable time on it.

R0b0t1 2068 days ago

Useful for adapting existing models. Not everything needs millions of hours of input.

tachyonbeam 2069 days ago

If you want to train yet-another-convnet sure, but there could be applications where you want to train directly on a robot with live data, as in interactive learning.

See this paper for an example of interactive RL: https://arxiv.org/abs/1807.00412

suyash 2069 days ago

or a highly rigged machine, this looks more for fast real time ML inference on the edge

debbiedowner 2068 days ago

You can adapt the final layer of weights on edge tpu.

Training on a dev board should be a last resort.

Even hobbyists can afford to rent gpus for training on vast.ai or emrys

pinewurst 2069 days ago

Google is pretty invested in TPUs for their own workloads but I fail to see any durable encouragement of them as an external product. At best they're there to encourage standalone development of applications/frameworks to be deployed on Google Cloud (IMHO of course).

tachyonbeam 2069 days ago

AFAIK, apart from toy dev boards like this, you can't buy a TPU, you can only rent access to them in the cloud. I wouldn't want my company to rely on that. What if Google decides to lock you out? If you've adapted your workload to rely on TPUs, you'd be fucked.

akiselev 2068 days ago

What's the difference between Coral's production line of Edge TPU modules and chips [1] and Google's cloud TPU offering?

Note: I haven't tried sourcing these in production (100k+) quantities so I have no idea what guarantees that product line gives customers.

[1] https://coral.ai/products/#production-products

usmannk 2068 days ago

They're nothing alike at all. Similar to how a low end laptop GPU differs from a top of the line NVIDIA datacenter offering. Google's cloud TPU offering is the strongest ML training hardware that exists, the edge devices simply support the same API.

debbiedowner 2068 days ago

Edge tpu is 2 tflops at half precision, cloud tpu starts at 140 tflops single precision and scales further.

Also edge tpu is 2-5Watts. Supposedly cloud tpus are more power efficient than GPUs, and for eg the 14 tflops 2080 ran at 300 W regularly.

popinman322 2068 days ago

Coral can only run inference, and is optimized for models using 8-bit integers (via quantization).

A full TPU v2/v3 can train models and use 16/32 bit floats. They also have a Google-specific (?) 16-bit floating point type with reduced precision.

kordlessagain 2068 days ago

And don't forget, TPUs are horrible at floating point math! The errors!

debbiedowner 2068 days ago

Yea I've been wondering about charts I've seen comparing tpu model quality perf to gpu model quality like here [1], whether that could be due to error correction. At the same time training on gaming gpus like 1080 ti or 2080 ti is widely popular, though they lack the ECC memory of the "professional" quadro cards or V100. I did think conventional DL wisdom said "precision doesn't matter" and "small errors don't matter" though.

I've noticed this difference in quality perf in my own experiments tpu vs gaming gpu, but don't know for sure what the cause is. I never did notice a difference between gaming gpu trained models and quadro trained modela. Have more info/links?

1: https://github.com/tensorflow/gan/tree/master/tensorflow_gan...

kanwisher 2069 days ago

Until you want to use Pytorch or another non tensor flow framework the support goes down dramatically. Jetson Nano supports more frameworks out of the box quite well, and it ends up being same cuda code you run on your big Nvidia cloud servers

panpanna 2069 days ago

Not only that, nvidia cares deeply about pytorch. Visit pytorch forums and look at most upvoted answers. All by nvidia field engineers.

sorenbouma 2068 days ago

That benchmark appears to compare full precision fp32 inference on the nano with uint8 inference on the coral, that floor wiping comes with a lot of caveats

m463 2069 days ago

There seems to be more than one jetson board.