| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by joshvm 3298 days ago

Ignoring the price tag this is about half the performance of the Jetson TX2 which can manage around 1.5TFLOPS on 7.5W.

Interesting that you could use this to accelerate systems like the Raspberry Pi. The Jetson is a pain in the backside to deploy (at a production level) because you need to make your own breakout board, or buy an overpriced carrier.

EDIT: I use the Pi as an example because it's readily available and cheap. There are lots of other embedded platforms, but the Pi wins on ecosystem.

2 comments

MacsHeadroom 3297 days ago

1.5TFLOPS would have made the supercomputer top500 12 years ago. That's amazing.

link

Dylan16807 3297 days ago

Keep in mind that supercomputers are a lot less specialized than circuits for running neural nets.

12 years ago you could have gotten a stack of 5-8 7800 GTX cards and had 1.5TFLOPS of single precision. 11 years ago you could have had a stack of 5 cards with unified shaders. It's not fair to compare against the significantly more complicated route of getting 100 CPU cores working together with only 1-4 per chip.

link

amelius 3297 days ago

But can't you configure the device to do e.g. fast matrix-vector multiplications instead of inference? I can be wrong, but I suspect that's what people do mostly on supercomputers anyway.

link

p1esk 3296 days ago

That 1.5 TFLOPs for TX2 is FP16, while TOP500 is FP64.

link

RBerenguel 3297 days ago

But, you can do training on a Jetson, whereas the stick is inference only of pre-trained networks

link

nl 3297 days ago

You can't really do any reasonable training on a Jetson.

link

RBerenguel 3292 days ago

Thanks, worth knowing (was thinking of getting one in a few months)

link