|
|
|
|
|
by nickfromseattle
295 days ago
|
|
What are the variables that prefer local GPUs vs cloud inference? Is connectivity the dividing line or are there other variables that influence the choice? Anduril submersibles probably need local processing, but does my laundry/dishes robot need local processing? Or machines in factories? Or delivery drones? |
|
Imagine you were tracking items on video at a self-service checkout. Sure, you could compress the video down to 15 Mbps or so and send it to the cloud. But now, a store with 20 self-checkouts needs 300 Mbps of upload bandwidth. That's one more problem making it harder for Wal-Mart to buy and roll out your product.
Also, if you know you need an NVIDIA L4 dedicated to you 24/7 for a year, a g6.xlarge will cost $7,000/year on-demand or $4,300/year reserved [1] while you can buy the card for $2,500.
Of course for many other use cases the cloud is a fine choice. If you only need a fraction of a GPU, or you only need a monster GPU a tiny fraction of the time, or you need an enormous LLM that demands water cooling and tolerates latency easily, the cloud can be a fine choice.
[1] https://instances.vantage.sh/aws/ec2/g6.xlarge?currency=USD&...