Hacker News new | ask | show | jobs
by robecommerce 2171 days ago
Another data point:

"For example, we recently internally benchmarked an Inferentia instance (inf1.2xlarge) against a GPU instance with an almost identical spot price (g4dn.xlarge) and found that, when serving the same ResNet50 model on Cortex, the Inferentia instance offered a more than 4x speedup."

https://towardsdatascience.com/why-every-company-will-have-m...

1 comments

That data point talks about inference though, and nobody's arguing that deployment and inference have improved significantly over the past years.

I'm referring to training and fine-tuning, not inference, which - let's be honest - can be done on a phone these days.