Hacker News new | ask | show | jobs
by barbolo 3132 days ago
Would that be a viable option to deploy TensorFlow models on serverless environments (Lambda, Functions)?
2 comments

You can deploy TensorFlow model binaries as serverless APIs on Google Cloud ML Engine [1]. But I would also be interested in seeing a TensorFlow Lite implementation.

[1] https://cloud.google.com/ml-engine/docs/deploying-models

Disclaimer: I work for Google Cloud.

Thanks, @rasmi. I have a feedback for you guys. The pricing for predictions inference in GCP is not very fair. If I deploy a small model (like a SqueezeNet or Mobilenet) I pay almost the same price of someone deploying large models (like Resnet or VGG). That’s why I’m deploying my models on serverless environments and paying about 5 dollars for 1 million inferences.

The pricing of GCP is: $0.10 per thousand predictions, plus $0.40 per hour. That’s more than 100 dollars for 1 million inferences.

I see what you mean. To some companies, ML Engine's cost as a managed service may be worth it. To others, spinning up a VM with TensorFlow Serving on it is worth the cost savings. If you've taken other approaches to serving TensorFlow models to get around ML Engine's per-prediction cost, I'm curious to hear about them.
The main TensorFlow interpreter provides a lot of functionality for larger machines like servers (e.g. Desktop GPU support and distributed support). Of course, TensorFlow lite does run on standard PCs and servers, so using it on non-mobile/small devices is possible. If you wanted to create a very small microservice, TensorFlow lite would likely work, and we’d love to hear about your experiences, if you try this.
Thanks for the answer. Currently I’m using AWS Lambda to deploy my TensorFlow models. But it’s pretty hard and hacky. I need to remove a considerable portion of the code base that is not needed for inference only routines. I do that so the code loads faster and to fit the deployment package size limit. If TensorFlow Lite is already a compressed code, then it may be much easier to deploy it to a serverless environment. I’ll be trying it in my next deployments.
Sounds really interested. We're excited to hear about how that goes.