Hacker News new | ask | show | jobs
by dodata 2257 days ago
One of the things that has deterred me from SageMaker is how expensive it can be for a side project. Real-time endpoints start at $40-$50 per month, which would be a bit too much for a low-budget project on the side. I love the idea of using an open-source alternative, but I noticed that all of the systems combined for Cortex would be a bit more expensive. Do you have any tips on how to keep a model deployed cheaply for a side project using Cortex? Id be fine with a little bit of latency on the first request, similar to how Heroku's free dynos work.
1 comments

In general, Cortex will be significantly cheaper because you're only paying AWS for EC2 (the bulk of the bill) and the other AWS services used (a much smaller portion of the bill). With SageMaker, you're paying the EC2 bill plus a ~40% premium.

To keep the AWS bill as low as possible, Cortex supports inference on spot instances, which are unused instances that AWS sells at a steep (as in 90%) discount. The drawback is that AWS can reclaim the instance when needed, but with ML inference failover isn't as big of a deal, since you typically don't need to preserve state.

If you use spot instances, choose the cheapest instance type possible, and keep your autoscalers minimum replicas to 1 (meaning it won't keep many replicas idling), you should be able to deploy the model pretty cheaply. Significantly cheaper than with SageMaker, at the very least.

There's some more info here: https://www.cortex.dev/cluster-management/spot-instances