Hacker News new | ask | show | jobs
by stratified 2164 days ago
[DISCLAIMER] I work at AWS, not speaking for my employer.

We really need some more details on your infrastructure, but I assume it's EC2 instance cost that skyrocketed?

A couple of pointers:

- Experiment with different GPU instance types.

- Try Inferentia [1], a dedicated ML chip. Most popular ML frameworks are supported by the Neuron compiler.

Assuming you manage your instances in an auto scaling group (ASG):

- Enable a target tracking scaling policy to reactively scale your fleet. The best scaling metric depends on your inference workload.

- If your workload is predictable (e.g. high traffic during the daytime, low traffic during nighttime), enable predictive scaling. [3]

[1] https://aws.amazon.com/machine-learning/inferentia/

[2] https://docs.aws.amazon.com/autoscaling/ec2/userguide/as-sca...

[3] https://docs.aws.amazon.com/autoscaling/plans/userguide/how-...

1 comments

It could also be worth it to have a look at SageMaker? IIRC it's cheaper.