| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by stratified 2212 days ago

[DISCLAIMER] I work at AWS, not speaking for my employer.

We really need some more details on your infrastructure, but I assume it's EC2 instance cost that skyrocketed?

A couple of pointers:

- Experiment with different GPU instance types.

- Try Inferentia [1], a dedicated ML chip. Most popular ML frameworks are supported by the Neuron compiler.

Assuming you manage your instances in an auto scaling group (ASG):

- Enable a target tracking scaling policy to reactively scale your fleet. The best scaling metric depends on your inference workload.

- If your workload is predictable (e.g. high traffic during the daytime, low traffic during nighttime), enable predictive scaling. [3]

1 comments

It could also be worth it to have a look at SageMaker? IIRC it's cheaper.