Hacker News new | ask | show | jobs
by cdowns 882 days ago
I recently ran into an overload issue and it turned out to be a not-obvious "hard limit" that was mentioned. Everything would be smooth for a bit and then my throughput would be halved after I walked away, backing up the queues indefinitely and paging me again.

I had moved a single-broker RabbitMQ from GCP to AWS and the instance type I chose had bandwidth "up to" 10Gbps. Being less familiar with AWS, I did not realize they will actively throttle based on credits because "up to" means "burstable to" regardless of available capacity. My messages are pretty large and I was running out of credits after about an hour.

Bandwidth was the last thing I considered since I hadn't had the issue on GCP. Switching to a larger instance with guaranteed bandwidth was a band-aid. Clustering to spread the load between multiple instances will be my longer term fix. Lesson learned, hopefully this helps someone someday.