|
|
|
|
|
by wfarr
5019 days ago
|
|
At the time of the outage, the status site was seeing upwards of 30,000/req minute. AS we scaled up dynos, we would see temporary performance improvements until the status site would stop responding again. In the short term, this led to us massively increasing dynos as quickly as we could as it appeared that CPU burn was a significant cause of the slowness (at the time). This was in part caused by all the dynos repeatedly crashing. That's how we ended up going from 8 previously to 90. Once the database problem for the status site was identified and resolved, we began scaling down dynos to a smaller number. |
|