| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by jtmarmon 1325 days ago
	I think this status page is inaccurate - hosting is affected. My app is _unfixbly_ broken right now. I have an app on fly whose VM appears to have died due to these issues, and because deploys and restarts are broken, I have literally no way of fixing it. https://community.fly.io/t/ewr-app-is-completely-inaccessibl...

2 comments

jteppinette 1325 days ago

This is what is worrying me about moving over to Fly. I am surprised that it has been so heavily pushed here on HN. Perhaps this is just a relatively isolated event, we will see how it is handled moving forward.

link

mrkurt 1325 days ago

This didn't actually kill VMs, but it _did_ prevent them from being rescheduled for upwards of an hour. The vast majority of apps running on the platform had 100% uptime throughout the incident. The ones that didn't rely on our rescheduling infrastructure to recover from app errors.

link

jtmarmon 1325 days ago

Except my app isn't down due to an app error but a failed host in EWR which I couldn't escape from (due to the concurrent scheduling issues) https://status.flyio.net/incidents/v2dshzvy1mcl

EDIT: recognize that these may be poorly timed but unrelated incidents, but it has been frustrating to be trapped on a broken box for 12 hours and have the status page telling me it's just new deploys that are borked :)

link

mrkurt 1325 days ago

I don't want to belabor this because we need to do a much better job making it obvious: but single node, development postgres databases are going to have downtime in our infrastructure. We'll get that host back for you, but you should _definitely_ add a replica if you care about availability.

link