Hacker News new | ask | show | jobs
by fyzix 1205 days ago
I get up to 20secs of downtime when redeploying on railway... I was using the sample golang repo from fly : https://github.com/fly-apps/go-example

That's just unacceptible imo.

2 comments

Hey there fyzix! Angelo here, Support Engineer from Railway. I was here procrastinating from work when I saw this comment.

The downtime you see is our proxy taking time to cutover- however, we have https://docs.railway.app/deploy/healthchecks that will only cutover once we have a 200 from your API. This way we can keep your old deploy live and serving requests if and only if it's live.

Theres more we can do to make it magical, but this should help in the meantime.

Thanks for the reply, Issue resolved:)

This could be done automatically by pinging the root and searching for a header set by railway's default page. If it doesn't exist then the service is live.

We used to do that by default, but it led to issues with load on our proxy. (Imagine n * m pings against your host for a shit-tonne (official measurement) of builds + deploys a day.)

We are hiring Network Engineers for exactly this reason :')

More context: The UI shows that the service is available but when I visit the live url, I get a 404.

What I think is happening is that the UI shows the status of the underlying container and there is some fault with the reverse proxy they are using to expose the container to the internet.

Fly.io doesn't have this issue but I am leery of all the complaints here.