Hacker News new | ask | show | jobs
by lewisl9029 1668 days ago
Yes, their network stacks are definitely complex and cost a lot to maintain, I'm sure. But that doesn't necessarily make it a good deal if the customer isn't able to derive enough additional value from all that complexity. In fact it makes the offering less attractive if the complexity isn't sufficiently abstracted away and distract from product work or if their abstractions are leaky.

Recently I've been working with https://fly.io/ for a new app and it's a breath of fresh air compared to working with the big cloud providers. They offer simple but robust networking primitives built on top of ipv6 and WireGuard and provide a ton of value add on top like global distribution & load balancing, service discovery, TLS termination, all of which just work exactly like I'd expect it to, out of the box without any configuration on my side.

EDIT: Almost forgot to mention: their egress costs are also much more reasonable: https://fly.io/docs/about/pricing/#outbound-data-transfer

2 comments

I'm watching fly.io with interest, I want to see how they handle the first major incidents - response time, lessons learnt, transparency before I trust them with a production site though. Most SRE skills related to your own operations are all learnt on the battlefield and not via some cliche must-read book from Google engineers afaic.

If its Linode style - delayed status page updates - sometimes as much as 15minutes, zero detail post-mortems - this problem has been fixed by our engineers thank you yada yada, and same issues repeat six months down the line then I will be understandably disappointed.

I've only been with them through one major incident so far, and I recall them handling it reasonably well.

You can see them responding to customers and providing updates in real time here: https://community.fly.io/t/there-seems-to-be-an-outage-with-...

And a detailed postmortem here: https://community.fly.io/t/major-outage-portmortem-2021-10-1...

They also update their status page pretty diligently whenever something goes wrong even for things that don't necessarily impact all customers (the only recent item on there that affected my app directly was the Oct 13 one from what I can remember): https://status.flyio.net/history

> But that doesn't necessarily make it a good deal if the customer isn't able to derive enough additional value from all that complexity.

It’s simply obvious that it’s not a good deal if you’re not their target customer with a use case they cater to. However, it could be a good deal if you have a relevant use case. Unless it’s being suggested that AWS caters to everyone in all cases then it adds nothing to the conversation to point it out.