Hacker News new | ask | show | jobs
by bot41 2052 days ago
Cloud computing seems to be a winner take all scenario. For example, if you use AWS and need a message broker service then you'll use this. If you use Azure, you'll use their version. Development seems like just hooking up this components. I can't tell if this is a good thing or a bad thing.
4 comments

It’s a bad thing - whenever you cannot make a choice you become a slave to tyranny. Infra lock-in means you’re a slave to their whims (financial, legal, competitivenes, whatever)

This is justified by “less maintenance, and easier deployment” but the reality of the situation is, it’s not worth giving your freedom up for, and to a lesser degree - if your platform becomes popular, you end up spending the same amount of time tweaking and optimising to match the idiosyncrasies of their implementation anyway.

But the most important part is vendor lock-in, it’s bad.

How is using AWS or Azure's managed RabbitMQ service lock-in? You can easily switch to someone else's managed service, or roll your own, since it's still just good old RabbitMQ.
I was responding to the parents more general point about cloud services being "winner take all"
How many cases are you truely not locked into anything? Even if you host your own RabbitMQ you're still coupled to the software. I'm not sure being coupled to cloud service X is any worse than OSS product Y. For the latter, you tend to need to have more expertise to run it yourself.

At the end of the day, you can still rewrite your code and switch in both cases. You can end up in a tough spot if the OSS community loses interest in the software you've already bet your complicated app on, as well

AWS has sqs & kinesis which are much better queueing options in that scenario where you’re in AWS doing new dev and can pick a technology. This is more likely to do with opening doors for large and complex applications that can’t be rewritten to come into the AWS cloud. Unsexy stuff but there’s some fun engineering to be done in that realm, if you find like puzzles and shitshows to be fun anyways :-)
how are those much better? if you use SQS and you write your code for it, then you are stuck on a proprietary platform. Also, SQS is super-basic and actually requires a bunch of code to do anything beyond trivial - although yes, it seems reliable and well-supported, at least from my experience. I was actually really waiting for AWS to support Rabbit since it seems to hit the right combo of features, usability and platform independence for me, and it looked friendlier than ActiveMQ.
If you're using a decent framework there's a good chance it already does most of the work for you. With Ruby on Rails, there's the ActiveJob abstraction which you can hook up to different backends like SQS or Redis with a few lines of code. In addition, AWS has a lot of out of the box integration with SNS and SQS for other services like Cloudwatch and Lambda (in fact, lambda you don't need any special code).

If you have a Lambda function processing SQS messages they just get dumped in your handler method and it your function runs successfully they get automatically removed from the q. If your lambda fails, the message reappears after the visibility timeout out subject to your redrive policy

I'm referring to things like topics, multiple consumers, routing, etc etc that are not even possible with SQS and once you grow into a need for those, SQS stops being adequate no matter what library you use for it.
"sqs & kinesis"? These are two vastly different queuing systems. It's like saying "SSDs and Tape" are a better storage system than X.
Neither SQS nor Kinesis support the same functionality as RabbitMQ.
> AWS has sqs & kinesis which are much better queueing options in that scenario where you’re in AWS doing new dev and can pick a technology.

Why are they better?

They aren't. They're technically more correct but not always the practical best choice.

RabbitMQ is a smart play as Rabbit is very easy to use, understand, and troubleshoot at the low end (which is where I suspect the vast majority of queue systems live).

It also has a feature which is actually really hard to do (and sqs doesn't do). Guaranteed delivery of a message once.

That was THE reason we never migrated to SQS, there are scenarios where SQS can double deliver. Our codebase was built up from nothing over time and couldn't gracefully handle double delivery of messages in all scenarios. We could have refactored, but it wasn't worth the work when we were already doing a half billion in revenue without getting even close to the limitations of rabbit AND were close to selling (which we ultimately did).

AWS is great at selling multiple slight variations of the same product. If you look you can usually find ONE variation that works for you. The real test will be if the billing isn't garbage (garbage billing is why we didn't use their other AMQP service and part of the reason why we don't use things like EKS or Managed SFTP despite having the need).

> Guaranteed delivery of a message once

That flies in the face of my distributed systems knowledge. It's not possible in some failure cases.

If your acknowledgement of a message gets lost (because either server involved or the pipes in-between fail) you've processed the message already but the queue server will think you haven't. It either has to resend it (duplicate delivery) or it ignores acknowledgements all together (drops messages that it sent you, but you didn't process - maybe because your server failed.) So the choice when there is a failure in the system is between at least once or at most once - exactly once cannot be guaranteed.

I'm not aware of any way around that predicament.

You are correct, a better description is that their path to 'deliver exactly once to the best of your ability' is clearer.

If I remember correctly SQS is hard limited to a fairly short timeout to requeue messages delivered but not acked. In rabbit it's much more configurable.

Also regular rabbit hosts support the kludge pattern of, 'just run one host and accept if it goes poof you can lose messages,' which is useful if you don't want to bother with the complexity of clustering or are on a shoe string budget.

Lastly you get a nice user interface with the management plugin and you can stand it up locally with docker compose (without depending on AWS for dev or any of the 'aws but on your laptop' solutions).

Yeah, those are nice features to have. Plus you don't get the platform lock in.
SQS also supports FIFO queues, which have once-only delivery and ordering. Any reason those didn't work for you?
Aren't they expensive with performance limitations?

Yes we could do that, but we had already been using rabbit in a bunch of places. It made no sense to change it.

They’ve been around for quite some time so they have a much wider customer base & bigger teams supporting them. AWS services can be a bit choppy in the beginning so imo, especially with queues, I’d wait for it to bake.
I agree. SQS has done nothing but improve over time.
Nope. If only. There are four considerations to look at which are rarely if ever mentioned between the mountains of well organised hype:

The first is application complexity. A lot of real workloads are quite complex but do not need to scale out a lot. The cloud and all the hype is organised around simple workloads that need to scale out easily which is an easy win. So for example Netflix or a SaaS application with a few tens of endpoints and a React front end or something. The wide sprawling real businesses are a terrible fit and tend to get rather expensive rather quickly when you start putting their workloads into the cloud. There is marginal aggregate cost benefit over actually buying hardware ($4m a year SQL server clusters are a reality in the cloud), the real benefit being only agility.

The second is simply "hooking up components" sounds really easy. But it's not. I think perhaps 50% of my time is working out why X won't talk to Y or why Z is broken and finding some opaque abstraction which doesn't allow me to get to the bottom of the problem. It's very very easy to turn your deployment into a complete tangle of chaos and circular dependencies which are very hard to rationalise and automate even with state of the art automation tools (which I will say tend to melt in your hands). This is existing layering on top of the same concerns you had before rather than a different one.

Thirdly we have to work out the difference between mature products and hype. Nearly all solutions are described in little blog snippets that make things look really easy for a specific and narrow use case but realistically things are really fucking complicated and in some cases absolutely awfully described in documentation. In a lot of cases, including AWS, it's actually hard to find someone at the cloud vendor who knows how something works when you break it. And sometimes there are solutions which are just absolutely dire. Again pointing the finger here at Amazon's managed ElasticSearch.

Fourthly, you end up being perpetual bean counter afraid of the rube goldberg machine waking in the middle of the night due to some event you didn't anticipate and drinking the content of your credit card in a few minutes. Some of the cost management and spot instance management software automates this rather nicely into a whole cluster of new failure modes as well just as if the complexity wasn't enough already. A trite version of this is "saving money costs money and sometimes the benefits are less than the costs"

So what you end up doing is trading your original problems for a set of new and shiny ones which are possibly even more complicated.

But at least you only have one vendor to shout at, which is a net win if you've ever tried to get HPE and Cisco to work out what fucked up mess is going on between their two lumps of iron.

I digress but be careful with assumptions about it being magical unicorns. They poop and you have to shovel it.

good - my job is really easy

bad - my job is really boring

I beg to differ on your second point - at my company we've fully embraced AWS and putting vendor lock-in issues aside, the end result is focusing more on the application and less on the minutiae of operational issues which is a big win. This makes things much more interesting since you can get there faster and consequently take on more impactful projects in the same timeframe. This in general is a boon for developers in my experience.
Agreed. The flip side isn't vendor lock-in though, it is building complex systems under the guise of scale.
Somehow I don't think physical laborers ever complain when they get new and more powerful tools to make their job easier.

It's only software engineers that bemoan their lives getting easier, so they can spend more time working on other problems higher up the abstraction chain.

The 'problems higher up the abstraction chain' are the ones that are closer to labour, or factory work. Mundane and repetitive, relatively speaking easy - requiring less thought and being to die extent trainable as working within a pattern/template.
How about when the new powerful tools help get rid of some of them due to productivity gain?
And you can spend that time thinking about how not to use a message broker... And maybe just use lower level MANAGED services, like SQS or SNS.
> good - my job is really easy

Longterm isn't this bad? For example, if you can do it with 5 years experience. At 15 years experience you'll likely be too expensive for companies to want to hire you. They'll just hire more junior people.