Hacker News new | ask | show | jobs
by slt2021 1004 days ago
you don't need "systems" for 10-20 messages a day. it all could be replaced with S3 buckets and aws-cli with even better durability and delivery latency and error handling than anything you would be able to engineer yourself
3 comments

I am completely dumbfounded by this reply.

You're suggesting we engineer something on S3 and aws-cli, while complaining about engineering something ourselves when AWS offers a perfectly good queue service that requires no engineering?

Uff. I'm going to buy a hut in the woods and live in it.

I used s3 just as an example of a service with very good track record of availability for a very low cost - and perfectly reliable available service can be created with bash scripts, aws-cli and free-tier AWS account.

perfectly fine with using SQS, just it will have worse availability guarantees than S3 - people should understand tradeoffs

I don't know but I think they might be suggesting that the answer to "don't need a huge complex system" is not "use someone else's huge complex system".
my understanding is that you should not roll your own expensive complex huge system, just use AWS S3 which provides eleven nines of availability at $22/tb/mo.

It is really hard to beat the cost/benefit ratio of S3.

a lot of mediocre engineers cannot swallow a pill that all their expensive work with hundreds of hours of overengineering could be replaced by a couple of AWS managed serverless services stitched together with a few mouse clicks or a single .yml file

Why does AWS even enter the conversation?
same reason any vendor enters the conversation. the argument that you can get more value out of the vendor than you would have out of a full time employee, and often for less money.

sometimes it really does make sense just to pay someone else to solve the problem. not always, but not never.

Then you have dependency on a specific proprietary API / technology available from a single company. Doesn't look like a good trade off.
S3 API has become lingua franca and is supported by open source (minio), storage vendors (QNAP), as well as plugins that translate S3 API calls to APIs for competing cloud providers (s3proxy).

all this is done because S3 provides unmatched durability and reliability at a dirt cheap cost of $22/terabyte/month of storage (with the first 50Tb/mo free!).

Try to beat that reliability guarantees with whatever you handrolled, and I bet you will never be able to beat the cost of S3, even match the durability, reliability, availability guarantees at any reasonable cost at all

from https://aws.amazon.com/s3/storage-classes/:

  Key Features:
    Low latency and high throughput performance
    Designed for durability of 99.999999999% of objects across multiple Availability Zones
have you ever built anything with 11 nines? (as in eleven nines)
S3 is fine until you want your data to leave AWS.

Then it costs $92 / TB to get it out again.

Also S3 has durability guarantees but it's very difficult to do a durable transactional write to S3. Try it a few million times and see. The API is a defacto shitty standard.

These two facts are rather interesting when it comes to doing a restore from your supposed backup or wonder why consistency guarantees between external metadata services (DB) and what is in S3 don't always line up.

and why would you ever take raw data out of AWS ?

if it is for migration: it is one time cost that anyone can swallow easily if they decided to leave AWS for something else.

If your data is worth < $94/tb - it is really not worth pulling it out of AWS. Just let it sit there.

or just use cloudfront to download your data ($8.5/Tb)?

On top of that, if it's a big enough deal, most salespeople at other cloud providers (GCP, Oracle etc.) will gladly pay you to migrate. They'd probably even throw one of their Solutions Engineers at the problem and do it for you for free.
Selling your soul to a different crack dealer. Hmm.
> S3 API has become lingua franca

S3 API support sounds great until your costumer builds a system with an "S3 compatible object storage" product. Soon you discover that many "S3 compatible" solutions aren't actually that compatible when pushed.

get-object and put-object is really all you need. everything else is nice to have
Your point is good, but that stack wouldn't win any latency awards. Many of the people I know using kafka need latencies in the millisecond range.
but kafka isn't fast. Most things are backed by real files, so when you hit limits or something ejected from cache, it gets slow real fast.

Kafka isn't the right choice for most things.

SQS, MQTT, NATS, rabbit if you're wanting a lot of admin are all better (plus the crap that azure and google make)

kafka is not for latency, it is used for high throughput. By design kafka shines in high throughput workload due to consumer and producer concurrency (consumer group), broker concurrency (multiple nodes and partitions).

for latency sensitive you will probably need redis pub/sub or something in-memory

In all the times I've been forced to use Kafka, I have never seen single digit millisecond latency.

If you need fast response Kafka is a bad choice.

If you are okay going to multiple digits of milliseconds then there are simpler solutions.

The only reason to use Kafka is the ability to guarantee order. For everything else it's second place at best.

what if you want to have at least once processing and durability?
Like AWS SQS? Others provide at least once processing as well.

Kafka has been the slowest out of them that I've used and definitely more complex to use.

SQS has durability for up to 14 days and you can only have one consumer group.

It’s also proprietary.

Then use Rabbit.

It's still harder to get wrong than Kafka.