Hacker News new | ask | show | jobs
by manigandham 3705 days ago
This will be interesting to try out. I've used all the major cloud event/logging systems (Kinesis, Azure EventHubs, etc) and so far Google PubSub is the best in features and performance.

Only downside with Google Pubsub can be latency (which I'm working on fixing by building a gRPC driver) but Kafka has proven to be too complicated to maintain in-house. If heroku can provide the speed without the ops overhead, it'll be some good competition to Google's option.

Also want to note that Jay Kreps who helped build Kafka at LinkedIn is now behind http://www.confluent.io/ which is like a better/enterprise version of Kafka.

2 comments

Not sure why you are comparing Google Cloud Pub/Sub to Kinesis - the former is a MQ system, not a distributed commit log.

When creating a Kinesis consumer, I can specify whether I want to start reading a stream from a) TRIM_HORIZON (which is the earliest events in the stream which haven't yet been expired aka "trimmed"), b) LATEST which is the Cloud Pub/Sub capability, c) AT_SEQUENCE_NUMBER {x} which means from the event in the stream with the given offset ID, d) AFTER_SEQUENCE_NUMBER {x} which is the event immediately after c), e) AT_TIMESTAMP to read records from an arbitrary point in time.

A Kinesis stream (like a Kafka topic) is a very special form of database - it exists independently of any consumers. By contrast, with Google Cloud Pub/Sub [1]:

> When you create a subscription, the system establishes a sync point. That is, your subscriber is guaranteed to receive any message published after this point.

[1] https://cloud.google.com/pubsub/subscriber

So the stream is not a first class entity in Cloud Pub/Sub - it's just a consumer-tied message queue.

Is there something like Kinesis' AT_TIMESTAMP in Kafka?

I think the only way in to replay events in Google Cloud Pub/Sub is to create multiple subscriptions in advance, right after topic creation. But then I think you need to pay for the storage and event traversal requests.

What are your thoughts on Kafka vs Flume?