|
|
|
|
|
by barrkel
3110 days ago
|
|
I had an entertaining few days working with Confluent's Kafka Connect stuff. I was trying to connect a MySQL table to Kafka and then on out to Hadoop. Amusingly, Kafka Connect wanted to use a queue with the same name as my table (MySQL or Hive / Hadoop, I don't recall which end); but of course since Kafka doesn't have namespaces, I had better hope that my table name is unique across the whole cluster! It was around about then that I figured out that Confluent was a bunch of kids playing at building stuff. I have zero doubt that it's a good base if you have an enormous firehose of data, but look for features beyond raw performance and basic correctness, and it's underdeveloped. Basic stuff like back-pressure - don't expect it, either overallocate your storage or make sure you always have faster consumers than producers. |
|
It'll be the MySQL end if it's a Connect source as opposed to sink.
Two options - in your Connect config, you can specify a topic prefix, or if you use a custom query, the topic prefix will be used as the entire topic name.
> It was around about then that I figured out that Confluent was a bunch of kids playing at building stuff.
Kafka Connect saved me writing a load of boilerplate to monitor a PG database to propagate model updates in a medium suitable for streaming jobs - Kafka Connect + Kafka Streaming's Global KTables is a nice fit, even if the Connect JDBC end is somewhat beta at this point (KTables rely on Kafka message key for identity, the JDBC source doesn't populate it by default, so you have to use Single Message Transforms (SMTs) to achieve it)
I'd say beta, not kids.