| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by rektide 1497 days ago

Lovely read. Condensing some, there's three node types in the system, writers, compactors, and readers.

> Writers read from Kafka, (briefly) buffer events in memory, upload events to blob storage in our custom file format, and then commit the presence of these new files to our metadata store.... Compactors scan the metadata store for small files generated by the Writers and previous compactions, and compact them into larger files.... The Reader (leaf) nodes run queries over individual files in blob storage and return partial aggregates, which are re-aggregated by the distributed query engine.

And then the meta-data supporting the system:

> Husky's metadata store has multiple responsibilities, but its most important one is to serve as the strongly consistent source of truth for the set of files currently visible to each customer. We’ll delve into the details of our metadata store more in future blog posts, but it is a thin abstraction around FoundationDB, which we selected because it was one of the few open source OLTP database systems that met our requirements

There's some nice scalability/isolation benefits in this all. Having reader nodes reading from network storage has created a lot of flexibility & ability to shift work around on demand.

Keeping all the metadata in FoundationFB is exciting, & sounds like a great use case, for it's safe transactional updates!

1 comments

seedless-sensat 1497 days ago

Also, using external compactors give another independent scaling dimension. Nice

link