Hacker News new | ask | show | jobs
Show HN: An open-source library for telemetry streaming started at Formula 1 (github.com)
11 points by cha0sengine 1211 days ago
Hello HN, we’re open sourcing a new library called Quix Streams (https://github.com/quixio/quix-streams)—a high-performance Python and C# library that processes time-series telemetry data with message broker such as Apache Kafka, instead of a database.

We’re a team of engineers who all worked together at the Formula 1 McLaren pit wall. We struggled to get wanted we wanted out of existing stream processing tech, so we built our own platform called Quix. Although our original requirements were pretty niche (process over 60k data points per second), our solution is designed to make life easier for anyone who needs to process high volumes of time-series data.

We also wanted it to be as accessible as possible for data scientists and ML engineers (e.g. those who run a lot of physics models on sensor data). This means it’s got a lot of built-in convenience functions such as native DataFrame support and easy-to-use buffers for rolling window calculation.

The main differentiator though, is making it as performant as possible—we’ve added features that make it easier take full advantage of Kafka’s parallel processing capabilities while keeping relevant data together and ordered correctly.

So even if you’re not doing real-time ML on Formula 1 telemetry, we hope that its simplicity and performance will make you more productive and we’re excited to see what other use cases you find for it.

More info:

https://github.com/quixio/quix-streams

https://quix.io/blog/open-sourcing-quix-streams

https://docs.quix.io/sdk-intro.html#library-features

* Full disclosure: the library started as a companion to our commercial platform http://quix.io but you can use it with any Kafka installation you like.

3 comments

Interesting, I skimmed the readme, but can you clarify how is this different from other similar libraries like kafka-python, or faust?
Well, we go into a bit more detail in the blog post (2nd link above). But essentially there are a lot of custom features that focus on time, specifically time-series data. Also, the stream context feature is quite different, allowing you to send different types of data (time series, events, binary, etc) in one ordered stream, bound to one partition. Regarding Faust specifically, the big difference is in the approach to stateful processing. Faust basically copies Kafka's native KafkaStream's architecture, where state is stored in topics - our library, on the other hand, avoids this by using a K8s persistence volume (shared in a stateful set) to persist state and manage state checkpointing.
Would qx be difficult to adapt to a different message broker with a python client?
Quix Streams was envisioned to support multiple brokers from day one. We've built abstraction on top of Kafka, so it would be possible to integrate with different message brokers. Which broker do you have in mind?
This looks super interesting. Are there maybe some test data streams? Not necessarily real data from real F1 races :)
Hi, I recorded a quick loom here -> https://www.loom.com/share/876f1475113a481b8d2591d0ce2f302c

But basically if you login to the Quix Portal and head to the library you can see a bunch of data sources.

Hmm..this requires signing up with your platform which I'd prefer not do. But streaming from a CSV is good enough for me. I noticed you have nice kafka-python/vanilla Kafka tutorial on your blog (the "...event-driven system to get ML predictions" one), which shows how to do that. I assume the same process can be followed with this new library? Do you have any docs that show how to stream from a CSV?
All the samples and connectors are also open source in this repo https://github.com/quixio/quix-library

This project has a stream from CSV example https://gist.github.com/merlin-quix/e42594da530432cbd050bd83...

Yeah for sure. What are you looking for?