Hacker News new | ask | show | jobs
by tobilg 2675 days ago
We‘re using AWS Kinesis delivery streams to batch incoming JSON messages from IoT devices to Parquet files in S3. Those can directly be read by different AWS services like Redshift, EMR or Athena...
2 comments

We use Athena for all our robotics data, which we ETL into JSON. It's fantastic for queries that are simple time-slice queries, as most are because sensor data is inherently time-series. When more complicated joins are necessary, the performance is there across terabytes, and the cost is very very low, $5 per terabyte scanned (storage costs are another thing).
What bothers me about Kinesis is that it is prohibitively expensive at scale if you don't compress your data before putting it to Kinesis.

But if you want to use the nice features like parquet conversion your data can't be compressed.

If it could handle compressed data at the same price I would use a lot more of it.