Hacker News new | ask | show | jobs
by aksakalli 1693 days ago
I used Apache Drill to query json line log files stored in Azure Blob. It is very easy to configure and run it. I used in embedded mode for not so big queries and some visualisation in Apache Superset. It worked really well. I created some views in parquet to speed it up.

Be aware, there is no such thing as schemaless, Drill is schema on-read and if your files contain changing schemas, it is painful to workaround all the errors you face. JSON is too ambiguous when it comes to types.