| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by v0g0n 3477 days ago

With Qubole you can offload data engineering to their platform. Cluster management is super simple. Hand rolled solutions in my experience are a pain and elastic cloud features take up time to build. Qubole's offering provides out of the box experience for most big data engines out there. Presto/ Spark/ Hive/ Pig - what have you - all work with your data living in S3 (or any other object storage). I believe they have offerings in other clouds too.

Some amount of S3 listing optimisation is done by Qubole's engineering team for: https://www.qubole.com/blog/product/optimizing-s3-bulk-listi...

They also have features that allow you to auto-provision for additional capacity in your compute clusters as your query processing times increase.

1 comments

ktamura 3477 days ago

When Amazon Athena actually matures, wouldn't it solve at least the interactive query needs, probably at a much lower/elastic price point than Qubole?

link

v0g0n 3477 days ago

True, I've tried Athena and it's great at cost, performance and ease of use. However, most Data Engineering teams have lots of custom tweaks they need and certain level of control to add jars, applications, UDFs to their queries. I don't see this available through Athena today.

link