Hacker News new | ask | show | jobs
by threeseed 2311 days ago
Not sure what you are talking about.

If you have S3 you can use Athena, Redshift Spectrum or Spark as query layer. It's not 22 technologies.

You don't need ElasticSearch, Ranger, Knox, Zookeeper etc as they have nothing to do with querying.

1 comments

But then it's far from basically free. Even overpriced Oracle databases can end up cheaper than locking into AWS in these cases (my experience).
I think the presumption that's differing here is query workload.

An OLAP database is, in the default case, an always-online instance or cluster, costing fixed monthly OpEx.

Whereas, if your goal in having that database is to do one query once a month based on a huge amount of data, then it will certainly be cheaper to have an analytical pipeline that is "offline" except when that query is running, with only the OLTP stage (something ingesting into S3; maybe even customers writing directly to your S3 bucket at their own Requester-Pays expense) online.