Hacker News new | ask | show | jobs
by threeseed 845 days ago
Given how small the dataset is there is no query that justifies a $14k charge.

AWS charges $27/hour for a server with 3TB of memory. Enough to run the queries in memory.

2 comments

BQ charges you based on the volume of data being scanned. I think this is a situation which involves scanning the whole dataset again and again without fully understanding how it works. I’ve worked with much larger datasets on BQ (petabyte scale) and managed to not spend more than $1000 in an hour. Also, BQ tells you how much data will be processed BEFORE you run the query, which makes it easier to understand the cost implications.

Again, you could fit the whole dataset in memory in an EC2 instance and do your thing.

It's easy to make an enormous query by joining to other data (or to the same data), or reading a lot of data.

A regex query on response_bodies would churn through 2.5TB of data every time it's run.