Hacker News new | ask | show | jobs
by mulmen 845 days ago
I don’t disagree but how does that work exactly? When you hit the quota the query gets cancelled? That’s definitely already a feature of Redshift Spectrum with WLM. Does BigQuery offer something similar?
2 comments

My first choice would be something like "this query will cost $13953, which exceeds your default cap of $100; please click the confirm button if you really want to run it". (The dollars could be CPU-minutes or whatever if you want to use resource based limits, which might play nicer with a free tier)

Edit: rereading, I think this is actually for non-interactive scripts, in which case yes it should just cancel the query

Edit 2: https://news.ycombinator.com/item?id=39447499 was kind enough to point out that the resource-based version of this might actually exist, which is nice

https://cloud.google.com/bigquery/docs/best-practices-costs#...

You can set the size limit for individual queries. Plus the custom quotas and everything.

Part of the problem is that the OP wrote a script with a loop. So say you set the limit to 50 GiB per query, but then write a script that runs a 49 GiB query 1000 times...

That type of batch process should be designed much more carefully to consider costs.

> ... the OP wrote a script with a loop.

Are you sure?

The article doesn't say anything about a loop, and the estimated usage by the Google responder makes it seem like the cost is from a single "SELECT *".

According to https://news.ycombinator.com/item?id=39447465:

> I was doing historical evaluation for a few sites, so I was running a query for each month going back to 2016 for each site. I've done this before with no real issues, and if I knew the charges were rapidly exploding I'd have halted the script immediately - but instead it ran for 2 hours and the first notice I got was the CC charge.

So looks like a loop of ((6 * 12) + 2) * #sites iterations with a full table scan every time.

Thanks, that does add further detail after all. :)