Hacker News new | ask | show | jobs
by yjftsjthsd-h 844 days ago
Or, GCP could implement cost/resource/use limits, which would allow them to give away whatever they wanted for free without any concern about people over using it, while also allowing people to avoid shooting their own feet off.
1 comments

I don’t disagree but how does that work exactly? When you hit the quota the query gets cancelled? That’s definitely already a feature of Redshift Spectrum with WLM. Does BigQuery offer something similar?
My first choice would be something like "this query will cost $13953, which exceeds your default cap of $100; please click the confirm button if you really want to run it". (The dollars could be CPU-minutes or whatever if you want to use resource based limits, which might play nicer with a free tier)

Edit: rereading, I think this is actually for non-interactive scripts, in which case yes it should just cancel the query

Edit 2: https://news.ycombinator.com/item?id=39447499 was kind enough to point out that the resource-based version of this might actually exist, which is nice

https://cloud.google.com/bigquery/docs/best-practices-costs#...

You can set the size limit for individual queries. Plus the custom quotas and everything.

Part of the problem is that the OP wrote a script with a loop. So say you set the limit to 50 GiB per query, but then write a script that runs a 49 GiB query 1000 times...

That type of batch process should be designed much more carefully to consider costs.

> ... the OP wrote a script with a loop.

Are you sure?

The article doesn't say anything about a loop, and the estimated usage by the Google responder makes it seem like the cost is from a single "SELECT *".

According to https://news.ycombinator.com/item?id=39447465:

> I was doing historical evaluation for a few sites, so I was running a query for each month going back to 2016 for each site. I've done this before with no real issues, and if I knew the charges were rapidly exploding I'd have halted the script immediately - but instead it ran for 2 hours and the first notice I got was the CC charge.

So looks like a loop of ((6 * 12) + 2) * #sites iterations with a full table scan every time.

Thanks, that does add further detail after all. :)