Hacker News new | ask | show | jobs
by anon84873628 846 days ago
https://cloud.google.com/bigquery/docs/best-practices-costs#...

You can set the size limit for individual queries. Plus the custom quotas and everything.

Part of the problem is that the OP wrote a script with a loop. So say you set the limit to 50 GiB per query, but then write a script that runs a 49 GiB query 1000 times...

That type of batch process should be designed much more carefully to consider costs.

1 comments

> ... the OP wrote a script with a loop.

Are you sure?

The article doesn't say anything about a loop, and the estimated usage by the Google responder makes it seem like the cost is from a single "SELECT *".

According to https://news.ycombinator.com/item?id=39447465:

> I was doing historical evaluation for a few sites, so I was running a query for each month going back to 2016 for each site. I've done this before with no real issues, and if I knew the charges were rapidly exploding I'd have halted the script immediately - but instead it ran for 2 hours and the first notice I got was the CC charge.

So looks like a loop of ((6 * 12) + 2) * #sites iterations with a full table scan every time.

Thanks, that does add further detail after all. :)