Hacker News new | ask | show | jobs
by m0zg 2627 days ago

  > SELECT * FROM AAA
Can't think of a realistic use case where such a query would be appropriate, even from just the performance standpoint. In fact, for a very long time the internal counterpart of BigQuery didn't even support "SELECT star", and nobody complained too badly. If you'd like to give Google a gift, however, sure "SELECT star" all you want. :-)

  > "SELECT field1, field2 FROM AAA" you'll pay only for the total size of field1 and field2 rows.
Moreover, if you also use a WHERE clause, you'll pay even less.
2 comments

This can happen in data cleaning/loading where you load unclean data into a table that is ready for analysis. I have loaded data through staging tables regularly. There may be multiple stages.

Another example is materialized view creation. It's common for these to scan large quantities of data to compute aggregates.

That's not the recommended way of loading data into BigQuery, though.

https://cloud.google.com/bigquery/docs/loading-data

That is not true

A where clause still searches the entire column, unless it is conditioned on the partition column.

Not if you filter on clustered columns, like you should if you care about performance.
True, but now we have moved pretty far from just saying "use a where clause".

"Partition your table and create clustered columns and filter on those"