| HN Mirror

Testing strategy? What’s that? I kid, but just a bit. Our use case is a data warehouse. We use DBT to build everything. Each commit is built in CI to a CI target project. Each commit gets its own hash prefixed in front of dataset names. Each developer also has their own prefix for local development. The dev and ci datasets expire and are deleted after like a week. We use data tests on the actual data for “foreign keys”, checking for duplicates and allowed values. But that’s pretty much it. It’s very difficult to do TDD for a data warehouse in sql.

My current headache is what to do with an actually big table, 25 billion rows of json, for development. It’s going to be some DBT hacks I think.

God help you if you want to unit test application code that relies on bigquery. I’m sure there are ways but I doubt they don’t hurt a lot.