Hacker News new | ask | show | jobs
by physicles 485 days ago
I never understood the rationale behind TimescaleDB — if you’re building a time series database using row-oriented storage, you’ve already got one hand tied behind your back.

What does your testing strategy look like with bigquery? We use snowflake, but the only way to iterate and develop is using snowflake itself, which is so painful as to impact the set of features that we have the stomach to build.

1 comments

Testing strategy? What’s that? I kid, but just a bit. Our use case is a data warehouse. We use DBT to build everything. Each commit is built in CI to a CI target project. Each commit gets its own hash prefixed in front of dataset names. Each developer also has their own prefix for local development. The dev and ci datasets expire and are deleted after like a week. We use data tests on the actual data for “foreign keys”, checking for duplicates and allowed values. But that’s pretty much it. It’s very difficult to do TDD for a data warehouse in sql.

My current headache is what to do with an actually big table, 25 billion rows of json, for development. It’s going to be some DBT hacks I think.

God help you if you want to unit test application code that relies on bigquery. I’m sure there are ways but I doubt they don’t hurt a lot.

Interesting strategy with appending the commit hash to the dataset name. If one of those commits is known to be good and you want to “ship” it, do you then rename it?

What are you doing with that JSON? What’s the reason why you can’t get a representative sample of rows onto your dev machine and hack on that?