Hacker News new | ask | show | jobs
by rjbwork 1103 days ago
I actually have this exact problem right now with SQL Server on AWS RDS. Unless I want to pay for standard+ editions in my dev/stage/qa/etc. environments, I can't use the baked in CDC features. And because of the minimum instance sizes for Standard+ edition, it costs ~1700 bucks per month per database. This is fine for production, because I need features like High Availability, but paying a significant premium over web/express in those environments seems like lighting money on fire.

We're already tracking changes for the purposes of time travel queries and other auditing purposes using Temporal Tables (SQL:2011 feature). I'm thinking a cron job triggering a lambda every minute should be sufficient to read from the history tables and publish out change data events over a bus.

Anyone see any problems with this approach?

1 comments

I'm not sure how this interacts with temporal tables, but I know that I've run into issues before just trying to track something with a high watermark, in that long running transactions might add records "in the past" behind your watermark that you track, so you need some mechanism to make sure you're capturing those (i.e., you can't just track an auto-inc primary key). Unfortunately, I think CDC is really the best way to handle this.
I agree that it is probably the best choice, I'm just not willing to >quadruple my server costs for the feature, unfortunately. I appreciate you taking the time to reply, and I did some further research.

It looks like as long as transactions are not taking excessive time to complete, that temporal tables will be sufficient, since the history tables get marked with the transaction begin times. I'll use a sliding window approach and dedupe.

EDIT: I also found this https://learn.microsoft.com/en-us/sql/relational-databases/s... which seems like it's supported on Express and Web versions and should be a fairly robust solution.

We made mssql-changefeed for this purpose:

https://github.com/vippsas/mssql-changefeed

V1 requires to run a sweeper procedure in the background, but an upcoming v2 version does without the sweep loop. Unfortunately too fresh for readme to be updated, but relevant lines in tests to show it off:

https://github.com/vippsas/mssql-changefeed/blob/v1-lazy/go/...

https://github.com/vippsas/mssql-changefeed/blob/v1-lazy/go/...

https://github.com/vippsas/mssql-changefeed/blob/v1-lazy/go/...

Library itself:

https://github.com/vippsas/mssql-changefeed/blob/v1-lazy/mig...

The "in the past" problem can be solved in a number of ways (misc locking patterns). Have played with a number of such schemes in mssql-changefeed (see sibling post).

CDC has 20-40 second delay for mssql which makes it unusable for many purposes.