Hacker News new | ask | show | jobs
by jacksondeane 5281 days ago
If I am reading this correctly, you want to maintain multiple versions of the same blog post, correct? This isn't really 'version control' (git, svn, etc...) in the traditional sense .

At a high level, you could just keep a record of every version of every blog post, then separately maintain which entry is most current, or active, or whatever metric you want to display/query. This will let you maintain all versions of the blog post and you can run diffs on different versions, revert to older versions and branch from any version.

1 comments

I'm curious - has anyone here had experience doing this with a few million rows of data? Even a few revisions each means the table would grow rapidly. You'd also (probably) need three operations for each insert:

1. Insert the new row

2. Find the previous row

3. Set previous row to `active`=0

Are there performance tricks to keep this reasonable? I'd also wonder how to modify the table itself (dropping/adding columns) when there would potentially be selects + inserts + updates running 100% of the time - but I realize that's borderline out of scope of the OP's question.

Not sure about the other questions but in regards to changing the actual table structure. If you're dropping columns you'd first make the changes to your code base so that it no longer relies on those columns. If you have long running queries you'd just have to wait till the old ones using those columns are finished and then you could change the DB. And you'd do it in just the opposite order when you wanted to add a column.