Hacker News new | ask | show | jobs
by laserDinosaur 709 days ago
>Call me old fashion but I really like integer autoincrement primary keys.

Just hope you never have to merge tables from two databases together.

1 comments

This is exceptionally rare in most projects.

I know of only one person in my entire career that had to do this. And they managed it just fine despite working with auto-incrementing big ints.

Yet some folks advocate that all projects should pay an expensive insurance against this elusive event of two databases being merged.

>And they managed it just fine despite working with auto-incrementing big ints.

I wonder how. I've had to do several big merges in my career, and it was always a nightmare because of all the external systems which were already referencing and storing those pre-existing ints. Sure, merging the databases is easy if you don't mind regenerating all the Id's, but it's not usually that simple.

Simplest way is to keep the identifiers from DB A and increment all the identifiers from DB B by an offset. Third parties complicates things of course but internally it can be pretty simple, so maybe they just didn't have too many third parties using the IDs.
That was it if I recall.

They wrote a small script with the logic involved in the merging. PKs and FKs of only one database had to be incremented by an offset of max(table.pk) + safe margin.

They did this for each table.

Once this script was tested multiple times with subsets of each database, they stopped production and ran the script against it (with backup fallbacks). A small downtime window in a Sunday.

And that was it. The databases never had to pay the UUID tax, before or after.

>they stopped production

Oh I see, we're talking about two entirely different worlds here, lol.

Not being able to stop production database for a very short window once in a lifetime is another exceptionally rare business case.

I've seen architecture astronauts make their business pay unreasonable tech insurances by adding complexity to avoid simply pausing production for some minutes when it could have been much cheaper this way.

And from my understanding, in the case I mentioned, they chose to stop production to simplify the process. But they didn't have to.

A mixture of replication plus code changes to write in two databases could also have solved the issue.

Most business die because they can't move fast enough. Not because their production database stopped for a few minutes.

Stopping production on db B isn't really a requirement, just makes it easier.
I swear, it really reads like "oh you like SOME TECHNOLOGY? we'll se how you like it when FARCICALLY RARE EVENT happens
I know right?

"If your architecture can't withstand life threatening solar flares, third world war, sabotaging of undersea cables and 1 billion concurrent users can you even call yourself an engineer?"