Hacker News new | ask | show | jobs
by jiggawatts 370 days ago
Not the original commenter, but I've read through half a dozen post-mortems about this kind of thing. The answer is: yes. There's challenges and sometimes downtime and/or breaking changes are inevitable.

For one, if your IDs are approaching the 2^31 signed integer limit, then by definition, you have nearly two billion rows, which is a very big DB table! There are only a handful of systems that can handle any kind of change to that volume of data quickly. Everything you do to it will either need hours of downtime or careful orchestration of incremental/rolling changes. This issue tends to manifest first on the "biggest" and hence most important table in the business such as "sales entries" or "user comments". It's never some peripheral thing that nobody cares about.

Second, if you're using small integer IDs, that decision was probably motivated in part because you're using those integers as foreign keys and for making your secondary indexes more efficient. GUIDs are "simpler" in some ways but need 4x the data storage (assuming you're using a clustered database like MySQL or SQL Server). Even just the change from 32-bits to 64-bits doubles the size of the storage in a lot of places. For 2 billion rows, this is 8 GB more data minimum, but is almost certainly north of 100 GB across all tables and indexes.

Third, many database engines will refuse to establish foreign key constraints if the types don't match. This can force big-bang changes or very complex duplication of data during the migration phase.

Fourth, this is a breaking change to all of your APIs, both internal and external. Every ORM, REST endpoint, etc... will have to be updated with a new major version. There's a chance that all of your analytics, ETL jobs, etc... will also need to be touched.

Fun times.

1 comments

> For one, if your IDs are approaching the 2^31 signed integer limit, then by definition, you have nearly two billion rows

Just wanted to nitpick this; this is not actually definitively true. A failed insert in some systems will increment the counter and deleting rows usually does not allow the deleted ID to be re-used (new inserts use the current counter). Of course, that is beside the point: the typical case of a table approaching this limit is a very large table.

It's actually fairly common to see this problem crop up in systems that are using a database table as a queue (which is a bad idea for many reasons, but people still do it) in which case the number of live rows in the table can be fairly small.
If a SQLServer instance is killed unceremoniously it adds 1000 to the pk increment.
I'm trying not to imagine the poor SQL Server that has crashed one or two million times and hence pushed the ID values into the billions!
haha—somewhere out there it’s crashing right now. Keep it in your thoughts.