Hacker News new | ask | show | jobs
by jeff-davis 1123 days ago
"There are no real world natural keys — only someone else’s surrogate key... SSN, VIN, email address, UPC, tax numbers, country codes, language codes, URIs… they’re all just someone else’s made up identifier."

Every time this topic comes up, there are major definitional problems, and it seems like either surrogate keys or natural keys get defined out of existence.

The author is drawing the line such that natural keys are surrogate keys that are out of your control, and surrogate keys are in your control. That doesn't seem to match the definition from wikipedia: https://en.wikipedia.org/wiki/Surrogate_key

I agree with the author that control over the ID is an important distinction, but if you're going to make up a new ID, why not just say so? "Surrogate key" just introduces confusion -- it makes it sound like something only programmers and database admins care about, which is definitely not true. If customers are seeing it, everyone in the organization should care that it's in a recognizable pattern (so you don't mix up different IDs from different places) and memorable and easy to understand when spoken.

1 comments

VIN is a good example of how this stuff gets pretty crazy. The standards define it so as to be unique, but only unique within defined n year ranges of model years (VIN identifies vehicles), where n depends, however, on the vehicle type, which can be determined from some specific sub-fields of the VIN. So, multiple vehicles may be on the road with the same VIN if at least one has survived more than n model years (which map flexibly and only approximately to ordinary calendar time). In addition, sometimes a manufacturing error produces non-identical vehicles with the same VIN because of blunder by human, machine, or both in concert. These scenarios are resolved by IT means similar to the method used by Mark Twain and his parents to avoid confusion between him and his mythical twin brother.