| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by EddieJLSH 1054 days ago
	Realistically which production DB tables don't have a unique id? Genuine question, never used one in my life.

6 comments

dspillett 1054 days ago

Log analytics or warehouse tables often have no simple useful key for this sort of comparison.

Also in a more general case you might be comparing tables that may contain the same data but have been constructed from different sources. Or perhaps a distributed dataset became disconnected and may have seen updates in both partitions, and you have brought them together to compare to try decide which to keep or if it is worth trying to merge. In those and other circumstances there may be a key but if it is a surrogate key it will be meaningless for comparing data from two sets of updates, so you would have to disregard it and compare on the other data (which might not include useful candidate keys).

link

theodpHN 1053 days ago

Also, database tables where unique key constraints aren't enforced. Programming and operational mistakes happen. :-)

https://stackoverflow.com/questions/62735776/what-is-the-poi...

link

erinnh 1054 days ago

It happens. I’m currently working on a project where the CRM tool I need to access for data, actually does not have a unique id in its db. I have no idea if I will be able to successfully complete the project yet.

link

justinclift 1054 days ago

Is there any chance that the rows actually do have a unique id, but it's not being displayed without some magic incantation?

Asking because I've seen that before in some software, where it tries to "keep things simple" by default. But that behaviour can be toggled off so it shows the full schema (and data) for those with the need. :)

link

erinnh 1053 days ago

Sadly, no.

The manufacturer is just really incompetent.

I was told their reason when asked was „it was easier (for us)“.

link

justinclift 1053 days ago

> it was easier (for us)

That's not all that unusual when something gets implemented, as people tend to take the easy approach for things that meet the desired goal.

It just sounds like the spec they were writing to wasn't very clear or it was just a checkbox list of features provided to them by marketing. So "lets get this list done then ship it". ;)

link

Dylan16807 1053 days ago

The question is whether it was actually easier.

Even a couple minutes of extra debugging takes longer than learning how to add a synthetic primary.

link

valenterry 1054 days ago

For example tables that store huge amount of logs or sensor data where IDs are not very useful and just increase space usage and decrease performance.

link

hobs 1053 days ago

PostTags in the published Stack Overflow schema - https://data.stackexchange.com/stackoverflow/query/edit/1772...

It happens a lot when people are implementing something quick and often happens in linking tables.

link

wodenokoto 1054 days ago

Don’t things like BigQuery always allow duplicates?

link

EddieJLSH 1054 days ago

Good point, have not used it before but looks like you have to add a unique ID if you want one

link