Hacker News new | ask | show | jobs
by redis_mlc 2275 days ago
Banks use an ancient but powerful architecture called Source of Truth (SoT), or system of record. It's one of the first techniques developed to manage heterogeneous distributed databases.

One master is picked as the final authority, then changes flow to other databases, replicas, etc. either using database tools or applications.

Awareness of deadlines or how much latency (ie. when the updates are needed for each downstream pipeline) is helpful.

A practical example that solves a simple case is the recent release of a tool by Netflix.

Pro tip: when you interview people for a project like this, they should already know the above. :)

Source: DBA.

https://en.wikipedia.org/wiki/Single_source_of_truth

https://en.wikipedia.org/wiki/System_of_record

4 comments

I get the feeling OP is the one being interviewed.

Mortgage rates change frequently.

The bank likely deals with this exact scenario on a regular basis and should have an established procedure in place.

Source of Truth should (in my opinion) be used in many sectors but even finance doesn't use it everywhere. Having each data point originate in only one system helps a lot with data quality but is complex to introduce in existing systems.
No, no interview. He's a friend of mine but I kept his problem in my head to possibly provide some solution.
You underestimate what how unknowledgeable a small financial institution can be.
I was in a client meeting a while back where one of their consultants corrected me with 'system of record' when I said source of truth. Your comment made me wonder where this distinction/difference was from? Do you know the history behind this?
Some good answers in this thread but I can flesh this out more.

The distinction and purpose is clearer when you think about aggregating data, such as the various Corona virus trackers are doing. This is a modern example but the scenario applies as far back as humans have been recording data. (And will apply as far forward as well, because physics)

Right now every hospital or testing center is recording the tests they give and their results. Plus they record how many inpatients, how many ICU beds, how many ventilators, etc.

Each hospital has a (at least one) database where that data is recorded.

Separately various news orgs and political offices are keeping track of all of the counts for their region, etc. So some number of times a day they call each of the hospitals to get their counts. (Ok, they don't actually call, but you get the idea). And they aggregate those counts.

Depending on the time that the organizations call the hospitals, a given hospital will give them a different count. The count As Of a particular time.

So different aggregation organizations may all have slightly different counts throughout the day.

You in following the progression of the cases will have picked some particular source that you choose to give you a trustworthy count.

So- the hospital, in keeping Records- is a System of Record.

The news orgs as Sources of Trustworthy information for consumers are Sources of Truth.

The two seem synonomous at first blush- and colloquially when people treat them interchangeably they really at referring to Source of Truth- but on reflection it should make sense that they serve quite different purposes and have quite different requirements. And once you the distinction you see it everywhere. Hope that helps.

It totally depends on the instance:

Was it a piece of atomic data at the system of generation? Likely SoR

Was it an aggregated/mastered/other data? Likely Source of Truth

SoR = Where it came from

SoT = Where you get it from

FWIW I've heard both used interchangeably (in banking, adtech, and other tech verticals).

Edit: the linked wikipedia articles discriminate the terms by defining SSOT as a single place to store and edit every data element, while SOR allows for replicas, but in the case of data discrepancies, the SOR's state wins.

It's probably just what they call it.
That's been my experience as well. People call it different things. I have met some that are down right religious about the name and chide you for not using their term, but in the end it is the same thing.

I have had people try to explain to me how they are different, but when you remove all their BS, it is the same thing.

I have no idea about this particular case, but for a lot of older concepts the root of divergent terminology is IBM. IBM invented its own lingo for lots of things, and persisted in using it even when the rest of the world settled on a different term. (This seems to be a habit companies pick up when they become huge and dominant; see also Micro-speak and Google-speak.)

The thing everybody else called a "hard disk," for instance, was long referred to by IBMers as either a "Winchester disk" (see https://en.wikipedia.org/wiki/History_of_IBM_magnetic_disk_d... for the etymology on that one) or a "fixed disk."

Cf Emacs' "windows", "frames", "kill", "yank", etc...
> A practical example that solves a simple case is the recent release of a tool by Netflix.

Which tool is this?

What do non-banks use?
The same. The concept of a system of record is fairly universal in the Enterprise IT world.
Thanks - that's what I thought but OP's phrasing confused me.