Hacker News new | ask | show | jobs
by eordano 2197 days ago
It seems that SHARD is a term used already in 1988, meaning "System for Highly Available Replicated Data"

Source: https://scholar.google.com/scholar?cluster=14914487445955020...

2 comments

As I pointed out a while back (https://news.ycombinator.com/item?id=22974882), the "SHARD" system described in that paper didn't actually have anything to do with "sharding" as the term is currently used. It was designed to replicate data, but it didn't do any kind of partitioning; each replica stored a copy of the entire dataset.

For that reason (in addition to the low number of citations), I think it's very likely that the name is a total coincidence. Pretty much any word you can think of has been used by somebody as an acronym for some project.

I'm the person you replied to in that thread, and in support of your point: after that discussion, I spent some time crawling through the proceedings of Very Large Databases (VLDB) and the ACM Digital Library, and I could find no instances of "shard" used to mean the partitioning of a database prior to 2001. (That paper is "Minerva: An automated resource provisioning tool for large-scale storage systems" in Transactions on Computer Systems, free-to-read at https://dl.acm.org/doi/abs/10.1145/502912.502915.)

Other the other hand, I found many papers citing the SHARD paper - more than the official count. That's a difficulty with citation counts of old papers: a lot of the papers citing it are also old papers, and we're not consistent at tracking the citations of old papers. Personally, I don't have a conclusion. The SHARD paper is decently cited, and its usage is close to the modern one. On the other hand, I can't find any smoking gun pre-1997 usage of "shard" in the modern meaning.

Interesting, thanks for putting a lot more effort into answering this question than I did!
While interesting I don’t think this old paper led to the popularization of the term. It only has 12 citations!
Looking at a single academic paper's citation count won't reveal much about a term's historical currency.

For example, there are also papers in Google Scholar (findable via query [database shard], through 1987) mentioning this same SHARD system in 1986 and 1987, with 33 and 97 citations respectively. And further, there's a 1986 MIT technical note that mentions a commercially-in-development version of this SHARD system, but refers back to a 1985 paper, "System architecture for partition-tolerant distributed databases", as an authoritative source about SHARD – though that 1985 paper doesn't declare the name SHARD.

That's suggestive that SHARD was adopted as a catchy name for that particular work around 1985-1986, then becoming more widespread in the 1986-1988 timeframe.

But perhaps more interesting: that original 1985 paper mentions in its acknowledgements Hector Garcia-Molina – a definite 'hub' person in databases/indexing/networked-information for decades, among many other things Google cofounder Sergey Brin's advisor at Stanford from 1993-1997. (See: <https://en.wikipedia.org/wiki/H%C3%A9ctor_Garc%C3%ADa-Molina...)

So it's likely safe to assume that from the late 80s into the 90s, top CS students/researchers around the world discussing partitioned distributed databases would often have this particular sense of SHARD mentioned to them, or appear in their readings.

Notably, that 1985 SHARD involved a system where each replica contained the entire database – so did not capture the modern connotations of 'shard', as horizontal partitions. But that vivid & apropos analogy was "in the air" around partitioned/distributed databases.

Thus I'd strongly suspect uses in the modern, non-overlapping sense in that same era, likely predating Ultima Online's 1996ish use. (I'd especially look around precursor work to the 1997 'Consistent Hashing' paper, & other caching-centric work – because there the idea of partition-by-key was central.)

So UO might have devised, but I'd guess more likely popularized, our modern sense of 'sharding'.

It's interesting to learn more about the history here. When UO came out and used the shards concept I just assumed it was a callback to 1988's Ultima 5 and the shards of Mondain's gem. Linguistic history sure is blurry and organic!
It was exactly that, a callback. In fact, the game opening cinematic recounted the story.