Hacker News new | ask | show | jobs
by drewda 1688 days ago
Yes, which is why it's amusing in hindsight that for a decade everyone* outside Google was forcing all* their distributed data tasks into the MapReduce paradigm, without considering alternative approaches like the one used by Spanner.

* slight exaggerations, I know

3 comments

I'm not sure how you think a distributed data processing technology would "fake-out" other companies when building/choosing database technology. They are totally different problem sets.

MapReduce does not have a set in stone data source/sink and can use multiple things like bigtable and spanner so they are complementary technologies.

I think the parent commenter might be referring to systems like Hive or HBase built on top of Hadoop and do have a lot of overlap with a large scale database system.
Exactly, thanks. "MapReduce paradigm" wasn't precise, and I can see why that makes everyone want to give a distributed systems 101 lecture in this comment thread :)
HBase isn’t really related to MapReduce though, more akin to BigTable.
It’s not even related. No one was running OLTP workloads as MapReduce jobs at any point.
Spanner didn’t exist in 2012.
Maybe, but this 2010 presentation mentions it.

https://cloud.google.com/files/storage_architecture_and_chal...

at that time, only aristocrats could use spanner.
Yes it did. Google published a paper about it in 2012, and claimed that at that point it had been in development for 5 years and in production for more than 1.
Mmm, I worked at google in 2012 and F1 was in active development.

I’ve never heard of Spanner internally. Maybe it was in development, but it was not in use.

Edit: went and read more. Looks like Spanner existed but didn’t have sql, so it wasn’t what it is today. And looks like I don’t remember things any more.

at least by 2013 it had some basic SQL (although much more limited subset that one usable today); if you needed more you would be using f1; IIRC I used spanner (without F1) in production around 2013