Hacker News new | ask | show | jobs
by throwbsidbdk 3493 days ago
It sounds like Uber has a database cluster for roughly every employee?!! Uber has a single product that largely centers around 1 app. How can this be necessary?

I have a feeling... That they're dumping realtime GPS data into a bunch of these when they should be using something like Cassandra...

2 comments

As said in a previous blog post by Uber, they have thousands of microservices and over 8000 Git repos.

It's safe to assume that their infra is a giant clusterfuck :D

I would assume that they scaled by adding more and more engineers, that end up working each in their corners on different problems, without any basic shared tooling/ practices/design. Things get out of hands quickly.

"Single product" which consists of more than X thousands of micro-services.
>The entire trip store, which receives millions of trips every day, now runs on Dockerized MySQL databases together with other stores

It really does sound like this is part of the issue. Millions of trips is billions of db row per day. A document store is much more amenable to that kind of workload than MySQL

Actually it is sort of the opposite.... They needed better OLTP to deal with synchronizing billing ... and they wanted schema flexibility. Document/Column stores are traditionally not ideal for this because A+P (CAP theorm) are preferred over "C"onsistency (very broadly speaking).

They could have done it with a transactional message queue and a decent RDBMS (there are far bigger companies that use RDBMS for far more transactions than Uber does) but they clearly did not have the in house expertise for that (and they also wanted rapid schema changes).

Part of the problem is Postgres was a little behind on scaling in previous years but that has changed. IMO they could have stuck with Postgres by making an addon to that instead but they found extending MySQL easier.

Uber essentually built their own custom document store on top of MySQL. They explain their design and reasons (and why they didn't use Cassandra etc) in this post: https://eng.uber.com/schemaless-part-one/
Okay thanks, that explains the why but doesn't make it sound less terrible. They essentially built a nosql database on top of MySQL. Forgivable years ago but this was in 2014...
Which likely explains why they use their MySQL Schemaless engine[1].

1. https://eng.uber.com/schemaless-part-one/