|
|
|
|
|
by kiyoto
4402 days ago
|
|
>Need a beefy RDBMS for 15mm rows? Maybe if you want to store the whole denormalized table in memory, but if you're just indexing a small field (or even partial-indexing a larger field) you should have no problem. Good point. Honestly, I don't have that much experience with using row-based RDBMS for analytics purposes (my background is mostly in finance where folks use expensive proprietary columnar databases) and Hadoop. Any good resources on testing the limits of using MySQL/PostgreSQL for analytics? |
|
That said, I agree that distributed columnar stores end up being much more useful for large-scale analytics, and the power of high computation parallelism seals the deal. We've mostly moved on from those snapshot MySQL databases to Impala running on top of our Hadoop cluster, so you're preaching to the choir :)
That said, a hell of a lot of analytics can be done in a properly-structured SQL database, and schema changes aren't a big deal as long you don't need to do them online in a production system.
More info: http://stackoverflow.com/questions/14733462/can-mysql-handle...