|
|
|
|
|
by sitkack
817 days ago
|
|
In the second scenario, they can't do math. They could have bought themselves 6-18 months by getting the most powerful machine available using probably at most 1-2 salaries worth of those 40 people. Less a single digit percentage of workloads needs massive, hard to use horizontal scale out (for things that can solved on a single machine, or a single database). MR is useful as an adhoc scheduler over data. Need to OCR 10k files, MR it. Hadoop was the worst possible implementation of MR, wasted so much of everything. That was its primary strength. |
|
“Why on earth would you spend the cost of a brand new sedan on a drive like this?” I asked. Answer: to put the Oracle or DB2 WAL data on so you could vertically scale your database just that much higher while you tried to solve the throughput problems you were having another way. It was either the bargaining phase of loss or a Hail Mary you could throw in to help a behind-schedule rearchitecture. Last resort vertical scaling.