|
It's odd because there seem to be two conflicting trends. On the one hand, you have people embracing (say) javascript as a server platform because it's easy to get something done, and simultaneously have people designing for outlandish scale. In general, the 'get it done' mentality is the one that makes economic sense, because once you've added together the pile of software that doesn't need to scale to the other piles where this long-view doesn't matter, you have almost everything built. The other piles, for the record, include software: - that is designed wrong so it needs to be re-written - is obsoleted by changes in business direction (a project canceled, for example) - gets replaced by something off-the-shelf or open-source - is built for a startup that won't survive, or that gets aqui-hired, or that pivots to a wildly different thing On the other hand, I sometimes see the opposite thing in heavily analytical work, where data science work is done in Python because its "easy", and then a team of engineers builds a crazily complex pipeline to make the python perform in some reasonable time frame. (Hi, Spark!). In my workplace, one example allocates bits of a job to roughly 100 machines, moving data to each, in a cloud environment where the data movement overhead is constantly fighting the benefits of distribution. |
Having seen at least a couple of similar setups, I remain skeptical that this isn't, at its core, just a problem of ignorance of how "big" one can make/get a single server, before even paying a premium.
However, even for the "largest" commodity servers, last I looked, the premium at the highest end (over linear price:performance) was only something like 4x.
There was some relevant discussion of single server versus distributed in subthreads of https://news.ycombinator.com/item?id=17492234 a few days ago.
> In my workplace, one example allocates bits of a job to roughly 100 machines, moving data to each, in a cloud environment where the data movement overhead is constantly fighting the benefits of distribution.
I'm confident that cloud environments contribute to hardware ignorance, since cloud providers offer a very limited choice of options, and I have yet to see anything high end.
This is especially a frustration for me with networking options, where high bandwidth (beyond 10Gb/s on AWS, until recently, and still only 40GB/s max, AFAIK) is nonexistent and, otherwise, expensive, and low latency options like Infiniband don't seem to exist, either, even at the now low/obsolete bandwidths of 16 or 32Gb/s.