Hacker News new | ask | show | jobs
by 3pt14159 3670 days ago
I've been doing data analysis and machine learning client work for quite some time now and for companies as small as a 3 person startup to advising a department of the Canadian government.

Almost always numpy matrix math + cython or C or Java on a single machine is enough. Not always-always; but if you can relax requirements slighly say by accepting a 45 minute lag from new data impacting the total model, or by caching the results of the top 10k most likely queries, or by putting more effort into stripping out the garbage parts of the data, or, sometimes, just throwing a $10k a month server or mathematician at the problem (sure is cheaper than a bunch of cheap servers + larger infrastructure team).

The times you need real scalability you know you need it. You'd laugh at how silly someone would be for trying to put it onto one machine. You're solving the travelling salesman problem for UPS (although I can think of some hacks here - I probably can't get it down to a single machine), or you're detecting logos in every Youtube video ever made, or you're working for the NSA.

Even if you know for sure you're going to need scalability. I don't think it hurts to just do it on a single box at first. Iterating quickly on the product is more important and once you have something proven you can get money from the market or from VCs to distribute it.

2 comments

This is kind of the same argument as microservices.

We could write 30 microservices deployed on 30 docker images with load balancing and FT and all that magic for a basic webapp...

Or we could just write a pretty fast webserver and do it with 1 server. (Or if it is stateless, do it with a few for still a lot less work than a giant microservice cluster).

I think in the last year or so microservices have become a little less cool, and people are more along the lines of "code cleanly so we can microservice if we need to down the road, but don't deploy it like that for 1.0"... seems similar for this.

People forget that the two methods of scaling are HORIZONTAL and VERTICAL. They think: "I can just put some micro-services behind HA proxy and boom, more capacity!".

And then they forget if they had just modified that one query and tweaked that one for-loop they could've had that same capacity without launching six new servers with all kinds of potential for the wiring to go down and cause downtime. Plus the dev time to build the services.

I usually think of scaling in 3 methods: horizontal (add more machines), vertical (get a bigger machine) and inward (improve your algorithms).

The largest performance gains I've seen are most often of the inward type -- like multi-thousands of percent, just by rethinking the code or the approach.

Bonus, improving inward scaling multiplies investments in horizontal and vertical scaling.

Vertical scaling requires hiring good engineers instead of mediocre ones (additional cost of $100,000s per year across the team). Horizontal scaling in comparison is much, much cheaper for your average CRUD app.
Maybe, maybe not.

For example choosing Java over Ruby would give you 2-10x better perf per server... and I am not sure that Java devs cost any more or less than Ruby devs.

Now we can get into an argument about developer productivity and all that.. but form a purely "i want to run 10x more users per server"... something like Java / Ruby gets you a long ways.

I talked to a CTO once that said he brought his RoR fleet down from 60 servers to 6-8 by switching to Scala.
When he said "switching to Scala" I reckon he probably meant "rewriting our platform."

It's very difficult to do a comparison like this in practice, because switching languages inherently involves a rewrite of a platform.

I was once involved in a large-scale government project that rewrote a Java app to RoR. They went from 50 servers to 10.

It has probably got more to do with the rewrite and the new architecture than whatever language it was written in.

> I am not sure that Java devs cost any more or less than Ruby devs

I don't know about that, how many bootcamps are pumping out Java programmers instead of Python/Ruby programmers?

Well in Chicago some of the colleges are pumping out Java Devs...

I guess if you are comparing a 4 year college Java dev to a 6 week bootcamp Ruby dev you have something, but that seems weird...

If you haven't seen stackoverflows server architecture posts you'd probably enjoy them: https://nickcraver.com/blog/2016/02/17/stack-overflow-the-ar...

I still think containers are great, and I really like the abstractions kubernetes provides, but if I ever had enough traffic to worry about scaling, I envision running a small cluster of very powerful computers rather than 100s of weak ones.

So basically, through careful engineering, they can run stackoverflow.com, one of the top-ten sites in the world, on a single web server and a single database server. Beefy machines for sure, but it kind of takes the air out of the web-scale hype balloon.
Ya I love this as a counterpoint to much scaling.

Stackoverflow has pretty good uptime and pretty good performance, can't fault the end result.

> one of the top-ten sites in the world

49th actually: http://www.alexa.com/siteinfo/stackoverflow.com

Physical servers they host themselves, mind you.
Huh? Why are they running Windows?
I don't know about the rest of the (initial) team, but Joel was lead on Excel as I recall. If you have a group if devs that know the nt kernel, SQL server and iis very well - why would you not use the ms stack?

Don't get me wrong, I prefer Linux and Free software myself - but that doesn't mean the thousands of Man years sybase and ms have spent on their stack is wasted effort.

Because they thought it was the better choice for them. You cannot argue with the result.

https://www.quora.com/Why-does-StackExchange-StackOverflow-u...

See also https://circleci.com/blog/its-the-future/, which is almost too true to be funny.
My solution was more about higher availability, and staggered deployments... but each of about 8 services (including web) would be deployed to 3 servers, identical config with dokku... then there would be a load balancing nginx instance that pointed to the app.foo on each of the three servers.. deployment would update/cycle nginx, deploy to first server, roll over, then deploy to the other two.

That was the plan. (I won't go into the political bs at a company I no longer work with).

> throwing a $10k a month server or mathematician at the problem

Do you have any examples of problems at which you have thrown a mathematician or could imagine doing so?