Hacker News new | ask | show | jobs
by lsh123 4738 days ago
Disclaimer: I used to write Web apps in C++ in late 90s/early 00s for the largest web company of the time.

1) DB is often a bottleneck for web apps. Not always. But often. Very often. Optimizing pages rendering will improve the performance for pages that are already rendered pretty fast. It will do nothing for pages that are loaded slow due to DB access. A better/smarter cache will give you much bigger bang for the buck.

2) For the annual salary of a good C++ programmer (i.e. one who can write safe and readable code with STL and Boost), I can run 50 large AWS instances for a year.

3) Writing safe C++ code is hard. Even with STL and Boost you still have to understand the little details of objects ownership (see boost::shared_from_this<> as an example).

Overall, I would invest in distributed system with smart and efficient caching instead of trying to optimize single server performance. At the end, you will run out of the single box solution (if you are successful). Going from 1 server to 2 servers is really hard. Going from 10 to 100 is not too bad.

1 comments

Disclaimer: I’m not a C++ guy; I generally write in Ruby and PHP.

I agree with your overall point — C++ may not be a good fit for web development, and there’s a lot to be said for being able to scale across multiple servers (and choice of platform will determine how quickly you have to solve that problem).

That said, my experience is that for most projects the performance bottleneck is the framework, not the DB. On big PHP projects, loading lots of code on every page load takes a huge amount of time, and many frameworks do things like parse multiple XML files on every request.

Similarly, Rails is slow… just take a look at how much time is spent outside of the database for a typical request. It’s crazy how long it takes, considering that most requests boil down to a DB query and then some string concatenation.

Another angle on his is the common pattern of having a bunch of app servers and just a few DB servers. That implies that the DB is not the main bottleneck.

Again, I agree with your overall thesis, I just have to disagree with the common wisdom that is point 1.

In my statement I kind of assumed that all the "cheap" optimizations in the framework are already done (e.g. PHP APC cache is enabled, lazy classes/configs loading is implemented, etc.) so we compare "apples-to-apples": a highly optimized C++ framework to a highly optimized, say, PHP framework (can't comment about Ruby - didn't have experience building/running large-scale apps on it). And of course, it all depends on the application itself. Fetching single "narrow" rows by primary key is obviously cheap and nobody cares. However, you have to dig through a table larger than 500G with multiple indexes is not so cheap.

And the reason why you don't see a lot of DB servers is that it is HARD to scale DB by just adding servers (I am ignoring for a second non-SQL servers w/o joins as well as high-end Oracle and new MySQL-Galera solutions). An ACID compliant SQL DB is a single-server affair unless you invest heavily into the DB itself since it is really hard to run DB cluster even from pure operational standpoint.

> In my statement I kind of assumed that all the "cheap" optimizations in the framework are already done (e.g. PHP APC cache is enabled, lazy classes/configs loading is implemented, etc.) so we compare "apples-to-apples": a highly optimized C++ framework to a highly optimized, say, PHP framework

I'm kind of skeptical, to be honest. I suspect a simple, unoptimized C++ application wouldn't have a lot of problems keeping up with a highly optimized PHP framework. Interpreters do so much extra work.

Not that I would use C++ for web stuff- that's 99% string munging, and string munging in C++ is how you wind up with your desperate last words mockingly quoted on seclists.org.

Assume your DB query takes 10 secs. It's irrelevant if C++ code takes 0.01 sec when your PHP code takes 0.1 sec.
Now why would I assume that? There are times when you need to generate a big report, and just the SQL can take that long or more. But typical web requests for typical applications are not like that. DB queries shouldn't take more than a couple hundred milliseconds for "typical" application requests.

Even counting that, the comparison is also not necessarily correct. In some cases, perhaps the C++ code would have taken 0.1s, where the ruby code takes 5s. In fact, 50x is about the slowdown you used to see for ruby in the programming languages shootout, and it is still pretty bad. [1]

[1] http://benchmarksgame.alioth.debian.org/u32/benchmark.php?te...

Sure, I was considering cheap optimizations — APC is helpful, but not that helpful.

Most queries in most projects are cheap.

On the other hand, frameworks that have ORMs and other DB abstraction layers often make trade a little bit of DB time (and developer time) for a lot of framework time.

And as you say, scaling the DB is a lot harder than scaling the app servers (typically) — the implication being that trading framework performance for developer time makes a lot of sense.

Well, all applications are different so can't comment on your experience. For me APC does at least 2-3x in the max number of non-DB requests per second. Personally I think it is a lot.

As with any technology, ORM is just a tool. Used wisely it works great. Used it w/o understanding of what you are doing - you get a recipe for disaster.

"Most queries in most projects are cheap" is the true assumption for small projects. When you cross certain threshold, none of the non-cached queries are cheap (by non-caching I mean caching in all layers including the DB itself).

And I strongly disagree with "trading framework performance for developer time makes a lot of sense". In 99% of cases you DON'T care about performance. Yet, a "bad" framework adds penalty in 100% of cases. I strongly prefer to have a good-enough performance from the framework and then focus on this 1% of cases when it matters (and in majority of cases, it's the DB, not the app code).