Hacker News new | ask | show | jobs
by charlesju 6331 days ago
I hate when people tell me that Rails doesn't scale because I have scaled Rails and it has nothing to do with the framework.

The reason why websites can't scale is ALWAYS the DB. To fix the DB it is always master/slave, memcached, then sharding. For EVERY language.

Good article, sums up a lot of the tools I used to scale my stuff.

I'd like to add a little more.

1. Turn on slow-query logging in your DB and tail -f the slow query log. Find slow queries and kill them in your code through indices or using several fast queries to make up for 1 long-one.

2. Cache most reads. Both from application caching and memcached.

3. Turn off associations, they don't play well with caching.

4. If you're using memcached, don't use the plugins.

5. If you're a serious startup, use Engine Yard, they're life savers.

3 comments

Rails running on vanilla Ruby does have particular scaling issues due to the limitations of the Ruby garbage collector. You hit a GC cycle after every 8MB of allocation, which typically takes around 150ms to run, so can dominate the runtime of your requests. We've had 80% of the runtime in GC, even when you include database time.

This isn't necessarily a killer, but it means there's sometimes more work than you might like in scaling. There are patches to tune the garbage collector, and JRuby etc. may help, but ultimately you need to be much more aware of memory allocation than you might think.

That's an application bottleneck which is fixed by more application servers. This has no effect on scaling what-so-ever.

And by scaling here, we're talking about 1 M to 10 M, not 10,000 to 100,000.

"5. If you're a serious startup, use Engine Yard, they're life savers."

- If you're a serious startup that doesn't monetize off of display ads, use Engine Yard. Otherwise, you can't afford them.

If you're a serious startup, you're not monetizing off display ads.

:)

Actually, this is generally good advice regardless of what language you're on. The database is almost always at fault. My general rule is that a query should generally take less than 0.001 seconds to execute. If you can get it down to that through proper indexing and whatnot, MySQL will execute the queries fast enough so that they don't get backed up and eventually kill the server. Don't start caching until you can write a proper query. Your site probably doesn't need it if you are writing your queries correctly.
... and be less afraid of de-normalizing your data!