Hacker News new | ask | show | jobs
by joshmn 2045 days ago
Rails makes it really easy to do something 10 different ways to get the same result. Unfortunately, most of which aren't the most performant way. In my 10 years of building Rails apps of all different sizes, and seeing some very mature apps in production, this is the most common culprit I've seen.

I currently work on a rather large Rails app for a site that most of us here use. A common pattern for our performance pitfalls are things like this:

  Tag.all.map(&:name)
versus

  Tag.all.pluck(:name)
Using `#map` will instantiate a `Tag` object and do all the slow(er) metaprogramming to get you the syntactic sugar that makes interacting with an ActiveRecord object a treat. It does this, and then you only hit `#name`, completely wasting all that effort.

`#pluck` will change the SQL from `select * from tags` to `select tags.name from tags` and never instantiate a `Tag` object, instead short-circuiting and directly fetching the resulting SQL rows — which comes back as an array. It's something along the lines of:

  ActiveRecord::Base.connection.exec_query(Tag.select(:id).to_sql)
Another one I see:

  ProgrammingLanguage.where(tag_id: @user.tags.select(&:language_tag?).map(&:id))
versus

  ProgrammingLanguage.where(tag_id: @user.tags.select(:id).where(type: 'LanguageTag'))
The first example loops over the loaded `@user.tags`, loads them if they're not already `#loaded?`, selects ones that are `type == 'LanguageTag'`, only to grab the `#id`.

The second example joins the two resulting SQL statements and calls `#to_sql` on the second statement, building one query from two queries.

Are these times when the first example would be preferred? Yeah, plenty! If your result is already `#loaded?`, then you probably don't need to hit the database again. But for this example and the ones I'm stumbling across while getting our company up-to-speed on "good ways", these are the the commonalities.

Save for only very recently, the company I work for hasn't put emphasis on real-world Ruby/Rails skills, instead "if you can code at a high level for any language, we think you can make reasonable contributions to our codebase." This has lead to hiring skilled developers that just don't know that there's a more preferred way of doing things in Rails for different contexts.

Double-edged sword, really.

4 comments

I would refactor your second example into an exists using Arel, because at best the IN will result in the same performance. At worst it will be significant my slower. There are also particular issues with NOT IN and NULL. This is at least true in PG.

I also deal with a lot of scale, the issues people have here doesn’t match my reality. I think people have issues and rather than looking at what is fundamentally happening with their call patterns, they jump to calling out rails itself.

Rails does have some specific issues, but you’d have to go pretty deep to see them and boot times are terrible.

The general problem you describe - n + 1 queries caused by needless iterating over / instantiating Ruby objects when a simple SQL query would do - is certainly a very common newbie mistake, but it's just that: a newbie mistake.

Confusion here simply shouldn't be a problem for even a moderately seasoned developer, and if they do make such a mistake (because hey, we all make mistakes...) in performance-sensitive code they could quickly recognize it for what it is - a bug - and fix it.

If you're hiring junior developers, on the other hand? Sure! But you should know what you're getting, and your code review / mentoring process should get them straight.

I'm not sure I really understand how this is Ruby's or Rails' fault, unless your premise is "ORMs considered harmful" - in which case, ActiveRecord is far from alone here, and that's a different sort of conversation.

Your examples aren't really doing the same thing in two different ways. Map (and select, collect, each, and some others) iterate over an enumerable. Pluck and select are active record methods that generate SQL.

It's a good example of choosing magic/brevity over expressiveness. You don't know that Tag.all.map calls SQL because it's not something you explicitly tell it to do. That's the real issue with Ruby & Rails. The magic lets you do some powerful stuff but sometimes it's hard to tell what exactly is happening.

That's why I said they produce the same result instead of saying they do the same thing. If you read further in my example, I mention that they perform very differently under the hood.
That's a consequence of using ORMs. ORMs are a horrible performance mess. Just straight up write SQL. I'm sure Ruby has enough metaprogramming magic to not have to deal with cursors manually.