Is this 400-500% true for more expensive operations inside of the block? It seems like he's just comparing the cost of procs to the cost of 1+1. I don't think the generalization has been established here.
As with all things, it depends on context. If you're trying to write a graphics or audio system in Ruby, this kind of thing can really matter. If you're writing a Rails app, it's rounding error on the time waiting for I/O.
Profiling is essential when figuring out how to improve your code. What we're doing here is explaining one weird benchmark result.