One of the points in the article is that this kind of behaviour modification is not done that often. Yea, it happens... but not as much as you'd think.
I don't think you have that right. I read it as that kind of behavior modification isn't done that often after load time. During load time, all sorts of shenanigans are going on (monkey patching, etc.)
Well, you're right about that distinction-- but from your comment "only on Tuesdays" I understood you were talking about behaviour modification some time after "load time". Also keep in mind the first paper cited in the article (profile-guided inference) actually points out that in most cases, "load time" can be inferred nearly statically-- as in, you would not need to run a test suite, you would only need to "load" the base code. This is a feasible task, in most cases.