I suppose the thing to do is analyse your app for the average string length, and just recompile your Ruby with that. Would be even better of it was a command line parameter.
This isn't quite right. Even if your average string length is 1k+, you shouldn't change the embedded string size to 1k+. I think these objects sit on the C stack internally, which doesn't handle large objects like this well.
Also, I would guess the performance gains (from skipping malloc) would wash out the longer your average string gets -- even if the huge stack use doesn't kill your performance for some other reason (blowing the d-cache?).
I don't think these strings ever sit on the C stack, except maybe if some C code/extension is being really clever. The standard representation for variables is a tagged pointer as far as I know, so I would assume that is all that goes on the stack. This optimization probably just saves another level of indirection.
Also, I would guess the performance gains (from skipping malloc) would wash out the longer your average string gets -- even if the huge stack use doesn't kill your performance for some other reason (blowing the d-cache?).