What do you expect? People building software using other models than they themselves develop? Or people training the models train them for software that isn't the software they develop themselves?
It's like saying official car repair shops should repair any type of car, not just their brand. That's just not how the real world works.
Or they just don't even do it? Michelin tyres aren't best fitted by a Michelin shop, those don't exist. You go to KwikFit or whatever you think's best, and get Michelin or Continental or Pirelli or whatever you think's best fitted.
Terrible analogy, because the issues involved in fitting car tires of different brands are in no way comparable to the differences between LLM behavior across models.
It's like saying official car repair shops should repair any type of car, not just their brand. That's just not how the real world works.