Other companies were allegedly distilling the models by training on the reasoning output. By hiding the reasoning tokens, it makes it harder to do this. You can still try to distill the models, but you can't distill reasoning itself as well.
This could all be optics as well to try to give the appearance of a defensible moat. E.g. they can claim to investors that they are able to protect a significant chunk of their intellectual property this way. I'm not sure if anyone has a study about how significant the summarization is to distillation.
Not revealing actual thinking traces prevents mdoel distillation on yhe actual output (thinking traces are a key part of the output) which makes it harder for conpetitors to catch up (a moat).
Being currently in the lead in a category is not a moat,a moat is whatever creates a barrier to competitors catching up when you are in the lead. Merely being in the lead is not a moat except in a market with strong network externalities.
This could all be optics as well to try to give the appearance of a defensible moat. E.g. they can claim to investors that they are able to protect a significant chunk of their intellectual property this way. I'm not sure if anyone has a study about how significant the summarization is to distillation.