Is there a difference here between a secret and an unknown? It may well be that some researcher / comp engineer had an idea, tried it out, realized it was incredibly powerful, implemented it for real this time and then published findings after they were sure of it?
I'm more of a mechanical engineering adjacent professional than a programmer and only follow AI developments loosely
We also have no indication sparse models outperform dense counterparts so it's scale either way.