thats only because kimi 2.5 was trained using data stolen from claude. it wouldnt exist without riding claudes coat tails. none of the so called 'open source' models would
That's not true, some open weight models didn't distill Claude or other then frontier models. E.g Llama. Yet achieved comparable performance (back then in llama's case).
If distillation wasn't a thing, they would certainly exist, they would have trained them from scratch or via a decent base models to remain economically viable.
What's for sure is that Claude wouldn't exist if it wasn't for data stolen from millions of creators. As they found themselves admittedly guilty of.
> E.g Llama. Yet achieved comparable performance (back then in llama's case).
At not a single point in history did Llama ever achieve comparable real-world performance to frontier models. I was around. At best they were earlier at benchmaxxing than the others.
Boo hoo. Claude was trained using data stolen from the collective works of all of humanity. If someone does it faster and cheaper by skimming the cream off the top of Claude then surely that’s just a market efficiency in the thieves business?
If distillation wasn't a thing, they would certainly exist, they would have trained them from scratch or via a decent base models to remain economically viable.
What's for sure is that Claude wouldn't exist if it wasn't for data stolen from millions of creators. As they found themselves admittedly guilty of.