Hacker News new | ask | show | jobs
by hirako2000 90 days ago
That's not true, some open weight models didn't distill Claude or other then frontier models. E.g Llama. Yet achieved comparable performance (back then in llama's case).

If distillation wasn't a thing, they would certainly exist, they would have trained them from scratch or via a decent base models to remain economically viable.

What's for sure is that Claude wouldn't exist if it wasn't for data stolen from millions of creators. As they found themselves admittedly guilty of.

1 comments

> E.g Llama. Yet achieved comparable performance (back then in llama's case).

At not a single point in history did Llama ever achieve comparable real-world performance to frontier models. I was around. At best they were earlier at benchmaxxing than the others.