| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by ghshephard 111 days ago
	Do any of the open weight models from smaller labs exist if they can't distill from the SoTA models that are throwing billions of dollars of compute into pretraining?

2 comments

daniel_iversen 111 days ago

I’ve been wondering the same. And I think pretty much all the impressive small lab models were guilty of it, right? At least there is still larger players like DeepSeek and mistral to provide a bit of diversity in the market

link

username223 111 days ago

Does it matter? The frontier models stole the whole internet, then the second-level models stole from them… It’s all theft.

link

ghshephard 108 days ago

Oh - I 100% could not care less regarding the morality/legality/whatever... Everyone trains on everything.

I'm just wondering if the smaller labs see the same velocity of advances without SOTA models to generate Terabytes of training data?

Hard agree.

The question is - if the SOTA model disappear - do these follow-on models have the ability to improve themselves without distillation?

link

andsoitis 110 days ago

> The frontier models stole the whole internet

What does that even mean?

link