| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Eisenstein 327 days ago
	They don't release such huge open weights models because people who run open weights don't have the capability to run them effectively. Instead they concentrate on models like Gemma 3 which goes from 1B to 27B, which when quantized fits perfectly into the VRAM you can get on a consumer GPU.

2 comments

lossolo 327 days ago

> They don't release such huge open weights models because people who run open weights don't have the capability to run them effectively

This is a naive take. There are multiple firms that can host these models for you, or you can host them yourself by renting GPUs. Thousands of firms could also host open-source models independently. They don’t release them because they fear competition and losing their competitive advantage. If it weren’t for Chinese companies open-sourcing their models, we’d be limited to using closed-source, proprietary models from the U.S., especially considering the recent LLaMA fiasco.

link

Eisenstein 327 days ago

Given the assumption that Google has Google's own interests at heart, the question isn't 'why doesn't Google release models that allow other companies to compete with them' but 'what is the reasoning behind the models they release' and that reasoning is 'for research and for people to use personally on their own hardware'.

We should be asking why Meta released the large Llama models and why the Chinese are releasing large models. I can't figure out a reason for it except prestige.

link

regularfry 327 days ago

That shouldn't be the case here. Yes, it's memory-bandwidth-limited, but this is an MOE with 22B active. As long as the whole thing fits in RAM, it should be tolerable. It's right at the limit, though.

link