Hacker News new | ask | show | jobs
by brucethemoose2 847 days ago
Counterpoints:

- Local models are pretty easy to de-censor, if thats what you mean.

- ...Yeah, it should not be labeled as a 7B. Its sort of 7B class.

- The repo mentions they use the llama-cpp-python server

- 1M context brute forced across TPUs is insanely expensive, I can see why Google reigned it in.

But overall your message is not wrong. Google is hyping Gemma a ton when its... Well, not very remarkable. And they could have certainly made something niche and interesting, like a long context 8.5B model, a specialized model, a vastly more multilingual model, something to differentiate it from Mistral 7B 0.2