| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by mythz 648 days ago

It's a 236B MoE model with only 21B active parameters that ollama is reporting having 258k downloads [1] (for 16/236 combined) whilst Hugging Face says was downloaded 37k times last month [2], which can run at 25 tok/s on a single M2 Ultra [3].

At $0.14M/$0.28M it's a no brainier to use their APIs. I understand some people would have privacy concerns and would want to avoid their APIs, although I personally spend all my time contributing to publicly available OSS code bases so I'm happy for any OSS LLM to use any of our code bases to improve their LLM and hopefully also improving the generated code for anyone using our libraries.

Since many LLM orgs are looking to build proprietary moats around their LLMs to maintain their artificially high prices, I'll personally make an effort to use the best OSS LLMs available first (i.e. from DeepSeek, Meta, Qwen or Mistral AI) since they're bringing down the cost of LLMs and aiming to render the technology a commodity.

[1] https://ollama.com/library/deepseek-coder-v2

[2] https://huggingface.co/deepseek-ai/DeepSeek-Coder-V2-Lite-In...

[3] https://x.com/awnihannun/status/1814045712512090281