| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by cxie 473 days ago

Interesting to see AMD entering the small LLM space where practical compute constraints actually matter. These 3B models represent the pragmatic side of AI - not everything needs to be a 100B+ parameter behemoth burning through datacenter power.

The real test will be inference latency and throughput on consumer hardware, not just the cherry-picked benchmark graphs they've shared. Anyone run comparative evals against Llama 3.2 3B or Gemma-2 on identical hardware yet?

The fully open approach (weights, hyperparams, training code) is refreshing compared to the "open weights only" trend we've been seeing. This is how you actually build a community around your tech stack.

Edge deployment is where this gets interesting - having truly open small models running locally on laptops/phones/embedded without phoning home feels like the computing paradigm we should have been pushing for all along instead of the current API-gated centralization.