Hacker News new | ask | show | jobs
by solaire_oa 58 days ago
Not all prompts require the same compute, and Gemma-4B runs on our phones with parity output for ordinary 1-5 sentence queries. The common use case of Google-style queries is already solved locally, saying we're miles off is ridiculous.