Hacker News new | ask | show | jobs
by SwellJoe 20 days ago
They already provide E2B and E4B that run on (much) smaller devices, including tablets and phones. This fills the gap in the middle. The bigger Gemma 4 models are excellent for their size, but at 8-bit quantization they need about 64GB of VRAM or unified memory. 48GB for 6-bit. Any lower quantization than that, they start to get notably dumber. So, a 12B is interesting for that middle ground.