Hacker News new | ask | show | jobs
by Glemllksdf 69 days ago
I tried Gemma 4 A4B and was surprised how hart it is to use it for agentic stuff on a RTX 4090 with 24gb of ram.

Balancing KV Cache and Context eating VRam super fast.