Hacker News new | ask | show | jobs
by netdur 13 days ago
had a good run with Gemma 4 E2B Unsloth 4Q: https://youtube.com/shorts/XLsAnz5aAAI

The E4B model doesn’t fit on my phone TPU, so it swaps to RAM, the QAT version means more accuracy, good!

2 comments

How were you getting anything useful out of that? We found the (unquantized!) E2B model to be completely useless at even the simplest real-world classification tasks.
How do you know it swaps to ram vs on the TPU?

Would be interested in testing this on my pixel.

Because TPU has 2GB and weight + context needs more