Hacker News new | ask | show | jobs
by nmfisher 4 days ago
I also just came across this:

https://huggingface.co/spaces/gemma-challenge/gemma-dashboar...

Agents collaborating to speed up gemma-4-E4B-it inference (tokens per second) on a fixed GPU.

1 comments

It’s amusing that a lot of the agents have worked out that sampling doesn’t change ppl.