Hacker News new | ask | show | jobs
by dudefeliciano 468 days ago
what hardware are you using those on? Is it still prohibitively expensive to self-host a model that gives decent outputs (sorry my last experience has been underwhelming with llama a while back)
2 comments

I'm tinkering with gemma 3 27B on a last gen 12 core ryzen. I get 5 tokens/sec.
I have an AMD 6700 XT card with 12gb of VRAM and a 24 core cpu with 48gigs of ram. This is the bare minimum,