Hacker News new | ask | show | jobs
by polotics 545 days ago
well yes, locally, if you assume that someone's got about 300'000 dollars of hardware at hand... right? as you are not paying for Gemini, may I ask why, did you try it and find it inferior?
3 comments

I bought two (relatively) old datacenter GPUs with 48gb VRAM total for €200 that gets me 7 token/s for a 70b model.
which GPUs?
Not the GP, but I bought a few P40s over the summer for $150 each. Last I checked they're more expensive now, but it's still cheap vram and fast enough at inference for me.
Nvidia M40 and P40.
You actually can't pay for the latest models, they're only available as free with limits
Gemini for coding does not work for me. It gets so many things wrong
You should try again. Gemini rates highest on coding at lmarena.
Which Gemini AI model did you use?