Hacker News new | ask | show | jobs
by PaulRobinson 639 days ago
Depends on the specific model and your perf requirements, but lots of them will run on a single box with a middle of the road GPU. If your invocation rate is low, hosted solutions like AWS Bedrock or using hosted APIs might be cheaper.