Hacker News new | ask | show | jobs
by pshc 754 days ago
Quantized 4/5-bit 8b models with medium-short context might be shippable. Still, it’s going to require a nice GPU for all that RAM. Plus you would have to support AMD—I would experiment with llama.cpp as it runs on many architectures.

Hope your game doesn’t have a big texture budget.