Hacker News new | ask | show | jobs
by HashedViking 465 days ago
Good take on that. I still think a q8 32B model with a 200k context would fit into the 48Gb VRAM of one of those modded RTX 4090.