Hacker News new | ask | show | jobs
by teilo 897 days ago
No quantization (8_0). The full 48GB model. As for token count, I haven't tested it on more than 200 or so.
1 comments

Isn’t 8_0 8-bit quantization?
Sorry. That was a major brain fart. Yes. 8-bit quantization, and using 49G of RAM.