Hacker News new | ask | show | jobs
by jimlongton 1127 days ago
(Possibly naive question) This is marketed as open source. Does that mean I can download the model and run it locally? If so, what kind of GPU would I need?
2 comments

A 3090 (or any GPU with >=20GB VRAM) can run StarCoder with int8 quantization at about 12 tokens per second, 33 with assisted generation -- which will come out for StarCoder in the coming days.

When 4-bit quantization comes out, I would expect a GPU with 12GB VRAM to be able to run it.

Disclaimer: I work at Hugging Face