Hacker News new | ask | show | jobs
by espadrine 1293 days ago
All spaces do[0], but please don’t abuse it: it is just for demo purposes. If you hammer it, it will be down for everyone, and they might not bring it back up.

It can be run locally with ~16GB VRAM GPU; you might be able to configure it at a lower precision to run it with GPUs with half the RAM.

[0]: https://huggingface.co/spaces/togethercomputer/GPT-JT/blob/m...

3 comments

There are also a number of commercial services that offer GPT-J APIs (and surely in a couple days GPT-JT APIs) on a pay-per-token or pay-per-compute-second basis. For light use cases those can be extremely affordable.
Cannot one do inference using a CPU?
thank you!