Y
Hacker News
new
|
ask
|
show
|
jobs
by
q1w2
1200 days ago
Great, now how do I run it? Do I need a GPU with over 65GB RAM?
3 comments
version_five
1200 days ago
Try this, it's for running llms that won't fit in the gpu:
https://github.com/FMInference/FlexGen
link
gpm
1200 days ago
Currently that looks like it only supports facebook's opt and galactica models. Though they do appear to plan to add support for more models.
link
rnosov
1200 days ago
Generally, you'll need multiply model size by two to get required amount of video RAM. There are 4 sizes, so you might get away with even smaller GPU for say 13B model.
link
bioemerl
1200 days ago
Nope, more like 111gb
link