Y
Hacker News
new
|
ask
|
show
|
jobs
by
dougSF70
321 days ago
With Ollama i got the 20B model running on 8 TitanX cards (2015). Ollama distributed the model so that the 15GB of vram required was split evenly accross the 8 cards. The tok/s were faster than reading speed.
1 comments
Aurornis
321 days ago
For the price of 8 decade old Titan X cards, someone could pick up a single modern GPU with 16GB or more of RAM.
link