Y
Hacker News
new
|
ask
|
show
|
jobs
by
Tostino
1070 days ago
Exllama is significantly faster if you can fit the whole model in VRAM.