Y
Hacker News
new
|
ask
|
show
|
jobs
by
om8
314 days ago
To have a gpu inference, you need a gpu. I have a demo that runs 8B llama on any computer with 4 gigs of ram
https://galqiwi.github.io/aqlm-rs/about.html
1 comments
adastra22
314 days ago
Any computer with a display has a GPU.
link
om8
314 days ago
Sure, but integrated graphics usually lacks vram for LLM inference.
link
adastra22
314 days ago
Which means that inference would be approximately the same speed (but compute offloaded) as the suggested CPU inference engine.
link