Y
Hacker News
new
|
ask
|
show
|
jobs
Don't have a $5k MacBook to run LLAMA65B? MiniLLM runs LLMs on GPUs in <500 LOC
(
github.com
)
3 points
by
volodia
1194 days ago
1 comments
tempaccount420
1194 days ago
Doesn't this use as much VRAM as llama.cpp (with int4 models) uses RAM? RAM is a lot cheaper than VRAM.
link
volodia
1193 days ago
It won't run as fast on your CPU at it will run on a GPU. Also, it might clog most of your RAM; it's better to offload to a cheap GPU.
link