Y
Hacker News
new
|
ask
|
show
|
jobs
by
bigyabai
61 days ago
> The 35B Trick (Your SSD Is the New GPU Memory)
Wave "bye bye" to your write cycles.
1 comments
RobMurray
61 days ago
why? it's mostly reads. the weights are static.
link
bigyabai
61 days ago
llama-cpp's process is, but macOS itself will swap hard when 10-14gb of memory is paged for LLM inference. Dense models especially would thrash zram.
link