Y
Hacker News
new
|
ask
|
show
|
jobs
by
downvotetruth
1202 days ago
Follow up:
https://github.com/facebookresearch/llama/issues/79#issuecom...
claims 65B was able to fit in 128 GB by unsharding & merging weights into a single file instead of the multiple pth with 172Gb max swap file usage & appears to stream to GPU.