|
|
|
|
|
by q1w2
1197 days ago
|
|
> people are just going to be throwing pytorch code at the wall The pytorch 2.0 nightly has a number of performance enhancements as well as ways to reduce the memory footprint needed. But also, looking at the README, it appears that model alone needs 2x the model size, eg 65B needs 130GB NVRAM, PLUS the decoding cache which stores 2 * 2 * n_layers * max_batch_size * max_seq_len * n_heads * head_dim bytes = 17GB for the 7B model (not sure if it needs to increase for the 65B model), but maybe a total of 147GB total NVRAM for the 65B model. That should fit on 4 Nvidia A40s. Did you get memory errors, or you haven't tried yet? |
|