|
|
|
|
|
by borzunov
1265 days ago
|
|
Theoretical best-case for RAM offloading is 5.5 sec/token, for SSD offloading - 22 sec/token. Implementations we've tested are not faster than 10 sec/token though. See details in our paper: https://arxiv.org/pdf/2209.01188.pdf |
|