|
|
|
|
|
by fathyb
1115 days ago
|
|
I think GP meant zero-copy communication with the GPU, eg. through `newBufferWithBytesNoCopy` [0], which is only possible with unified memory architectures, eg. integrated GPUs. The mmap change was just about mapping the model files in memory instead of copying them, which has less overhead. [0]: https://developer.apple.com/documentation/metal/mtldevice/14... |
|
It can still be beneficial on discrete GPUs because it avoids a copy inside the driver.