|
|
|
|
|
by mrcslws
972 days ago
|
|
Aha, I was hoping to learn about something like this, thanks for sharing. I'll try this some time. PyTorch does use different threads for the forward and backward pass, so as you suggest, setting that flag might only improve the forward pass. |
|
cuDevicePrimaryCtxSetFlags: https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__PR...
dlsym(3): https://man.archlinux.org/man/dlsym.3.en
ld.so(8): https://man.archlinux.org/man/ld.so.8.en