This is a super common occurrence in training loops for ML models because PyTorch uses multiprocessing for its dataloader workers. If you want to read more, see the discussion in this issue: https://github.com/pytorch/pytorch/issues/13246#issuecomment...
As you’ve pointed out fork() isn’t ideal for a number of reasons and in general it’s preferred to use torch tensors directly instead of numpy arrays so that you are not forced into using fork()