Interesting. I'll have to check to be sure, but I think maybe something is happening automagically if you have reasonably up to date nvidia drivers on the host OS, because I was able to run the EmotiVoice TTS docker (which requires nvidia gpu) from WSL2.
https://github.com/netease-youdao/EmotiVoice