https://nvidia.github.io/cuda-python/cuda-core/latest/ https://developer.nvidia.com/nvmath-python
https://developer.nvidia.com/how-to-cuda-python
https://cupy.dev/
And
"Zero to Hero: Programming Nvidia Hopper Tensor Core with MLIR's NVGPU Dialect" from 2024 EuroLLVM.
https://www.youtube.com/watch?v=V3Q9IjsgXvA