Also checked and apparently Nvidia Cutlass now supports generic convolutions: https://github.com/NVIDIA/cutlass