In these CUDA vs ROCm comparisons I think they mostly compare the C++ dialects. And it's not even particularly the language implementation where ROCm is weak, but rather, the whole tool chain. Am I mistaken?
Well the C++ segment is important and from what I gather, ROCm is failing at that. AMD would be a lot better off if the C++ part worked, even if the other parts don't.