It was 6 years ago. I'm sure now there'd be no contest now if OpenAI dedicated resources to it, which it won't because it's busy with solving entirety of human language before others eat their lunch.
"AI tend to be brittle and optimized for specific tasks, so we made a new specific task and then someone optimized for it" isn't some kind of gotcha. Once ARC puzzles became a benchmark they ceased to be meaningful WRT "AGI".
So if DOTA became a benchmark same way Chess or Go became earlier it would be promptly beaten. It just didn't stick before people moved to more useful "games".