I've been tasking LLMs to write a traditional AI for a full vibe-coded RTS. I remove the human players and let them battle. I don't know why but I enjoy watching AI players battle so much :)
In the repo, I even have a tournament script that calculates ELOs. So far, codex was unmatched. I'll try with Opus 4.8 too.