“Some other models tested that just didn't work: gpt-4o, gpt-o1, qwen qwq.”
Notably gpt-4o was used in the post linked here.