How do they know GPT-4 will be enough to let it pass? Is there even a big enough difference in the training data for it to improve in the areas it was struggling with?
Rumours are that GPT-4 is a significant improvement over GPT-3.5. Given how big an improvement GPT-3.5 is over GPT-3 I am inclined to believe them. Probably we will find out for sure in a few months.