WizardCoder-34B-Python surpasses GPT-4 on HumanEval

Except it doesn't. They even mention in the same Tweet that their own test showed 82 percent for GPT-4.

But if CodeLlama can make that claim then I guess it's fair for WizardCoder to say it also.

Wherever the old number is coming from.shouod be updated.