It's probably on-par or better than humans get unaided. Hell, I'd bet due to transcription errors it's better than what humans get in a lot of settings, even when aided by a calculator.
I guarantee you professionals using math at work - for example in finance - not have a 1% error quota. They use tools. We have tools. Nobody in any serious role (money, etc) works unaided.
Math inference is a palor trick as is the whole “world model” bullshit - physics doesn’t work with 99% accuracy.
It’s the same reason agents are bullshit right now - error compounding at 95% reliability per step murders them and currently there is no path to triple 9