Hacker News new | ask | show | jobs
by physicsgraph 1113 days ago
The training data used [0] is written using "leaps of logic" which provide context for how the solution was derived. There isn't sufficient detail in the training data to formally check each step using a Computer Algebra System [1]. Therefore, whether or not any output from the OpenAI software is actually correct is left as an exercise for the user. Answering "Was the hallucination correct?" involves manually checking each step.

This advancement is good, but it's limited by the precision of the training data.

[0] https://artofproblemsolving.com/wiki/index.php/2015_AIME_II_...

[1] https://en.wikipedia.org/wiki/Computer_algebra_system