|
Thank you for this. Technically it's not GPT-3, but GPT-NeoX-20B, although they are based on a similar architecture. The poor performance is most likely due to not having a large database of math problems to draw from. Github, for example, is part of the dataset that is used to train both GPT-3 and GPT-Neo variants, which is partly why they can generate meaningful code (sometimes). I wonder how a model finetuned for math would perform. |
(Submitted title was 'GPT-3's answers to arithmetic questions')