|
|
|
|
|
by iamflimflam1
801 days ago
|
|
Well, the datasets used in the paper are all available (Appendix B) - so recreating the experiment seems possible. What we are currently seeing in the comments are people trying random things and then saying “it doesn’t work”. |
|
For example, for Friedman #1, GPT-4 predicts 12.89 while the true value is 11.69 (https://chat.openai.com/share/177571ad-3845-46a1-952f-963647...)
For Original #1, GPT-4 predicts 83.63 while the true value is 80.39 (https://chat.openai.com/share/808da995-99e6-444a-94da-fc7cd5...)