Hacker News new | ask | show | jobs
by anonylizard 799 days ago
This, Harry potter is a terrible example, even weak models know the rough story of Harry Potter unprompted.

If you want a real test, go test it on some Japanese light novel, or some harry potter fanfiction, and see if the model actually understands the plot details.

For reference, Opus/GPT-4 know the rough story of moderately popular light novels/mangas without any context given. They however do not precisely understand the fine-grained details of the story, like which character will win in a fight.

1 comments

Japanese light novels were almost certainly in the training set, either in their original Japanese, an English translation in that Books2 pile, or in a fan translation that happened to get scraped.
That's expected, and why the model can reproduce the basic details.

But those Japanese light novels don't have millions of forum discussions and essays written on it. So it shows how well the model can recall sparse data in its training dataset, rather than recalling a dataset that basically shows up 100000 times in different forms.