Hacker News new | ask | show | jobs
by kkjjkgjjgg 1641 days ago
Sounds as if they stored all the correct answers in a database and call it "better". How do they even evaluate these models? Like they already have a billion preprepared correct answers in the database. How do they come up with new questions for the evaluation?
3 comments

It's the equivalent of taking an a test where you can use the internet. Sure you know the information needed to answer the question exists, but it can be difficult to extract the answer and word it into at English sentence.
Instead of storing the correct answers in an encoded/embedded form in the weights of the neural net (certain neurons very loosely corresponding to certain "answers") the correct answers are stored elsewhere. That way we can scale down the model to the necessary "thinking" parts and we don't need to use excess neurons for the "memory" part. Kind of handwavey but hopefully that explains the general idea.
You mean otherwise the whole words would be encoded in the net, and now you only need to encode the index in the database?
> all the correct answers

That is clearly not possible, so it can't be what they are doing.

Rather than diffusely encoding that knowledge in a massive number of self-organized layers of weights, it is explicitly encoded. The remaining network can "focus" on mapping input to retrieve the relevant information stored in that database, and extracting/interpolating/extrapolating that information based on the current context to generate useful output.