| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by nl 2128 days ago

Thanks for the response.

> [BART] is essentially acting as a QA-pair memorizer over the training data, and at test time, it just matches the question onto those seen at training time.

Not super-surprising.

> The T5-11B+SSM model we evaluate in the paper uses a special "Salient span masking" pretraining objective that does this to some extent (only mask words at pretraining time that are likely to be "answers" to factual questions), so in essence the pretraining task becomes pretty standard cloze-question answering

This seems an obvious approach for cloze-type questions, but it seems non-obvious how to extent beyond this.

Are you aware of any work probing the differences in the representation using this style of masking vs a more normal language model objective? It would seem to me that this is the key to significant progress here (and of course one would speculate that a representation that works well for this would also work well for all kinds of KB-related tasks).

Thinking about this for a few minutes things like masking names, colors and numbers (the things that neural representations often confuse) and then asking questions based on them might be interesting. I wonder if bAbI could be extended for this?