| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by craffel 1747 days ago

(different author, not Stella)

To your first question: Unpublished experiments done by the BigScience architecture and scaling WG suggest that training on book corpus yields a boost of 10-15% accuracy on LAMBADA.

To your second question: LAMBADA specifically is an interesting task, but it's a bit unsatisfying to work on since there are so many conflating factors in prior work on the dataset. We are planning quite a few follow-up projects along this general line of work (prompted multi-task training), though.