| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by VictorSh 1747 days ago

(author here) That's an interesting take (which I agree with).

Providing a quick way to stress test the model is definitely a double edge sword. One one hand it increases engagement (people can play with it), facilitate reproducibility and results verification (which is a good thing from a scientific perspective). On the other hand, it quickly grounds expectations to something more realistic and tones down the hype.

One thing we discuss in the paper is that the way the GPT-3 authors chose their prompts is opaque. Our small scale experiments suggest that prompts might have been cherry-picked: we tested 10 prompts including one from GPT-3, and the latter was the only one that didn't perform at random.

Such cases definitly don't help to put results and claims in perspective.

1 comments

6gvONxR4sf7o 1746 days ago

> Providing a quick way to stress test the model is definitely a double edge sword.

I hope you don’t second guess or regret the choice to make the announcement so accessible. It’s a really good thing to have scientific communication accurate and accessible, especially when those two things go together.

link