Hacker News new | ask | show | jobs
by nostalgebraist 2210 days ago
I wasn't referring to one specific graph.

The GPT-3 paper (https://arxiv.org/pdf/2005.14165.pdf) has a very large number of graphs showing parameter count on the horizontal axis and some kind of prediction quality metric on the vertical axis. Most of the interesting ones are in Appendix H.

"The point isn’t the performance at 175B, but the shape of the curve as it passes from 117M to 175B" was referring to a general point about how to interpret any/all of those graphs, not a particular one of them.