Hacker News new | ask | show | jobs
by wizzwizz4 9 days ago
I've repeated the argument over and over since the GPT-2 days, when I derived it theoretically by inspecting the architecture of the model. I am now fatigued, and enough other people have taken up similar arguments – some developed half-way to a mathematical proof – that I no longer feel the obligation to keep repeating myself.
1 comments

You could post a link.