|
|
|
|
|
by wizzwizz4
9 days ago
|
|
I've repeated the argument over and over since the GPT-2 days, when I derived it theoretically by inspecting the architecture of the model. I am now fatigued, and enough other people have taken up similar arguments – some developed half-way to a mathematical proof – that I no longer feel the obligation to keep repeating myself. |
|