|
|
|
|
|
by londons_explore
1251 days ago
|
|
GPT3 DaVinci has a context window of 2048 tokens. ChatGPT seems to have a context window of 8192 tokens (from testing how far back it can remember). To me, that suggests the model is probably at least 4x larger, possibly 16x (a bunch of layers scale with the square of the window size). Obviously, other bits of the model design may have changed to reduce parameter count. |
|