|
|
|
|
|
by HarHarVeryFunny
1119 days ago
|
|
Sure - but it's still the interesting part! I'm sure some of key players know at least a little, but they don't seem inclined to share. In his Lex Fridman interview Sam Altam said something along the lines of "a LOT of knowledge went into designing GPT-4", and there's a time gap between GPT-3 (2020) and GPT-4 (2022) where it seems they spent a lot of time probably trying to understand it, among other things. It seems the way values are looked up via query/key and added must constrain representations quite a bit, and comparing internal activations for closely related types of input might be one way to start to understand what's going on. A high level understanding of what the model has learnt may be the last thing to fall, but understanding the internal representations would go a long way towards that. |
|