|
|
|
|
|
by fraboniface
205 days ago
|
|
> Developments to the model architecture contribute to the significantly improved performance from previous model families. I wonder how significant this is. DeepMind was always more research-oriented that OpenAI, which mostly scaled things up. They may have come up with a significantly better architecture (Transformer MoE still leaves a lot of room). |
|