|
|
|
|
|
by leereeves
895 days ago
|
|
> The complexity comes from the number of steps and the number of parameters. Yes, it seems like a transformer model simple enough for us to understand isn't able to do anything interesting, and a transformer complex enough to do something interesting is too complex for us to understand. I would love to study something in the middle, a model that is both simple enough to understand and complex enough to do something interesting. |
|