Y
Hacker News
new
|
ask
|
show
|
jobs
by
gwern
170 days ago
So if it's not using attention and it processes the entire input into an embedding to process in one go, I guess this is neither a Transformer nor a RNN but just a MLP?