|
|
|
|
|
by VMG
1155 days ago
|
|
I have to agree. The article summary says > Transformer block: Guesses the next word. It is formed by an attention block and a feedforward block. But the diagram shows transformer blocks chained in sequence. So the next transformer block in the sequence would only receive a single word as the input? Does not make sense. |
|