Y
Hacker News
new
|
ask
|
show
|
jobs
by
ashirviskas
54 days ago
What? Training is not inference. Reading books is not the same as writing.
1 comments
cookiengineer
54 days ago
Maybe read up on how transformers, their encoders and decoders, and the attention matrix works?
https://arxiv.org/abs/1706.03762
link
https://arxiv.org/abs/1706.03762