Y
Hacker News
new
|
ask
|
show
|
jobs
by
jeremysalwen
2359 days ago
Not true, a transformer can be used in models without any lookahead, for example how it is used in gpt-2.! The real difference is the complexity of the model and the large increase in computational cost.