Y
Hacker News
new
|
ask
|
show
|
jobs
by
maxwells-daemon
1771 days ago
Autoregressive transformers take a while to generate text, since you need to run the whole model once for every word in the output.