Hacker News new | ask | show | jobs
by fc417fc802 476 days ago
> only if you do a gradient step with data sampled from the exact same weights is it an online step.

Bit pedantic, but amusing thought; wouldn't that imply that asynchronous actor critic is an offline training methodology?

1 comments

Yes, pedantically, it is! But as I said, everything's on a spectrum. Online-ish data can still work just fine.