|
|
|
|
|
by tunesmith
934 days ago
|
|
Yeah I just saw the video from that researcher (later an OpenAI researcher?) that talked about it back in 2016... not that I understood much, but it definitely seemed that Q* was a generalization of the Q algorithm described on the previous slide. The optimum something across all somethings. |
|