|
|
|
|
|
by wongarsu
362 days ago
|
|
Maybe we'll call it "continuous RLHF" or something like that. But you might be right that the dynamic part might be the biggest architectural shift needed. You can simulate a lot with in-context memory or clever retrieval, but memory alone doesn't allow the model to get better at chess the same way a human does |
|