|
|
|
|
|
by anana_
13 hours ago
|
|
It looks like the purpose of this model is to i. generate environmental sim data for doing RL on other models or ii. act as a foundation model (they trained it to select actions as well as predicting the next state in the same loop?) Either way, neither are intended for end consumers. |
|