|
|
|
|
|
by sandGorgon
126 days ago
|
|
this is actually a very valid technique. We do the same (as an rl environments provider). Except we bundle it with a custom browser renderer which actually generates rewards based on dom diff...and not screenshot based. the browser renderer is opensource https://github.com/wootzapp/wootz-browser |
|