Y
Hacker News
new
|
ask
|
show
|
jobs
by
mewpmewp2
2345 days ago
Not a single reward function, but multiple. Although for each decision/action you could in theory calculate a single value that represents the weight toward making that decision. This value comes from multiple systems.