Hacker News new | ask | show | jobs
by mewpmewp2 2345 days ago
Not a single reward function, but multiple. Although for each decision/action you could in theory calculate a single value that represents the weight toward making that decision. This value comes from multiple systems.