Hacker News new | ask | show | jobs
by MakeAJiraTicket 258 days ago
LLM actions are divorced from that reward function, it's not something they consult or consider. Reward function in that context doesn't make sense.