Y
Hacker News
new
|
ask
|
show
|
jobs
by
mirekrusin
1137 days ago
We need RLHF -> RLCF/RLIF/RLEF (Reinforcement Learning from Compiler/Interpreter/Execution Feedback).