Y
Hacker News
new
|
ask
|
show
|
jobs
by
half-kh-hacker
86 days ago
How does post-training via reinforcement learning factor in? Does every evaluated judgement count as 'the training data' ?
2 comments
abcde666777
86 days ago
I guess I'd place both within a broader umbrella: human generated input. So it still holds that they're regurgitating the decisions made by humans.
link
internet_points
86 days ago
yes
link