|
|
|
|
|
by Hfuffzehn
18 days ago
|
|
I agree.
But notice that you assume that there is a metric with which you can messure improvement.
Which is fine if you are measuring against your personal taste. But it might be that the optimization target itself has a ceiling. If you're training toward human approval ratings from a broad population, you converge toward what median preference selects for. The plateau is baked into what you're measuring against. |
|