| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by maxbond 25 days ago
	It means having taste. People say Picasso was a great painter, but that cannot be verified (at least, not in the sense of a verified reward).

1 comments

dominotw 25 days ago

"people say picasso was a great painter" is definitely not hard to verify . lol.

link

maxbond 25 days ago

I don't know if you're being factitious or not but that was not what I meant. Picasso being a great painter is an example of "having taste"; "create an artistic image generation model with Picasso-level performance" is a valid problem statement we could attack with RLHF, but not with RLVR, because "taste" is not amenable to modeling with a reward function.

"Write this code in a way that is readable and maintainable" is another example.

link

dominotw 25 days ago

https://futurism.com/artificial-intelligence/real-monet-ai-c...

link

logifail 25 days ago

The first paragraph ends with "[...] unleashing a flood of ill-informed reactions and muddled discourse. So, you know, it was just another day online."

It's almost as though it's not about the Monet.

link