| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by andy_xor_andrew 1165 days ago

I wonder how much longer this "Using LLMs to evaluate the quality of other LLMs" can last. Certainly it has proven valuable and useful up until now, especially since ChatGPT is a pretty high bar to evaluate against.

But it also seems like a strange, incestuous, closed system approach.

Like, unless you are introducing something new into the system, you just have the system churning against itself, probably until it reaches an equilibrium (or else becomes incoherent).

1 comments

anothernewdude 1165 days ago

I wonder how long "Using humans to rate the quality of other humans" thing can last. Surely academia has only so long before it collapses.

link

plagiarist 1165 days ago

You're asserting that current LLMs are as capable as evaluating each other as are humans with advanced degrees?

link

anothernewdude 1160 days ago

Yes. They're both awful.

link

jacooper 1165 days ago

Also humans aren't exact clones

link