|
|
|
|
|
by andy_xor_andrew
1118 days ago
|
|
I wonder how much longer this "Using LLMs to evaluate the quality of other LLMs" can last. Certainly it has proven valuable and useful up until now, especially since ChatGPT is a pretty high bar to evaluate against. But it also seems like a strange, incestuous, closed system approach. Like, unless you are introducing something new into the system, you just have the system churning against itself, probably until it reaches an equilibrium (or else becomes incoherent). |
|