|
|
|
|
|
by hashemalsaket
1049 days ago
|
|
For now, the design is basic: User to LLM: "Rate this response to the following prompt on a scale of 1-10, where 1 is a poor response and 10 is a great response: [response]" LLM rates responses of all other LLMs All other LLMs do the same Then we take the average score of each response. The LLMs that produced the top 50% of responses will respond again until one response with the highest score remains. |
|