|
|
|
|
|
by niwtsol
473 days ago
|
|
If you have a judge system, and Mistral performs well on other tests, wouldn't you want to include it so if it scores the highest by your judges ranking it would select the most accurate result? Or are you saying that mistral's image markdown would score higher on your judge score? |
|
In our current setup Gemini wins most often. We enter multiple generations from each model into the 'tournament', sometimes one generation from gemini could be at the top while another in the bottom, for the same tournament.