|
|
|
|
|
by mythz
526 days ago
|
|
Yeah we evaluated several models for grading ~1 year ago and concluded Mixtral was the best choice for us, as it was the best model yielding the best results that we could self-host and distribute the load of grading 1.2M+ answers over several GPU Servers. We would have liked to pick a neutral model like Gemini which was fast, reliable and low cost, unfortunately it gave too many poor answers good grades [1]. If we had to pick a new grading model now, hopefully the much improved Gemini Flash 2.0 might yield better results. [1] https://pvq.app/posts/individual-voting-comparison#gemini-pr... |
|