|
|
|
|
|
by throwaway287391
510 days ago
|
|
Yes, I am assuming they evaluated the models in good faith, understand how to design a basic user study, and therefore when they ran a study intended to compare the response quality between two different models, they showed the raters both fully-formed responses at the same time, regardless of the actual latency of each model. |
|