|
|
|
|
|
by gertlabs
41 days ago
|
|
The more filters you apply (single model and single language, especially if you also filter by pipeline like agentic vs one-shot), the fewer samples, so there is variance. Known limitation that is inevitable with any finite budget. This is why we are selective about adding more languages because it will dilute the amount of samples we can run per language per model. But the aggregated statistics hold up well and are very consistent in our testing. |
|