Hacker News new | ask | show | jobs
by gertlabs 41 days ago
The more filters you apply (single model and single language, especially if you also filter by pipeline like agentic vs one-shot), the fewer samples, so there is variance. Known limitation that is inevitable with any finite budget. This is why we are selective about adding more languages because it will dilute the amount of samples we can run per language per model. But the aggregated statistics hold up well and are very consistent in our testing.
1 comments

I just applied a single filter, programming language.