Hacker News new | ask | show | jobs
by zozbot234 11 hours ago
True but OP says that there is a meaningful "knee" at b=n/k (about 43 for DeepSeek V4 Flash) and I'm not sure that's all that relevant. If anything, it might be a bit more meaningful to highlight the point where on average half the experts are covered, which is coincidentally around 43 for Pro and 30 for Flash. Since that ought to be approximately where the variance in that expectation is maximized.