|
|
|
|
|
by GregorStocks
127 days ago
|
|
Yeah, the intention here is not to answer "which deck is best" - the standard of play is nowhere near high enough for that. It's meant as more of a non-saturated benchmark for different LLM models, so you can say things like "Grok plays as well as a 7-year-old, whereas Opus is a true frontier model and plays as well as a 9-year-old". I'm optimistic that with continued improvements to the harness and new model releases we can get to at least "official Pro Tour stream commentator" skill levels within the next few years. |
|
And re: ages, it's worth noting that the youngest player to make Day 2 of a Grand Prix is 8 years old, and the youngest Pro Tour winner was 15 years old. I don't think it's realistic to get an LLM anywhere close to either of those players in skill level, though it's absolutely possible with a specialized model.