|
|
|
|
|
by benbayard
127 days ago
|
|
I was working on a similar project. I wanted a way to goldfish my decks against many kinds of decks in a pod. It would never be perfect, but enough to get an idea of:
1. How many turns did it take on average to hit 2,3,4,5,6 mana
2. How many threats did I remove?
3. How often did I not have enough card draw to keep my hand full? I don't think there's a perfect way to do this, but I think trying to play 100 games with a deck and getting basic info like this would be super valuable. |
|
https://github.com/spullara/mtg-reanimator
I have also tried evaluating LLMs for playing the game and have found them to be really terrible at it, even the SoTA ones. They would probably be a lot better inside an environment where the rules are enforced strictly like MTG Arena rather than them having to understand the rules and play correctly on their own. The 3rd LLM acting as judge helps but even it is wrong a lot of the time.
https://github.com/spullara/mtgeval