Hacker News new | ask | show | jobs
by jschomay 45 days ago
OP here: Thank you and I appreciate the thoughtful questions. To answer: 1) I used a text representation because it made sense for my game and let me "render" certain details in a more AI-friendly way, like the compact map. You could use something like agent-browser and it would probably work just fine, but I figured it added an extra layer of indirection that I didn't need, plus it would be a lot of screenshots! Being able to have a turn based loop really helped make this work.

2) I had a skill on just how to use the playtest server. I also gave it context on what the game is and how to play it. From there, it probably depends on your use case. I wasn't that impressed with its natural ability to playtest for bug discovery, so I would consider making a skill describing what a playtester would normally do. Focused playtester instances is a good idea. Ultimately what I found to be most helpful was to point it at a feature or bug that I was aware of and have it validate it. Not only was it fairly successful, that was the part that saved the most time for me.

3) I think I only burned about 300K tokens on my longest play-test session, and that includes a bunch of code tweaks too. Running it after every feature as a validation step is pretty cheap. Running it overnight in "open" playtesting could add up.

Good luck, please let me know how it goes if you get somewhere helpful!