That seems a good start. But maybe that's game-able too. If you're dealing with an AI (or proto-AI, at least) that understands you better than you do yourself, you're pwned. So then you need your own AI, to filter input and flag exploits. Rather like anti-malware, for your mind.
The idea is to create a human programming language to encode activities for learning/practicing principles found in the book Nonviolent Communication and publish that before publishing the language as a way to protect oneself.