I think the issue is deeper than prompts, agents.md, smart flows, etc. I think the problem is that LLMs are searchers, trained on preferring some results. So, if the dumb solution is there, and the smart solution is not there, they won't spit it out.
To elaborate: That advice isn’t as objective as you think.
What one developer calls clean the other calls messy.
My advice is to use it, then document the issues when it gets messy. It takes some time, but no more than recruiting, training, paying another engineer.