I think most of this trial and error "You are an experienced engineer" stuff probably hurts model performance. No one ever does comprehensive testing so eh, yolo.
There are papers showing that models follow instructions less the more instructions they have. Now you think about how many instructions are embedded in that MD + the system prompt + likely a local AGENTS.md and at the end there is probably very little here that matters.
How much context is eaten up by skills that rehash what a SOTA model should already know?
Maybe token-wise, it's a wash: Elixir/OTP does a lot without third-party libs, which would require massive npm dependencies to achieve the same thing.