I thought about this for a while and came to a conclusion that while "code is free", tokens are not. If tokens were free and instant, it would generate machine code directly. Therefore, it needs abstractions like a compiled or interpreted language in order to address the token bottleneck.
Reduce entropy, increase probability of the correct outcome.
LLMs are surfing higher dimensional vector spaces, reduce the vector space, get better results.