| Really depends on the repo you’re working in. If it’s very large, especially if the tool needs to refer to documentation for a lot of custom frameworks and APIs, you often end up needing very large context windows that burn through tokens faster. If it’s smaller or sticks with common frameworks that the model was trained on, it’s able to do a lot more with smaller context windows and token usage is way lower. |
I don't use LLMs to write code (other than simple refactors and throwaway stuff) but I do use them heavily to crawl through big codebases and identify which files and functions I need to understand.
Some of the codebases I explore will burn through tokens at a rapid rate because there is so much complex code to get through. If I use the $20 Claude plan and Opus I can go through my entire 5-hour allocation in a single prompt exploring the codebase some times, and it's justified.
Other times I'm working on simple topics, even in a large codebase, and it will sip tokens because it only needs to walk a couple files to get to what it needs to answer my questions.