| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by jeremychone 139 days ago

Interesting, I’ve never needed 1M, or even 250k+ context. I’m usually under 100k per request.

About 80% of my code is AI-generated, with a controlled workflow using dev-chat.md and spec.md. I use Flash for code maps and auto-context, and GPT-4.5 or Opus for coding, all via API with a custom tool.

Gemini Pro and Flash have had 1M context for a long time, but even though I use Flash 3 a lot, and it’s awesome, I’ve never needed more than 200k.

For production coding, I use

- a code map strategy on a big repo. Per file: summary, when_to_use, public_types, public_functions. This is done per file and saved until the file changes. With a concurrency of 32, I can usually code-map a huge repo in minutes. (Typically Flash, cheap, fast, and with very good results)

- Then, auto context, but based on code lensing. Meaning auto context takes some globs that narrow the visibility of what the AI can see, and it uses the code map intersection to ask the AI for the proper files to put in context. (Typically Flash, cheap, relatively fast, and very good)

- Then, use a bigger model, GPT 5.4 or Opus 4.6, to do the work. At this point, context is typically between 30k and 80k max.

What I’ve found is that this process is surprisingly effective at getting a high-quality response in one shot. It keeps everything focused on what’s needed for the job.

Higher precision on the input typically leads to higher precision on the output. That’s still true with AI.

For context, 75% of my code is Rust, and the other 25% is TS/CSS for web UI.

Anyway, it’s always interesting to learn about different approaches. I’d love to understand the use case where 1M context is really useful.

15 comments

daemonk 139 days ago

Yeah this is the simpler and also effective strategy. A lot of people are building sophisticated AST RAG models. But you really just need to ask Claude to generally build a semantic index for each large-ish piece of code and re-use it when getting context.

You have to make sure the semantic summary takes up significantly less tokens than just reading the code or its just a waste of token/time.

Then have a skill that uses git version logs to perform lazy summary cache when needed.

smusamashah 139 days ago

It seems like a very good use of LLMs. You should write a blog post with detail of your process with examples for people who are not into all AI tools as much. I only use Web UI. Lots of what you are saying is beyond me, but it does sound like clever strategy.

tontinton 139 days ago

Yeah we all converge to the same workflow, in my ai coding agent I'm working on now, I've added an "index" tool that uses tree-sitter to compress and show the AI a skeleton of a code file.

Here's the implementation for the interested: https://github.com/tontinton/maki/blob/main/maki-code-index%...

gck1 125 days ago

I'm curious, what does your workflow look like? I saw a plan prompt there, but no specs. When you want to change something, implement a new feature etc, do you just prompt requirements, have it write the plan and then have it work on it?

jeremychone 139 days ago

Oh, that's great.

I've always wanted to explore how to fit tree-sitter into this workflow. It's great to know that this works well too.

Thanks for sharing the code.

(Here is the AIPack runtime I built, MIT: https://github.com/aipack-ai/aipack), and here is the code for pro@coder (https://github.com/aipack-ai/packs-pro/tree/main/pro/coder) (AIPack is in Rust, and AI Packs are in md / lua)

firemelt 139 days ago

whenever I see post like this

i said well yeah, but its too sophiscated to be practical

jeremychone 139 days ago

Fair point, but because I spent a year building and refining my custom tool, this is now the reality for all of my AI requests.

I prompt, press run, and then I get this flow: dev setup (dev-chat or plan) code-map (incremental 0s 2m for initial) auto-context (~20s to 40s) final AI query (~30s to 2m)

For example, just now, in my Rust code (about 60k LOC), I wanted to change the data model and brainstorm with the AI to find the right design, and here is the auto-context it gave me:

- Reducing 381 context files ( 1.62 MB)

- Now 5 context files ( 27.90 KB)

- Reducing 11 knowledge files ( 30.16 KB)

- Now 3 knowledge files ( 5.62 KB)

The knowledge files are my "rust10x" best practices, and the context files are the source files.

(edited to fix formatting)

tjoff 138 days ago

How do you re-evaluate your approach? I'm asking because the landscape, at least from my lens, was completely different a year ago. So I fear that as the foundation shifts whatever learnings, approaches and mental models I have risk being obsolete and starts to work against me.

The problem of evaluating is hard enough as it is without layers of indirection built on top of it.

adammarples 139 days ago

It's not sophisticated at all, he just uses a model to make some documentation before asking another model to work using the documentation

lukeundtrug 139 days ago

I built myself an AST based solution for that during the last 6 months roughly. I always wondered whether grep and agent-based discovery will be the end of it and thought it just has to be better with a more deterministic approach.

In the end it's hard to measure but personally I feel that my agent rarely misses any context for a given task, so I'm pretty happy with it.

I used a different approach than tree-sitter because I thought I found a nice way to get around having to write language-specific code. I basically use VSCode as a language backend and wrote some logic to basically rebuild the AST tree from VSCode's symbol data and other API.

That allows me to just install the correct language extension and thus enable support for that specific language. The extension has to provide symbol information which most do through LSP.

In the end it was way more effort than just using tree-sitter, however, and I'm thinking of doing a slow migration to that approach sooner or later.

Anyways, I created an extension that spins up an mcp server and provides several tools that basically replace the vanilla discovery tools in my workflow.

The approach is similar to yours, I have an overview tool which runs different centrality ranking metrics over the whole codebase to get the most important symbols and presents that as an architectural overview to the LLM.

Then I have a "get-symbol-context" tool which allows the AI to get all the information that the AST holds about a single symbol, including a parameter to include source code which completely replaces grepping and file reading for me.

The tool also specifies which other symbols call the one in question and which others it calls, respectively.

But yeah, sorry for this being already a quite long comment, if you want to give it a try, I published it on the VSCode marketplace a couple of days ago, and it's basically free right now, although I have to admit that I still want to try to earn a little bit of money with it at some point.

Right now, the daily usage limit is 2000 tool calls per day, which should be enough for anybody.

Would love to hear what you think :)

<https://marketplace.visualstudio.com/items?itemName=LuGoSoft...>

jeremychone 138 days ago

I looked at your solution and extension README, and it's very interesting and well thought out.

The fact that you've been using it for six months and that it performs well says a lot. At the end of the day, that's what counts.

I like your idea of piggybacking on top of the LSP services, and I can imagine that this was quite a bit of work. Doing it as an MCP server makes it usable across different tools.

I also really like the name "Context Master."

In my case, it's much more niche since it's for the tool I built. Though it's open source, the key difference is that the "indexing" is only agentic at this point.

I can see value in mixing the two. LSP integration scares me because of the amount of work involved, and tree-sitter seems like a good path.

In that case, in the code map, for each item, there could be both the LLM response info and some deterministic info, for example, from tree-sitter.

That being said, the current approach works so well that I think I am going to keep using and fine-tuning it for a while, and bring in deterministic context only when or if I need it.

Anyway, what you built looks great. If it works, that's great.

lukeundtrug 137 days ago

Thanks for taking the time to check it out and for the kind words! I really appreciate it.

I totally get sticking with your current approach. Your workflow sounds very intriguing as well. A combination of both approaches might really be very interesting :) Adding an LLM interpretation layer on top of my graph is also something I'm actively considering.

Thanks for the great discussion, and best of luck with your tool and workflow!

cloverich 139 days ago

This is really interesting; ive done very high level code maps but the entire project seems wild, it works?

So, small model figures out which files to use based on the code map, and then enriches with snippets, so big model ideally gets preloaded with relevant context / snippets up front?

Where does code map live? Is it one big file?

jeremychone 139 days ago

So, I have a pro@coder/.cache/code-map/context-code-map.json.

I also have a `.tmpl-code-map.jsonl` in the same folder so all of my tasks can add to it, and then it gets merged into context-code-map.json.

I keep mtime, but I also compute a blake3 hash, so if mtime does not match, but it is just a "git restore," I do not redo the code map for that file. So it is very incremental.

Then the trick is, when sending the code map to AI, I serialize it in a nice, simple markdown format.

- path/to/file.rs - summary: ... - when to use: ... - public types: .., .., .. - public functions: .., .., ..

- ...

So the AI does not have to interpret JSON, just clean, structured markdown.

Funny, I worked on this addition to my tool for a week, planning everything, but even today, I am surprised by how well it works.

I have zero sed/grep in my workflow. Just this.

My prompt is pro@coder/coder-prompt.md, the first part is YAML for the globs, and the second part is my prompt.

There is a TUI, but all input and output are files, and the TUI is just there to run it and see the status.

CuriouslyC 139 days ago

1M context is super useful with Gemini, not so much for coding, but for data analysis.

jeremychone 139 days ago

Even there, I use AI to augment rows and build the code to put data into Json or Polars and create a quick UI to query the data.

speakbits 139 days ago

I think you've kind of hit on the more successful point here, which is that you should be keeping things focused in a sufficiently focused area to have better success and not necessarily needing more context.

exceptione 139 days ago

  > - a code map strategy on a big repo. Per file: summary, when_to_use, public_types, public_functions. This is done per file and saved until the file changes. With a concurrency of 32, I can usually code-map a huge repo in minutes. (Typically Flash, cheap, fast, and with very good results)

Thanks, but why use any AI to generate this? I would say: you document your functions-in-code, types are provided from the compiler service, so it should all be deterministically available in seconds iso minutes, without burning tokens. Am I missing something?

jeremychone 139 days ago

Very good point. I had two options:

1) Deterministic

  - Using a tree-sitter/AST-like approach, I could extract types, functions, and perhaps comments, and put them into an index map.

  - Cons:

    - The tricky part of this approach is that what I extract can be pretty large per file, for example, comments.

    - Then, I would probably need an agentic synthesis step for those comments anyway.

2) Agentic

  - Since Flash is dirt cheap, I wanted to experiment and skip #1, and go directly to #2.

  - Because my tool is built for concurrency, when set to 32, it's super fast.

  - The price is relatively low, perhaps $1 or $2 for 50k LOC, and 60 to 90 seconds, about 30 to 45 minutes of AI work.

  - What I get back is relatively consistent by file, size-wise, and it's just one trip per file.

So, this is why I started with #2.

And then, the results in real coding scenarios have been astonishing.

Way above what I expected.

The way those indexes get combined with the user prompt gets the right files 95% of the time, and with surprisingly high quality.

So, I might add deterministic aspects to it, but since I think I will need the agentic step anyway, I have deprioritized it.

rafael-lua 139 days ago

Well, out of all the workflows I have seen, this one is rather nice, might give it a try.

I imagine if the context were being commited and kept up-to-date with CI would work for others to use as well.

However, I'm a little confused on the autocontext/globs narrowing part. Do you, the developer, provide them? Or you feed the full code map to flash + your prompt so it returns the globs based on your prompt?

Also, in general, is your map of a file relatively smaller than the file itself, even for very small files?

jeremychone 139 days ago

- The ..-code-map.json files are per "developer folder," which would create too many conflicts if they were kept in Git.

- I have two main globs, which are lists of globs: knowledge_globs and context_globs. Knowledge can be absolute and should be relatively static. context_globs have to be relative to the workspace, since they are the working files.

- As a dev, you provide them in the top YAML section of the coder-prompt.md.

- The auto-context sub-agent calls the code-map sub-agent. Sub-agents can add to or narrow the given globs, and that is the goal of the auto-context agent.

It looks complicated, but it actually works like a charm.

Hopefully, I answered some of your questions.

I need to make a video about it.

But regardless, I really think it's not about the tools, it's about the techniques. This is where the true value is.

exceptione 139 days ago

  > I need to make a video about it.

My 2ct, I think writing and reading an article is easier.

jeremychone 139 days ago

point taken.

akrauss 139 days ago

Looking forward to some article/video

LuxBennu 139 days ago

Your code map compresses signal on the context side. Same principle applies on the prompt side: prompts that front-load specifics (file, error, expected behavior) resolve in 1-2 turns. Vague ones spiral into 5-6. 1M context doesn't change that — it just gives you more room for the spiral.

Myrmornis 139 days ago

This is interesting but don't you worry that you're competing with entire companies (e.g. Anthropic) and thus it's a losing battle? Since you're re-implementing a bunch of stuff they either do in their harness or have decided it was better not to do?

mh- 139 days ago

I think it's worth remembering that for any offering like that, it necessarily needs to be ~one-size-fits-all, while what you come up with.. doesn't.

They're solving a different problem than you. So I think it's very plausible that you could come up with something that, for your use case, performs considerably better than their "defaults".

sphilipakis 137 days ago

Personally I don't see aipack's pro@coder and other approaches (claude code, cursor, copilot, etc...) as competitors anymore. I use both approaches to solve different problems. I keep using the agentic solutions (claude code style) for more operational tasks, a bit like "smart interfaces to terminal", and pro@coder for coding / engineering tasks where I need a much tighter control over long running work sessions.

ra7 139 days ago

This is fascinating. I feel like this is converging into the concept of a traditional "IDE". So much of your setup reminds me of IDEs indexing, doing static analysis, building ASTs, etc. before a developer starts writing code.

jeremychone 138 days ago

Yes, there is a parallel here. Now, some of those "indexing" steps can be performed by an LLM.

And that does not prevent mixing and matching the two, as some comments in this thread suggest.

Anyway, it's a great time for production coding.

Weryj 139 days ago

My approach has been using static analysis to produce a Mermaid diagram of all Classes:Methods and their caller/callees.

make_it_sure 139 days ago

very interested in this approach and many other people are for sure. Please do a blog post.