Hacker News new | ask | show | jobs
by bcyn 392 days ago
Very interested to see what the next steps are to evolve the "retrieval" model - I strongly believe that this is where we'll see the next stepwise improvement in coding models.

Just thinking about how a human engineer approaches a problem. You don't just ingest entire relevant source files into your head's "context" -- well, maybe if your code is broken into very granular files, but often files contain a lot of irrelevant context.

Between architecture diagrams, class relationship diagrams, ASTs, and tracing codepaths through a codebase, there should intuitively be some model of "all relevant context needed to make a code change" - exciting that you all are searching for it.

2 comments

I have a different pov on retrieval. It's a hard problem to solve in a generalizable format with embeddings. I believe this can be solved at a model level where its used to fix an issue. With the model providers (oai, anthropic) going full stack, there is a possibility they solve it at reinforcement learning level. Eg: when you teach a model to solve issues in a codebase, the first step is literally getting the right files. Here basic search (with grep) would work very well as with enough training, you want the model to have an instinct about what to search given a problem. similar to how an experienced dev has that instinct about a given issue. (This might be what the tools like cursor are also looking at). (nothing against anyone, just sharing a pov, i might be wrong)

However, the fast apply model is a thing of beauty. Aider uses it and it's just super accurate and very fast.

Definitely agree with you that it's a problem that will be hard to generalize a solution for, and that the eventual solution is likely not embeddings (at least not alone).
Relevant interview extract from the Claude Code team: https://x.com/pashmerepat/status/1926717705660375463

> Boris from the Claude Code team explains why they ditched RAG for agentic discovery. > "It outperformed everything. By a lot"

This is very cool. They explained the solution better than I did. If I knew, I would have just linked this :)
Adding extra structural information about the codebase is an avenue we're actively exploring. Agentic exploration is a structure-aware system where you're using a frontier model (Claude 4 Sonnet or equivalent) that gives you an implicit binary relevance score based on whatever you're putting into context -- filenames, graph structures, etc.

If a file is "relevant" the agent looks at it and decides if it should keep it in context or not. This process repeats until there's satisfactory context to make changes to the codebase.

The question is whether we actually need a 200b+ parameter model to do this or if we can distill the functionality onto a much smaller, more economical model. A lot of people are already choosing to do it with Gemeni (due to the 1m context window), and they write the code with Claude 4 Sonnet.

Ideally, we want to be able to run this process cheaply in parallel to get really fast generations. That's the ultimate goal we're aiming towards