| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by anotherpaulg 705 days ago

I agree that many AI coding tools have rushed to adopt naive RAG on code.

Have you done any quantitative evaluation of your wiki style code summaries? My first impression is that they might be too wordy and not deliver valuable context in a token efficient way.

Aider uses a repository map [0] to deliver code context. Relevant code is identified using a graph optimization on the repository's AST & call graph, not vector similarity as is typical with RAG. The repo map shows the selected code within its AST context.

Aider currently holds the 2nd highest score on the main SWE Bench [1], without doing any code RAG. So there is some evidence that the repo map is effective at helping the LLM understand large code bases.

[0] https://aider.chat/docs/repomap.html

[1] https://aider.chat/2024/06/02/main-swe-bench.html

3 comments

lemming 705 days ago

I've been thinking about this a lot recently. So in Aider, it looks like "importance" is based on just the number of references to a particular file, is that right?

It seems like in a large repo, you'd want to have a summary of, say, each module, and what its main functions are, and allow the LLM to request repo maps of parts of the repo based on those summaries. e.g. in my website project, I have a documentation module, a client side module, a server side module, and a deployment module. It seems like it would be good for the AI to be able to determine that a particular request requires changes to the client and server parts, and just request those.

link

anotherpaulg 705 days ago

The repo map is computed dynamically, based on the current contents of the coding chat. So "importance" is relative to that, and will pull out the parts of each file which are most relevant to the task at hand.

link

lemming 705 days ago

Interesting, how does Aider decide what’s relevant to the chat?

link

emporas 704 days ago

I had forgotten that Aider uses tree-sitter for syntactic analysis. Happy to found you've got the tree-sitter queries ready, to retrieve code information from source. I was researching how to write the queries myself, for exactly the same purpose as Aider.

link

manishsharan 704 days ago

I tried using Aider but my codebase is a mix of Clojure Clojurescript and Java . I gave up making it work for me it as it created more issues for me. What I really hated about Aider was that it made code changes without my approval.

link

danenania 704 days ago

You might be interested in my project Plandex[1]. It’s similar to aider in some ways, but one major difference is that proposed changes are accumulated in a version-controlled sandbox rather than being directly applied to project files.

1 - https://github.com/plandex-ai/plandex

link

paradite 704 days ago

You can give 16x Prompt a try. It's GUI desktop app designed for AI coding workflow. It also doesn't automatically make code changes.

https://prompt.16x.engineer/

link

anotherpaulg 704 days ago

The recommended workflow is to just use /undo to revert any edits that you don’t like.

link