Hacker News new | ask | show | jobs
by AmiteK 147 days ago
I think the disagreement is about where inference belongs, not whether LLMs are capable.

Git diffs + LLM inference work well for understanding changes once. What I’m targeting is reducing the need to re-infer semantic surface changes every run, especially across large refactors or long-running workflows.

Today, LogicStamp derives deterministic semantic contracts and hashes, and watch mode surfaces explicit change events. The direction this enables is treating those derived facts as a semantic baseline (e.g. drift detection / CI assertions) instead of relying on repeated inference from raw diffs.

By “repeatability” I mean the artifacts, not agent behavior: same repo state + config ⇒ same semantic model. I don’t yet have end-to-end agent performance evals versus AGENTS.md + LSP

1 comments

> By “repeatability” I mean the artifacts

> Inference works well per session ... doesn't give artifact ... explicitness and repeatability across runs.

When you write this, it sounds like you are talking about repeatability between inference sessions and that this artifact enables that. It does not read that you are applying the repeatability to the artifact itself, which one assumes since it is autogenerated from code via AST walking

I agree - that’s on me for the wording. I’m not claiming repeatability of agent inference or LLM sessions.

By “repeatability” I mean the extraction itself: given the same repo state + config, the derived semantic artifact is identical every time. That gives CI and agents a stable reference point, but it doesn’t make agent behavior deterministic.

The value is in not having to re-infer structure from raw source each run - not in making inference runs repeatable.

> The value is in not having to re-infer structure

Except this is not the case, it has to re-infer structure on every new session, you are providing an index to supposedly speed that up. But the model still has to infer something from your index as it processes the tokenized version, it's not automagically injected into its understanding.

I agree having something like this helps a lot. I don't agree that auto generating it from the code and providing that comprehensive list to the model is helpful. I tried across this whole spectrum, from none at all to as detailed as you are. There is a point where these indexes become more noise than helpful, which is why (1) I keep hounding on evals, because mine show a different conclusion than the one you are making (2) having a curated version of this in the agents.md files was more than sufficient to noticably improve performance, format doesn't matter much

The other drawback I've experienced from doing this is that the model tends to go look up things based on the index even though it doesn't need it for the task at hand. It ends up making more tool calls and spending more tokens in the long run.