| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by mehmetkose 176 days ago
	wasn’t Claude.md’s doing this?

1 comments

MirrorEthic 175 days ago

CLAUDE.md is static same content every session (and a soft 40k character limit)

This is dynamic attention routing. Files get scored based on what you're actively discussing. mention "auth" and auth-related docs go HOT (full injection), related files go WARM (headers only), unrelated files stay COLD (evicted).

Scores decay over turns. If you stop talking about auth, those files fade back to COLD automatically.

Plus multi-instance coordination, concurrent Claude sessions sharing completions and blockers so they don't duplicate work.

25k character limit on injection, so you compact less and stay focused where its needed. Ive also seen it help alot with the post compacting context wobble that occurs.

link

mehmetkose 175 days ago

Thanks, now i have clear understanding. The thing is, what’s the token/result ratio with this extension? Is there any way to benchmark?

link

MirrorEthic 171 days ago

Good question. I don't have a formal benchmark yet, but here's what I've measured in practice:

Token reduction: 64-95% depending on codebase size and work pattern. The variance is because it depends on how many files are in your .claude/ directory and how focused your session is.

How to measure yourself: 1. Check `~/.claude/attention_history.jsonl` after a session 2. Run `python3 ~/.claude/scripts/history.py --since 2h` to see what got injected vs evicted 3. Compare your token counts before/after in Claude Code's usage stats

The 25k character injection cap is the key constraint (adjustable), forces the router to prioritize ruthlessly instead of dumping everything in.

A proper benchmark comparing baseline Claude Code vs claude-cognitive on identical tasks would be useful. If someone wants to build that, I'd happily collaborate.

link

MirrorEthic 171 days ago

Sorry about the late reply!

link