| HN Mirror

Good question. I don't have a formal benchmark yet, but here's what I've measured in practice:

Token reduction: 64-95% depending on codebase size and work pattern. The variance is because it depends on how many files are in your .claude/ directory and how focused your session is.

How to measure yourself: 1. Check `~/.claude/attention_history.jsonl` after a session 2. Run `python3 ~/.claude/scripts/history.py --since 2h` to see what got injected vs evicted 3. Compare your token counts before/after in Claude Code's usage stats

The 25k character injection cap is the key constraint (adjustable), forces the router to prioritize ruthlessly instead of dumping everything in.

A proper benchmark comparing baseline Claude Code vs claude-cognitive on identical tasks would be useful. If someone wants to build that, I'd happily collaborate.