Hacker News new | ask | show | jobs
by d4rkp4ttern 73 days ago
For me one of the most interesting aspects is how compaction works. It turns out compaction still preserves the full original pre-compaction conversation in the session jsonl file, and those are marked as "not to be sent to the API". Which means, even after compaction, if you think something was lost, you can tell CC to "look in the session log files to find details about what we did with XYZ". I knew this before the leak since it can be seen from the session logs. Some more details:

  The full conversation is preserved in the JSONL file, and messages
  are filtered before being sent to the API.

  Key mechanisms:

  1. JSONL is append-only — old pre-compaction messages are never deleted. New messages (boundary
  marker, summary, attachments) are appended after compaction.
  2. Messages have flags controlling API visibility:
    - isCompactSummary: true — marks the AI-generated summary message
    - isVisibleInTranscriptOnly: true — prevents a message from being sent to the API
    - isMeta — another filter for non-API messages
    - getMessagesAfterCompactBoundary() returns only post-compaction messages for API calls
  3. After compaction, the API sees only:
    - The compact boundary marker
    - The summary message
    - Attachments (file refs, plan, skills)
    - Any new messages after compaction
  4. Three compaction types exist:
    - Full compaction — API summarizes all old messages
    - Session memory compaction — uses extracted session memory as summary (cheaper)
    - Microcompaction — clears old tool result content when cache is cold (>1h idle)
1 comments

What is microcompaction? I didn’t realize there was any thing time based in CC, when I go eat dinner and come back it compacted while I was gone?
I dug into this more. It's disabled by default, and it's a cost/token-usage optimization.

  The logic is:

  1. Anthropic's API has a server-side prompt cache with a 1-hour TTL
  2. When you're actively using a session, each API call reuses the cached prefix — you only pay
  for new tokens
  3. After 1 hour idle, that cache is guaranteed expired
  4. Your next message will re-send and re-process the entire conversation from scratch — every
  token, full price
  5. So if you have 150K tokens of old Grep/Read/Bash outputs sitting in the conversation, you're
  paying to re-ingest all of that even though it's stale context the model probably doesn't need

  The microcompact says: "since we're paying full price anyway, let's shrink the bill by clearing
  the bulky stuff."

  What's preserved vs lost:
  - The tool_use blocks (what tool was called, with what arguments) — kept
  - The tool_result content (the actual output) — replaced with [Old tool result content cleared]
  - The most recent 5 tool results — kept

  So Claude can still see "I ran Grep for foo in src/" but not the 500-line grep output from 2
  hours ago.

  Does it affect quality? Yes, somewhat — but the tradeoff is that without it, you're paying
  potentially tens of thousands of tokens to re-ingest stale tool outputs that the model already
  acted on. And remember, if the conversation is long enough, full compaction would have summarized
   those messages anyway.

  And critically: this is disabled by default (enabled: false in timeBasedMCConfig.ts:31). It's
  behind a GrowthBook feature flag that Anthropic controls server-side. So unless they've flipped
  it on for your account, it's not happening to you.