| HN Mirror

I had Claude Code pull the OTEL trace and calculate cost based on token counts in the responses. I'll double check later today tho if I remember

Edit: I do see the first request shows 0 cache read, 7k cache write tokens. The next request shows 7k cache read, 900 cache write tokens. The agent run summary is:

usage {

cache_read_input_tokens 244586

cache_write_input_tokens 38399

completion_tokens 8131

input_tokens 1172

output_tokens 8131

prompt_tokens 1172

total_tokens 292288

}

I do see a recent issue in the Strands Agent issue tracker about 1hr TTL getting ignored and defaulting to 5m TTL. I haven't validated cache TTL but these agent runs take ~2-3m so a 5m TTL is sufficient.

I also checked the AWS bill and see separate Usage SKUs

USE1-MP:USE1_CacheWriteInputTokenCount-Units $0.34

USE1-MP:USE1_OutputTokenCount-Units $0.27

USE1-MP:USE1_CacheReadInputTokenCount-Units $0.16

USE1-MP:USE1_InputTokenCount-Units $0.01