| I had Claude Code pull the OTEL trace and calculate cost based on token counts in the responses. I'll double check later today tho if I remember Edit:
I do see the first request shows 0 cache read, 7k cache write tokens. The next request shows 7k cache read, 900 cache write tokens. The agent run summary is: usage { cache_read_input_tokens
244586 cache_write_input_tokens
38399 completion_tokens
8131 input_tokens
1172 output_tokens
8131 prompt_tokens
1172 total_tokens
292288 } I do see a recent issue in the Strands Agent issue tracker about 1hr TTL getting ignored and defaulting to 5m TTL. I haven't validated cache TTL but these agent runs take ~2-3m so a 5m TTL is sufficient. I also checked the AWS bill and see separate Usage SKUs USE1-MP:USE1_CacheWriteInputTokenCount-Units
$0.34 USE1-MP:USE1_OutputTokenCount-Units
$0.27 USE1-MP:USE1_CacheReadInputTokenCount-Units
$0.16 USE1-MP:USE1_InputTokenCount-Units
$0.01 |