| > 1. Gamified Savings vs. Your Actual API Bill Tool use output represents a large amount of my output. I'll take 3.7M tokens saved on 3.9M tokens of input. Tokens saved are tokens saved. > 3. Where Are the Accuracy Benchmarks? As a user of RTK, it would be nice to see accuracy benchmarks. However, I've seen no evidence of the model missing anything critical as a result of the compression. As part of their design philosophy they are very strict about preserving correctness to the point that if a filter fails they fall back to raw output. For my most frequently used commands I've inspected the source, was happy with what I saw, they've earned my trust thus far. > The day git, cargo, npm, or grep updates its terminal formatting by a few spaces or changes an error layout, RTK's regex and parsing filters will break. And returning to the silent failure trap, it won't throw an explicit error; it will fail quietly, feeding corrupted or partial text to your agent. Again, any filter that fails simply falls back to the raw output. One of their core pillars is avoiding this exact scenario you described. RTK should never feed corrupted or partial text to an agent. Your concerns are fair but I'd like to see your criticism backed up with evidence. Have you used RTK? Have you found evidence that they are failing to preserve correctness? |