|
|
|
|
|
by anupamchugh
130 days ago
|
|
Following up - I built a tool "wobble"[1] to measure this: parses ~/.claude/projects/*.jsonl session transcripts,
extracts skill invocations + actual commands executed, calculates Bias/Variance per the paper's formula. Ran it on my sessions.
Result: none of skills scored STABLE. The structural predictors of high variance: Numbered steps without clear default, Options without (default) marker, Content >4k chars (overthinking zone), Missing constraint language [1] https://github.com/anupamchugh/shadowbook (bd wobble) |
|