| I hate how seriously people take the output of an LLMs or how reliable they think it is. Have Claude produce that spec 10 times, use the same prompt and same context. Identical requests, but you'll get 10 unique answers that wil contradict each other with each response seeming extermely confident. Its scary how confident you people are in these outputs. |
There are real decisions to be made when going from a vague prompt to a spec. It's not surprising that an LLM would produce different specs for the same work on different runs. If the prompt already contained answers to all the decision points that come up when writing the spec then the prompt would already be the spec itself.