|
|
|
|
|
by musebox35
51 days ago
|
|
They say that they did test but the coverage was not enough to pick it up, at least for the prompt change: “ After multiple weeks of internal testing and no regressions in the set of evaluations we ran, we felt confident about the change and shipped it alongside Opus 4.7 on April 16. As part of this investigation, we ran more ablations (removing lines from the system prompt to understand the impact of each line) using a broader set of evaluations. One of these evaluations showed a 3% drop for both Opus 4.6 and 4.7. We immediately reverted the prompt as part of the April 20 release.” Considering the number and scope of users they serve, I can sympathize with the difficulty. However, they should reimburse affected users at least partially instead of just announcing “our bad, sorry “. That would reduce the frustration. |
|