|
|
|
|
|
by raylad
59 days ago
|
|
That was more than one task. It was 3. I also had Opus 4.7 and Opus 4.6 do audits of a very long document using identical prompts. I then had Codex 5.4 compare the audits. Codex found that 4.6 did a far better job and 4.7 had missed things and added spurious information. I then asked a new session of Opus 4.7 if it agreed or disagreed with the Codex audit and it agreed with it. I also agreed with it. |
|