Hacker News new | ask | show | jobs
by cinntaile 18 days ago
You have to keep an eye on them, but they don't just make tests pass.
1 comments

Claude sonnet 4 (this time last year) did do this. It once made simulation if a test script passing. Literally a script that just echoed test names and then said pass.
Change happens fast, a year old model is pretty outdated.

I'm sure it can happen, hence why I said to keep an eye out. Its main mode of operation is not to cook the tests however.

Happened to me, 3 days ago - deleted some tests and flipped assertions after outlining that it wasn't to change any assertions.

Our team was doing a similar task to move between test frameworks, and I had to do a git diff of hundreds of thousands of lines to try and work out where a test had disappeared to.

> 3 days ago

Your fault. You should have used a model from 0.000005 seconds ago!

Reading is difficult.
> Change happens fast, a year old model is pretty outdated.

What change? That you should not fake the results of a test because that defeats the whole purpose of a test has been known before there were computers.

I don't know, the weather?