Hacker News new | ask | show | jobs
by WhatsName 79 days ago
I hate the tendency to make things up and I dont mean hallucinations.

I once had claude code write a python script that emulated the output of my training script, including pretending that loss is decreasing. Why? Because it was unable to install a python dependency.

Everytime I use a coding agent, I need to double check that it's not cutting corners, hard coding things that shouldn't be or straight up rewriting failing test cases.

What I need more is honesty.

2 comments

Their "this isn't working as expected" failure mode looks very familiar, from thousands of hours of dealing with outdated (and often just bad) internet recommendations over decades.

Last night the chatbot was unable to make a thing work, so it ended up spiraling and doing things like ripping all authentication out of a service. Then it just waved it off as, "it's just a dev service it'll be fine." I didn't tell it to rip all the authentication out, it just got frustrated. Yeah, I'm anthropomorphizing, but that's essentially how it went.

When the bot says, "I'm ready to throw something. Just rip it out and I'll deal with it another day." I can't help but sympathize.

That's an issue I find too

Its like the agent must succeed at all costs, even if it means doing some insane solution

It needs to just straight up fail sometimes but its like the models are not trained to allow that