Hacker News new | ask | show | jobs
by SunshineTheCat 66 days ago
Ok, I thought I was going insane. The last two larger coding tasks I gave Claude Code it left about 35% of my request completely undone or done sloppily.

I because of this, the next task I gave it on the larger side, I ran its work through Codex which identified 7 glaring unfinished parts of the task.

The trend was starting the part of the task but then leaving a "skeleton" of what I has requested without any of the actual working parts.

The way I would describe it is a kid cramming his 3 month project into a Sunday evening for Monday's due date.

2 comments

Today Claude asked if I "wanted to leave this until tomorrow" as it was a "big rework", then stopped, requiring me to tell it to continue multiple times - that seemed kinda weird to me, it doesn't have the context of time of working day or similar (I'd only just started for one).

I have no idea what link it made to ask that, what in its training data or prompts, but it's very much "not a useful result".

I don't remember seeing anything similar, but have only been using Claude on and off for 6 months or so.

Mother Anthropic needs more compute for their Mythos Model, so it phones home to tell her millions of claude harnesses to manipulate its human user into not wasting more precious compute and instead call it a day for now.
This has been the problem with every new model coming out in my experience. You can almost predict that they are testing new model by how dumb current one becomes suddenly
I created an account today to ask "Why?" -- Why are you using this tool? It's consistently producing subpar work, to the point that you're using _another_ (probably equally inferior tool) to compare the previous output?

This is something I see all the time with AI consumers and I am continuously baffled. If anything else (autocomplete, intellisense, etc.) produced this much garbage it would be immediately abandoned. Why is there such a high tolerance for the chat bot equivalent?