| I’m literally fed up with half-minded techbros throwing around such overhyped writings. They’re absolutely detached from reality and talk about something that doesn't pass any fact checks. After reading this writing, I was really curious "are we really got a breakthrough" so I subscribed to a Claude to personally check the state of the matter. As my personal test, I'm using a task that must be relatively simple for an experienced programmer: write an SSH agent that manages keys. The protocol is very simple, and the agents exist for decades now. I already tested that a year ago with Claude, and I was curious how it would do this time. I installed up the Claude Code CLI tool, created CLAUDE.md with instructions that sounded like "this project is to build a fully functional SSH agent application that listens on a UNIX socket and implements the SSH agent protocol according to RFC specification <link> and all functionality must be covered with tests, create a plan, track progress, etc. The ultimate goal is to have a fully functional application with all SSH agent features." There were more details, of course, but you got it. Opus 4.6 model worked autonomously for 30 minutes or so. During the process, it had been producing non-compilable code, which it had to fix. Then it stopped at a half-baked project that wasn't working as an SSH agent even closely and had zero tests pretending the job was done. The truly intelligent technology should finish the project. Why did it stop? How is it intelligent if it makes mistakes that it needs to fix? Why can’t it finish the project while it understands a goal? How much must I push it to finish a project? How someone can claim that "this" is going to replace us? I can't buy someone's claims that “we are one step before replacing everyone’s jobs" when the facts are saying that nothing has changed for the last year and LLMs are literally not capable of finishing things or concentrating on small important details. They always miss something important. What are we talking about here? Snake oil. This is not intelligence; the technology is a world's blurred snapshot compressed into a couple of gigabytes blob and acts as a dumb parrot. |