| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by hansmayer 15 days ago
	Their CEO claims a lot of wild shit. He claimed in January this year, that in about 2-3 weeks from this moment, i.e. "in 6 months" that AI will be doing all of SWE work. Lets hold these people accountable for a change!

3 comments

aspenmartin 15 days ago

> "in 6 months" that AI will be doing all of SWE work

I assume this is the quote you're referring to from Davos?

"I have engineers within Anthropic who say I don’t write any code anymore. I just let the model write the code, I edit it. I do the things around it… we might be six to twelve months away from when the model is doing most, maybe all of what SWEs do end to end."

that was in Jan, he said "might" and he said 6-12 months. Yes! Let's hold him accountable for saying reasonable things!

hansmayer 15 days ago

Reasonable things? He said the same shit over and over over the last several years. Yes, lets hold him accountable - you don't make such "oopsies" accidentally, several times in a row.

aspenmartin 15 days ago

Seems pretty reasonable to me. Timescales are hard for anyone to predict. He is forced to do these predictions to know how much compute to buy in advance. Surprisingly, he underbought compute and now has to scramble to secure it from xAI or wherever he can. So he was overly conservative...

hansmayer 15 days ago

> Timescales are hard for anyone to predict

Indeed. That's why serious people are very careful, even if they are not running a company supposedly worth 1T USD

> He is forced to do these predictions to know how much compute to buy in advance

Ah well, that explains it. For my companies next quarter, I'll just pull some random numbers out of my ass so we can make plans with material impact to company business based on that.

aspenmartin 15 days ago

> That's why serious people are very careful, even if they are not running a company supposedly worth 1T USD

10x revenue growth per year, even more this year...his predictions about when agents will claim SWE e2e work are his speculations, relevant because people care about what he thinks as he is closer than anyone to the leading edge of the technology. It's also important for him to be as accurate as he can about this because he has to put his money where his mouth is. He has to sign the right amount of compute otherwise he screws himself. He got it wrong in the opposite direction that you're implying, so at this point it sounds like you are more interested in your axe to grind than the truth on the ground.

You think enterprises are adopting CC because they think "oh this will replace my SWEs I can fire them"? That's not happening at major companies. They buy CC because it's useful and the writing is so clearly on the wall in so many data points that to suggest otherwise is a bit silly at this point.

> For my companies next quarter, I'll just pull some random numbers out of my ass so we can make plans with material impact to company business based on that.

You, as a leader of a company, don't have to make predictions? Don't have to make bets about what the best thing for you to do next year? That must be incredibly nice.

Amodei and everyone else need to plan compute and plan their products and roadmap. You want him to....not do that?

hansmayer 15 days ago

> 10x revenue growth per year

To the stunning tune of 5B in the lifetime .

> You think enterprises are adopting CC because they think "oh this will replace my SWEs I can fire them"?

Yeah, that's actually Darios main talking point

> They buy CC because it's useful and the writing is so clearly on the wall in so many data points that to suggest otherwise is a bit silly at this point

Right, really sound arguments - writing is "clearly on the wall" and there are "so many data points". I'd be keen to use those immediately, but I am kind of missing the key of the "many data points" - namely, what did you build with it and how much ARR is it generating?

> You, as a leader of a company, don't have to make predictions

I have to make predictions, but not confabulations, lies and idiocies.

> Amodei and everyone else need to plan compute

FOR WHAT? Again, what was built with their shitty product in various companies and how much ARR did it generate? Uber seems to get no value out of it.

supern0va 15 days ago

I work in big tech and probably 90% of code over the last month has been written by AI. And I suspect it's probably higher within Anthropic, which is probably what he's basing his opinion on.

So, he's closer to correct than not.

That said, your recollection is also flawed. It was in mid-March, and here's the relevant quotes:

>I think we’ll be there in three to six months—where AI is writing 90 percent of the code. And then in twelve months, we may be in a world where AI is writing essentially all of the code.

[...]

>But the programmer still needs to specify, you know, what are—what are the conditions of what you’re doing, what—you know, what is the overall app you’re trying to make, what’s the overall design decision? How do we collaborate with other code that’s been written? You know, how do we have some common sense on whether this is a secure design or an insecure design?

[...]

>So as long as there are these small pieces that a programmer, a human programmer, needs to do, the AI isn’t good at, I think human productivity will actually be enhanced. But on the other hand, I think that eventually all those little islands will get picked off by AI systems.

With another 3-4 months left on the clock, his prediction seems remarkably on point for at least certain organizations and domains.

I welcome you to also hold yourself accountable in the coming months if this trend continues. ;)

pier25 15 days ago

> And I suspect it's probably higher within Anthropic

That probably explains why their uptime and reliability are so bad.

m1coti 15 days ago

Written, but was it reviewed? Do you need to edit code written by LLM?

I agree that most of the things are written by AI but writting code was never the bottleneck in big tech.

supern0va 15 days ago

Yep! We have a review process where we have a few agents, each tuned to a particular domain of expertise (security, code quality, etc) which iterate until the feedback meets a certain threshold, at which point it goes over to humans for (hopefully) final review.

That said, I generally agree that you're correct: writing code in many ways has not been the biggest bottleneck. However, by removing much of that writing, it frees up engineers to work on the uniquely human things that are larger bottlenecks.

I had a few comments in a thread here touching on where I think most of the value has come from for us (which is largely search/understanding of our dependencies and making away team work far more viable, which aids with cutting through bureaucracy and the tendency for teams to push back on work): https://news.ycombinator.com/item?id=48298731

hansmayer 15 days ago

Haven't you heard - these days they just throw slop generated by LLM agents over to other LLM agents which cosplay as internal QA. They know it works because they write really strict .MD files where they instruct agents in English language to 'never do this' and 'always do that'.

aspenmartin 15 days ago

This is really what you think happens at large tech companies? You don't think it's possible this is maybe even slightly overly simplifying what the relevant processes are?

hansmayer 15 days ago

Read the other comment in the thread. Your buddy literally confirmed exactly what I wrote.

aspenmartin 15 days ago

Comment does indicate you don’t really seek to know how things work with respect to this and seem to not be able to imagine that the Occam’s razor is: agents are more useful than you think they are.

supern0va 15 days ago

>Read the other comment in the thread. Your buddy literally confirmed exactly what I wrote.

Please engage in good faith. I commented that humans are the final step of the review process.

hansmayer 15 days ago

> I welcome you to also hold yourself accountable in the coming months if this trend continues. ;)

My company did not swallow hundreds of billions in shady investment deals and is not publicly traded. We work with real money, and the revenue on our books is the revenue that is actually booked, not fake revenue we plan in 2 years time to maybe happen. So no, I am not going to hold myself accountable. But people who work with other people's money should be absolutely held accountable when their wild imaginations don't come true, repeatedly, quarter after quarter, year after year!

aspenmartin 15 days ago

I think he means hold yourself accountable when it turns out your predictions and pessimism don't age well.

hansmayer 15 days ago

Mate, for 5 years I've been hearing that crap. I am not predicting anything / on the contrary the AI boosting bunch is. When are your predictions coming true?

supern0va 15 days ago

AFAIK, most predictions from several years ago were for...approximately now to within the next few years. Can you be more specific?

You criticized a very specific (and fake/misquoted) prediction, ignored the correction, and are now criticizing vague hand-wavey "predictions" that you have left unspecified.

Can you please stop with the angry/ranty replies and actually have a real conversation grounded in actual facts?

Now, having said all of the above...I'll also point out that these are predictions, not promises/guarantees. These people are being asked to forecast and are doing so. I hardly think they should be held responsible for not being literal oracles, but even so--please, at least quote them correctly/at all.

In short: be better than the hallucinations you're seen to call out from the models.

aspenmartin 15 days ago

What predictions, sorry?

supern0va 15 days ago

I will note that you have essentially not responded to anything specific in my comment, nor at least acknowledged that you misstated Dario Amodei's actual prediction.

sampli 15 days ago

Elon playbook