Hacker News new | ask | show | jobs
by oudlys 16 hours ago
>Respectfully, your link is not very convincing.

I'd love to understand why. This would be valuable feedback for me as I try to make my writing and exposition better. Also, if you have other data, that also would be valuable for me to know.

>if you believe what you believe, you should also acknowledge that AI doesn’t need regulations in the context Dario is proposing since obviously AI can’t do anything he predicts. Do you agree?

I think you misunderstand my beliefs. On net I think how we're using LLMs destroys value. That doesn't mean no one ever gets value from LLM use.

My particular point about trillion dollars is - the main place Anthropic, OpenAI, and - hilariously - SpaceX think they will drive value creation is in enterprise applications. In that domain I think the evidence is very convincingly negative. I'm certainly not the only person who thinks this. It's pretty well accepted in economics right now that there is no observed organizational level productivity improvement. Lines break down on whether it will show up eventually or whether we will wait forever.

My belief about LLM value is that it's most useful for individuals and small teams. Places where coordination and trust are easily established and feedback loops to value creation are tight. They are "short range" as it were.

Their value starts to erode as soon as a user becomes disconnected from the point of direct value creation. Which is pretty much everyone who works inside of a large organization. It becomes negative at pretty small scale, IMO. I do think there are patterns of use that could drive value at these scales. I talk about that in my post.

On Bioweapons in particular, I could see small teams of people working to build something very dangerous. Having spent my formative academic years in a biochemistry and microbiology lab though, I do think the danger is overstated. Papers are not know-how or equipment. There's a lot of tacit knowledge that can't get written down that is super hard to acquire.

But, I'd be happy for us to regulate AI for dangerous applications.

My question would be - why would Anthropic build something they so clearly think is dangerous? If they were really building something deserving of the valuation they have, why build applications like this?

To my eyes - it's super weird that a company would build something they think is dangerous and turn around and beg the governments of the world to stop them. That's really strange behavior from my perspective.

2 comments

You have engaged in good faith discourse, thanks! I'll reply in a bit.
I went through your post in substack (I think that's what you were referring to).

> I'd love to understand why. This would be valuable feedback for me as I try to make my writing and exposition better. Also, if you have other data, that also would be valuable for me to know.

I think it comes down to few things

- you took a single report that agreed with your statistics, for the sake or argument lets say I buy it completely

- you suggest that net value is lost simply because there are more incidents. this is a big jump

- you say that historically different technological improvements may have had similar patterns but this specific one is different because AI is stochastic

So it all really rests on you finding one distinction with AI and then disagreeing with the past trends.

I agree AI is stochastic and I'll put it this way: it is a high variance bet but it pays off. This is a bit hard for people to understand -- its a tool that works sometimes really nicely and fails other times. Overall you are better off using it but you need to use it enough to reduce variance.

Let me ask this: if you are so sure this won't lead to enterprise level productivity, how do you think this will show in macro trends? Surely you must believe that the valuations must drop wouldn't you? Can you come up with a concrete future scenario that would vindicate your opinion that AI doesn't make enterprises more productive?

> My question would be - why would Anthropic build something they so clearly think is dangerous? If they were really building something deserving of the valuation they have, why build applications like this?

I think this is fair and interesting question. Here is what I think they think: If they don't build it, someone else might do it. And they think they are more moral than others. If they have a head start they can set the political and regulatory landscape.

>you took a single report that agreed with your statistics

These are not my statistics. I'm not affiliated with Faros at all. I built an analysis on top of their reporting.

And, it's also not one report. DORA has tracked statistics with respect to throughput and quality as well. Those indicators are flat for throughput and negative for quality. The throughput flatness is also supported by the shovelware data.

I discuss both of those lines in How I'm thinking: https://unessays.substack.com/p/how-im-thinking-about-the-va...

>you suggest that net value is lost simply because there are more incidents. this is a big jump

I don't think it's a big jump at all. Incidents and bugs drive rework. Rework has to be subtracted from throughput. Product throughput is the only thing people pay for.

This type of analysis is done all the time in manufacturing and devops. Here's a link for you: https://reworkcost.com/benchmarks. I'm not bringing novel intellectual ideas to the table here.

Faros reports a 16% throughput improvement on PRs. They also report an 860% code churn increase. If you assign only 9% of that increase to wasteful rework, then the absolute throughput improvement disappears. This is a very simple, straightforward analysis of the operations data reported by Faros.

> - you say that historically different technological improvements may have had similar patterns but this specific one is different because AI is stochastic

I'm saying LLMs are unreliable. I think we agree on that front, you say:

>I agree AI is stochastic and I'll put it this way: it is a high variance bet but it pays off.

What I'm disputing is the "pays off" statement. That statement is amenable to validation with data. In my view, the data is saying it doesn't pay off. I think it says that very clearly. Across distinct lines of evidence.

>if you are so sure this won't lead to enterprise level productivity, how do you think this will show in macro trends? Surely you must believe that the valuations must drop wouldn't you? Can you come up with a concrete future scenario that would vindicate your opinion that AI doesn't make enterprises more productive?

I think LLMs can deliver value in the enterprise. I think the way to do that is to use them as quality checks and not as primary authors of intellectual work - like writing code.

Unfortunately, this use case would not support the expected 2-10x productivity increases that current valuations depend on. I do expect a major market correction in the near future. It would not surprise me if OpenAI or Anthropic are acquired. I think we're at risk of that happening within the next 1-7 months.

What would invalidate my beliefs? 1. Actual micro or macroeconomic data indicating economic productivity is increasing. 2. A Faros like observational study demonstrating sustained throughput improvement with significantly less rework and quality impacts.

I think I could be swayed against the market correction if the financials of OpenAI or Anthropic are strong. I'm anticipating they will be quite bad. I think Mythos was very expensive to train and I think the improvements in capability are sublinear. The inference costs are incredibly high.

I also have ideas about how Anthropic and OpenAI are trying to change their business models into enterprise transformation plays. Similar to Palantir. But this comment is already long.

>If they don't build it, someone else might do it.

No other players in the market other than US tech companies have the capital or the technology to train the models of the power of Fable. The way the Chinese model builders are building their models is by distilling from US models. So Anthropic, by building Mythos with all this bio data, has created the possibility that other actors can distill their models and do harm with them. (Not to say the Chinese are seeking to build weapons, but actors with their models might).