| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by symfrog 751 days ago

We work with LLMs on a daily basis to solve business use cases. From our work, LLMs seem to be nowhere close to being able to independently solve end-to-end business processes, in every use case they need excessive hand holding (output validation, manual review etc.). I often find myself thinking that a use case would be solved faster and cheaper using other ML approaches.

LLMs for replacing work in its entirety seems to be a stretch of the imagination at this point, unless an academic breakthrough that goes beyond the current approach is discovered, which typically has an unknown timeline.

I just don't see how companies like Anthropic/OpenAI are drawing these conclusions given the current state.

6 comments

ben_w 751 days ago

The developers may well be Clever Hands-ing themselves, seeing capabilities that the models don't really have.

But the… ah, this is ironic, the anthropic principle applies here:

> From our work, LLMs seem to be nowhere close to being able to independently solve end-to-end business processes

If there was an AI which could do that, your job would no longer exist. Just as with other professions before yours — weavers, potters, computers: https://en.wikipedia.org/wiki/Computer_(occupation) — and there are people complaining that even current LLMs and diffusion models forced them to change career.

> I just don't see how companies like Anthropic/OpenAI are drawing these conclusions given the current state.

If you look at the current public models, you are correct. They're not looking at the current public models.

Look at what people say on this very site — complaining that models have been "lobotomised" (I dislike this analogy, but whatever) "in the name of safety" — and ask yourself: what could these models do before public release?

Look at how long the gap was between the initial GPT-4 training and the completion of the red-teaming and other safety work, and ask yourself what new thing they know about that isn't public knowledge yet.

But also take what you know now about publicly available AI in June 2024, and ask yourself how far back in time you'd have to go for this to seem like unachievable SciFi nonsense — 3 years sounds about right…

… but also, there's no guarantee that we get any particular schedule for improvements, even if it wasn't for most of the top AI researchers signing open letters saying "we want to agree to slow down capabilities research and focus on safety". The AI that can take your job, that can "independently solve end-to-end business processes" may be 20 years away, or it may already exist and be kept under NDA because the creators can't separate good business from evil ones any more than cryptographers can separate good secrets from evil ones.

link

tivert 751 days ago

> If you look at the current public models, you are correct. They're not looking at the current public models.

> Look at what people say on this very site — complaining that models have been "lobotomised" (I dislike this analogy, but whatever) "in the name of safety" — and ask yourself: what could these models do before public release?

Give politically incorrect answers and cause other kinds of PR problems?

I don't think it's reasonable to take "lobotomised" to mean the models had more general capability before their "lobotomization," which you seem to be implying.

link

ben_w 751 days ago

> Give politically incorrect answers and cause other kinds of PR problems?

If by that you mean "will explain in detail how to make chemical weapons, commit fraud, automate the production of material intended to incite genocide" etc.

You might want to argue they're not good enough to pose a risk yet — and perhaps they still wouldn't be dangerously competent even without these restrictions — but even if so, consider that Facebook, with a much simpler AI behind its feed, was blamed for not being able to prevent its systems being used for the organisation of the (still ongoing) genocide in Myanmar: tools, all tools including AI, make it easier to get stuff done.

> I don't think it's reasonable to take "lobotomised" to mean the models had more general capability before their "lobotomization," which you seem to be implying.

I don't like the use of the word, precisely because of that — it's either wildly overstating what happens to the AI, or understating what happens to humans.

And yes, when calling them out on this, I have seen that at least some people using this metaphor seem to genuinely believe that what I would call "simply continuing the same training regime that got it this far in the first place" is something they are unable to distinguish from what happened to Rosemary Kennedy (and yes, I did use her as the example when that conversation happened).

link

detourdog 751 days ago

I think you are using LLMs exactly right. The corporations need LLMs to be extra special to support valuations.

link

camillomiller 751 days ago

Oh I can see how. It’s called hype marketing and it’s needed to justify the bubble they are inflating.

link

politelemon 751 days ago

It could simply be that the work environments they're in are simply echo chambers, which is probably a necessity of working there. They likely talk to each other about happy paths and everything else becomes noise.

link

joak 751 days ago

Maybe we need a different approach or maybe more is different.

More training, more data, more parameters, more compute power... and voilà.

Hard to say... but we've been surprised more than once in machine learning history.

link

JasonBee 751 days ago

I think it says more about their self perception of their abilities in realms where they have no special expertise. So many Silicon Valley leaders weigh on in on matters of civilizational impact. It seems making a few right choices suddenly turns people into experts who need to weigh in on everything else.

I don’t think I’m being hyperbolic to say this is a really dangerous trend.

Science and expertise carried these people to their current positions, and then they throw it all away for a cult of personality as if their personal whims manifested everything their engineers built.

link