Hacker News new | ask | show | jobs
by resident423 14 days ago
At the moment is the significant part, how long do we expect this to last now that AI is capable of generating novel ideas like solutions to Erdos conjectures?
3 comments

The most difficult and time consuming tasks in my job as a software developer is talking to customers and figuring out what they need coupled with getting them to understand what is realistic and what is not. This is a process which takes months if not years, and requires aligning internal goals and tasks with external ones. LLMs can definitely help here, but I feel that the current mode of use where the users have to explicitly manage the tiny - relative to a human brain - context window is an obstacle. I don't know if what we need is a new architecture or just more clever context engineering, but I don't see an LLM actually taking over this type of work as things are now.
LLMs enable a new kind of text search. It looks like reasoning and intelligence, but it is not.

For example, If you are not aware of Internet, you would consider a traditional internet search that comes up with a stackoverflow answer as a machine generating "novel ideas" and answers.

It is a clever marketing trick (touting Erdos solutions) employeed by AI companies.

That's one way to look at it. Kinda like a way to look at cargo planes is in terms of boxes and sticks.
Whatever LLMs are or are not, they've completely changed what I do for work. 9 months ago I was coding, today I prompt. Every line of code I commit is generated by LLMs. If you want to call it text search, be my guest. Doesn't change what it's done for the industry.
Sure, as I said, a better search can bring pretty dramatic change in how you work.

> Every line of code I commit is generated by LLMs...

Imagine if someone told you "Hey, this stackoverflow site is great. Everything I commit in my work is copy pasted from it!". What would you think about their work? Is that something worth bragging about?

Those are 2 different things so yeah I naturally would have a different reaction.
i'm imagining this, and don't see how it has any relation to the discussion
No, no, you don't get it. See, it's just like a text search, if a text search could return text that never existed before, and that solves original math problems, and answer your emails for you, ... and ...
But most of the stuff it returns existed before, and llm's parent company during their training part stole all that info, legal or not who cares right. The rest is combined in sort of least-resistance-path which can produce impressive results but its not what you wrote. Many people don't actually care much about morality in their lives only when its convenient for them, and this is a prime example of such tunnel vision.

Start with clean llm, no external previous ideas of humans inserted into it, and let it generate some wisdom on its own and then lets talk. (btw thats how I would expect we could get closer to AGI with these statistical models, but thats just my opinion)

But most of the stuff it returns existed before

No, it did not.

Start with clean llm, no external previous ideas of humans inserted into it, and let it generate some wisdom on its own and then lets talk.

An LLM is about as likely to do that as you are. The ability to generate "wisdom" ab initio cannot possibly be a criterion for intelligent reasoning. The ability to arrive at novel mathematics proofs, on the other hand, is good enough for me.

Intelligence means making the most of the resources and information available, not the ability to speedrun the Big Bang. LLMs are certainly smarter than humans who dismiss them as "text searchers."

>No, it did not.

It did. In the form of a pattern that humans are too incapable of recongizing. LLMs identify and repeat the pattern. That is all.

For example, if A -> B and B -> C, logic dictates A -> C. But LLMs will be able to state that A -> C, without actually using logic, if there is sufficient statements in its training data that says A -> B and B -> C and A -> C. So now if you say P -> Q and Q -> R, it will say that P -> R, when there is no explict P -> R in the training data and NOT using logic. For you, it looks like a new discovery inferred using logic when it is not. But that is how that happens..

It is just pattern recognition masquarading as logic, x, y, or z.

Ever heard of "prompt injection" attacks?

This "super intelligent" and "capable" thing cannot even understand that your ssh keys are private and should not be sent to randos. It can solve complex math, but does not understand basic security/privacy.

What does that say to you?

This "super intelligent" and "capable" thing cannot even understand that your ssh keys are private and should not be sent to randos.

When somebody posts their private keys to Github, it's usually a human. Enough said.

(And if you had ever used Claude Code, you'd know that it nags you endlessly about key hygiene.)

Ever heard of social engineering? Also, models nowadays are way sharper than they were even a year ago. They’re not going to make stupid mistakes like that unless you basically ask them to. GPT-5.x for example would bend over backwards to avoid even reading your passwords into context.
> Ever heard of social engineering?

Oh wait, I thought these things were super smart. I didn't expect "social engineering" to work on them.

> models nowadays are way sharper than they were even a year ago.

You are missing the point. If the thing can solve complex math problems and at the same time be so dumb as to fall for "social engineering", then that means that it is not "smartness" or "reasoning" that is helping it to solve those problems. Just some form of advanced, but yet dumb, search algorithm.

Reading comments like this is like watching an impaired pedestrian about to be run over by an approaching bus. You yell, you wave your arms, but they aren't paying attention. There's no way to warn them, so all you can do is... watch.
> There's no way to warn them, so all you can do is... watch

…and then wake up from the nightmare wishing the stress from the job is lower.

I can only laugh that some people truly believe that developers, one of the most ardent group at automating the tedious part of their job, would refuse to use an effective tool. You only need to look at the open source world to see people litterally scratching their own hitch everywhere.

What if I don't want to automate away the part of my job that I actually like doing? What if, in my job as a programmer, I actually want to do programming?
That’s fair. I do think, however, that the software industry may become a bit like the clothing industry: there will still be an artisanal market for people who want human-made software, but to be honest I wouldn’t expect it to remain the mainstream option.
Well I can say for sure I'd rather wear long lasting, tested (second hand), human made clothing rather than Shein Slop
People who demand programmers start using LLMs in their work don't understand that it is essentially like asking programmers to start doing accounting or HR. Something fundamentally different from what they love to do..
Theyre asking programmers to become managers, asking another entity to do the work for them and check in every now and again to see how it's going
>would refuse to use an effective tool.

No one said I don't use LLMs. I use LLMs daily...for search. That is its best use. That is my own judgement.

Reading comments like this is like watching someone who is absolutely convinced that they have a crystal ball in their lap when at best they have a foggy piece of plastic. You could be right you could be wrong, but don’t act like you have such certainty.
The foggy piece of plastic writes better code and better text than I do. I don't know about you, but that makes me sit up and stop waving my hands dismissively.

As for "certainty," that's a luxury nobody has.

I really don't want to sound like an asshole, but I refuse the notion that an agent writes better code / prose than I do and I am concerned for anyone who does think that.

Is it _faster_? sure! That is NOT better.

Before you ask, I write code w/ agents daily, I find it useful, but it's not better than I am purely on quality.

What I've been seeing lately with Opus 4.7 under Claude Code is that it finds more bugs in my code than I find in its code. That, to me, makes it hard to argue that I am "better" than it is.

Certainly Claude/Codex's knowledge of algorithms and data structures is leagues ahead of any human programmer alive. Only its capacity for creating new ones on demand is weaker. Recent results from the mathematics field suggest that's a temporary state of affairs.

As far as prose goes, my best writing is indeed better than the best I've seen from LLMs. But that's a matter of opinion (mine.) On average the clanker wins, especially if conditioned to avoid LLM-isms.

The truth is that the models are getting better in both areas, while I'm not. Which IMHO is freaking awesome, not a reason to burn it all down.

Wow that sure sounds smart!

The problem, of course, is that generating a one time solution to a problem is a much easier problem space than a many-input task with human product concerns

Synthesizing a ton of inputs to help clarify a decision or set of options is exactly one of the easiest and most powerful use cases for AI agents right now.
I didn’t say it wasn’t :)

All i said is that one type of task is easier than another

I don't think that part is true, either. The average human could be trained to use an agent to synthesize information in their job to help make product decisions. The average human could not be trained to evaluate whether a reasoning model produced a correct proof in research-level mathematics. To be sure: reviewing a candidate proof at this level written by AI is significantly easier and faster than writing and creating it from scratch. But it's still not something hardly any humans could credibly do.
I didn’t say any of that either, all I said was for ai, one type of task is easier than another

I said nothing about a human’s capabilities, which are most certainly different than an AI’s

This is the second time you’ve said a bunch of stuff I didn’t say, and then tried to argue with me over it.