Hacker News new | ask | show | jobs
by lqet 47 days ago
> The first is when novices in a field are able to produce work that resembles what their seniors produce [...]. > The second is when people generate artifacts in disciplines they were never trained in.

There is a third shape. Experts who have become so reliant / accustomed to AI that it dilutes their previously sharp judgment and, importantly, taste. I am seeing more and more work produced by experts which seems strangely out of character. A needlessly verbose text written by someone who was previously allergic to verbosity. An over-engineered solution (complete with CLI, storage backend, documentation, unit tests) for a trivial problem which that person would've solved by an elegant bash one-liner only 3 years ago. The work itself is always completely immune to any rational criticism, as it checks all the boxes: extensive documentation, scalable, high test coverage, perfect code style, and for texts perfect grammar, non-offensive, seemingly objective. But, for lack of a better word, it simply lacks taste.

13 comments

>> The first is when novices in a field are able to produce work that resembles what their seniors produce [...]. > The second is when people generate artifacts in disciplines they were never trained in.

This phrasing made me think of Baudrillard: https://en.wikipedia.org/wiki/Simulacra_and_Simulation , in particular "Simulacra are copies that depict things that either had no original, or that no longer have an original".

The AI produces something that is statistically similar to what it was asked for. A copy, through the weights, of some text selected from all the text it was trained on. A simulacra of good work.

I bought this book a year ago but have not read it. It's one of those books that requires effort from the reader, which these days seems in low supply for me. :) I'll have to give it another try.

Recently I commented that: Artificial intelligence produces artificial results.

I liked the double-artificial but I wasn't happy with the meaning. Perhaps Simulacra is more accurate? I will see :)

Results are results, though. By "artificial results" do you mean those that merely appear to be results, or do you mean results achieved via nonhuman means?

If I write the exact same code as the AI, our results will be indistinguishable.

Late reply so not sure if you'll see this.

But I mean: they appear to be results. They look real, but are not. There are subtle errors hidden that the casual observer will not, or even cannot detect.

Obviously there are exceptions to this. Many exceptions. AI right now can probably write most if not all algorithms on par or better than I could. It can put together working prototypes for many ideas I've had and not had time to implement.

But the danger is in assuming that because it can do A, it can do B, because a human that does A can do B.

You might enjoy Speech Central, it decently renders any text or pdfs,epub into an audiobook that you can still read along with or speed read or whatever.

Kinda takes the effort out so you just gotta veg while reading/listening and following along

> An over-engineered solution (complete with CLI, storage backend, documentation, unit tests) for a trivial problem which that person would've solved by an elegant bash one-liner only 3 years ago.

Importantly, I think AI companies are motivated towards the overengineered solutions as they increase the buyer's token spend. I'm not sure how we can create incentives that optimize for finding the 'right' solution, which may be the cheapest (the bash one-liner). Perhaps a widely recognized but not overly optimized for benchmark for this class of problems?

> Importantly, I think AI companies are motivated towards the overengineered solutions as they increase the buyer's token spend.

Yes that, and also, the more complicated the solution, the more likely no one reads or reviews it too carefully, and will instead depend on an LLM to ‘read’ and ‘review it’

Even ignoring token costs, there’s a high incentive for LLMs to generate complex solutions, because those solutions generate demand for further LLM use. (You don’t really want to review that 30,000 line pull request by hand, do you?)

This reminds me off this famous quote by Tony Hoare:

    "There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies, and the other way is to make it so complicated that there are no obvious deficiencies."
> Yes that, and also, the more complicated the solution, the more likely no one reads or reviews it too carefully, and will instead depend on an LLM to ‘read’ and ‘review it’

Exactly right. It's the other end of the bikeshed continuum[1]. If you send out a two-page design doc or a hundred like pull request, the recipient will actually review it. Let AI inflate that to ten pages or a thousand lines of code and they feel like they don't have enough mental capacity to tackle it so they let it slide.

[1]: https://bikeshed.com/

> Perhaps a widely recognized but not overly optimized for benchmark for this class of problems?

I don't see how this could be achieved.

Any widely-recognized benchmark is going to be gamed by the genAI companies.

They have a strong financial incentive to do so, and their products' nature shows that they are not influenced by ethical or societal-good incentives.

I dunno, on a subscription one would assume that minimizing token spend would actually be in their interest. Even for API calls I'm not entirely convinced they're profitable.
I think the model space is too competitive. People will switch if another model is significantly better.
There are only a few frontier models, and aren’t they all operating under the same incentives?
Open source models maybe not necessarily as they can (in theory) be self hosted.

I think right now the incentives of open source chinese model developers is to provide good (comparable to SotA) and cheap models so the space is not captured by a few private american companies because they've seen how hard it is to compete in the space when that happens.

I agree completely with your sentiment. The word I use to describe it is “quality”! Most people don’t produce quality work or take pride in it, even beyond the tech industry. I believe the tool of AI is exasperating an underlying problem
I posted a comment very similar in spirit - we’ve adopted “perfect is the enemy of good” as the operative maxim instead of maximizing accountability and now we may need to flip as AI does that first part quickly enough.
I find LLMs make me doubt my own ability to produce quality now. I used to jump in and just write code so eagerly, letting it get messy but discovering the shape of the problem in the process.

But AI can produce beautiful, complete, syntactically perfect code on the first pass that makes my code look juvenile.

I mean, it might be wrong for other reasons, but it makes me feel like I'm programming with crayons next to it.

I find that I have this problem too. I hate to admit it, but I feel as though it’s the natural progression of the “tab pause” problem, where after writing a stem you wait to see what the autocomplete will say, even though you know exactly what you want to type. It’s like even using AI for confirmation rewires how you think anyway.
> An over-engineered solution (complete with CLI, storage backend, documentation, unit tests) for a trivial problem which that person would've solved by an elegant bash one-liner only 3 years ago.

“There is more Unix-nature in one line of shell script than there is in ten thousand lines of C.”

https://www.catb.org/~esr/writings/unix-koans/ten-thousand.h...

Reminded me of The Tao of Programming [1].

Fantastic little read that brightened my morning (during a boring meeting).

[1]: https://www.mit.edu/~xela/tao.html

Love this!

> The Master was explaining the nature of Tao to one of his novices. > > "The Tao is embodied in all software -- regardless of how insignificant," said the Master. > > "Is the Tao in a hand-held calculator?" asked the novice. > > "It is," came the reply. > > "Is the Tao in a video game?" asked the novice. > > "It is even in a video game," said the Master. > > "Is the Tao in the DOS for a personal computer?" asked the novice. > > The Master coughed and shifted his position slightly. "The lesson is over for today," he said.

haha, yes, that was a good laugh :)
> The work itself is always completely immune to any rational criticism, as it checks all the boxes: extensive documentation, scalable, high test coverage, perfect code style, and for texts perfect grammar, non-offensive, seemingly objective. But, for lack of a better word, it simply lacks taste.

"It is overly engineed" is a rational criticism. Likewise, "it is overly verbose, it could have been shorter" and "this could have been a one-liner" are rational criticisms.

I’m watching out for that in my own work. I’m a pragmatic person but I have sweated over details that Claude will just blast out a solution to, and the temptation to say “tests pass, move on” is strong.

It’s a little like riding a horse that knows the route.

Almost accurate It's not "the route" But "a route"
Almost accurate, but it's »Almost accurate. It's not "the route", but "a route".«.
The untucked period really increases the meta.
Developers have been lacking taste for decades anyway, like all of those kubernetes clusters built out for companies that could run on a 50 euro a month dedicated server at hetzner.
Sometimes entire businesses collapse to a python dict and a backup UPS power supply
Software-wise maybe, but if their customers could replace them with a Python dict they would. Such businesses usually have value due to their social network consisting of suppliers, knowledgeable employees, and other customers.
I have thought about it.

Present iteration of LLMs are, despite what normies would believe, aren't optimised to provide correct solutions. They are optimized to __sound smart__.

This may be just an undesirable artifact of the RLHF process. But the end result is same. They try (?) too hard to sound smart.

Last generation LLM writing was too obvious in its soulless journalistic nature. But the current generation LLMs do all the following things to appear smart; From the lowest levels to highest level

- use clever writing styles and punchlines. Not X, it's a Y'ed Z. (Though it's not funny and makes no sense).

- Overstuff the technical terms, most often using a +. "Add a shim + iptables rule + signal handler".

- Over engineer the low level design. (Eg rather write a function to do some complex parsing when a way exists to avoid it altogether. Write tricky bash script and parse the output for what could be achieved by stdlib in few more lines).

- over engineer the code flow: this is rather because they're clueless and can't step back. But I have fun seeing the LLM come up with 4 5 levels of branching and then extract it into a function, whereas a human would step back and try to avoid the branching.

- over engineer the high level design: well your mistake is letting the word soup machine lead the design. It will add all and kitchen sink with need bullet points and + marks. Only a pleb not sufficiently educated in the matters of computer science will be impressed with such Markdown kitchen sink designs. It's fine to rely on LLM for brainstorming and discovering how to do A, B and C. But if you outsource the job of design, it's instincts (!) to sound maximally smart using bullet lists and + marks will kick in.

That's honestly a fear of mine, that I might lose the taste for simplicity.
Like healthy food, simplicity doesn't taste good. At least not on the surface.

It is an acquired taste and is easily lost. When your own instinctual heuristics are being weaponized against you for profit, you have to continually fight to maintain a discipline of nourishment. The sugar high is too addictive.

AI is a fast food of the creative mind.

Taste.

I believe we (software engineers) have tried hard to eliminate taste in programming: linters, git message styles, you name it. And I think that's a good thing. Taste is not transferable. Consistent code is.

Perhaps the experts have decided that, for this specific instance, the thing we need to do is ad-hoc and throwaway, and is simply not worth paying the extra cost to make it tasteful.
How can a bash one-liner be more expensive to build than a full-blown CLI tool with the maintenance burden that comes with it?
Same way an excel spreadsheet can be more expensive to maintain than a web app.
Bash is one of the most complicated languages in common use and is horribly error-prone. It's almost never useful alone, but as an interface to call other CLI tools. I don't think this is a particularly useful comparison.
What expert would build a cli tool to avoid writing a bash one-liner?
I cannot judge without more context. Depending on the one liner, the problem at hand, and overall situation that can be justified. A CLI is straightforward to create.

A bash oneliner can be a chain of 5+ programs, each buffering the stdin/out, what if the CLI is doing the same operation via streams instead? Just a random example but that can easily be worth it

Sure, then there's a point where the extra taste isn't worth its cost.

I just don't like the fictional straw man where an expert has somehow been brainwashed by AI into forgetting everything they ever knew.

Wait... Did you just fictional strawman a fictional strawman?

Because I don't think I've ever heard anyone say an expert gets brainwashed by AI into forgetting everything they ever knew. Maybe losing some skills that require regular exercise. Or get lazy about implementations when spawning dozens of super useful agents. But come on. Don't build your strawman out of a fictional strawman.

Perhaps the ones that were experts before letting their brain rotten by AI.
I'm sorry but "extensive documentation, scalable, high test coverage, perfect code style" seems to me to be the opposite of throwaway.

It sounds like the kind of thing people will think surely must be very important and in use, because why go through all those hoops instead of doing a quick hack?

But I guess we can just throw AI at the maintenance burden anyways..

I agree, so you should ask yourself "why would the expert do this?"

I decided to go for the charitable interpretation of "the alternatives are close enough in functionality that writing by hand is not worth it", instead of the uncharitable interpretation of "these examples are completely made up".

Because the expert has forgotten. Skills that we don't use are forgotten, and there's nothing new in that. Except for the proverbial bicycle.
Ok, if you think the expert has forgotten that a problem can be solved by a bash one-liner and instead think they need a whole extensive CLI with documentation, our viewpoints are too far apart for fruitful discussion.
The bash one-liner might be hyperbolic but with the advent of AI everything is artificially longer, stuffier, more complex and convoluted for no reason other than because the AI allows this increase in volume with little to no extra effort.

It used to be the proverbial one-liner with zero documentation because that was the best ratio of effort to results. Now the effort is on the AI and the results look more impressive. Today that will still impress a lot of people, bosses, colleagues. Very soon everyone will see through it and anything overly stuffy will have the opposite effect of looking low-effort.

Or maybe they just didn't have the time (left it to the last minute and ran out of it), and went with the first thing that AI proposed which was said CLI with documentation.
and here I am reading an interaction and thinking you two are saying exactly the same thing. Language be, what it is I guess: open to interpretation.
> Skills that we don't use are forgotten

I think, through these tools as accelerants, we’re finally getting to see the chasm between academic rote memorization of tech-work,

and deep, actual understanding,

that some of our colleagues have.

Folks have been noting a trend of mental abstraction away from the stack, & long-term thinking - that hasn’t changed.

It just has Turbo now.

All shell, no ghost.
One of the top 4 ICS in my (major, public) company just posed a 100 lines of AI slop (which I and others read, and found to be meaningless) in a conversation about a major pain point where they are supposed to be the expert. It's like people have totally turned off their brains.
I wonder if the issue is that the use of AI has generated so much work, a substantial amount none essential and incorrect as per the article and for one to cope with the volume of work you have to use AI. The irony. More pull requests to check, more pages or documentation to review, more new apps or features to get acquainted with all at a rapid pace.
I've seen pure LLM output used in performance reviews. It's wild.