| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by simonw 55 days ago
	> Why use someone's project when you can just have the robot write your own? I've been thinking about this a bunch recently, and I've realized that the thing I value most in software now isn't robust tests or thorough documentation - an LLM can spit those out in a few minutes. It's usage. I want to use software which other people have used before me. I want them to have encountered the bugs and sharp edges and sanded them down.

8 comments

earleybird 55 days ago

Depth of use over the lifetime of an app is a quality all its own that often not appreciated. A recurring pattern at $dayjob is that a new manager or director will join a business unit and declare an existing app as the worst terrible, no good, horrible app they've seen and they're going to fix that. A year and a half later the new app is finally delivered with 80% of the original functionality and a fresh set of bugs. The new dev team sees the surface functionality but misses a lot of the hard earned nuance the old system accrued over time. This is a pattern that existed long before LLMs.

mormegil 55 days ago

Yes, see e.g. a quarter-century-old (!!) https://www.joelonsoftware.com/2000/04/06/things-you-should-...

ZeelRajodiya 55 days ago

Good read!

tovej 55 days ago

An LLM most definitely cannot spit out robust tests or thorough documentation. It can spit out some tests or some documentation, but they will not cover the user perspective or edge cases unless those are already documented somewhere. That's verified by both experience and just thinking about it for two seconds.

The sanding down you refer to is what generates those tests and documentation.

mexicocitinluez 54 days ago

> but they will not cover the user perspective or edge cases unless those are already documented somewhere

Are you suggesting that LLM's can't test for people who use screen readers? Keyboard only users? Slow network requests?

You're acting like the issues an app faces are so bespoke to the actual app itself (and have absolutely no relation to existing problems in this space) that an LLM couldn't possibly cover it. And it's just patently wrong.

tovej 54 days ago

I'm not talking about keyboards or screen readers or any sort of input testing, I'm talking about how the software is used in practice.

If you disagree with that, I think the onus is on you to show me that an LLM could simulate the full context in which a user interfaces with software. That's a ridiculous claim.

Feel free to show literally any evidence for this claim.

mexicocitinluez 54 days ago

I'm disagreeing with the saying it's impossible across the board, I'm not saying it's universally possible.

lol And you made the claim, not me. The proof is on YOU.

tovej 54 days ago

No, that's not how burden of proof works.

The status quo is that this capability does not exist. Whoever makes a claim contradicting the status quo has the burden of proof. I can't prove a negative.

And even with your logic, I did not make the original claim, it was made by simon.

Your statement now also makes little sense. For any nontrivial software project, the usage patterns and interactions with other systems are complex enough that the code itself does not contain enough context to understand how it is used, or what the invariants are.

There may be very simple codebases where an LLM can actually give you "thorough documentation" or "robust tests", but those are rare.

mexicocitinluez 54 days ago

> There may be very simple codebases where an LLM can actually give you "thorough documentation" or "robust tests", but those are rare.

Its not rare. I've built 2 dozen line-of-business apps in it last handful of years that were glorified CRUD apps. Every environment I've been in has had a mix of the 2.

And even then, that's at odds with your absolute above. On top of being in a field that changes daily.

thunderfork 54 days ago

>Are you suggesting that LLM's can't test for people who use screen readers? Keyboard only users? Slow network requests?

I don't think it's feasible to fully simulate the full depth of actual usage, given that (especially in the case of screen readers and the like) there's a great deal of combinatorial depth and context to the problem. Which screen readers, on which operating systems, and which users thereof?

fibonacci_man 54 days ago

I can’t tell if you’re being sarcastic or not

mexicocitinluez 54 days ago

You're saying that every app on this planet has bespoke usages that can't be derived from the app itself? That's your claim or am I getting this wrong?

watwut 54 days ago

> he thing I value most in software now isn't robust tests or thorough documentation - an LLM can spit those out in a few minutes.

Can it if we stop defining "robust tests" as "a lot of test code lines" and "good documentation" as "lengthy documentation"?

simonw 54 days ago

I chose my words carefully. "Robust tests" are tests that provide high coverage and aren't flaky. "Thorough documentation" likewise is documentation that describes as much of the code as possible.

I didn't use the word good.

porridgeraisin 55 days ago

Yep. I realised the same. No one reads docs, or goes through tests. Either ways it's easy to write useless tests. And easy to write useless docs. Idt most even read the code. Now the difference is that it has become possible to write useless code.

So it's just the fact that others have already gone through the motions before I did. That's it really. I suppose in commercial settings, this is even more true and perhaps extends to compliance.

matkoniecz 55 days ago

> No one reads docs, or goes through tests.

I regularly do both when trying to use library, especially unfamiliar to me.

porridgeraisin 54 days ago

Dare I say you're in the minority

tovej 54 days ago

I hope not. How else are you learning to use the library? The only other option is to read the source, which is also a good idea eventually, if something is unclear, but why would you _start_ there?

matkoniecz 54 days ago

Ask LLM.

tovej 54 days ago

Bad idea.

But even in that case, you're reading the documentation. Just through a nondeterministic, hallucinating search engine.

matkoniecz 54 days ago

Maybe, but still a counterexample.

jbxntuehineoh 54 days ago

> No one reads docs

sooo uhh how do _you_ learn how to use a new library? just throw random shit at the wall until something sticks?

anp 55 days ago

I feel similarly but IIUC I think that doesn’t strictly require an open source development model. I’ve benefited a huge amount from consuming and contributing to open source projects and I’m a bit worried that the “unit economics” changing might break some of the social dynamics upon which the ecosystem is built.

einpoklum 54 days ago

> an LLM can spit those out in a few minutes.

It may be able to spit out text that purports to be that, in a few minutes. But for most software, an LLM will not be able to spit out robust tests - let alone useful documentation. (And documentation which just replicates the parameter names and types is thorough...ly useless.)

simonw 54 days ago

That's why I said "thorough" and not "good".

johanyc 54 days ago

So battle tested

jart 55 days ago

I value software that reveals knowledge. The frontier LLMs were trained on all the code that institutions had been keeping to themselves. So they're revealing programing know-how on a scale that just wasn't possible with open source. LLMs are the ultimate Prometheus. Information is more accessible and useful now than it's ever been.

wiseowise 55 days ago

> The frontier LLMs were trained on all the code that institutions had been keeping to themselves.

Lolz! I haven’t encountered “code that institutions had been keeping to themselves” that got even remotely close to OSS in quality.

cmrdporcupine 54 days ago

The quality of code inside Google's Google3 repository is more consistently high quality than most of what I see in the Exterior World.

But there's no way that Google is releasing a model trained on it. Way too high of a risk of IP leakage.

Antibabelic 55 days ago

I promise you, "the code that institutions had been keeping to themselves" is not nearly as special or good as you are implying here.

adrian_b 55 days ago

True.

I have worked during several decades in many companies, located in many countries, in a few continents, from startups to some of the biggest companies in their fields. Therefore I have seen many proprietary programs.

On average, proprietary programs are not better than open-source programs, but usually worse, because they are reviewed by fewer people and because frequently the programmers who write them may be stressed by having to meet unrealistic timelines for the projects.

The proprietary programs have greater quantity, not quality, by being written by a greater number of programmers working full-time on them, while much work on open-source projects is done in spare time by people occupied with something else.

Many proprietary programs can do things which cannot be done by open-source programs, but only because of access to documentation that is kept secret in the hope of preventing competition.

While lawyers, and other people who do not understand how research and development is really done, put a lot of weight in the so-called "intellectual property" of a company, which they believe to be embodied in things like the source code of proprietary programs or the design files for some hardware, the reality is that I have nowhere seen anything of substantial value in this so-called IP. Everywhere, what was really valuable in the know-how of the company was not the final implementation that could be read in some source code, but the knowledge about the many other solutions that had been tried before and they worked worse or not at all. This knowledge was too frequently not written down in any documentation. Knowing which are the dead ends is a great productivity boost for an experienced team, because any recent graduate could list many alternative ways of solving a problem, but most of them would not be the right choice in certain specific circumstances.

jech 54 days ago

> On average, proprietary programs are not better than open-source programs, but usually worse, because they are reviewed by fewer people and because frequently the programmers who write them may be stressed by having to meet unrealistic timelines for the projects.

There's also the fact that when you write open-source code, you're writing for a friendly audience. I've often found myself writing the code, letting it rest for a few hours, then rewriting it so that it is easier to read. Sometimes, the code gets substantially rewritten before I push.

There's no cooling period when you write code during your 9-5 job: it works, it has the required test coverage, ship it and move on to the next task.

applfanboysbgon 55 days ago

The claim is also just categorically untrue. The largest source of training data by far is publicly available code on e.g. Github, so it mostly just gives you a way to recycle already-available code, without crediting the author, while allowing you to pretend you own it.

jart 55 days ago

So you're both saying all the alpha in Claude comes from open source devs like me? Even when I'm wrong I'm right.