| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by benve 532 days ago

I think this is true because I myself said to myself: "it is useless for me to create a library or abstraction for the developers of my project, much better to use everything verbose using the most popular libraries on the web". Until yesterday having an abstraction (or a better library/framework) could be very convenient to save time in writing a lot of code. Today if the code is mostly generated there is no need to create an abstraction. AI understands 1000 lines of code in python pandas much better than 10 lines of code using my library (which rationalises the use of pandas).

The result will not only be a disincentive to use new technologies, but a disincentive to build products with an efficient architecture in terms of lines of code, and in particular a disincentive to abstraction.

Maybe some product will become a hell with millions of lines of code that no one knows how to evolve and manage.

4 comments

hobs 532 days ago

This is completely wrong and assumes that an LLM is just much better at its job than it is - an LLM doesn't do better with a chaotic code base, nobody does - a deeply nonsensical system that sort of works is by far the hardest to reason about if you want to fix or change anything, especially for a thing that has subhuman intelligence.

link

baq 532 days ago

LLMs work best matching patterns. If 1k loc matches patterns and the 10 loc doesn’t, it’s a problem.

The only thing the OP is missing which combines the best of both worlds is to always put source of and/or docs for his abstractions into the context window of the LLM.

link

xmprt 532 days ago

If your abstractions match common design patterns then you've solved your problem. It's ridiculous to assume that an LLM will understand 1k LOC of standard library code better than 10 lines of a custom abstraction which uses a common design pattern.

It's more prone to hallucinating things if your custom abstraction is not super standard but at least you'd be able to check its mistakes (you're checking the code generated by your LLMs right?). If it makes a mistake with the 1k LOC then you're probably not going to find that error.

link

baq 532 days ago

LLMs are not human, they see the whole context window at once. On the contrary it’s ridiculous to assume otherwise.

I’ll reiterate what I said before: put the whole source of the new library in the context window and tell the LLM to use it. It will, at least if it’s Claude.

link

xmprt 532 days ago

Attention works better on smaller contexts since there's less confounding tokens so even if the LLM can see the entire context, it's better to keep the amount of confounding context lower. And at some point the source code will exceed the size of the context window; even the newer ones will millions of tokens of context can't hold the entirety of many large codebases.

link

baq 532 days ago

Of course, but OP’s 1kloc is nowhere near close to any contemporary limit. Not using the tool for what it’s designed because it isn’t designed for a harder problem is… unwise.

link

freehorse 532 days ago

I have experienced quite a few of mistakes by claude as documentation grows larger (and not necessarily too large compared to certain standards). Eg some time ago, I fed a whole js documentation for some sensors into the context window and asked to generate code. The documentation mentioned specifically that it does not fully support ES6, and also explicitly that it does not support const. Claude did not bother and used const. And many times I have experienced that Claude makes mistakes using syntax in a (much less common than js or python) language that would make sense in some other language maybe, but not that one. I have inserted instructions not to do the specific mistakes in system prompts, told it to make sure it is valid syntax for X language, but Claude once in a while keeps doing the same mistakes. Negative prompts are hard, especially when probably going against a huge bunch of the training set.

link

esafak 532 days ago

> Maybe some product will become a hell with millions of lines of code that no one knows how to evolve and manage.

That is exactly what will happen, so why would you do that?

link

benve 530 days ago

I think I might be forced to do this by the metrics that measure me at work "things have to work right away and have to scale quickly to other low-skilled people"

link

baq 532 days ago

On the other hand you should ask yourself why do you care? If you assume no human will ever read the code except in very extraordinary circumstances, why wouldn’t you do that?

link

gavmor 532 days ago

Wow, and this posture doesn't apply to junior developers, ie a good abstraction is needed to avoid overwhelming the human "context window."

But it is a shame--and possibly an existential risk--that we then begin to write code that can only be understood via LLM.

link

CerebralCerb 532 days ago

Only in one sense. As code is now cheaper, abstractions meant to decrease code quantity have decreased in value. But abstractions meant to organize logic to make it easier to comprehend retains its value.

link

hhhAndrew 532 days ago

I like this take.

Previously there was a tension between easy-to-write (helper functions to group together oft-repeated lines of code, etc) vs easy to read (where often modest repetition is fine and is clearer). I felt this tension a lot in tests where the future reader is very happy with explicit lines of code setting things up, whereas the test author is bored and writes layers of helper functions to speed their work up.

But for LLMs, it seems readability of code pretty much equals its writability?

To make code more authorable by LLM, we approximately just need to make it more readable in the traditional sense (code comments, actual abstractions not just code-saving helper functions, etc).

link

benve 530 days ago

I hope so, but it adds an extra difficulty Easy to understand is not always an absolute metric, a project with many lines of code can be easy to understand for a team with a certain experience and difficult to understand for another team with a different experience (not less but different). Now I will have to think about "easy to understand" for AI

link