Hacker News new | ask | show | jobs
by sottol 775 days ago
The main thing that makes me skeptical is still what happens to a code base when you do this longer-term. And not just the code base but also the company when nobody understands the code any longer, but maybe neither are problems.

A couple questions:

* Will the codebase turn into a mess over time by having the AI apply changes over changes over changes? Do we even care? Or do we want a human to still be able to follow what is going on?

* Will you just be able ask the AI to refactor it all and clean it up? Then it wouldn't be a problem I presume.

* Are product-based tech companies/startups still defensible if anyone can basically recreate the product with some English?

* I don't know Codepilot Workspaces - are the prompts that generate and change the code kept somewhere? Imo they're part of the codebase now.

6 comments

My sense, after a year of working at a company with an enterprise Copilot subscription:

If your idea of high-quality code is "follows all the standard clean coding practices, uses design patterns, doesn't do anything Sonarqube would complain about, etc.", then it does a great job.

In terms of more abstract, design-level aspects of code quality, though, I have been less impressed. So, things like limiting statefulness and avoiding unnecessary temporal coupling, good high-level abstractions that obey regular and predictable - ideally algebraic - rules, preservation of well-defined bounded contexts, things like that. Left unchecked, Copilot will happily help you turn a large monolithic codebase into architectural spaghetti.

But then, most humans will do that, too.

Is it verbose? Yes

Will it work? Also yes.

This is usually enough for most cases. Despite HN skewing to the fancier side of programming, the vast majority of day to day programming is just slapping together API glue.

For those cases LLMs like Copilot are excellent. It's a lot faster to ask Copilot about some specific C# thing than start searching through Microsoft's documentation for it. In most cases it can just insert whatever you want at the cursor.

Like just today I pasted a SQL CREATE statement to Copilot and asked it to create a FooModel class of it. Took me 3 seconds of typing, about 5-10 seconds of waiting and clicking "insert at cursor" and I had a 15 property model class.

Repeat a few more times and I've cut down stupid tedious writing by at least 30 minutes and I can go do the more fun bits of attaching some actual logic to those models.

Limiting state is the big problem here in my opinion. I’ve also noticed the tendency of AI tools to just add more variables to make things work, which is fine for write-only code, but makes it harder for humans and AI tools to maintain it in the future.

However I think this is also the hard bit for humans to do. It’s one of the most frequent stumbling blocks I see for more junior engineers, and one of the things I notice most when working with code from people who are really good programmers.

My experience so far with LLM generated code is that it tends to be pretty easy to maintain in the future, because it uses obvious code patterns and includes genuinely relevant comments.

The trick is to know how to program already, and avoid checking in LLM-generated code unless you completely understand every line.

If you don't do that you'll run into the same problems as you would if you hire a contractor to build your codebase without understanding what they did for you.

> because it uses obvious code patterns and includes genuinely relevant comments.

I often (simplistically) explain LLMs to people by explaining that it's essentially running a statistical average of language. Next-token-prediction (generally) aims to predict the next-least-surprising word that would occur in a sequence. It aims to "make sense" and be unsurprising.

If you want creative writing and innovative research papers and novel ideas, this isn't going to get you very far.

But if the things you want are "unsurprising" or "predictable" (great attributes of good, maintainable source code), then using this to write code feels like a pretty darn good fit.

> If you don't do that you'll run into the same problems as you would if you hire a contractor to build your codebase without understanding what they did for you.

I guess the difference is now that the contractor is cheap or free (because it’s a LLM), whereas in the old days you’d either hire a person to do the work and not understand or pick up a book and figure it out yourself (or go to school, or whatever). Figuring it out yourself was often cheaper and then you could understand.

(Not that humans can be replaced by LLM devs yet, or that LLM generated code is necessarily unreadable. It’s usually fine as you say.)

I have a feeling that if there was not a healthy amount of competition in the space, the prices would start to trend towards the cost of human work.
And the self-hosted options are better than nothing, I'm currently getting code autocompletions via Starcoder+TabbyML on an M1 MacBook pro.
> If you don't do that you'll run into the same problems as you would if you hire a contractor to build your codebase without understanding what they did for you.

I really like this way of thinking about using LLMs, I think that's a great analogy in many ways.

> Are product-based tech companies/startups still defensible if anyone can basically recreate the product with some English?

The code is not the asset. It never has been. Deeply understanding your customer, their problem, and how to solve it is the asset. The code is just the current manifestation of that understanding.

Problem is that for many companies the code is also the only manifestation of that understanding.

Maybe put another way, if I can get an LLM/AI to build exactly the product that I need, is a company that serves many customers simultaneously but probably worse still necessary?

I think it'll be hard enough to reason about what you really want that most customers won't care enough to roll their own. And personally, I'd happily pay someone to keep the product maintained. A product is usually not one and done.

My experience at companies is that the vast majority of the code is not understood by anybody working there, nor even attempted to be. It sits in third party libraries that nobody audits.

That's a very bad thing, but this sounds like just more of that. Which most developers seem totally fine with.

I think there is a natural trend for implementations to drift in complexity toward the edge of what can be understood (and quite often beyond that edge). I would expect the same to happen for AI-authored code with respect to what the AI can understand. Maybe refactoring and reducing tech debt will have to be a more explicit part of development and maintenance in the future?
>Will you just be able ask the AI to refactor it all and clean it up? Then it wouldn't be a problem I presume.

For smaller contexts, LLMs tend to be really good at reviewing, suggesting changes, and refactoring. I haven't seen this applied successfully at a larger contexts, though.