| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by justsocrateasin 264 days ago

Okay how about this situation that one of my junior devs hit recently:

Coding in an obj oriented language in an enormous code base (big tech). Junior dev is making a new class and they start it off with LLM generation. LLM adds in three separate abstract classes to the inheritance structure, for a total of seven inherited classes. Each of these inherited classes ultimately comes with several required classes that are trivial to add but end up requiring another hundred lines of code, mostly boilerplate.

Tell me how you, without knowing the code base, get the LLM to not add these classes? Our language model is already trained on our code base, and it just so happens that these are the most common classes a new class tends to inherit. Junior dev doesn't know that the classes should only be used in specific instances.

Sure, you could go line by line and say "what does this inherited class do, do I need it?" and actually, the dev did that. It cut down the inherited classes from three to two, but missed two of them because it didn't understand on a product side why they weren't needed.

Fast forward a year, these abstract classes are still inherited, no one knows why or how because there's no comprehension but we want to refactor the model.

4 comments

jofla_net 264 days ago

True True, I remember another example, with Linus Torvalds, who at a conference used a trivial example of simplifying functions, as to why hes good at what he does, or what makes a good lead developer in general. It went something along the lines of.

"Well we have this starting function which clearly can solve the task at hand. Its something 99 developers would be happy with, but I can't help but see that if we just reformulate it into a do-while instead we now can omit the checks here and here, almost cutting it in half."

Now obviously it doesn't suffice as real-world example but, when scaled up, is a great view at what waste can accumulate at the macro level. I would say the ability to do this is tied to a survival instinct, one which, undoubtedly will be touted as something that'll be put in the 'next-iteration' of the model. Its not strictly something I think that can be trained to be achievable though, as in pattern matching, but its clearly not achievable yet as in your example from above.

link

acuozzo 264 days ago

> Tell me how you, without knowing the code base, get the LLM to not add these classes?

Stop talking to it like a chatbot.

Draft, in your editor, the best contract-of-work you can as if you were writing one on behalf of NASA to ensure the lowest bidder makes the minimum viable product without cutting corners.

---

  Goal: Do X.

  Sub-goal 1: Do Y.

  Sub-goal 2: Do Z.

  Requirements:

    1. Solve the problem at hand in a direct manner with a concrete implementation instead of an architectural one.

    2. Do not emit abstract classes.

    3. Stop work and explain if the aforementioned requirements cannot be met.

---

For the record: Yes, I'm serious. Outsourcing work is neither easy nor fun.

link

metalliqaz 264 days ago

Every time I see something like this, I wonder what kind of programmers actually do this. For the kinds of code that I write (specific to my domain and generates real value), describing "X", "Y", and "Z" is a very non-trivial task.

If doing those is easy, then I would assume that the software isn't that novel in the first place. Maybe get something COTS

I've been coding for 25 years. It is easier for me to describe what I need in code than it is to do so in English. May as well just write it.

link

acuozzo 262 days ago

> I've been coding for 25 years.

20 here, mostly in C; mixture of systems programming and embedded work.

My only experience with vibe-coding is when working under a time-crunch very far outside of my domain of expertise, e.g., building non-transformer-based LLMs in Python.

link

wtetzner 264 days ago

I mean, unless you just don't know how to program, I struggle to see what value the LLM is providing. By the time you've broken it down enough for the LLM, you might as well just write the code yourself.

link

acuozzo 262 days ago

I've been writing code for over 20 years, mostly in C.

My only experience with vibe-coding is when working under a time-crunch very far outside of my domain of expertise.

No amount of "knowing how to program" is going to give me >10 years of highly-specialized PhD-level Mathematics experience in under three months.

link

wtetzner 262 days ago

The how do you know it got it right?

link

acuozzo 261 days ago

I was provided with a battery of externally-produced tests, benchmark scripts, etc. I was told to assume that the tests were comprehensive.

Independent of this, I used competing models produced by different organizations (e.g. OpenAI vs. Google) to test & verify each other's work.

I also could, somewhat, follow along with the math itself.

link

baq 263 days ago

Yeah, but LLM is simply faster, especially in this case where you know exactly what you need, it’s just a lot of typing.

link

stillworks 264 days ago

Curious about the mechanics here — when you say the model was ‘trained on our code base’, was that an actual fine-tune of the weights (e.g. LoRA/adapter or full SFT), or more of a retrieval/indexing setup where the model sees code snippets at inference? Always interested in how teams distinguish between the two.

link

jf22 264 days ago

What would you tell a junior dev that did this?

You tell them not to create extra abstract classes and put that in your onboarding docs.

You literally do the same thing with llms. Instead of onboarding code standards docs you make rules files or whatever the llm needs.

link