| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by raw_anon_1111 142 days ago

When I first started coding, I knew how my code worked down to assembly language because that was the only way I could get anything to run at a sufficient speed on a 1Mhz computer, I then graduated to C and C++ with some VB and then C#, JavaScript and Python

Back in 2000 I knew every server and network switch in our office and eventually our self hosted server room with a SAN and a whopping 3TB of RAM before I left. Now I just submit a yaml file to AWS

Code is becoming no different, I treat Claude/Codex as junior developers, I specify my architecture carefully, verify it after it’s written and I test the code that AI writes for functionality and scalability to the requirements. But I haven’t looked at the actually code for the project I’m working on.

I’ve had code that I did write a year ago that I forgot what I did and just asked Codex questions about it.

1 comments

mikaelaast 142 days ago

How do you verify the code without actually looking at it?

link

adamzwasserman 142 days ago

Although I write very little code myself anymore, I don't trust AI code at all. My default assumption: every line is the most mid possible implementation, every important architecture constraint violated wantonly. Your typical junior programmer.

So I run specialized compliance agents regularly. I watch the AI code and interrupt frequently to put it back on track. I occasionally write snippets as few-shot examples. Verification without reading every line, but not "vibe checking" either.

link

mikaelaast 142 days ago

I like this. The few-shot example snippet method is something I’d like to incorporate in my workflow, to better align generated code with my preferences.

link

adamzwasserman 142 days ago

I have written a research paper on another interesting prompting technique that I call axiomatic prompting. On objectively measurable tasks, when an AI scores below 70%, including clear axioms in the prompt systematically increases success.

In coding this would convert to: when trying to impose a pattern or architecture that is different enough from the "mid" programming approach that the AI is compelled to use, including axioms about the approach (in a IF this THEN than style, as opposed to few shot examples) will improve success.

The key is the 70% threshold: if the model already has enough training data, axioms hurt. If the model is underperforming because the training set did -not- have enough examples (for example hyperscript), axioms helps.

link

moomoo11 141 days ago

"Let's check that we can do X, Y, Z"

"Create documentation and then write tests"

a few moments later...

"There's a bug where we cannot do Y. Investigate the code and then let's discuss the best fix"

"Update the documentation and tests"

link

raw_anon_1111 142 days ago

How do you verify the compiler without looking at the assembled code? How do you verify code that links against binary libraries?

You run it and check for your desired behavior.

link

giantg2 142 days ago

Compilers have a finite set of inputs and outputs that should generate reproducible results. There's a larger amount of possible outputs for the same question with AI and very little reproducbility.

link

raw_anon_1111 142 days ago

Yes but once the code is written it’s not going to magically change. I am going to test the code just like I would test something I wrote - again like I’ve been doing for 40 years when writing my code by hand.

link

giantg2 141 days ago

But your thought process during coding influences your testing. At least for most of us, we find edge cases or point of concern during coding that we place extra focus on in test.

This is different than what you've done for the past 40 years becuase you're not testing your code. This would be analogous to you testing someone else's code. The vast majority of people and places have not followed that paradigm until AI showed up.

link

raw_anon_1111 141 days ago

My thought process during my architecture influences my testing.

Since AI has been a thing, I’ve been in a customer facing cloud consulting role - working full time at consulting departments (AWS ProServe) and now a third party company - specializing in app dev.

Before my hands actually write a line of code or infrastructure as code, I’ve already spoken to sales to get a high level idea of what the customer wants, read over the contract (SoW) to see what questions I have, done discovery sessions/requirements analysis, created architecture diagrams, done a design review, created detailed stories/workstreams (epics), thought about all the way things can go wrong etc.

I very much keep my hands on the wheel and treat AI as a junior coder that might not follow my instructions. I can answer any question about architectural decisions, repo structure, what any Lambda does the naming conventions etc.

I’ve also intuited “these are the things that I need to think about and test for from my 30 years of professional experience as a developer and 8 years of experience across literally dozens of AWS implementations”.

In the before times, if I were doing this without AI, I would have to have two or three more junior people doing the work just because I couldn’t physically do it in 40 hours a week. Even then I would be focused on how it works and look for corner cases.

I don’t have to think about what I need to test for. I did specifically call out concurrency because there are subtle bugs.

Ironically, what I am working on now had a subtle concurrent locking bug that Codex wrote. I threw the code into ChatGPT thinking mode and it found it immediately and suggested better alternatives. I also have Claude and Codex cross check each other.

link

mikaelaast 142 days ago

(Those are hardly analogous comparisons to LLM generated code, are they?)

So you do a vibe check?

link

raw_anon_1111 142 days ago

What’s “vibe checking”?

I input x and I expect y behavior and check for corner cases - just like I have checked for correctness for 40 years. Why do I care how the code was generated as long as it has the correct behavior?

Of course multithreaded code is the exception unless the LLM is putting a bunch of rnd() calls in the code to make it behave differently.

link