Hacker News new | ask | show | jobs
by Philpax 84 days ago
Apologies for the obligatory question, but what did you try to do, and with which AI did you try to do it with?
1 comments

Well following advice from folk on here earlier, I thought I'd start small and try to get it to write some code in Go that would listen on a network socket, wait for a packet with a bunch of messages (in a known format) come in, and split those messages out from the packet.

I ended up having to type hundreds of lines of description to get thousands of lines of code that doesn't actually work, when the one I wrote myself is about two dozen lines of code and works perfectly.

It just seems such a slow and inefficient way to work.

Hate to pull the skill issue card here, but that is a trivial problem that can be one shotted with almost any model with
Okay, tell you what then. Help me learn.

The problem is that I want something that listens on a TCP connection for GD92 packets, and when they arrive send appropriate handshaking to the other end and parse them into Go structs that can be stuffed into a channel to be dealt with elsewhere.

And, of course, something to encode them and send them again.

How would I do that with whatever AI you choose?

I'm pretty certain you can't solve this with AI because there is literally no published example of code to do it that it can copy from.

GD92 packets?

No idea what you’re talking about but if it has a spec then it doesn’t matter if it’s trained on it. Break the problem down into small enough chunks. Give it examples of expected input and output then any llm can reason about it. Use a planning mode and keep the context small and focused on each segment of the process.

You’re describing a basic tcp exchange, learn more about the domain and how packets are structured and the problem will become easier by itself. Llms struggle with large code bases which pollute the context not straightforward apps like this

One other thing, it might be worthwhile having the spec fresh in the LLM's context by downloading it and pointing the agent at it. I've heard that that's a fruitful way to get it to refresh its memory.
Yep you can even extract the relevant parts and put them into local files the llm can scan
> GD92 packets? No idea what you’re talking about but if it has a spec then it doesn’t matter if it’s trained on it.

Okay, so you're running into the same problem that LLMs are.

> Break the problem down into small enough chunks. Give it examples of expected input and output then any llm can reason about it.

So I have to do lots of grunt work?

> You’re describing a basic tcp exchange, learn more about the domain and how packets are structured and the problem will become easier by itself

I've written dozens of things that deal with TCP. I already have a fully-working example of what I want. The idea was to test if I could recreate it using LLMs.

How is it supposed to work? How does it put in the code I already know I want?

>Okay, so you're running into the same problem that LLMs are.

I can't tell if you are a troll or not, but you can't complain that nobody understands your intentionally vague and obtuse way to describe the problem at hand to pretend you're superior.

https://www.publiccontractsscotland.gov.uk/NoticeDownload/Do...

You have to rename the file ending to PDF. It's probably the wrong spec, because I'm basing this research on literally four letters that could mean anything since there is zero context given here. I've also found some German documents about chemistry.

If your argument is that LLMs and humans are stupid because they don't know what a "GD92" is, then yeah maybe it's a you problem.

Go and throw the spec into openai codex inside limactl (get it from GitHub) and use zed (the editor) and a SSH remote project to get inside the VM, don't forget to enable KVM for performance. The free tier for openai is fine, but make sure to use codex 5.2.

First ask questions on what the binary encoding is based on. It's probably X.400, then once you've asked enough questions, tell it to implement it. You probably won't have to read the spec at all yourself.

tbh that's not a helpful thing to say. I think a more productive thing would be to ask "What model are you using?" "Are you using it in chat mode or as a dedicated agent?" "Do you have an AGENTS.md or CLAUDE.md?"

I've also been underwhelmed with its ability to iterate, as it tends to pile on hacks. So another useful question is "did you try having it write again with what you/it learned?"

> I think a more productive thing would be to ask "What model are you using?" "Are you using it in chat mode or as a dedicated agent?" "Do you have an AGENTS.md or CLAUDE.md?"

In my case I'd have to say "Don't know, whatever VS Code's bot uses", and "no idea what those are or why I have to care".

> Don't know, whatever VS Code's bot uses

The reason I ask about what model is I initially dismissed AI generated code because I was not impressed with the models I was trying. I decided if I was going to evaluate it fairly though, I would need to try a paid product. I ended up using Claude Sonnet 4.5, which is much better than the quick-n-cheap models. I still don't use Claude for large stuff, but it's pretty good at one-off scripts and providing advice. Chances are VS Code is using a crappy model by default.

> no idea what those are or why I have to care

For the difference between chat mode and agent mode, chat mode is the online interface where you can ask it questions, but you have to copy the code back and forth. Agent mode is where it's running an interface layer on your computer, so the LLM can view files, run commands, save files, etc. I use Claude in agent mode via Claude Code, though I still check and approve every command it runs. It also won't change any files without your permission by default.

AGENTS.md and CLAUDE.md are pretty much a file that the LLM agent reads every time it starts up. It's where you put your style guide in, and also where you have suggestions to correct things it consistently messes up on. It's not as important at the beginning, but it's helpful for me to have it be consistent about its style (well, as consistent as I can get it). Here's an example from a project I'm currently working on: https://github.com/smj-edison/zicl/blob/main/CLAUDE.md

I know there's lots of other things you can do, like create custom tools, things to run every time, subagents, plan mode, etc. I haven't ever really tried using them, because chances are a lot of them will be obsolete/not useful, and I'd rather get stuff done.

I'm still not convinced they speed up most tasks, but it's been really useful to have it track down memory leaks and silly bugs.

>I decided if I was going to evaluate it fairly though, I would need to try a paid product.

Okay. Get me a job and I'll pay for any model of your choosing. Until then, finances are very slim.

Agreed was a bit rough. Yes they are not great at iterating and keeping long contexts, but you look at what he’s describing and you have to agree that’s exactly the type of problem llm excel at

Shouldn’t have to baby step through the basics when the author is clearly not interested in learning himself

> Shouldn’t have to baby step through the basics when the author is clearly not interested in learning himself

I'd rather assume good faith, because when I first started using LLMs I was incredibly confused what was going on, and all the tutorials were grating on me because the people making the tutorials were clearly overhyping it.

It was precisely the measured and detailed HN comments that I read that convinced me to finally try out Claude, so I do my best to pay it forward :)

I totally agree, and myself have gone through that cycle.

But the guy is being adversarial and antagonistic. Its a 2 way street, sometimes you have to call people out on their BS because I'm not seeing someone argue in good faith, but rather pretending some superior knowledge because hes working on a esoteric protocol like people here don't know how packet headers work

>Shouldn’t have to baby step through the basics when the author is clearly not interested in learning himself

Okay. Whip up your favorite model and report back to us with your prompts. I'm pretty anti-AI, but you're going to attract more bees with honey than smoke.

There is a big performance difference between models.

Trying to trace back the quality of the model to the "skills" of the person sounds extremely manipulative.