Hacker News new | ask | show | jobs
by ldelossa 811 days ago
Show me one of these things do something more complex then a front end intern project.
2 comments

I agree, these things seem to do okish on trivial web projects. I've never seen them do anything more than that.

I still use ChatGPT for some coding tasks, e.g. I asked it to write C code to do some annoying fork/execve stuff (can't remember the details) and it did a decentish job, but it's like 90% right. Great for figuring out a rough shape and what functions to search for, but you definitely can't just take the code and expect it to work.

Same when I asked it to write a device driver for some simple peripheral. It had the shape of an answer but with random hallucinated numbers.

I've also noticed that because there is a ton of noob-level code on the internet it will tend to do noob-level things too, like for the device driver it inserted fixed delays to wait for the device to perform an operation rather than monitoring for when it had actually finished.

I wonder if coding AIs would benefit from fine tuning on programming best practices so they don't copy beginner mistakes.

I used a web project in the demo because I figured it would be familiar to a wide range of developers, but actually many nontrivial pieces of Plandex have been built with the help of Plandex itself.

That's not to say it's perfect or will never make "noob-level" mistakes. That can definitely happen and is ultimately a function of the underlying model's intelligence. But I can at least assure you that it's quite capable of going far beyond a trivial web project.

It's also on me to show more indepth examples, so thanks for calling it out. I'd love it if you would try some of the projects you mention and let me know how it goes.

So basically you doesn't have any non trivial example. What else but to be expected?
Check out some of the test prompts here for examples of larger tasks: https://github.com/plandex-ai/plandex/blob/main/test/test_pr...
Here's a prompt I used to build the AWS infrastructure for Plandex Cloud with Plandex: https://github.com/plandex-ai/plandex/blob/main/test/test_pr...
It's not something I would consider a complex job. A simple prompt to chatgpt could even produce a working CDK template.
Here's another one, for the backend of a Stripe billing system: https://github.com/plandex-ai/plandex/blob/main/test/test_pr...

It seems like more examples demonstrating relatively complex tasks would be helpful, so I'll work on those.

I'm certainly not trying to claim that it can handle any task. The underlying model's intelligence and context size do place limits on what it can do. And it can definitely struggle with code that uses a lot of abstraction or indirection. But I've also been amazed by what it can accomplish on many occasions.