Hacker News new | ask | show | jobs
by cryptoz 701 days ago
Ooooh, this is interesting. I think I'm building something quite similar! May I ask, how do you solve the code modification problem? In your demo video it shows the AI prompt is modifying code, not just generating it first-time, but I am curious how you do it. Are you using diffs?

I wrote about my approach here using ASTs: https://www.codeplusequalsai.com/static/blog/prompting_llms_...

You wrote in your post that you 'regenerate' a file - is that how you do it? Is it reliable? How does that work on big files? Does it fail at reproducing the rest of the file that should remain unchanged sometimes?

Thanks for answering any of these! Great project!

2 comments

Thank you! I read your blog post and checked out your project! If I understood it correctly, you’re trying to build a software engineering team in a box. Basically from first issue, to code, to live apps. Very interesting approach adding the collaboration angle! ASTs are neat but I’d imagine it could get hard to manage with more complex code.

In our case, we regenerate the `main.py` file each time. One of the hacks we did was to start with boilerplate code, which is why you see it modifying the code as opposed to generating from scratch the first time. We also feed the model with some context/rules on app building using our web framework, so the output is more bounded.

We haven’t tested it on really big files yet, though I'd imagine it could be a problem later. At the moment, we don’t generate HTML, JS/TS, or React code from scratch so our files tend to be relatively smaller than if we did. Our UI is defined via the `properties.json` file, which abstracts much of the underlying code, therefore keeping the files small. It’s much easier for LLMs to generate json and map it to UI behavior, than generate of the client code needed to do all of it.

We don’t have issues with the LLMs changing function/method code, but it occasionally implements one of boilerplate methods we didn’t explicitly ask for. In those cases, a developer has to remove that code manually, which is why showing code diff is critical.

Many other hacks come down to lots of prompt engineering! Something along the lines of "Only implement or modify a method/function corresponding to a user's prompt. Leave all others intact"

Happy to chat more!

Also you might find this blog post we wrote interesting: https://www.dropbase.io/post/an-internal-tools-builder-that-...

It seems to me the more killer product here is the "Writing two files to build a webapp", and you could comfortably rip out ChatGPT and market to a wider audience?
I like your take on that! I hadn't thought about it that way before but "Writing two files to build a webapp" indeed sounds quite intriguing. And we could extend that idea to "...and deploy it with 1 click" or some version of that.

I'm curious about what audience you have in mind and what kind of apps would you be interested in building this way? Would love to hear more of your thoughts!

Edit: I should add that our main motivation for integrating GPT is that we had to introduce some new concepts to make this experience work, which increased the app-building learning curve. We thought having GPT generate code and highlighting diffs would be a neat way to teach users how to develop apps without reading a lot of documentation.

Aha, thanks for that detailed answer! Really fascinating to hear others' approaches to this area of building simple but full apps with LLMs. I'll definitely be following your progress, curious to see where this goes. And I will read your blog post this afternoon!
hi cryptoz! I'm curious to read about your approach but it seems I'm not the only one, your website went offline.
Hm, might be a DNS issue, not sure. I'll look into it, thanks!
Loaded fine for me who just clicked it a few minutes ago