Hacker News new | ask | show | jobs
by LifeIsBio 388 days ago
One of my favorite applications of multimodal LLMs thus far is the ability to:

1. Draw a DAG of whatever pipeline I’m working on with pen and paper.

2. Take a photo of the graph, mistakes and all.

3. Ask ChatGPT to translate the image into mermaid.js

Given how complicated the pipelines are that I’m working with and the sloppiness of the hand drawn image, it’s truly amazing how well this workflow works.

3 comments

I recently did a variation of this where instead of drawing, I just drafted a quick few bullet points and text describing at a high level what the system should do. And then I asked chat GPT to identify use cases and generate sequence diagrams for each use case in puml format (plantuml). Shockingly effective and it took about five minutes. This was a technical proposal that I shared with a few partner companies to provide a detailed plan to a customer. It came after several online meetings spaced over a few weeks of us negotiating the details. Pretty important document and it was well received. Plantuml looks decent enough that you can get away with sticking the resulting diagrams in a document.

I'm a busy person. I don't have hours of time that I can take out of my schedule to generate what I regard as write only documentation (nobody will ever read or truly value it) that ticks the box of "we have stuff to point at when somebody asks (which nobody ever will)", which has a lowish value. Sometimes it's nice to have. The above is a fine example. People will glance at it, give me a little thumbs up, and then give me permission to proceed as planned and bill accordingly. Job done. It's not a reference design that anyone will ever look at for more than a few seconds.

After a few decades in the industry, I'm extremely skeptical of the value of diagrams vs. the time required to produce them. I just don't see it. A lot of good software gets produced without them. You don't need blueprints for your blueprints, which is what source code is (a blueprint for automatically compiling into working software). People value such traits as structure, readability, conciseness in source code for a reason: it allows them to treat source code as design assets. I don't write UML, I stub out data classes and interfaces instead. And then I refactor them over and over again. Diagrams just slow me down.

But a few minutes is about on the threshold of me wasting braincycles on producing them and enrich documentation that I'm writing anyway in text form. Quickly jot down some notes. Don't waste any time whatsoever obsessing about the awkward syntax of these micro languages, and just get the essentials nailed. I bet I can get it down to like a minute or so with better LLMs and larger context windows. "Examine this project, produce an overview diagram of all the database tables". That's a prompt I'd write. In the same way, letting LLMs document code is a great use of time.

> write only documentation (nobody will ever read or truly value it)

But what's the point of producing such documentation? I could imagine that the process of creating it could be somehow beneficial (committing to memory, finding discrepancies, etc). If it's not, why can't it just be skipped?

Documentation is a tool for creating shared understanding. If you don’t need to share your understanding, don’t write docs.

Note however that sharing understanding works on the people axis and on the time axis. Docs allow you to share your current understanding with your future self. They’d better be general enough to be true then, though.

Nowadays I find Gemini pro to be able to accurately document a complex workflow within minutes just by looking at the sources and sometimes even just logs, so value of low level docs is questionable. High level requirements - essentially how it’s supposed to work and what for - is very valuable, as it allows you and the model to cross check whether things work as they were intended.

I write and read lots of documentation. Diagrams are not a common feature of that.
I'd rather have a single concept diagram than a thousand words of docs. To each their own I guess.
None other than ticking boxes and shutting up the people that keep asking for such things to be produced. Who then invariably don't have the attention span to do anything with the diagram. That's literally the only reason I have for creating them. Otherwise it's a tedious activity that gets in the way of developing, slows me down, and just interrupts my creative process. I usually have better things to do.

And as you might understand from what I just said, I rarely produce any diagrams. I've been active as a developer since before UML got popular and then peaked and then faded into obscurity. I still have a signed (by Martin Fowler) copy of UML distilled on a shelf somewhere gathering dust. First edition and everything. I don't think it's very valuable. Waste paper basically. But contact me if you feel otherwise. It's in pristine condition because I never did much more than thumb though it and shelve it.

25 years ago, any self respecting architect had expensive licenses for things like rational rose or visio. And they'd be fiddling with those tools for hours to produce detailed class and other diagrams. And those diagrams were as useless then as they are now. Epic waste of time. People stopped buying and using those tools. This was once a very big industry that has now imploded to next to nothing. Nobody is buying, very few people waste budget on this crap. It's a niche market with some niche revenue. Tens of millions of developers ignore these tools.

What do plantuml, mermaid, and other OSS diagramming tools have in common? The people that make them don't eat their own dogfood to document how their own software works. You can have some fun looking for diagrams in OSS projects. With few exceptions, this is not a thing (devops people seem to have a weird obsession with diagramming. And overengineering). I'm not aware of many serious OSS project where developers have bothered to document even a tiny fraction of their software with diagrams. Including all the major OSS UML diagramming tools.

The documentation for these contains plenty of examples of course (typically very simplistic). Just not any that document how the tool is designed or works. I'm not judging. I wouldn't bother either for reasons that I articulated above. But I find it ironic that even diagram tool developers don't seem to feel an urge to use diagrams for their own stuff. Makes you wonder why they bother creating the tool? You'd have to be passionate about diagramming tools but not so that you'd want to use them for your own software.

The answer is literally in the same sentence you quoted…
Care to share your prompt(s)?

I draw a fair bit on a Kindle Scribe. I’d love to try this, but I bet your prompt would be helpful.

There's really not much to share. I rewrite the prompt each time, but here's was a recent one:

> I have an image of a hand drawn workflow diagram. I’d like to turn it into a mermaid.js file.

(with the image attached)

Alternatively, ask the LLM to create a mermaid DAG of your current code.
That's an interesting idea. A lot of times what I'm drawing is a blend of what the code is versus what I want it to be post-refactor.