Hacker News new | ask | show | jobs
by beardedwizard 693 days ago
I love it, but I can't help but wonder why I almost never see complete or up to date diagrams in an enterprise or at scale software engineering setting, despite there being so many tools in this space (mermaid, uml, draw.io, graphviz, etc). I wonder what the barrier is or how to make tools like this fundamentally different so that we would see more adoption.

This comes up frequently in the context of secure design review, or more generally when outside stakeholders need to understand a foreign system.

Nobody argues against diagrams as good practice, but so few actually make them. That tells me incentives/costs are still off, despite good intent.

Information extraction from design docs could be one approach to suggest a diagram for free but that creates a dependency on the fidelity of the design document.

6 comments

> why I almost never see complete

That's a huge scope question on its own. What's "complete"? How low level or wide do you go? What about the view of internals of the external systems you rely on? What about their mapping to teams/owners? C4 helps a bit here by saying "whole organisation throws everything possible into this and when you want to see it, you use scoped views instead".

> or up to date diagrams

"When exactly does the system change" is tricky too. When you change the code? When you enable the new system? Who knows that the diagram of the system exists and where? It's a bit of an issue of unknown unknowns. Again, the C4 idea helps here a bit, but there are still going to be random projections of state at a given time saved in various places.

Another big issue is that structure and presentation are very different. Generated diagrams of infrastructure or code look shit, universally. On the other hand, well presented diagrams are just snapshots in time that get preserved for years even if their outdated. I mean this is what C4 page shows as one of their results https://c4model.com/img/alternative-1.png which is an unreadable mess and this is what they initially complain about https://c4model.com/img/sketch-2.jpg which is perfectly sized and context specific goodness. The first is good for knowledge preservation, but if anyone needs to understand things, I'm making the second one for them.

In my dreams, you have docs and source code in the same repo, and update both in a single pull request. In reality, confluence...
I've lived this life first-hand, and it is refreshing after experiencing the vast open sea of half-assed Confluence sprawl, huge PDF design documents that aren't kept up to date with changing requirements, etc.

The other sibling comment regarding rendering documentation source to Confluence or <required external system> to keep the corp overlords happy is a great middle ground IMO.

Merely as a "for your consideration," if you already have a build pipeline, rendering the in-repo docs and pushing them into Confluence could be a middle ground between "corporate overlords" and the workflow that works best for you and your team
I tried to solve this after years of frustration, but it's a complex task, especially that "complete" means there is a lot of noise in the diagrams. Plus the data gathering is an immense task, not everything is in the repo even with world class terraform, you have to connect to external sources.

https://github.com/specfy/specfy

My take is that it's a bit worse than that: the whole doc tooling scene is a dumpster fire. I want to maintain technical documentation under revision control (same revision control system as I use for source code), and I want to be able to embed diagrams in said documents (also under revision control) and I want to wysiwyg edit those things. Amazingly in 2024 this does not exist.
> I want to maintain technical documentation under revision control

This kind of effort is sabotaged at every place I've ever worked, usually because product or leadership insists that it moves to confluence, and then predictably this increases friction and no one ever updates those diagrams again. People with "20 years experience in industry" can't be bothered login to github, and in the best case instead of confluence they want lucidchart, where inevitably the whole org is sharing like 3 seats.

Partly this is just laziness or ignorance, but partly it's deliberate overreach into engineering business because there's lots of people who see actual information as a threat to to their preferred narratives (think hype/sales promises/etc). Plus you can't say that you didn't know about a design flaw if there is 6 month old diagram calling out the need for a solution in big red letters. Since a picture is worth a 1000 words, and since a git hash can get a date attached to it, dead-on-arrival docs/diagrams tends to decrease accountability.

Jetbrains' recent WriterSide tool is a great addition into this space. It's not exactly wysiwyg, it's side-by-side live preview with different source languages (right now, markdown and some xml language are supported, but they want to support more popular doc languages).

It allows you to specify definitions in one location, and reference them in other pages. So if you have for example a documentation page which is about "installing our nodejs package", you can write a version in vars.json or vars.xml or whatever, and reference it in your installation documentation page.

When you later have another page which is saying "Hey, we found a bit of an issue with ...this-or-that... package because we're using $NODE_VERSION, and the package hasn't been updated so we have ...such-and-so... workaround", you know when you update the $NODE_VERSION in your vars, that you need to take a look at that package again.

It's an extremely basic part of software development, being able to define variables, but I think this is a very good sign of powerful documentation tooling.

The only major downside about the platform is that their XML (HTML) extensions to markdown are proprietary, and as such the platform isn't able to grow outside of Jetbrains' oversight. I'm still debating whether I want to switch from Gitlab's Wiki (which is very basic but is really really easy to edit for anyone on the team) to a solution like Jetbrains' Writerside.

I think this captures the sentiment I have been seeing and feeling myself. Docs and diagrams die because they aren't code. When the code changes, the docs are left behind.

What is interesting here is that mermaid and graphviz could be committed to source control, but I have never seen it done anywhere likely because the burden of drawing/updating is still too high (hence your wysiwyg comment).

I helped implement Mermaid as a part of Apollo GraphQL’s public documentation to address the “docs as versioned artifact with code” challenge. It’s a win, but still another Yak to shave since public docs are a product, not the source code for the internal teams’ use.
Have you taken a look at swimm.io? Been curious about it for a while now but not a fan of the friction to demo it.
Org files with embedded PlantUML blocks get close to what you want.
Who is using this? I have only seen plantuml written about on blogs, never actually in use.
I’ve used it for a long time; even if I don’t end up sharing with others it can help me quickly visualize things (the ease of change often makes better than pen and paper). At one of my jobs we had an architect join and he was amazing; I really respected him. He used PlantUML and would open it up and start writing sequence diagrams with us. He would make sure complex flows that were actively being implemented were kept up to date and they were useful references.
If you want to understand the underlying problem, think about geographical mapping (e.g. Google Maps). When drawing a map, how do you decide what gets featured? Roads, businesses? How big can the text be? How do you keep it all readable?

Diagramming is the same problem, but usually lacking the context to decide what to draw. The same stakeholders may be interested in different combinations of features depending on which business questions they're trying to answer. Geographic mapping is practically a constrained domain by comparison. And it's supposed to be feasible to do it automatically for systems diagrams? /hard-skeptic

I understand the argument but think this is a straw man based on my experience. The issue isn't so much that people can't decide on the level of detail, it's that the diagram is drawn only once or in most cases never.

A dataflow diagram for example is mostly self constraining in scope.

Automating it is another problem entirely and I agree it has a lot of hard challenges.

Honestly, I think you are using free (and old) diagramming tools, and you are getting why you pay for. There are plenty of tools designed specifically for documenting large systems, but they cost money.
Which ones are those? Who is using them? I don't limit this discussion to free tools, the list I gave happens to include some well known free things.