The gotcha is that users only read the smallest possible amount of docs (which is usually zero), at the most "focused" placed (e.g. the docs for exactly one command they suspect is misbehaving, not for all of the commands used in the script, and definitely not the intro into the docs where the concepts are explained), and the doc writers don't bother to duplicate the information in all the relevant places.
There are two types of useful documentation: Documentation that builds a mental model for a user, and documentation that answers questions that cannot be answered from the mental model. The first is done through tutorials and quickstarts, and the second through API references.
There's a third type of documentation: Documentation that describes behavior that is contrary to the mental model. This documentation is useless to a user, as there is no reason to seek it out. The benefit to a developer is that they can document bugs instead of fixing them, and blame the users for it.
To be fair, if the documentation isn't meeting the users' needs it is poor documentation.
One of my more tongue-in-cheek maxims is that too much documentation is worse than too little. With too much documentation the information might be there, but you aren't able to find it. The outcome is the same, but with too much documentation you've just wasted an hour failing.
It's slightly tongue-in-cheek because you can push the amount of documentation pretty far, but you have to think about how to organise it and how users will get the required information when and if they need it.
> if the documentation isn't meeting the users' needs it is poor documentation
Unless they are not reading it, or properly paying attention. Or following an unofficial document and blaming the core project for failings in that. Sometimes it is on the user.
> With too much documentation the information might be there, but you aren't able to find it.
It can also became a huge burden to maintain, meaning it is in danger of becoming out of date or inconsistent.
> Unless they are not reading it, or properly paying attention. [...] Sometimes it is on the user.
If they make no effort, sure. But looking for an answer to a question and finding something that seems to works is a perfectly reasonable way to use documentation. If there's a dangerous gotcha and it isn't documented right there, then the documentation is structured badly.
> Or following an unofficial document
That could be a very clear symptom of bad documentation.
Sometimes it's on the user, but if lots of users are failing to use your documentation, maybe consider that the documentation is bad.
I do agree with this in the end. I also think people often underestimate how challenging it is to create good docs.
I've seen this pattern often:
- Person doesn't know how to do something.
- They fumble around until they get it working.
- Once it's finally working, they write a doc about how they did it (because our "poor documentation" is a common pain point and therefore a popular problem to attack).
- If you actually search our internal docs, you find at least one other doc describing the same process.
I've seen this happen in many different contexts, even onboarding. I've seen multiple people join the company, and each one wrote down their own "onboarding painpoints" doc to hopefully help the next person, without even noticing the existence of the previous person's equivalent doc.
So again, I still agree with you. Even in these scenarios I described, I'm willing to believe that there is some way we could've structured our docs that couldn't prevented these issues. But I have no idea what it is, and seeing all the futile attempts at improving the situation makes me irk a little at armchair "poor docs"-style comments (not that that's even related. I've gone a little off topic here!)
LLMs augumented with RAG has great potential for docs as well.
Have a problem, ask the LL
and it will reference the docs. So you don’t have to read through 40 pages just to find an answer.
Some products already make use of it for their docs. More will in the future.
The advantage then is that you can have up to date docs that the LLM can pull from and be able to hopefully accurately pinpoint relevant docs and summarize an answer for the user.
I also think some startups will come that focus on providing this kind of service. Probably several such startups exist already even. Similar to how there are some companies from before LLMs existed that focused purely on better access to docs of open source products.
Documentation has to meet the user, not the other way around. Otherwise it's poor documentation. Docker should update their docs to show what 90%+ of people are interested in and leave the deep dives for later.
One has beautiful clear examples of what 99% of people are going to want to do. The other is some kind of secret language people must decipher, especially if they are new to the language. "What is 'm'? What is 'I'?"
I interpret each directive in a Dockerfile as creating a new layer of an image. So this ARG-before-FROM gotcha doesn't feel like a gotcha to me, but rather, the consequence of literally interpreting "ARG" and not knowing the side-effects of a directive in a Dockerfile. (Yes, even WORKDIR, ENTRYPOINT, and related instructions create a layer, albeit a 0-byte one)
If you need to write in the docs about a surprise that a user otherwise wouldn't have expected, may be it's a sign that the surprise should be fixed up such that it's not surprising behaviour.
> I don't see the gotcha, that's how it is supposed to work. It's just their purpose
The issue here is that docker evolved rather rapidly and in a “let a thousand flowers bloom” sort of manner. And because of that you have these subtle but confusing differences between behaviors that aren’t really all that consistent.
A good example of this is how the shell is handled from layer to layer(sorta this) or even how CMD and ENTRYPOINT behave (or don’t).
If the spec has allowances for behaviors like this generating warnings would be the best possible outcome (eg referencing a variable that theoretically isn’t set). Maybe certain runtime / runc / build envs complain but the author didn’t see the complaint.