Hacker News new | ask | show | jobs
by mjayhn 2235 days ago
I'm on the ops-architect end and I read design docs and usually steal all of the plantuml and everything else I can find as nobody ever (aside from the developer of the feature) can help me track down or troubleshoot problems and I find most developers to be very disorganized (I was too until recently and it's still something I work on daily). So the documentation is appreciated. I also write a lot of ops-focused docs and tools, like stuff to lower MTTR and just generally help my ops teams relax and be lost less often when alerts come in. An ops person should never get an alert at 2am and be completely lost, no playbooks or guidance or anything because a lack of implementation documentation, I've been there, it sucks and will make you regret your employment.

I don't think most of that gets read either. But I still do it. I just assume most people are siloed in their own projects and barely have time to get those done so reading ancillary stuff is on the backburner. But if some emergency ever comes up at least it's there and if I'm on a mountain I got everything out of my head so you won't need me to walk you through how something works over the phone.

With that said, I face immense frustration when I go into a microservice repo that is scaffolded hoping that some golang developer updated the readme to actually be relevant and not the scaffold template and 99% of the time for internal stuff it's never, ever touched. Please do some documentation for Ops/new-devs, etc. That extra 30-2 hours of work will save half a day of some new person trying to launch your service in KIND or whatever, now compound that over every new person/outage/time you've had to explain something that you didn't document. The readme.MDs in every single one of your microservices should not be a base template with zero touching. It's not hard work to give someone who has never touched it some breadcrumbs on what your service does.

Half the time I go in there when something is already broken and the stress is high hoping for simple stuff to get me going, even just some basic UML of what apis its talking to, what kubernetes resources it needs, what RBAC permissions, etc. Think of yourself having never touched the service and leave some pointers/best-practices for someone who has never touched it or troubleshot it (like your ops people when you throw it over the fence for deployment).

Having to figure out your microservice architecture via container logs when SHTTF is miserable.