|
|
|
|
|
by randolando
2608 days ago
|
|
I like the principles that you outlined about balancing automation with repeatable human processes. But you lost me at writing your own config management system. That seems at odds with rule number 3 because sysadmins must have fundamental understanding of the systems/network/OS, in-depth knowledge of your custom homegrown config manager and the language it's implemented in in order to debug. Happy to hear it works for you though. Personally I'm willing to accept the shortcomings of an existing config management system like Ansible because it's a system that is already understood by a large percentage of sysadmins and the core program does not require any maintenance time/effort from my team. |
|
Ultimately, it turned out:
1. We had already grown a custom configuration manager within the tool.
Both tools support many ways of doing the same thing, so we had to pick one way and constrain ourselves to it (e.g, master-minion vs. salt-ssh vs. masterless). This happened quite a bit. And, as usual, with enough use emerges a pattern. Plus, some ways simply did not exist and had to be built.
2. We had already learned large portions of the tool.
Ansible and Salt are simple tools when used for simple tasks. When using either for not-so-simple tasks, one invariably meets portions of their code/behaviour one doesn't expect to meet.
3. Any sysadmin we'd hire would need to know the configuration tool in-depth, anyway.
And, to our surprise, the vast majority knew only the basics, if that. Since we are, on principle, opposed to using something in an important capacity without understanding it well enough, and we needed to be certain any sysadmin we'd entrust with the responsibility did know their tool in-depth, we learned the tools in-depth, ourselves.
4. The tools are NOT simple.
When we'd learned Ansible and Salt, ourselves, we'd found they were actually quite a bit complex. Made sense, they had to take care of so many conditions and variations and different situations.
5. Any sysadmin we'd hire would need to know programming and a programming language.
We already had extensions in these tools, custom modules written in a real programming language. And, in this day and age, anyone with a responsibility as important as managing our production servers has to be able to program anyway.
----
Our current ops tool is only 1.2k (library) + 1.3k (config mgmt, sans static) LOC, the config mgmt is in plain Python, and is vastly simpler compared to knowing how Ansible or Salt work or how to write Ansible or Salt modules. The in-depth knowledge required is much smaller too (since we don't have to take care of all the many ways Ansible and Salt could be used or the platform differences Ansible and Salt need to worry about); just Python, SSH, SFTP, Rsync, Git, Sqitch, TLS, and the OS we have chosen, almost all of which one needs to know anyway.