| Provisioning and Bootstrapping machines is still a black art. At the risk of appearing to build a strawman, I'm purposefully taking this quote slightly out of context, because this strikes me as a far more general belief, which is used to justify these CM systems as a solution. Herein lies the danger of circular reasoning or self-imposed problem. Since I'm an open-minded sysadmin, I always keep an eye on the likes of puppet[1], but I have continued to reject them. Much of it has to do with philosophy: use what I can, that's tried and true. Very nearly all the right pieces for provisioning and configuration management already exist[2]. Make sure your machine is 100% described by your tool. This, sadly, smacks of perfectionism, which is known as the enemy of Good Enough, yet I agree that these tools demand it. The vast majority of this kind of description is already done by the OS via the package system and init/upstart[3]. To duplicate this kind of description with a separate tool is, to me, incomprehensible. What's more, for at least the past 5 years, brand name server hardware[4] has had, in the BMC, without any special add-on cards or option keys, enough IPMI-over-LAN support that one can, over the same physical ethernet as one of the regular network interfaces, set the next boot to be PXE and trigger a reset. From that point, a fully functioning server can be up in 5-10 minutes[5]. With those kinds of provisioning times, why would I want to bother with something that requires the extra step of "black art" bootstrapping[6]? At most extreme, to make a configuration change on a running system, I'd just need to trigger the installation of a new package version on the relevant systems. The best part of such a scheme is that I don't need to make any further customization choices, like puppet vs. chef. All the infrastructure I need (DHCP, DNS, TFTP, kickstart or debian-installer, local mirror/repo) is a Good Idea to have anyway, and it's all standard. I would expect any moderately experienced sysadmin to be able to debug all those pieces, without learning a DSL or a particular system's quirks. One also benefits from years of evolution of such tools, including "free" redundancy and pre-existing plugins for monitoring. The only thing that's left is some kind of higher-level templating, which can be added as a wrapper around all of the standard things. So far, the only tool I've found that doesn't want to take over everything all at once, and works fine with incremental takeover/integration of the underlying tools) is Cobbler [7]. Not all problems can be solved with (custom) software. [1] It was this month's BayLISA topic. [2] Growing up with parents in the semiconductor industry, my exposures to Unix (and VMS, TOPS, and VM/CMS, none of which "stuck") and Lego were around the same time, so there's a deep-seated analogy there. [3] which are, of course, configured by files which can be contained in packages, so, really, just the package system. [4] that is, any rackmount server which can be ordered with a cable management arm. That there is such a differntiating factor belies the notion of "commodity" hardware. I find it to be merely a euphemism for "lowest common denominator" hardware. [5] I've observed this scale easily, with no slowdown, to 30 clients against one sub-$1k boot/repo server. [6] "Because we use cloud providers" is a weak answer, since, besides being a self-imposed problem with other unique issues, it gets remarkably expensive beyond a few dozen (if that) instances. [7] When I last dove into it a couple years ago, it was clearly focused on kickstart and the Redhat/Fedora world, with Debian/Ubuntu barely an afterthought. |
In my experience, "making a configuration change on a running system" is not an extreme case. It happens all the time, and 5-10 minutes of downtime for reboot just to add a comment to an apache configuration file is insanity. Especially if there's a problem and you need to rollback.
Frankly, if you are booting machines in 5-10 minutes with all the packages they need, you've almost entirely solved the bootstrapping problem anyway. There are just a few security bits left.
edit: I did a quick check of our subversion repository, it looks like our commits per month are in the 80-100 range. Systems get reconfigured, at least in minor ways, a LOT. Installing a new version of a package for every single change would be a huge amount of overhead. Far more than the one-time overhead of bootstrapping cfengine. In fact we could do it that way, if we wanted, but we don't.