Hacker News new | ask | show | jobs
by oyving 6189 days ago
I am probably missing something about Fabric, but it seems to me it solves a problem already solved by most unix distributions.

Why not utilize the host packaging system for deploying your code and applications?

4 comments

Because it's sometimes simpler not to, especially if you're planning to do more than just internal deployment.

You may end up dealing with multiple operating systems or package tools, for example, at which point Python's ubiquitous setup.py (and tools which work with it on every OS) look very attractive.

You may end up needing to install App A with Version 2.0 of Dependency B, and App C with Version 3.5, on the same server, at which point virtualenv starts looking really attractive.

These sorts of situations are extremely common in the real world, and "use the OS package tool" isn't a one-size-fits-all solution for them.

There are classes of problems that can't be solved at the system level, like deploying to more than one system type or across multiple machines at the same time.
Sorry, but I wonder what you're talking about. Package managers are designed exactly for this purpose.

It's not even hard to leverage the power of apt (the most advanced of the bunch) for your own deployments.

In essence you setup a local mirror and add that to the sources.list of all your hosts. Then you learn how to roll your own debs and push them to the mirror. The actual deployment happens via apt-get - which can be triggered by cron, a shell-script, puppet, a sweaty admin at 4am, or whatever fits your bill.

Working with the system this way instead of along with it (or even against it) has various advantages. Most importantly you get proper dependency management. Need to roll back to app version 2 from version 3, whereas version 2 depends on an older version of foo? No-brainer, apt takes care of that for you, both ways, also in much more complicated cases. Need a package or package version that's not in the official repos? No problem, roll your own and make your application package depend on that.

With a bit of elbow grease you can also have it mangle your database and other auxiliary infrastructure appropiately, within the respective pre-/post-install scripts.

Fabric and capistrano are just expressions of the old "if you don't understand you're doomed to reinvent, poorly" meme.

I agree with you. But now I want to launch and release to several EC2 and Rackspace machines, in parallel. apt doesn't help with that. It also doesn't help with releasing to multiple machines simultaneously (including different types).

If I have 5 debian machines that need to be updated, I should be able to do that with a single command and it should happen in parallel. The same applies if I have 5 debian machines and 5 red hat machines (etc...). I'm advocating a tool that is aware of the existing system specific package managers rather than a replacement of them.

I agree with you. But now I want to launch and release to several EC2 and Rackspace machines, in parallel. apt doesn't help with that.

Ofcourse it does. What makes you think it doesn't?

If I have 5 debian machines that need to be updated, I should be able to do that with a single command and it should happen in parallel.

reprepro -Vb . stage1 myapp_2.0-1.dsc

That drops a new pkg onto the mirror where the staging hosts pick it up within one minute, from cron. I could use the "live" distro instead of "stage1" to roll it out to production. We use sections if we want to limit the push to individual groups of hosts.

The same applies if I have 5 debian machines and 5 red hat machines (etc...)

If you mix linux distributions in a production environment then you have bigger problems to resolve first.

I'm advocating a tool that is aware of the existing system specific package managers rather than a replacement of them.

Those who don't understand are doomed to reinvent, poorly...

I think we mostly agree, we're just looking at the problem from different directions.

Of course it does.

Can apt launch EC2 instances and execute scripts (that are not part of the package) before and after installation? Can it update security group settings and request and assign static IP addresses? My understanding is that apt does not help with these problems, so we write scripts or use tools like Fabric to do this. These scripts/tools are aware of the package manager in that they call the commands to make things happen. This is the level I'm talking about at which there are open problems.

If you mix linux distributions in a production environment then you have bigger problems to resolve first.

In an ideal world this is true, but it does happen. For example, one vendor my require a specific type or version of OS from the rest. A business may also choose to change the OS from one release to the next.

It's important to be aware of what is possible and account for it ahead of time. Again, I'm not advocating not to use apt or yum or rpm. I'm suggesting that it's helpful to not tie your process to a specific one unless you have complete control over the environment, now and for the foreseeable future.

Can apt launch EC2 instances and execute scripts (that are not part of the package) before and after installation? Can it update security group settings and request and assign static IP addresses? My understanding is that apt does not help with these problems, so we write scripts or use tools like Fabric to do this.

Well apt does not launch EC2 instances, you launch them, after you defined their role in your central configuration server.

The first thing a launched instance does (in rc.local) is "apt-get install bootstrap". The bootstrap package contains everything a node needs to come alive. Ours consists of not much more than a script that immediately runs via the post-install hook. This script is where the magic happens, it connects to the "hivemind" and gathers the configuration data, based on the node name that the instance was parametrized with at startup. According to the role it is asked to assume it will install the appropiate application packages (we call them "logic bombs"). For sanity it makes sense to just name the packages after the role. We have packages for "faceplate", "db", "queue" and such.

The packages will depend on other packages as needed and most of them contain pre-install hooks for initialization tasks (e.g. mount an EBS volume for a database node, claim an elastic IP, mangle DNS, etc.).

Well, long story short, I think the key mistake of capistrano and fabric is to assume Push where you really want Pull. Once that is realized life becomes much easier.

My understanding is that apt does not help with these problems, so we write scripts or use tools like Fabric to do this.

Apt is ofcourse just one part of the toolchain and scripts will always be involved either way. My point is that a toolchain built around apt most likely has no need for something like fabric. Fabric is just not a very useful abstraction in a scenario involving more than a handful of hosts.

In an ideal world this is true, but it does happen. For example, one vendor my require a specific type or version of OS from the rest. A business may also choose to change the OS from one release to the next.

Well, these are problems technology can't fix. These are problems only the HR department can fix.

I'm suggesting that it's helpful to not tie your process to a specific one unless you have complete control over the environment, now and for the foreseeable future.

There is a word for systems where nobody assumes "complete control": abandoned.

It's not the packaging, but Fabric (and Capistrano and friends) all look like shells scripts to me.
Think of them as shell-script frameworks. They include commonly used tasks/libraries that are useful for deployment and build tasks. Yes, you could use shell script, but these frameworks gives you a more convenient environment. They are also easier to set up and are more portable, since they have fewer external dependencies.
They are also easier to set up and are more portable, since they have fewer external dependencies.

Sorry, but WTF? This made my toe-nails curl up.

You can't get a much easier setup than "already installed". You can't get much more portable than bash. And you can't get much less dependencies than zero.

Seriously, sit down and write the equivalent shell-script to whatever fabric/capistrano recipe you're currently using. I'm quite sure you'll be a bit baffled about why you bothered with them in first place.

To me it seems like Fabric/Capistrano were invented by people, for people, who are afraid to learn the bash syntax. This is unjustified, bash syntax is ugly but trivial.

Try getting a moderately complex shell script to run across different platforms. I dare you.

While it might be a safe bet these days to assume that bash exists (Though there are no guarantees), you can't really do anything with the shell alone - You have to call external commands, and they vary from platform to platform. FreeBSD has all sorts of annoying small variations of standard gnu utilities (or was it the other way around). And Windows doesn't even have a standard shell.

> To me it seems like Fabric/Capistrano were invented by people, for people, who are afraid to learn the bash syntax.

To me it seems like you never actually used shell script for anything serious.

Try getting a moderately complex shell script to run across different platforms. I dare you.

That's a broken premise. Your deployment script doesn't need (and should not) be complex by any metric. Your dependencies are ssh, tar, mv, cp, rsync/git/svn and a very small number of other utilities which are easily tested or wrapped for compatibility. If you think you need more then you're likely doing it wrong (e.g. trying to reinvent version control and package management at the same time).

and they vary from platform to platform

That's the other broken premise. You don't "build once, run anywhere". You build platform specific modules and only trigger them centrally. Puppet shows the way.

To me it seems like you never actually used shell script for anything serious.

Hm, let me think, I've created and managed a deployment of >20 racks. But yeah, nothing serious.

> Your deployment script doesn't need (and should not) be complex by any metric.

I kind of agree, but complexity is a relative concept. mv and cp doesn't exist on Windows. And you really don't have to use exotic commands to run into compatibility issues between bsd and linux. It's not long ago I had an error reported due to the fact that readlink doesn't work equal on linux and bsd. I don't think readlink is in the "too complicated" basket.

> That's the other broken premise. You don't "build once, run anywhere".

Why would I prefer to write three different deployment scripts, if I could write one? Am I missing something here?

> Puppet shows the way.

I don't know Puppet. I'll have a look at it.

> You can't get much more portable than bash.

I was with you up to this point, bash is not portable, and it is a hideous shell.

Any sane hacker writes all their scripts that are meant to be portable in standard bourne shell. Bourne is to bash as C is to C++. And people that put #!/bin/bash at the top of scripts that can be run by any bourne shell will burn in gnu/hell for the rest of eternity.

I was with you up to this point, bash is not portable

I have yet to encounter a system where bash wasn't available, so yes I'd say it's quite portable.

You are right, though, it would be more consequental to run with vanilla bourne shell.

and it is a hideous shell.

Well, that's a holy war I'm not so interested in. In my opinion all shells are quite horrible. The idea is to pick the lowest common denominator and bash just happens to be the most popular of the bunch. Your chances of finding a working /bin/bash on any given system are still orders of magnitude higher than finding a working capistrano/fabric along with the corresponding ruby/python toolchain.

But again, I agree that if you're forced to deal with esoteric platforms then your chances of finding a working bourne shell are even higher than that.

> I have yet to encounter a system where bash wasn't available, so yes I'd say it's quite portable.

Most systems (other than linux and OS X) do not include bash by default, and not everyone is able or willing to install it (and its dependencies) just to run a silly shell script. Sh one the other hand is almost as universal as ed(1), and implements the most sane subset of bash anyway.

Other systems (eg., Plan 9) don't have bash available at all (this is considered by some as a feature), while bourne is supported, if only for backwards compatibility.

> In my opinion all shells are quite horrible.

Most mainstream shells are indeed horrible (don't get me started on csh), but [there are some quite sane shells out there](http://rc.cat-v.org).

> The idea is to pick the lowest common denominator and bash just happens to be the most popular of the bunch.

lowest common denominator != most popular of the bunch.

> Your chances of finding a working /bin/bash on any given system are still orders of magnitude higher than finding a working capistrano/fabric along with the corresponding ruby/python toolchain.

And your chances of finding a working /bin/sh on any given system are still orders of magnitude higher than finding a working /bin/bash

If you're distributing software to end users a packaging system is probably the way to go, but if you're looking to deploy code from development environments to staging and production servers then something like Fabric or Capistrano is the way to go.
Why? With the host packaging system I get versioned packages, through host management tools I can control which packages my production, staging, and development hosts should have, and I can describe my dependencies on host libraries and software through the host system's own tools.

Together with distribution systems like apt I can also significantly ease deployment.

I can see that executing some commands over a set of hosts at the same time could be useful, but doesn't sound like a killer feature for me.

As for deploying from staging to production servers, it sounds more tidy to build proper packages to deploy in staging and test before deploying the same packages to production.

I can see that executing some commands over a set of hosts at the same time could be useful

for h in $hosts; do ssh $h "my command"; done

As for deploying from staging to production servers, it sounds more tidy to build proper packages to deploy in staging and test before deploying the same packages to production.

Amen.

dsh is worth looking at. It's essentially a for loop that runs ssh, but it can also get named groups of machines from config files, run them in parallel, and prefix output lines with the machine they came from.