| Disclaimer: infrastructure secrets management is my profession. This is a lot harder problem than people realize. If you have a fixed set of machines that need secrets, then encrypting a bag of secrets with each machine's private key works ok. But in auto scaling / automated / ephemeral scenarios, it doesn't work. You need an RBAC scheme for machines that builds layers of trust; each machine is placed into a role by a trusted service, script or person. Communication between the machines and the secrets service is verified TLS. Each event of access to, or modification of, a secret is recorded for audit purposes. And people and machines should both be treated as first-class actors. Furthermore, secrets should be kept off permanent media; per the 12factor guidelines, secrets should come from environment variables. Don't entangle secrets management with other tools like configuration management; otherwise you impede yourself from switching architectures down the road. Don't create workflows that only ops can control, leaving developers out in the cold, or you are increasing organizational friction. And if your secrets management processes are opaque to security and compliance people, then they won't have the same level of trust that they would have in a transparent system. Here's an example of how we approach the problem: http://blog.conjur.net/chef-cookbook-uploads-with-conjur |
This makes using ssh-agent with a reasonable timeout incredibly painful.
So you're left with either reentering your passphrase every 5/10/15mins, or basically never. Using smartcards for humans and TPMs for servers is a step in the right direction, but it seems ssh-agent is still missing this basic functionality - or am I missing something?