Hacker News new | ask | show | jobs
by timeattack 2168 days ago
It's cool and interesting application of the technology, but doesn't really seem to be practical.

When you're unable to access machine using your standard SSH keys usually it means that it's highly unlikely that it will be possible to login remotely via other means.

As an emergency login there are two common options:

* in case of cloud: use remote VM console provided by the hosting provider.

* in case of bare-metal: use IPMI to access machine console directly.

6 comments

Hey there — I'm the author of this post.

There's a few scenarios where I imagined this approach being useful:

* If you have any kind of remote dependency in your SSH auth flow (LDAP, or an online CA, or automated Ansible playbooks to push keys), any of those might fail and render the host otherwise inaccessible.

* It's becoming more common to not ever SSH into machines. So, what if emergency SSH access is the only way to access a host? Some companies even go a few steps further: When a host is SSH'd into, it is considered "tainted by humans", is quarantined and eventually shut down.

* Some hosts should never allow root access to anyone. For example, there's no reason for anyone to have root on a bastion host. So, what if the only way to get root on some hosts is with the emergency key?

While you could use the cloud VM console for emergency access in these cases, having a hardware key provides even more security and would let you turn off cloud VM access.

Of course if you broke your SSHD config, or have a network issue that prevents you from reaching the host, this won't magically fix any of that. IPMI is good for that though.

> While you could use the cloud VM console for emergency access in these cases, having a hardware key provides even more security and would let you turn off cloud VM access.

I'm not sure it's more secure, but I suppose it depends on the provider. Your control of your account's admin key (or password) is the last bastion of security for most providers.

> Of course if you broke your SSHD config, or have a network issue that prevents you from reaching the host, this won't magically fix any of that. IPMI is good for that though.

This is why I just use the providers' emergency management (or IPMI). Easier to have one method of emergency access that always works regardless of the guest. The guest's root (or emergency) account can still have a pretty darned complex password.

> It's becoming more common to not ever SSH into machines

This is a reality for me. At work we run a handful of distributed clusters, if anyone does an equivalent of sshing into a box and poking around (in our case, `kubectl exec`), the infrastructure team gets an alert, then follows up with whoever invoked the command. If they are doing debugging, we shift whatever resources they need into dev. If they are not debugging, they will probably get questioned by their boss. (fortunately, most of the time this chat results in, "oh wow I didn't know about the APM/Metrics/Graphs/Logs/etc setup we had, I'll check that next time)

There’s also the access control feature of this approach. You can give someone temp access to a host.
That's exactly the reason why we use certificate based ssh access at my employer's: for our suppliers. It looks like the author went far away to find alternate reasons to deploy this :/
> Valid: from 2020-06-24T16:53:03 to 2020-06-24T16:03:03

Almost! I think you meant:

Valid: from 2020-06-24T16:53:03 to 2020-06-24T17:03:03

Yep, the most common way I've lost access to machines is by messing up the iptables/ipfw rules. Read a post here about avoiding that by having a timed reset with sleep.
For people asking: you can create a resetfw.sh script, for iptables:

  #!/bin/bash

  iptables -P INPUT ACCEPT  
  iptables -P FORWARD ACCEPT  
  iptables -P OUTPUT ACCEPT  
  iptables -t nat -F  
  iptables -t mangle -F  
  iptables -F  
  iptables -X
chmod +x resetfw.sh

and add it for ex to /etc/cron.hourly directory

This way you can test your iptables rules and they'll get clear at every hour. Once you check they are OK you can delete this cronjob.

(NOTE: I'm typing from memory, haven't tested this)

https://manpages.debian.org/stretch/iptables/iptables-apply....

Or use `at` to run `iptables-restore`. Simpler than setting up a cronjob (and if youre doing it manually, cron has a bunch of gotchas that at least bite me in the ass once in a blue moon).

Yes. Although iirc (it may have changed, haven't looked "recently" the iptable- commands are distro specific, as in not all of them have / had them).
You might add a daily task to remove that task just in case you forget. That way you avoid lockout but don't end up opening yourself up accidentally.
Or possibly just turn iptables off, in the same cron.hourly.
Ah yes, that's simpler: systemctl stop iptables. Also need to do systemctl disable iptables just in case, otherwise if the server reboots the iptables service will restart.
This has happened to me as well. Where could I read about this method?
Maybe this:

   service network stop && sleep 10 && service network start
The worst is: sudo ifdown eth0 && ifup eth0
Link?
IPMI is painfully insecure, and therefore assumes the existence of a completely separate, protected network. Some people don't colocate more than a few machines (and therefore can't justify the extra infrastructure for an IPMI OOB network), don't want to pay extra for a colo provider to provide IPMI OOB, and/or don't trust their colo provider to have access to such a sensitive and insecure thing.

Having an emergency method to connect is an excellent idea.

Not sure about other vendors, but I know Cisco offers dial-in capabilities for managing routers, switches, etc. The dial-in modem on the router is connected to a landline.

Has this approach ever been taken by server admins?

A standard for emergency IPMI or other console type type access would be welcome. Vendors have certainly done a bad job in this space. Break-glass type access isn't a new thing.
I think it depends. I've worked in places that had something like the following setup.

- Hardware in datacenters with operators who were not experts on the applications running. - All remote access was done using a short term (~1 day) ssh keys. There was an authentication service to generate these.

It was pretty easy to imagine that the authentication service would go down. In this case a selection of people who worked on the infrastructure had longer-term keys on HSMs. (With very high logging and alerting for any use). It would actually make sense for these to be CA keys so that they could access different user accounts or similar.

TL;DR you are assuming a very basic SSH auth setup. As the regular setup gets more complicated having something like this as a backup makes sense.

> All remote access was done using a short term (~1 day) ssh keys. There was an authentication service to generate these.

This is weird. Really weird.

Did that service use a more secure authentication storage than a password protected key?

It’s really not - by limiting the life of keys, and having a service generating them, you can more effectively lock things down when someone leaves, rather than going round revoking keys from servers. Something we’re experimenting with at work is AWS Instance Connect, which uses your AWS credentials to push a key to a target instance with 1 minute validity - no more managing keys on instances, and revoking access is just a change to an IAM policy.
As opposed to having a few bastion-hosts, and requiring people to log in there in order to then ssh on to their final destinations -- in that case, revoking their keys is as simple as wiping their accounts on the bastion hosts.
Even with a few bastion hosts things get hard to track quickly as you end up with multiple clusters (dev/staging/UAT/production), and potentially multiple production clusters in different regions.
It seems weird but has several advantages. Most places screw up defunct account cleanup and privilege management.

A process like this allows you to ensure that people have the access they need and makes it easy to get them the privilege separation needed.

Yes, the system used multi-factor auth and could be locked for suspicious activities.