| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by merpkz 46 days ago
	How do you guys, who run Docker in production deal with managing nftables firewall on hosts running containers? By design docker daemon creates and manages a set of firewall rules to forward traffic between containers and ingress traffic into containers as well as masquarades the outgoing container traffic. That is all well until admin needs to alter hosts firewall to allow and deny other traffic unrelated to docker - and restarting nftables or even applying new nftables rules usually ( flush ruleset in /etc/nftables.conf ) purges all the docker created rules and effectively breaks everything until docker daemon is restarted and rules re-created. I have partially solved this by using nftables filter chains with different names - admin_input/admin_output and using input hook with negative priority - so that traffic I choose to block is evaluated before docker rules are applied - that feels a bit like hack, but so far is the only way I have found. It is good practice in this day and age to run local firewalls on all hosts with policy deny, so that only traffic explicitly allowed can pass, that can severely limit blast radius during compromise.

8 comments

dizhn 46 days ago

My containers run in dedicated "docker host" VMs. And I never expose ports on 0.0.0.0, just the private internal IP. Most (all) of my docker hosts do not have a public IP anyway. I use wireguard to access them myself. If they need to be public I reverse proxy with caddy from my web server (or use Authentik's embedded proxy). These servers have access to the same private LAN which could be hardened without having the issues you brought up.

By the way most docker based implementations do not actually need the userland proxy docker runs automatically. Disable it in /etc/docker/daemon.js

{

    "userland-proxy": false

}

zbentley 46 days ago

https://www.macchaffee.com/blog/2024/you-have-built-a-kubern...

Like, if that works for you, more power to you. But that is a lot of moving parts in exchange for using a tool whose value prop is that it doesn't have many.

chickensong 46 days ago

That's neither kubernetes nor a lot of moving parts, just basic sysadmin setup for good hygiene and piece of mind.

dizhn 46 days ago

I wish. There's nothing like Kubernetes here nor the features it gives you or any need for them. Just some basic sys admin stuff that works well for me.

hkpack 46 days ago

This is the way, ended up using identical setup.

KetoManx64 46 days ago

What would the config look like if I have my docker containers split up over multiple VMs?

dizhn 46 days ago

I have all of mine on the same (or accessible) internal LAN so they can all talk to each other. You can get the connection going with Wireguard if they are in different places in terms of networking.

KetoManx64 46 days ago

As in you have a VLAN just for the docker containers to talk to each other on?

dizhn 46 days ago

Amounts to the same thing but no. Promox servers with two bridged interfaces. One interface has a public IP, the other a 10.0.10.0/24 etc. Multiple baremetal servers are connected by wireguard and have access to each other's private subnets. Like one other might be the 10.0.20.0/24. Setup the routes and good to go. Firewall to taste. My private LAN is all open.

This is not just for docker. There are other vms and lxc containers too.

KetoManx64 46 days ago

Very interesting way to set things up. Thanks for the breakdown! It's given me some ideas for our non-prod Proxmox cluster.

Lord_Zero 46 days ago

Could you elaborate on your setup? Is the docker host also your web server on which you run caddy?

dizhn 46 days ago

No it just needs to have route to the internal IP of the docker host. And you expose your ports on that IP. Let me know if you need more details. You could also put the reverse proxy (Caddy in my case) on the docker host.

BigTuna 46 days ago

I reverse proxy everything through a Caddy instance running on the same machine so I avoid the firewall dance entirely by just prefixing all my port assignments in the compose file with the loopback IP (eg. 127.0.0.1:3000:3000). Nftables denies all but 80 and 443 and I don't have to worry about restarts/flushes breaking things.

selfmodruntime 46 days ago

A really nifty thing is that you can also of course bind this to the device's tailscale ip!

Also you don't even need the loopback address if the traffic is between one container and another, just a bridge network is fine.

giobox 46 days ago

This is how I self host all my home services (Home Assistant, PFSense, Frigate etc), I do not for the life of me understand why so many folks doing self-hosted services for themselves put them on the public internet.

Caddy will even do fully automated valid TLS certificates for private IP ranges via DNS ACME challenge for free etc with renewals handled, so all my internal self-hosted sites have properly terminated TLS too, accessible by connected VPN clients.

It's funny that for many of us in our day job, we stand up private services behind a VPN all the time so only work clients can access it, but when self hosting don't bother with a simple wireguard/tailscale config etc.

selfmodruntime 45 days ago

A lot of people using docker or even k8s don‘t know that by default, a service is available to all other services via the service name defined in the compose file or your yaml specs. Docker compose builds an implicit bridge network. Most internet tutorials are wrong here and bing ports publicly to your ipv4 interface. So if you follow them you‘ll accidentally expose your database or similar to the public web

danparsonson 46 days ago

This is surely the easiest and I would guess the safest way, and has the added benefit that your proxy (nginx in my case) can handle SSL for you, making certificate deployment a breeze.

sneak 46 days ago

On my docker hosts there is no other traffic unrelated to docker. Everything goes in containers.

merpkz 46 days ago

Well, as an example we usually set incoming rules to filter SSH only from administrator IP addresses, TCP 10050 only from zabbix monitoring server and leave few icmp types required and rest is dropped and logged.

For forward chain we set docker network ranges to route between themselves and only services actually used in containers. Allow container outgoing connections to our DNS servers, centralized HTTP proxy server and monitoring - nothing else containers are allowed to route to.

And for output is similar, only allow our DNS servers, NTP, HTTP proxy, centralized rsyslog where everything goes and zabbix monitoring server and a few icmp types - nothing else gets out and is logged.

With the advent of these supply chain attacks we read about often here it's just a matter of time some container is compromised and this seems like only viable way to at least somehow limit impact when such an event occurs.

nijave 46 days ago

To expand, you can use privileged containers, host network, capabilities, etc if the software really needs it. In that case, Docker basically becomes an init system/service manager but you get a singular daemon managing everything

gomoboo 46 days ago

I put a firewall ahead of the Docker host so that they aren't running on the same system. Docker can do what it wants to on the host without stepping on my firewall rules.

declan_roberts 46 days ago

It makes sense but that's more overhead and the spirit of the post seems to be "can we just docker compose and be done with it?"

przemub 46 days ago

I use UFW, and this config: github.com/chaifeng/ufw-docker

The only modification is that I pin containers to an IPv4 address so I can limit the forward rule to that address.

ornornor 46 days ago

Adding to other answers: many cloud providers, including more reasonably priced one like hetzner etc offer firewall as a service where you can configure the firewall there instead of on the OS itself.

nijave 46 days ago

I don't. I'd run other workloads on separate hosts

papascrubs 46 days ago

firewalld supports docker and handles all of its routing/changes. I've standardized on using it in my environment.