| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by nijave 1804 days ago

>Accessing production on the command line is an anti-pattern

Seems to be at odds with

>then one can create some sort of debugging service that runs on another port and deploy it to investigate the bug

In many cases, that's just SSH. In most cases, I'm not copying files around, I want to connect to the real environment where firewall rules, API keys, permission systems, overlay networks, etc are in place. If there's a stuck process (let's say, lock contention) it's much easier to just SSH on and run gdb and check the stack to see what it's doing. Some languages like Java have pretty rich tooling out of the box for remotely connecting to processes. Others, like Python and Ruby, you just use gdb

Either way, there's no copying data necessary--you just need access to the running process. For a large system with hundreds of identical servers, I don't want to deploy a debug service everywhere; I just want to connect to the one with an issue and check that.

Snapshotting works sometimes, but I used stuck processes as an example since that's usually where all this remote/log/etc stuff falls apart. And, as-it-so-happens, things like lock contention tend to be really hard to recreate in synthetic or simulated environments that don't have real, authentic load.

Keep in mind that doesn't mean "go crazy with `root` in production". You can combine that strategy with scripting and tooling to drain/isolate/quarantine servers where the stuck process is still running but they don't have live traffic being routed to them.

I see this "ZOMG NO ONE TOUCH PROD" mentality a lot in highly regulated environments but it's usually more sustainable to try to isolate in-scope system's functionality as narrowly as possible to avoid bringing unnecessarily large amounts of things in scope (e.g. put the billing functionality in a microservice to limit PCI scope)