| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by orf 2563 days ago

1. Why are you running docker volume prune in production?

2. Why are you running docker on ad-hoc machines you need to prune?

3. Why do you even need root access on production machines to fiddle around with docker commands?

While this is obviously a bad bug (and there are many with Docker), it seems more of an operational procedures failure than anything else. You could be saying:

“Beware of rm -rf /, it just deleted 20gb of production data”

Ok. Sure. But why are you tools and procedures putting yourself in a position to make that mistake?

3 comments

reaperducer 2563 days ago

One of the most bothersome part of HN is when someone tells us about something that happened, and out come a ream of second-guessing replies. "Why didn't you just do this?" and Why didn't you just do that?" and any number of "It's so easy to just thing instead!"

We don't know his environment. We don't know his company's policies. We don't know his hardware, connectivity, or budget issues. These kinds of passive aggressive responses are almost never helpful.

link

orf 2563 days ago

When you reduce it down the title here is “giving people access to running arbitrary, manual and presumably unrestricted maintenance commands in production leads to issues”.

That’s not a surprise, and maybe the issue at the core here is not really Docker. That’s all.

link

james_s_tayler 2563 days ago

I agree with both of you. It's not helpful to not know the context and the op wasn't necessarily in control of it. But at the same time if you are someone in control of the context (which you aren't really if you are a line level employee) you should be aware this is a bad pattern. If you are a line level employee and this is being imposed on you for some reason or other you should sound an alarm if you know to "hey, for the record - this is a monumentally bad idea - just saying".

I've seen plenty of stuff in my career where I've gone on record to say "hey - we really shouldn't do this". Nothing got done about it. But hey, I did what I could.

Recently I learned about Rasmussen's dynamic safety model. I think this is a very handy mental model to have. It's the human factors that make what we do really hard. Often line level practitioners know better than they are allowed to do in practice and trying to fight organizational politics to Do The Right Thing can be an uphill battle.

link

q_queue 2563 days ago

Sure, but regardless of people doing dumb things, it's still worth asking "why did docker delete non-orphaned named volumes?" -- though you could also question whether someone was actually mistaken about them not being "orphaned" - you could probably arrange an unfortunate timing collision between someone running prune and a container being respawned.

link

uponcoffee 2563 days ago

Right, that's what raising an issue with the software maintainers are for.

Aside from anecdotes, there's little value in further discussion beyond the PSA that is the original post; save for prevention/recovery of such events.

link

windexh8er 2563 days ago

It almost sounds as if the daemon was in the process of starting the containers and the prune command was issued. If it were run with `-f` and the container wasn't running those volumes would be deleted. I tried this on a test system and didn't get the results in the issue.

link

int0x80 2563 days ago

Well, no. rm -rf / is a completely different beast. It is documented and expected for starters.

They may have valid reasons to do that, even if not common.

link

sethammons 2563 days ago

thinking of the `rm -rf` one, here is a fun take:

  export $WORKDIR=Home/me/proj
  ...
  rm -rf /$WORKDIR

If something unsets $WORKDIR or does not set it at all, wave bu-bye to your everything. And before you say "who would do that?!" -- I believe I heard that happened to a build of RedHat that also had some kind of force push and auto-pull and build on their version control so every connected person had their version of the software nuked. If not for the non-connected individuals, the entire software would be gone apparently. Or so the legend goes.

link

remram 2563 days ago

Also the Steam Linux client.

link