Hacker News new | ask | show | jobs
by bch 624 days ago
Tcl Improvement Proposal (TIP) 602[0].

[0] https://core.tcl-lang.org/tips/doc/trunk/tip/602.md

1 comments

One example from the document:

> Consider the naive attempt to clean out the /tmp directory.

> cd /tmp

> foreach f [glob *] {file delete -force $f}

> A file ~ or ~user maliciously placed in /tmp will have rather unfortunate consequences.

i once managed to create a directory named ~ using the mirror tool written in perl. then i naively tried to remove it using "rm -r ~" and started wondering why removing an empty directory would take so long, until it dawned on me...

i learned a few new habits since then. i almost never use rm -r and i avoid "*" as a glob by itself. instead i always try to qualify "*" with a path, remove files first: "rm dir/*"; and then remove the empty directory. "rmdir dir/"

if i do want to use rm -r, it is with a long path. eg in order remove stuff in the current directory i may distinctly add a path: rm -r ../currentdir/*" instead of "rm -r *"

related, i also usually run "rm -i", but most importantly, i disable any alias that makes "rm -i" the default, because in order to override the -i you need to use -f, but "rm -i -f" i NOT the same thing as "rm". rm has three levels of safety: "rm -i; rm; rm -f". if "rm -i" is the default the "rm" level gets disabled, because "rm -i -f" is the same as "rm -f"

My main safety habit is to avoid slashless paths.

Bad:

    rm *
Okay:

    rm ./*
    rm /tmp/d/*
    rm */deadmeat
    rm d/*
Then again, I commonly use dangerous things like `mv somefile{,.away}` that are easy to get wrong, so maybe don't trust my advice too much.

  rm -rf "$TSTDIR"/etc
is pretty dangerous when you forget to set the env var
Fair! Upvoted.

I guess I'm not likely to type that into the shell, or if I do, I then tab-complete to expand it.

I could definitely see myself using that in a shell script, though. I tend to do validity checks there:

    if ! [ -d "$TSTDIR" ]; echo "$TSTDIR not found, stupid" >&2; exit 1; fi
but that's kind of irrelevant, since if I need it to exist then I won't be removing it. Plus, I could totally see myself doing

    if [ -d "$TESTDIR" ]; then
      rm -rf "$TSTDIR"/etc
    fi
In bash, `set -u` or `"${TSTDIR:?Error: TSTSDIR is required.}/etc"` protects from such errors.
My safety technique is to echo the commands before I do the actual commands as a sanity check, e.g.

for i in $(find something); do echo "rm -f $i"; done

(bash example as my TCL is rusty)

Change your do block to `printf %q\ rm -f "$i" ; echo` and it won't lie about spaces. In case HN has "trimmed" my post in some way, as it often does, that's: percent q backslash space space. Works in bash/zsh, but not dash, probably not whatever your sh is. Can make a function of it trivially, but you have to handle the $# -eq 0 case, return whatever printf returns, etc.
When deleting, if it is more than a few specifically named files I will use a "find ... -delete" invocation.

I like it for two reasons. Find feels like it has more solidly defined patterns and recursion than shell globing and by leaving off the "-delete" it give me a chance to inspect the results before committing to my actions.

Without testing, I wonder if find follows symlinks. I’m pretty sure rm doesn’t.

Edit: Just checked and find doesn’t by default.

Very cool of you to post this. Too many people won't post stories like this, but I've done very similar multiple times. I think it definitely helps reinforce proper habits, and is the best way to cut your teeth on technology. It's also great for anyone new to read something like this, and be able to avoid something so devastating, and maybe make lesser mistakes, but still learn from both!
when you get to be as old as i am, these are the war stories you share with your kids and grandkids around the campfire ;-)

like the one from my colleague who once fat fingered fsck into mkfs and i lost my personal homepage because of it. what makes me uncomfortable about that story is that it was not my fault. if it were it would have been easier to tell. but at the time i was quite frustrated and my colleague felt that despite me trying to not get angry at him. i still feel really bad about my reaction then, adding to his predicament, since he had to live with the guilt about losing our website and everyone's personal home directory. it's bad when a mistake causes you to loose something personal, but so much worse when you loose someone elses stuff.

talking about mistakes is how we learn from them. the important part is not to get embarrassed about them. however that requires an environment where we are not blaming each other when something goes wrong.

i could have made that mistake myself. and i applied this lesson to my own learning as if i had.

blessed be the pessimist, for he hath made backups...

Absolutely, the worst mistakes are the biggest opportunities for learning. Destroying your own files is painful. Destroying EVERYONE'S files is a lesson that is more painful and something you will be more careful not to repeat.

I try to tell as many of these stories in person as possible, to let everyone know that unless you have dealt with this kind of accident, you're either in the 1% or due for your turn. I'd like to think that sharing these stories might at least give someone some pause before they go ahead and throw caution to the wind.

> if "rm -i" is the default the "rm" level gets disabled, because "rm -i -f" is the same as "rm -f"

You can use "\rm" to invoke the non-aliased version of the command. I made "rm -i" the default using an alias and occasionally use "\rm" to get the decreased safety level you described. I think it is more convenient that way.

I love zsh auto completion for this stuff. It automatically escapes really messed up paths like paths with new lines or emojis and crazy characters like that. Its really rare but I still intentionally practiced removing these things just so I can do it safely if it ever happens.
I've long fantasized about a tool I call "expect" that safeguards against crazy stuff like that.

It has a syntax of your expectations, functionally existing as a set of boundaries, and you can hook it to always run as a wrapper for some set of commands. It essentially stages the wrapped command and if none of the boundaries are violated it goes through. Otherwise it yells at you and you need to manually override it.

For instance, pretend I'm ok with mv being able to clobber except in some special directory, let's call it .bitcoin or whatever. (chattr can also solve this, it's just an example). The tool can be implemented relying on things like bpf or preload

Originally I wanted it as a SQL directive ... a way to safeguard a query against doing like `update table set field=value expect rows=1` where you meant to put in the where clause but instead blew away an entire column. I think this would be especially useful surfacing it in frameworks and ORMs some of which make these accidents a bit too easy.

When it comes to SQL, I will often write a SELECT with very explicit search (WHERE) criteria for this very reason. Then copying that statement, commenting the original, and pasting to change into an UPDATE or DELETE statement seems to be a technique that works well for me. The SELECT tells me exactly what I'm going to UPDATE or DELETE, and once I have that, changing the syntax is very minimal. In the case of an ORM, you might have to write a tool that only listens on LOCALHOST to run these statements first.
I always write the where first. It's kinda like thinking in RPN or postfix. I put the parts in out of order in a way that prioritizes the minimization of error.

But this is stupid. These are computers, we can make whatever we want. Executing a delete or update should, if one desires, not have to be database knifeplay.

I know what you mean, I do the same. I agree, but at the same time, it's difficult to start building in protections for the user. Where do you start and where do you stop? I have been forced to do the extreme to protect the user, and then you are asked why things are so difficult to use. I think to make something for someone that concentrates in the technology, as well as a beginner, means you've got to give up so much power (or create a secondary syntax/interface for both audiences). It would be nice to be able to set modes, but then it's going to be database specific unless it has proven itself to be useful across engines. Like most standardization, then you play syntax games between vendors. It would be nice to at least be able to write an UPDATE or DELETE statement with a leading character or keyword to display affected rows.
For sql specifically, “limit 2” is my default way to write “expect 1”; if it affects two rows, I know that I have screwed up, whereas “limit 1” can be wrong without my noticing.
That's not a terrible solution although expect sounds like a simple safety mechanism for feeble-minded people like me who do simple queries.

I actually know postgres people. I should probably ask them

just to clarify, this has nothing to do with the "expect" that is the other major application of tcl other than tk?
None.

I was just reminded of a good idea I never implemented

Oh yeah, re: SQL expect - I always wish joins had a "cardinality assertion", like a regex *?+ (or ! for exactly one)
You could also create a file named "-i" in your home dir.
Why not learn to enjoy life knowing you get a second chance instead ?:

https://github.com/rushsteve1/trash-d

because then i become to rely on stuff being in the trash and i'd be less careful when deleting, which means i have to double check when i clean the trash. that's extra work. . and since the trash is one single folder for the whole desktop, that means the trash is full of stuff from all over the place, making a review extra hard. in most cases i know something needs to be gone, so i'd rather delete it on the spot.

besides that, the primary reason for deleting stuff is to gain space. moving things to trash doesn't help with that.

what i sometimes do though, when mass deleting, is to move stuff to be deleted into a new folder (usually called "del" or "delete"). then verify the contents of the folder before removing it.

what would be more useful is a kind of trash implementation that does not take space, in that it keeps files around but reports them as unused space that can be overwritten when space is needed. kind of like undelete is possible on some filesystems. so that gone is gone because i can't control when deleted space gets used up, but in a panic situation i can revert the most recent deletes.

Isn't that just tilde-expansion happening at the wrong moment?
I once created a file named *. Sweats were sweated that day.
I've heard that one of the Unix founding fathers had a directory with 125 files that all had single-byte names: one for each ASCII symbol except slash, dot and null. He would then test any new utility against this directory and chew the careless programmer out if it couldn't correctly handle every one of these names.