Hacker News new | ask | show | jobs
by gauravagarwalr 3908 days ago
I have found the `find` command and its API to be the hardest to learn. 3+ years on *NIX machine, spending close to 70% time on the command line when not browsing, and yet I still am not comfortable with `find`.
7 comments

Personal Experience: Implement a command line stack machine like an RPN calculator and it'll click. Every argument normally just pop's the next arg to determine its state. I purposely made myself use find for about 6 months and I still struggle with flags sometimes, especially when chaining more complex expressions.

Find is nirvana with SSD's. I can throw find commands at the root -exec'ing out to grep on my 512GB SSD very quickly.

Lastly cygwin + find is still faster then windows explorer search in my experience, especially if you have an SSD. This is a great way to start teaching yourself unix command line also.

Not sure if I'm doing anything wrong but I find cygwin/msys find very slow on my windows SSD. I mean I still generally use it just because I'm familiar but if I'm looking for a file in huge directory I look up whatever flags I need for dir (dir /s something something)
Its start up is incredibly slow, its operations are slower then native, but then you aren't doing native things.

The argument commonly used is productivity vs speed of operation. Even slow file systems are faster then I can humanly see happen more often then not.

I combine "find ./" and grep instead of learning how to use find. (I'm a terrible person.)
For all I know you may indeed be a terrible person ;), but this isn't why. The composability of Unix' commands is probably its greatest strength.
-name and -iname are the commands to know.

But yeah, I think that's the default novice way of using find. I cannot agree with you and the article hard enough - I have an intense, irrational aversion to using 'find' that's absolutely incomparable to my feelings on any other Unix tool. I really can't think of any other utility which needs so many inane arguments to achieve a basic level of functionality. I even hate cracking open the manpages on it - I'll try to bring myself to do it, and then I decide "you know what, fuck it, I'll grep for it".

This is where I throw in a quick plug for bropages. Best tool ever, cannot recommend strongly enough. It's the second thing I install on a new system, right after etc-keeper.

I have found it helpful to learn small parts of the `find` api a bit at a time. For example, instead of `find ./ -print | grep "name_of_file"`, I use `find ./ -name "name_of_file"` or `find ./ -name '* name_of_file *"` (without the spaces -- I can't seem to surround text with asterisks without it triggering HNs formatting (yes I tried escaping them)) if you want to fuzzily search for a filename. You can replace `-name` with `-iname` if you want your search to be case-insensitive.
You mean that's not how you're supposed to use it?

It's such an utterly terrible tool and needs replacing bad.

find . | grep 'abc' === find . -name 'abc' find . | grep -i 'abc' === find . -iname 'abc'

Here, saved you a pipe :)

(Oh, and with -exec and -delete, not needing that pipe is incredibly useful. Find is an incredibly powerful command.)

Sorry if this is pedantic, but actually:

    find . | grep 'abc' === find . -name '*abc*'

    find . | grep -i 'abc' === find . -iname '*abc*'
The -name and -iname options do a verbatim file name check if you don't include those wild-card asterisks. I've been inconvenienced by having to go back and add them often enough that this is burned into my brain. :-)
I just read the original post (the store about someone writing to Dennis Ritchie) which this comment thread is about. Noticed a mention of find and cpio. So wanted to say:

find, in general, is quite a powerful command [1], and one of the ways of using it, is piping its output to cpio or other commands, to act upon the found filenames (which can include relative or full paths, depending on how you call find). There is also the -exec option to find, which can do more or less what I said above (piping find's output to other commands), but without using a pipeline.

Those are some good reasons why you might want to persevere and learn to use find, despite its awkward syntax. And it's not really that difficult.

[1] One such example is a command I used to use a lot: some variation on piping the output of find to cpio, using the -p option of cpio (for pass, IIRC). E.g.:

$ find . -name some_wildcard -print | cpio -pdmuv dirname

which would copy an entire directory subtree from one place in the file system to another (or to a place (directory) in a different file system too). (Something like XCOPY ... /s /v /e in DOS.) And those cpio options could be used to do things like keeping the permissions of the target files the same as those of the source, or not, etc. Imagine how much time that could save (over having to later manually change the permissions, after the copy, particularly when there are many files). And that's just one example of the power of fin (along with other Unix commands).

xargs is worse than find, I'd say, on account of its useless default mode of operation.

If it accepted newline-delimited input, it would cater for every non-pathological case, rather than (as it does now) failing miserably on many reasonable file names. (Of course, you have xargs -0 - but not all tools output appropriate data. And it accepts shell-style quoting - but do we really want that contagion to spread?)

The bizarre thing is that except for oddities like `echo * | xargs', every tool I've ever met that outputs lists of file names outputs them with newline as the separator. And any that currently don't, would be better off doing that, anyway, I'd argue - since most Unix tools are line-oriented.

Agreed. Posix xargs is bafflingly inane. GNU xargs, at least, supports `xargs -d '\n'` to regain some line-oriented sanity. I prefer to use GNU parallel these days, though, since it's line-oriented by default and a bit more ergonomic.
xargs is very powerful when you want to run one command with many arguments, but each argument is newline-delimited. That's basically all I use it for, and it saves me a lot of looping and subshell creation.

For example, when purging backups from Mercurial after a revert:

find . -name *.orig | xargs rm

But that's exactly it - if any of the filenames have spaces, that will fail, since xargs will try to delete each part separately. From the man page:

Because Unix filenames can contain blanks and newlines, this default behaviour is often problematic; filenames containing blanks and/or newlines are incorrectly processed by xargs.

Excluding filenames containing newlines, I wonder how many weird corner cases this would reveal:

    ls | tr '\n' '\0' | xargs -0 rm
I've been annoyed by ls and xargs not playing together here and there, but the above only just occurred to me. Not sure if it's a good idea yet or not!
find . -name '*.orig' -delete
Blatant plug: My pain at all these unnecessary hieroglyphics & man page work is why I built Crab, so I could use SQL to find things and an EXEC command to run commands on them.

etia.co.uk

Really it shouldn't have to be this hard.

Interesting that that's the direction you went with this.

I've always wished that mysql supported a more unix-like interface, where a few characters of composable interfaces would do, instead of word after word of semi-natural-language syntax.

Wordiness isn't really a problem, because I'm a fast typer and most systems have tab completion and history recall.

For me the benefit is that for lots of problems, whether databases or the filesystem, thinking in terms of sets feels really natural. I can't imagine why you'd want to pipe grep results to grep -v rather than say "and not"

I guess we all prefer the language whose syntax we know the best.

I remember seeing Crab a while back on HN, and thinking I don't see why you would want this, but when you mentioned thinking in sets it was way more clear. It may be cause I don't use SQL often that I didn't click. Prehaps that would be a good thing to mention on the website.
Only just getting to grips with it the past year and even then I'm not doing anything in the way of advanced. I have, however, found knowing only a subset of its features has dramatically improved my efficiency at the terminal. Awk and sed have been a useful learn too - I need to check out gawk. Apparently if trying to use awk and grep (emphasis on trying), I should just use gawk.
What do you find hard to learn about find? I don't think it's particularly hard to learn, as Unix commands go. A bit more difficult than the average command, maybe. Of course not everything about it may be well designed (as some other comments here say), but that holds true to some extent for any command or language.