Hacker News new | ask | show | jobs
by mnaydin 3430 days ago
I'd use the find command with the -printf option (GNU find has this option but POSIX find doesn't define it) instead of ls. For instance:

find /path/to/dir -type f -printf "%s\n" | awk ' { s += $0 } END { print s " bytes" } '

The find command has much powerful file filtering capabilities than that of the ls command and works better with weird characters in filenames.

1 comments

Thanks. Yes, I'm aware that in general find is a better option (long time Unix guy) than even a recursive ls command (ls -R) for finding files under a directory and processing them in some way (often together with xargs, to get around the args length limit). But mine was just a quick example, so I didn't use find. Actually, find is also better for this example, because with it, you do not have to deal with per-dir header lines like "dirname:" and "total n" (n blocks) that ls outputs. (The headers may not matter for my example, because I only process field 5, but they can matter for other kinds of processing of the output.)

There is also the -print0 option to find to handle filenames with newlines in them.

-print0 may be non-POSIX and a GNU extension.

POSIX has -print, but interestingly, in some Unixes I have seen that not using -print still prints the filenames found, by default.

That's the expected behaviour. Quoting from spec:

If no expression is present, -print shall be used as the expression. Otherwise, if the given expression does not contain any of the primaries -exec, -ok, or -print, the given expression shall be effectively replaced by: ( given_expression ) -print

http://pubs.opengroup.org/onlinepubs/9699919799/utilities/fi...

Yes, I wasn't implying the behavior is wrong. Was just mentioning it. Anyway, thanks for that link, which explains why. That Open Group info on POSIX utilities is a great resource for when you want to know the comprehensive, well-specified behavior of the commands.