Hacker News new | ask | show | jobs
by dunham 243 days ago
do pdfs count?

    % mdfind -onlyin ~ kind:pdf |wc -l
       11116
(2k of those are in my directory of github checkouts and there are duplicates in there.)
3 comments

They totally do! And epub, mobi, djvu...

    prhodes@troubadour:~/Downloads/pdfs$ find . -name "*.pdf" | wc -l
    18952
    prhodes@troubadour:~/Downloads/pdfs$ find . -name "*.epub" | wc -l
    2385
    prhodes@troubadour:~/Downloads/pdfs$ find . -name "*.djvu" | wc -l
    1384
    prhodes@troubadour:~/Downloads/pdfs$ find . -name "*.mobi" | wc -l
    125

(There are definitely duplicates in those, FWIW)
Why are you downloading mobi files? Seriously, I only do that if there is no epub, and only keep it long enough to convert it.
Pretty much that. If the only copy I can find is .mobi.. or, occasionally perhaps, just by mistake.
Is mdfind a Windows executable? Is there a standalone version that I might be able to use on the rare occasions I need to fight with somebody's Windows box?
mdfind is a built-in with macOS. It's similar to find (should be on your *nix system if you have one) which can be installed with Cygwin on Windows. On Windows, you'd use Powershell and Get-ChildItem (I don't think it's case sensitive, but I don't use PS much).
Thank you
mdfind is the command line interface to macos "Spotlight", which is the global file index. So it can do things like full text search in addition to matching metadata values or size bigger than X.

I don't know windows well enough to know the equivalent. But I think there is an index on windows, and powershell may be able to poke at it.

Thank you
Why do you make PDFs of github checkouts?
I meant that I have repositories checked out of github that contain pdf files. Most repositories that I check out of github are in ~/code, which I back up in case it disappears upstream. But it does look like a lot of those files actually are academic papers.