Hacker News new | ask | show | jobs
by e12e 4421 days ago
Right. It's not entirely straight forward to link up the mime database (via eg: file) and generating filters for use by find. Basing filters off of filenames isn't a very good idea -- and actually a little regressive in my opinion -- after all project/bin/foo (executable) might be a python or perl or whatever script -- not just a binary file.

But first getting all files via find, then testing with file, and finally matching against mime-type doesn't sound like something that's going to be as fast as possible...

I tried to see if maybe gvfs (gio - gnome io) could help, but couldn't really find anything directly applicable (although there is a set of gvfs command line tools, like gvfs-ls, gvfs-info, gvfs-mime).

2 comments

> after all project/bin/foo (executable) might be a python or perl or whatever script -- not just a binary file.

That's one of the big features of ack that the find/grep combo can't replicate is checking the shebang of the file to detect type. In ack's case, Perl and shell programs are detected both by extension:

  --type-add=perl:ext:pl,pm,pod,t,psgi
  --type-add=shell:ext:sh,bash,csh,tcsh,ksh,zsh,fish
And by checking the shebang:

  --type-add=perl:firstlinematch:/^#!.*\bperl/
  --type-add=shell:firstlinematch:/^#!.*\b(?:ba|t?c|k|z|fi)?sh\b/
Run `ack --dump` to see a list of all the definitions.
I'd prefer checking the magic numbers in general (or resource forks) -- and list based on mime-types -- rather than just shebang/extension. I'm sure there's frameworks ready for doing this -- both gnome and kde (among others) have been working on this for a while. You need it do be able to display (correct) file icons, for example. And once one goes down that route, it might be beneficial to leverage one of the frameworks for file-search (from locate db to something based on xapian or what-not) -- rather than find-style traversal.
You may be right about the extensions.

Thanks for suggesting gvfs. I'll investigate it and similar databases from other packages (I know at least KDE has its own).

I suppose this might be too late, but it might be worth having a look a tracker[1], and tracker-search[2]. Alternatives include recoll and Beagle (now defunct?).

[1] https://wiki.gnome.org/Projects/Tracker

[2] http://www.mankier.com/1/tracker-search