Hacker News new | ask | show | jobs
Ask HN: Grep like tools for full text search?
9 points by joiguru 1596 days ago
Hello,

Like many in this community, I maintain my notes in text files. I am looking for command line tools that I can use to index and search them.

Are there existing tools that do that? My google skills are are not producing good results.

Thanks

6 comments

I use Recoll:

https://www.lesbonscomptes.com/recoll/

There is a GUI, but it has a perfectly serviceable command-line interface as well. For example, this command lists all files with the exact phrase 'moved permanently':

    recoll -t -l '"moved permanently"'
And this command finds all files ending in '.py' and under a directory called 'Dropbox' that contain the exact words "import" and "logging":

    recoll -t -l "ext:py dir:Dropbox/ Import Logging"
It can take some time and effort to configure, but I have found it to be well worth it.
You may get some mileage out of Swish-e[1].

And while it's not it's primary purpose, Lucene comes with some CLI programs[2]. They are there mainly as a demo, but if you feel like writing some code you might be able to adapt that to your needs.

[1]: https://www.esa.org/tiee/search/html/readme.html

[2]: https://lucene.apache.org/core/3_5_0/demo.html

Why not grep? Are you saying you want full text search with like, stemming and result ranking?

If I can piggyback on your question, my notes are in Word files and I sure wish I could grep those. I bet none of the answers at https://superuser.com/questions/70343/grep-in-microsoft-word work any more because Microsoft will keep changing the file format.

Because grep is not a "full text search tool".

Full text search [1] is different from regular expression search.

[1] https://en.wikipedia.org/wiki/Full-text_search

I use this for searching through multi repo setups for code, it should be great for multiple text documents:

AG: The Silver Searcher https://github.com/ggreer/the_silver_searcher

You search inside file using grep.

grep -Ril "term"

Ripgrep is good.
ripgrep is, great, but it's a regexp search not full text. Full text search includes stemming.