Hacker News new | ask | show | jobs
by phiresky 2563 days ago
> It's not clear whether libripgrep would be a good fit for this project or not

I actually looked into using libripgrep for this, but then I decided not to because of (a) not wanting to handle arg parsing myself (ripgrep has sooo many arguments), (b) missing or hard to find documentation.

The main reason it might be a good idea is because currently ripgrep does not know at all about a single file returning multiple "files", and all line prefixes are "hardcoded" (e.g. Page X: hello in pdfs is just prefixed per line). Also I can't rely on ripgrep's binary detection currently, because it would have to happen for "parts of files" from the perspective of ripgrep.

It would be great if ripgrep had a slightly more advanced preprocessing API - allow returning multiple "files" per filename input, maybe even with a "sourcemap" of line<->Page etc.