They are more or less equivalent. One has obscure feature X other has obscure feature Y, one is a bit faster on A, other is a bit faster on B, the defaults are a bit different, and one is written in Rust, the other in C++.
Pick the one you like, or both. I have both on my machine, and tend to use the one that does what I want with the least options. I also use GNU grep when I don't need the speed or features of either ug and rg.
One thing I never liked about ripgrep is that it doesn't have a pager. Yes, it can be configured to use the system-wide ones, but it's an extra step (and every time I have to google how to preserve colors) and on Windows you're SOL unless you install gnu utils or something. The author always refused to fix that.
Ugrep not only has a pager built in, but it also allows searching the results which is super nice! And that feature works on all supported platforms!
Interesting - for me a built-in pager is an antifeature. I don't want to figure out how to leave the utility. Worst of all, pager usually means that sometimes you get more pages and you need to press q to exit, and sometimes not. Annoying. I often type yhe next command right away and the pager means I get stuck, or worse, pager starts doing something in response to my keys (looking at you, `git log`).
Then again I'm on Linux and can always pipe to less if I need to. I'm also not the target audience for ugrep because I've never noticed that grep would be slow. :shrug:
Some terminal emulators (kitty for sure) support "open last command output in pager". Works great with a pager that can understand ANSI colors - less fussing around with variables and flags to preserve colors in the pager
I referenced bat because I've found that suggesting cygwin sometimes provokes a negative reaction. The GP also mentioned needing to install GNU tooling as if it were a negative.
I'm sure you know but windows command prompt always came with its inbuilt pager -- more. So, you could always do "dir | more" or "rg -p "%*" | more ". (more is good with colors without flags)
I didn't! I'm not a Windows user. Colors are half the battle, so that's good. Will it only appear if paging is actually needed? That's what the flags to `less` do in my wrapper script above. They are rather critical for this use case.
For me, it's a lot easier to compile a static binary of a C++ app than a Rust one. Never got that to work. Also nice to have compatibility with all of grep's arguments.
For an up-to-date performance comparison of the latest ugrep, please see the ugrep performance benchmarks [at https://github.com/Genivia/ugrep-benchmarks]. Ugrep is faster than GNU grep, Silver Searcher, ack, sift. Ugrep's speed beats ripgrep in most benchmarks.
Although being faster in some cases, ripgrep lacks archive search support (no, transparent decompression ignoring the archive structure is not enough) which works great in ugrep.
I find myself returning to grep from my default of rg because I'm just too lazy to learn a new regex language. Stuff like word boundaries "\<word\>" or multiple patterns "\(one\|two\)".
If you consider it "the weirdest ever", I'm guessing that I'm probably older than you. I've certainly been using regex long before PCRE became common.
As a vim user I compose 10s if not 100s of regexes a day. It does not use PCRE. Nor does sed, a tool I've been using for decades. Do you also recommend not using these?
I use all of those tools but the inconsistency drives me crazy as it's hard to remember which syntax to use where. Here's how to match the end of a word:
ripgrep, Python, JavaScript, and practically every other non-C language: \b
Did you know that not all of those use the same definition of what a "word" character is? Regex engines differ on the inclusion of things like \p{Join_Control}, \p{Mark} and \p{Connector_Puncuation}. Although in the case of \p{Connector_Punctuation}, regex engines will usually at least include underscore. See: https://github.com/BurntSushi/rebar/blob/f9a4f5c9efda069e798...
And then there's \p{Letter}. It can be spelled in a lot of ways: \pL, \p{L}, \p{Letter}, \p{gc=Letter}, \p{gc:Letter}, \p{LeTtEr}. All equivalent. Very few regex engines support all of them. Several support \p{L} but not \pL. See: https://github.com/BurntSushi/rebar/blob/f9a4f5c9efda069e798...
Isn't that inconsistent with the way Perl's regex syntax was designed? In Perl's syntax an escaped non-ASCII character is always a literal [^1], and that is guaranteed not to change.
That's nice for beginners because it saves you from having to memorize all the metacharacters. If you are in doubt you on whether something has a special meaning, you just escape it.
Yes, it's inconsistent with Perl. But there are many things in ripgrep's default regex engine that are inconsistent with Perl, including the fact that all patterns are guaranteed to finish a search in linear time with respect to the haystack. (So no look-around or back-references are supported.) It is a non-goal of ripgrep to be consistent with Perl. Thankfully, if you want that, then you can get pretty close by passing the -P/--pcre2 flag.
With that said, I do like Perl's philosophy here. And it was my philosophy too up until recently. I decided to make an exception for \< and \> given their prevalence.
It was also only relatively recently that I made it possible for superfluous escapes to exist. Prior to ripgrep 14, unrecognized escapes were forbidden:
I had done it this way to make it possible to add new escape sequences in a semver compatible release. But in reality, if I were to ever add new escape sequences, it use one of the ascii alpha-numeric characters, as Perl does. So I decided it was okay to forever and always give up the ability to make, e.g., `\@` mean something other than just matching a literal `@`.
`\<` and `\>` are forever and always the lone exceptions to this. It is perhaps a trap for beginners, but there are many traps in regexes, and this seemed worth it.
Note that `\b{start}` and `\b{end}` also exist and are aliases for `\<` and `\>`. The more niche `\b{start-half}` and `\b{end-half}` also exist, and those are what are used to implement the -w/--word-regexp flag. (Their semantics match GNU grep's -w/--word-regexp.) For example, `\b-2\b` will not match in `foo -2 bar` since `-` is not a word character and `\b` demands `\w` on one side and `\W` on the other. However, `rg -w -e -2` will match `-2` in `foo -2 bar`:
They are more or less equivalent. One has obscure feature X other has obscure feature Y, one is a bit faster on A, other is a bit faster on B, the defaults are a bit different, and one is written in Rust, the other in C++.
Pick the one you like, or both. I have both on my machine, and tend to use the one that does what I want with the least options. I also use GNU grep when I don't need the speed or features of either ug and rg.