| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by dmlerner 904 days ago
	Why not ripgrep?

10 comments

GuB-42 904 days ago

Why not ugrep?

They are more or less equivalent. One has obscure feature X other has obscure feature Y, one is a bit faster on A, other is a bit faster on B, the defaults are a bit different, and one is written in Rust, the other in C++.

Pick the one you like, or both. I have both on my machine, and tend to use the one that does what I want with the least options. I also use GNU grep when I don't need the speed or features of either ug and rg.

tredre3 904 days ago

One thing I never liked about ripgrep is that it doesn't have a pager. Yes, it can be configured to use the system-wide ones, but it's an extra step (and every time I have to google how to preserve colors) and on Windows you're SOL unless you install gnu utils or something. The author always refused to fix that.

Ugrep not only has a pager built in, but it also allows searching the results which is super nice! And that feature works on all supported platforms!

bornfreddy 904 days ago

Interesting - for me a built-in pager is an antifeature. I don't want to figure out how to leave the utility. Worst of all, pager usually means that sometimes you get more pages and you need to press q to exit, and sometimes not. Annoying. I often type yhe next command right away and the pager means I get stuck, or worse, pager starts doing something in response to my keys (looking at you, `git log`).

Then again I'm on Linux and can always pipe to less if I need to. I'm also not the target audience for ugrep because I've never noticed that grep would be slow. :shrug:

amethyst 904 days ago

You might appreciate setting `PAGER=cat` in your environment. ;)

Git obeys that value, and I would hope that most other UNIXy terminal apps do too.

bornfreddy 904 days ago

Oh, wow, thank you! I must try this.

VTimofeenko 904 days ago

Some terminal emulators (kitty for sure) support "open last command output in pager". Works great with a pager that can understand ANSI colors - less fussing around with variables and flags to preserve colors in the pager

burntsushi 904 days ago

This is what I do personally:

    $ cat ~/bin/rgp
    #!/bin/sh
    exec rg -p "$@" | less -RFX

Should work just fine. For Windows, you can install `bat` to use a pager if you don't otherwise have one. You don't need GNU utils to have a pager.

anjanb 904 days ago

hi @burntsushi,

   fan of your tool. like it's speed and defaults.

I use windows : didn't understand what you mean by "install `bat`" to use a pager.

I use cygwin and WSL for my unix needs. I have more and less in cygwin for use in windows.

burntsushi 904 days ago

I referenced bat because I've found that suggesting cygwin sometimes provokes a negative reaction. The GP also mentioned needing to install GNU tooling as if it were a negative.

bat is fancy pager written in Rust. It's on GitHub: https://github.com/sharkdp/bat

anjanb 904 days ago

I'm sure you know but windows command prompt always came with its inbuilt pager -- more. So, you could always do "dir | more" or "rg -p "%*" | more ". (more is good with colors without flags)

burntsushi 904 days ago

I didn't! I'm not a Windows user. Colors are half the battle, so that's good. Will it only appear if paging is actually needed? That's what the flags to `less` do in my wrapper script above. They are rather critical for this use case.

ilyagr 904 days ago

I don't believe bat is a paper; it's more of a pretty-printer that tends to call less.

Two pallets that should work on Windows are https://github.com/walles/moar (golang) and https://github.com/markbt/streampager (Rust). There might also be a newer one that uses rust, I'm unsure.

ttyprintk 903 days ago

I'd recommend ov for Windows users.

https://github.com/noborus/ov

bat on Windows does page, but I believe it's only available on Choco and not winget.

MrDrMcCoy 904 days ago

For me, it's a lot easier to compile a static binary of a C++ app than a Rust one. Never got that to work. Also nice to have compatibility with all of grep's arguments.

datadeft 904 days ago

> to compile a static binary

Cargo is one of the main reasons to use Rust of C++. I am pretty sure there is more involved with C++ than this:

   rustup target add x86_64-unknown-linux-musl 
   cargo build --target=x86_64-unknown-linux-musl

devraza 904 days ago

From the ugrep README:

For an up-to-date performance comparison of the latest ugrep, please see the ugrep performance benchmarks [at https://github.com/Genivia/ugrep-benchmarks]. Ugrep is faster than GNU grep, Silver Searcher, ack, sift. Ugrep's speed beats ripgrep in most benchmarks.

codetrotter 904 days ago

Does these performance comparison take into account the things BurntSushi (ripgrep author) pointed out in the ripgrep issue link elsewhere ITT? https://github.com/BurntSushi/ripgrep/discussions/2597

Either way, ripgrep is awesome and I’m staying with it.

devraza 904 days ago

Agreed - ripgrep is great, and I'm not planning to switch either. The performance improvement is tiny, anyways.

Conscat 904 days ago

The best practical reason to choose this is its interactive features, like regexp building.

philkrylov 903 days ago

Although being faster in some cases, ripgrep lacks archive search support (no, transparent decompression ignoring the archive structure is not enough) which works great in ugrep.

0cf8612b2e1e 904 days ago

I assume the grep compatible bit is attractive to some people. Not me, but they exist.

derriz 904 days ago

I find myself returning to grep from my default of rg because I'm just too lazy to learn a new regex language. Stuff like word boundaries "\<word\>" or multiple patterns "\(one\|two\)".

masklinn 904 days ago

That seems like the weirdest take ever: ripgrep uses pretty standard PCRE patterns, which are a lot more common than posix’s bre monstrosity.

To me the regex langage is very much a reason to not use grep.

derriz 904 days ago

A bit hyperbolic, no?

If you consider it "the weirdest ever", I'm guessing that I'm probably older than you. I've certainly been using regex long before PCRE became common.

As a vim user I compose 10s if not 100s of regexes a day. It does not use PCRE. Nor does sed, a tool I've been using for decades. Do you also recommend not using these?

comex 904 days ago

I use all of those tools but the inconsistency drives me crazy as it's hard to remember which syntax to use where. Here's how to match the end of a word:

ripgrep, Python, JavaScript, and practically every other non-C language: \b

vim: \>

BSD sed: [[:>:]]

GNU sed, GNU grep: \> or \b

BSD grep: \>, \b, or [[:>:]]

less: depends on the OS it's running on

burntsushi 904 days ago

Did you know that not all of those use the same definition of what a "word" character is? Regex engines differ on the inclusion of things like \p{Join_Control}, \p{Mark} and \p{Connector_Puncuation}. Although in the case of \p{Connector_Punctuation}, regex engines will usually at least include underscore. See: https://github.com/BurntSushi/rebar/blob/f9a4f5c9efda069e798...

And then there's \p{Letter}. It can be spelled in a lot of ways: \pL, \p{L}, \p{Letter}, \p{gc=Letter}, \p{gc:Letter}, \p{LeTtEr}. All equivalent. Very few regex engines support all of them. Several support \p{L} but not \pL. See: https://github.com/BurntSushi/rebar/blob/f9a4f5c9efda069e798...

pbhjpbhj 904 days ago

`pgrep`, or `grep -P`, uses PCRE though, AFAIUI.

burntsushi 904 days ago

ripgrep's regex syntax is pretty similar to grep -E. So if you know grep -E, most of that will transfer over.

Also, \< and \> are in ripgrep 14. Although you usually just want to use the -w/--word-regexp flag.

xoranth 904 days ago

> Also, \< and \> are in ripgrep 14

Isn't that inconsistent with the way Perl's regex syntax was designed? In Perl's syntax an escaped non-ASCII character is always a literal [^1], and that is guaranteed not to change.

That's nice for beginners because it saves you from having to memorize all the metacharacters. If you are in doubt you on whether something has a special meaning, you just escape it.

[^1]: https://perldoc.perl.org/perlrebackslash#The-backslash

burntsushi 904 days ago

Yes, it's inconsistent with Perl. But there are many things in ripgrep's default regex engine that are inconsistent with Perl, including the fact that all patterns are guaranteed to finish a search in linear time with respect to the haystack. (So no look-around or back-references are supported.) It is a non-goal of ripgrep to be consistent with Perl. Thankfully, if you want that, then you can get pretty close by passing the -P/--pcre2 flag.

With that said, I do like Perl's philosophy here. And it was my philosophy too up until recently. I decided to make an exception for \< and \> given their prevalence.

It was also only relatively recently that I made it possible for superfluous escapes to exist. Prior to ripgrep 14, unrecognized escapes were forbidden:

    $ echo '@' | rg-13.0.0 '\@'
    regex parse error:
        \@
        ^^
    error: unrecognized escape sequence
    $ echo '@' | rg '\@'
    @

I had done it this way to make it possible to add new escape sequences in a semver compatible release. But in reality, if I were to ever add new escape sequences, it use one of the ascii alpha-numeric characters, as Perl does. So I decided it was okay to forever and always give up the ability to make, e.g., `\@` mean something other than just matching a literal `@`.

`\<` and `\>` are forever and always the lone exceptions to this. It is perhaps a trap for beginners, but there are many traps in regexes, and this seemed worth it.

Note that `\b{start}` and `\b{end}` also exist and are aliases for `\<` and `\>`. The more niche `\b{start-half}` and `\b{end-half}` also exist, and those are what are used to implement the -w/--word-regexp flag. (Their semantics match GNU grep's -w/--word-regexp.) For example, `\b-2\b` will not match in `foo -2 bar` since `-` is not a word character and `\b` demands `\w` on one side and `\W` on the other. However, `rg -w -e -2` will match `-2` in `foo -2 bar`:

    $ echo 'foo -2 bar' | rg -w -e '\b-2\b'
    $ echo 'foo -2 bar' | rg -w -e -2
    foo -2 bar

xoranth 904 days ago

Ok, makes sense. And thanks for the detailed explaination about word boundaries and the hint about the --pcre flag (I hadn't realized it existed).

jedisct1 904 days ago

Fuzzy matching is the main reason I switched to ugrep. This is insanely useful.

meindnoch 904 days ago

Because this is faster?

bsdpufferfish 904 days ago

ripgrep stole the name but doesn’t follow the posix standard.