| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by alex_dev 3482 days ago
	Looks like it uses regexp... There isn't any benchmark code as one would expect when making a claim that it's "fast".

2 comments

laumars 3482 days ago

Regexp definitely isn't something you'd want to be using if you're primary goal is speed. When running tight loops in string parsing I've found using string splitting and then cycling through the range of indices in a slice was several times faster than Regexp matching. Obviously performance difference will vary depending on the expression and application but that was enough to convince me to think twice about future usage of Regex - as to whether the problem needed Regex or if I was just using them lazily. The latter being a practice I'd slipped into after years of Perl hacking.

jules 3482 days ago

That depends entirely on the regex implementation. If the implementation uses a DFA to match multiple regexes simultaneously then the performance will be as good as a trie because a DFA is more or less a trie.

vanderZwan 3482 days ago

> That depends entirely on the regex implementation

True, and anyone who knows that Russ Cox is a core member of the Go team will have a hard time suppressing a smirk when reading this :)

https://swtch.com/~rsc/regexp/

laumars 3482 days ago

True. I was talking specifically about the same Regexp package as the one used in the topic project though.

I assumed that would have been obvious given the context however I apologise for not stating that in my comment and shall amend it appropriately. [edit: i can't add an amendment to my previous post now]

mypalmike 3482 days ago

True. But nowadays most regex implementations are quite good (apparently go's is not - I haven't used it).

That said, most regex performance problems are PEBKAC. Writing a fast regex is hard and requires a pretty thorough understanding of parser theory. And many who use regexes don't understand that it's critical to precompile them for performance. You don't get a fast parser when you rebuild the DFA each time you use it.

*edit: a word

aikah 3482 days ago

Go regexps are slow (https://goo.gl/r0K2xw ), the problem is not regexps but Go's implementation of regexps. So let's not blame regexps when regexps aren't the problem. Because by that logic, people shouldn't use the sort package as well ...

laumars 3482 days ago

You're making a distinction where one doesn't need to be made. It doesn't matter if regexp is generally slow or if it's Go implementation specifically - if you're using Go and wanting something where performance is your primary goal then you're generally best to avoid using regexp.

aikah 3482 days ago

> You're making a distinction where one doesn't need to be made. It doesn't matter if regexp is generally slow or if it's Go implementation specifically - if you're using Go and wanting something where performance is your primary goal then you're generally best to avoid using regexp.

Avoiding regexp doesn't fix Go's implementation of regexp. Making them faster does. Your argument is preposterous. If the Go team really cared about performances it would fix its regexp implementation.

SEJeff 3482 days ago

I think you're missing the point of the discussion entirely. When you're more or less doing string splitting, using regex (regardless of performance) really is the wrong tool for the job. In this use case (url routing) a tree based data structure aka a trie or radix tree, are better suited.

aikah 3482 days ago

> I think you're missing the point of the discussion entirely. When you're more or less doing string splitting, using regex (regardless of performance) really is the wrong tool for the job. In this use case (url routing) a tree based data structure aka a trie or radix tree, are better suited.

I'm not missing the point of the discussion. Using regex is not the wrong tool for the job. You deemed it the wrong tool for the job. And deeming it the wrong tool for the job doesn't fix Go regex being slower than in other languages. The 2 issues are not separate .People like you talk about performances as a goal while dismissing obviously performance issues in the standard library as "wrong tool for the job".

You're not going to convince anybody with this kind of argument, aside from gophers who already think like you do. I'm not one of them.

SEJeff 3482 days ago

FWIW If the router was in C or Rust or .net (which has a Jit for their regex engine) I would still tell you Regex is the wrong tool for splitting a URL on '/'. How many people have to tell you facts for you to believe them? Forget your pointless anti-go bias. A regex, while perfectly good for certain types of pattern matching makes no sense here.

I often have taught the same lesson to my junior colleagues who use regex in Python or C++ code where splitting the string would be simpler, more maintainable, and faster.

literallycancer 3482 days ago

A person looking to use a HTTP router most likely isn't going to rewrite/fix the regexp package just so they can use this router when there are already other routers that are faster as is. Do you dispute that?

duaneb 3482 days ago

If you use a broken hammer to attempt to insert a screw, you're crazy for using the hammer, not because it's broken.

cyphar 3481 days ago

The comment linked is about CSV parsing being slow (the linked GitHub issue in the comment shows that the `regexp` doesn't show up in the benchmarks at all). So I don't understand why you linked a 6-year-old thread that's been revived by unrelated topics?

creshal 3482 days ago

Regexes are the problem, because they're simply the wrong tool for the job.

aikah 3482 days ago

> Regexes are the problem, because they're simply the wrong tool for the job.

for what job? Extracting route variables from paths ?they are the right tool for the job, only in the Go community they are deemed "wrong tool for the job". Your statement embodies everything that is wrong with the Go community. Instead of finding a solution to a problem you guys spend your time shifting the blame on "bad practices".

rudolf0 3482 days ago

Wouldn't it make more sense to use something faster and simpler for most routing, and then an optional argument for regular expressions? Lots of web frameworks use that approach.