Hacker News new | ask | show | jobs
by alex_dev 3482 days ago
Looks like it uses regexp... There isn't any benchmark code as one would expect when making a claim that it's "fast".
2 comments

Regexp definitely isn't something you'd want to be using if you're primary goal is speed. When running tight loops in string parsing I've found using string splitting and then cycling through the range of indices in a slice was several times faster than Regexp matching. Obviously performance difference will vary depending on the expression and application but that was enough to convince me to think twice about future usage of Regex - as to whether the problem needed Regex or if I was just using them lazily. The latter being a practice I'd slipped into after years of Perl hacking.
That depends entirely on the regex implementation. If the implementation uses a DFA to match multiple regexes simultaneously then the performance will be as good as a trie because a DFA is more or less a trie.
> That depends entirely on the regex implementation

True, and anyone who knows that Russ Cox is a core member of the Go team will have a hard time suppressing a smirk when reading this :)

https://swtch.com/~rsc/regexp/

True. I was talking specifically about the same Regexp package as the one used in the topic project though.

I assumed that would have been obvious given the context however I apologise for not stating that in my comment and shall amend it appropriately. [edit: i can't add an amendment to my previous post now]

True. But nowadays most regex implementations are quite good (apparently go's is not - I haven't used it).

That said, most regex performance problems are PEBKAC. Writing a fast regex is hard and requires a pretty thorough understanding of parser theory. And many who use regexes don't understand that it's critical to precompile them for performance. You don't get a fast parser when you rebuild the DFA each time you use it.

*edit: a word

Go regexps are slow (https://goo.gl/r0K2xw ), the problem is not regexps but Go's implementation of regexps. So let's not blame regexps when regexps aren't the problem. Because by that logic, people shouldn't use the sort package as well ...
You're making a distinction where one doesn't need to be made. It doesn't matter if regexp is generally slow or if it's Go implementation specifically - if you're using Go and wanting something where performance is your primary goal then you're generally best to avoid using regexp.
> You're making a distinction where one doesn't need to be made. It doesn't matter if regexp is generally slow or if it's Go implementation specifically - if you're using Go and wanting something where performance is your primary goal then you're generally best to avoid using regexp.

Avoiding regexp doesn't fix Go's implementation of regexp. Making them faster does. Your argument is preposterous. If the Go team really cared about performances it would fix its regexp implementation.

I think you're missing the point of the discussion entirely. When you're more or less doing string splitting, using regex (regardless of performance) really is the wrong tool for the job. In this use case (url routing) a tree based data structure aka a trie or radix tree, are better suited.
> I think you're missing the point of the discussion entirely. When you're more or less doing string splitting, using regex (regardless of performance) really is the wrong tool for the job. In this use case (url routing) a tree based data structure aka a trie or radix tree, are better suited.

I'm not missing the point of the discussion. Using regex is not the wrong tool for the job. You deemed it the wrong tool for the job. And deeming it the wrong tool for the job doesn't fix Go regex being slower than in other languages. The 2 issues are not separate .People like you talk about performances as a goal while dismissing obviously performance issues in the standard library as "wrong tool for the job".

You're not going to convince anybody with this kind of argument, aside from gophers who already think like you do. I'm not one of them.

FWIW If the router was in C or Rust or .net (which has a Jit for their regex engine) I would still tell you Regex is the wrong tool for splitting a URL on '/'. How many people have to tell you facts for you to believe them? Forget your pointless anti-go bias. A regex, while perfectly good for certain types of pattern matching makes no sense here.

I often have taught the same lesson to my junior colleagues who use regex in Python or C++ code where splitting the string would be simpler, more maintainable, and faster.

A person looking to use a HTTP router most likely isn't going to rewrite/fix the regexp package just so they can use this router when there are already other routers that are faster as is. Do you dispute that?
If you use a broken hammer to attempt to insert a screw, you're crazy for using the hammer, not because it's broken.
The comment linked is about CSV parsing being slow (the linked GitHub issue in the comment shows that the `regexp` doesn't show up in the benchmarks at all). So I don't understand why you linked a 6-year-old thread that's been revived by unrelated topics?
Regexes are the problem, because they're simply the wrong tool for the job.
> Regexes are the problem, because they're simply the wrong tool for the job.

for what job? Extracting route variables from paths ?they are the right tool for the job, only in the Go community they are deemed "wrong tool for the job". Your statement embodies everything that is wrong with the Go community. Instead of finding a solution to a problem you guys spend your time shifting the blame on "bad practices".

Wouldn't it make more sense to use something faster and simpler for most routing, and then an optional argument for regular expressions? Lots of web frameworks use that approach.