Hacker News new | ask | show | jobs
by teach 1975 days ago
This is wonderful. I love to see technology enhancing experts' ability to do what they already do, but faster/more accurately.

Also, I'm a big fan of regex. I think -- probably thanks to jwz's famous quote -- a lot of younger programmers avoid them but they're fantastic for MATCHING. Using them in a Google sheet is a killer MVP to prove out something like this.

1 comments

I'm good at reading/writing regex and use them a lot, but I always worry about their maintainability. They're a common source of hard-to-pinpoint bugs.

I suppose I still use them because I don't know of a better way to do things.

We were amazed at how far we were able to get with them – if solving a problem with a regular expression produces two problems, we should now have 13,000 problems. The fact that they worked so well is due to the work of the subeditor who compiled (and still maintains!) the rule corpus – as well as the sheer volume, there are quite a few carefully ordered rules. Because style guide matches are reasonably sparsely found in content, and usually reasonably specific as to what matches (even if it's difficult to produce a correction) it turned out to be a surprisingly tractable problem to produce something useful with regular expressions alone – but we'd never have discovered that was the case unless someone had spent literally years doing it!

General maintainability is a priority, and we'd like to improve our rule management tooling to make the process of rule maintance generally accessible to editorial staff. We're also working on making noisy rules match more specifically, which usually involves migrating the initial regex into Languagetool for e.g. pattern-matching on part-of-speech.