Hacker News new | ask | show | jobs
by tyingq 2825 days ago
I don't know if any are simpler, but here's a stack overflow question that points to other similar tools: https://stackoverflow.com/questions/17891932/open-source-rul...
1 comments

We started RL3 (more than 10 years ago) because we had several projects with a huge number of patterns. We found other projects (present at that moment) were too heavy on a syntax, which make it complicated to support / manage large library of patterns. So, we tried to keep the power of regex, add new features (like named patterns, modules, templates and lookup dictionaries) but minimize additional syntax... as result we were able to enable team of computational linguists (i.e. not programmers) quite easily develop and support huge libraries of NER patterns and document classification rules.
Are there any public projects based on this engine?
Yes. The most notable are https://www.aihitdata.com and https://www.happygrumpy.com First crawls corporate websites (~25 millions) and extracts key information such as people, contacts, etc. Second is a sentiment analysis tool.
Thanks. Consider expanding the docs with more examples on how to extract different types of structured data.