Hacker News new | ask | show | jobs
by MrLeap 3026 days ago
Your story reminds me of a tool I wrote for helping lawyers classify a feed texts as a test set for a project.

Our main initiative was creating a heuristic based classifier (think lots of regex). At my own initiative, I trained ML classifiers while we worked on it. As development went on, the ML classifiers were rapidly catching up with the heuristic based one. Unfortunately it was kind of a one off data processing task, and when time ran out the regex machine was still in the lead.

I was modestly proud of the legalese DSL generator I wrote up. The lawyers didn't even know they were writing coffeescript as they typed out what documents were, what key dates were, etc. :D

That coffeescript formed the basis of our accuracy testing suite. It was as fundamental as it was huge. That team ended up creating a couple thousand tests in less than a month.

1 comments

I'd be interested in hearing more about this. Fancy dropping me an email (address is in profile)?