Hacker News new | ask | show | jobs
by ejk314 4400 days ago
I just wrote a script (https://github.com/Glank/repeat_test) to see if I could catch these sorts of errors. It does OK. I looked for repeated lines based on their levenshtein percent difference. Then for those repeating groups, if the percent difference between any of the lines is an outlier it returns a positive result.