Hacker News new | ask | show | jobs
by LeanderK 3005 days ago
I don't know how you are doing this, but with NNs there's still be the chance that result contains significant parts of the text-books. I am not a lawyer, but because you are only interested in the syntactic equality (the sematic being the same) i think a simple algorithm like something based on edit-distance may be able to exclude such cases.