Hacker News new | ask | show | jobs
by CyberRabbi 1804 days ago
You’re free to privately research with this data but commercializing other people’s work using ML is theft.

Edit: commercializing of the derived work is one explicit consideration used by US law in making a fair use determination. That said, even if it weren’t commercialized it may still be infringement and I believe it is.

3 comments

Commercializing isn't really the issue, it's still copyright infringement even if you release it for free (i.e. piracy) -- it's unauthorized redistribution (i.e. copying).
Even if we accept that (which many wouldnt as most licenses say little about research), the research would never be very useful if you can never make a comparable dataset to use in the real world.
I get that the problem is commercializing, but the theories around copyright that are being deployed here would prevent even free, open-source NLP research from becoming a reality.