Hacker News new | ask | show | jobs
Ask HN: How not to let chat GPT process my work
2 points by catinblack 1180 days ago
Open Ai and Chat GPT uses texts and data that are available in the internet. However, not everything that is publicly available is under a free licence and should/can be processed. How can I prevent my texts from being processed?
2 comments

You could poison your texts in various ways, but the same kind of things that might keep an AI or LLM from digesting your work, are probably the same kind of things that affect SEO and accessibility negatively.
essentially the general rule of thumb is: if a human (biological intelligence) can read and learn from your work, you should assume an artificial intelligence will, one way or another, learn and read from your work.
Yes and no. AI is not a human. It is a model, a tool, a software. I don't want my work to go into someone else's software. I think people who keep their open-source work on GitHub and feed the Copilot tool for free have the same problem.

But for now I wanted to focus on the texts.

Not all of us feel this way. Like GP, if you put your work out there for anyone to consume. I put my work out there without permissions, so that it is truly open and free to use for anyone without discrimination.

Why do you want to add discrimination in order to use your work?

It's not even so much that I want to do it as that I want to know whether it is possible. Personally, I think it's quite an important issue and we should be able to talk about it and make an informed decision.

Anyway, today on Hackernews there was a post about abandoning github because developer doesn't want Copilot to learn from his code.

I know a lot of people who are asking themselves the same question as me. How do not fed the GPT chat with my blog posts in the future. Suppose someone publishes unique data and want it to be available for humans, not machines.

In addition, we have something in the law like copyright and we should be able to decide whether our code, texts, graphics, can be copied and processed in someone else's software.

In the meantime, I found two solutions: https://www.seroundtable.com/chatgpt-bot-user-agent-35131.ht... https://shortbuzz.in/blog/shortbuzz.in/how-to-block-openai-c...

Copyright has an exception called Fair Use, where one of the main purposes is to support education, or perhaps training. The legality is not clear here.

Again, why discriminate against one or more groups? Are you going to ban Common Crawl from accessing your content? That is being used for training, but many more things as well.