Hacker News new | ask | show | jobs
by cmiles74 535 days ago
It has never been argued that copyright law should apply to information the people learn, whether that be from reading books or newspapers, watching television or appreciating art like paintings or photographs.

Unlike a person, an large language model is product built by a company and sold by a company. While I am not a lawyer, I believe much of the copyright arguments around LLM training revolve around the idea that copyrighted content should be licensed by the company training the LLM. In much the same way that people are not allowed to scrape the content of the New York Time website and then pass it off as their own content, so should OpenAI be barred from scraping the New York Times website to train ChatGPT and then sell the service without providing some dollars back to the New York Times.