Hacker News new | ask | show | jobs
by canadianwriter 1088 days ago
Not sure what you are implying here - just because something is free doesn't mean you can use it in a commercial product....
1 comments

What about search engines?

If you post something to the public internet, you lose privacy ... that's how internet works.

For this we have robots.txt and authentication ... if a site allows you to browse their content, it's free to take, whatever the purpose.

Search engines have a special legal carve-out, but otherwise granting access to browse a site ABSOLUTELY DOES NOT mean you have any rights to take it and do whatever you want with it. In the US, all works are automatically granted a copyright with all rights reserved, and the owner can choose to relax or waive those rights at their discretion, which most blog/social media posts, etc. do not waive those rights.
- Crawl Limitations: Search engines typically adhere to guidelines provided by website owners through the robots.txt file. This file instructs web crawlers on which parts of a website they are allowed to access and index. Website owners can use these instructions to control the extent to which search engines crawl and display their copyrighted content.

- Indexing vs. Displaying: Search engines primarily index web pages to create a searchable database of information. They do not generally host or display full copyrighted content directly. Instead, search results usually provide brief snippets, page titles, and links that direct users to the original source. This approach aims to respect copyright by driving traffic to the copyright holders' websites.

- Fair Use Considerations: In some cases, search engines may display limited portions of copyrighted content under the fair use doctrine, which allows for the limited use of copyrighted material for purposes such as commentary, criticism, news reporting, or educational purposes. The application of fair use can be subjective and depends on the specific circumstances of each case.

Replace "search engine" with "LLMs", it's (practically) the same.