|
|
|
|
|
by ryandrake
228 days ago
|
|
People need to have a better mental model of what it means to host a public web site, and what they are actually doing when they run the web server and point it at a directory of files. They're not just serving those files to customers. They're not just serving them to members. They're not just serving them to human beings. They're not even necessarily serving files to web browsers. They're serving files to every IP address (no matter what machine is attached to it) that is capable of opening a socket and sending GET. There's no such distinct thing as a scraper--and if your mental model tries to distinguish between a scraper and a human user, you're going to be disappointed. As the web server operator, you can try to figure out if there's a human behind the IP, and you might be right or wrong. You can try to figure out if it's a web browser, or if it's someone typing in curl from a command line, or if it's a massively parallel automated system, and you might be right or wrong. You can try to guess what country the IP is in, and you might be right or wrong. But if you really want to actually limit access to the content, you shouldn't be publishing that content publicly. |
|
Legally in the US a “public” web server can have any set of usage restrictions it feels like even without a login screen. Private property doesn’t automatically give permission to do anything even if there happens to be a driveway from the public road into the middle of it.
The law cars about authorized access not the specific technical implementation of access. Which has caused serious legal trouble for many people when they make seemingly reasonable assumptions that say access to someURL/A12.jpg also gives them permission to someURL/A13.jpg etc.