|
|
|
|
|
by _vegp
4250 days ago
|
|
So this situation happens in the news world all the time. While a company or agency has original databases, excel sheets, what have you - they don't consider that publishing them in a "human-readable" format is nearly the same thing as publishing the raw data. Try calling the place for a copy, and they'll hang up on you. But, they won't think that a crafty outsider can probably reconstruct the original by scraping. What's particularly interesting here is guessing the motivation behind publishing. Was the information a trade secret, or did a middle-manager want to show that their team is ahead of the others? Or are these feathers to show the company has the know-how and capability? In either case, most of the web-published data isn't initially considered as published data by the publishers, who in turn don't think to state any restrictions governing the data. That's when we scrape and make use of it - and even if there are restrictions on republishing, you can still perform and claim transformative derivative work. The fun legalese part is what happens when they discover what you're doing and try to lash out, or interrupt a standing scrape. One time, all it took to unblock access was to show up at a meeting and get yelled at by a police captain for 30 minutes. Our retort started with "In the interest of public safety, ..." |
|
I'd like to hear more about your example.