Hacker News new | ask | show | jobs
by heinrichhartman 1370 days ago
The case is more nuanced than it looks from the outset. In this case YTDL is executed on the provider's premise not on the users machines. So you are not distributing a tool, you are providing a service that scrapes videos from YouTube.

I still think that a DMCA takedown should not apply (and the regulation is BS) but this looks to be a different case. E.g. I could completely understand why Google would not want such a feature in your product for Business reasons.

7 comments

> you are providing a service that scrapes videos from YouTube. ... > I could completely understand why Google would not want such a feature in your product

So, when Google scraped the internet to provide search engine, that was okay, but when some other site scrapes something, it is not?

If they are doing something illegal, by all means sue or report them.

But Google acting as police, judge, jury and executioner here shows why it's a problem to have a single unregulated company with so much power.

Google did not only scrapped the web.

- Google recorded half of the planet with cameras scanning license plates, addresses, faces. You see those blurred in Google Maps, but the license plates, faces, are not blurred in the internal data at Google. I am still to see confirmation of data correlation not occurring.

- They scanned the Wifi MAC addresses associated with physical addresses "by mistake".

- They scanned public records of companies, and associated them publicly and sometimes the private address of the owner, and invite you as the owner to "take ownership" of their google map data...

> They scanned the Wifi MAC addresses associated with physical addresses "by mistake".

https://www.theguardian.com/technology/2010/may/15/google-ad...

  German request for data audit reveals the web giant 'accidentally' stored payload information from open networks
> Google recorded half of the planet with cameras

not to be a google apologist, but they only ever record public areas. Do you hold tourists that take photos to the same standard?

> are not blurred in the internal data at Google.

whatever is in internal google is irrelevant, since the data is not exposed publicly. Do you expect that a tourist that took the same photo with a license plate to blur out their photo in their own private album? I would expect that the tourist that publishes the photo (say, on facebook) blur out the license plate, but not when it's in their own private album.

> sometimes the private address of the owner

how did google get the private address of the owner of a public business? I don't quite understand the claim to the wrong doing.

A single person that collects data at the same scale at Google? I will absolutely hold him to the same standard.

I mean, that was a plot point in a Batman movie not too long ago. And even Batman doesn’t deserve that kind of power.

> whatever is in internal google is irrelevant, since the data is not exposed publicly

The data may not be exposed publicly but can still be sold or used against you.

> they only ever record public areas. Do you hold tourists that take photos to the same standard?

If those tourists are creating a massive database of everybody, then yes.

The issue isn't really the recording itself. The issue is what is done with the data afterwards.

>not to be a google apologist, but they only ever record public areas. Do you hold tourists that take photos to the same standard?

Tourist don't do the same thing. Why do you think there is almost no Google Street View in Germany?

> whatever is in internal google is irrelevant, since the data is not exposed publicly...

GDPR -> "What information must be given to individuals whose data is collected?"

https://ec.europa.eu/info/law/law-topic/data-protection/refo...

> how did google get the private address of the owner of a public business? I don't quite understand the claim to the wrong doing.

Google got it because some owners have their company registered to their own private address. It's the extra step of publicly exposing that info and pressuring you to take ownership of it otherwise it will be shown on Google Map an offer that currently is 98% of the market. Why do you think you have the option of blurring the view of your house in Google Maps?

> Do you hold tourists that take photos to the same standard?

Scale, intent and usage matters in law.

they only ever record public areas

They tried to get around that with Pokemon Go with some success. I don't know if all that content they scraped is publicly available however.

What is a public area? Anywhere that can be viewed from any spot accessible to the public? So the only non-public areas are rooms with no windows?

If I go to Google Street view I can see a lot of details from inside my house, same with the houses of my neighbors.

If you stop and stare inside someone's window in public, people will notice and wonder what the hell you're doing. Nothing's stopping anyone from staring through people's windows through Google Street view though.

In most "free" countries, people are free to stand outside your house on the public street and look towards your house as much as they want. They can even take photos.

So you can wonder what the hell they're doing as much as you want, because what you're suggesting is perfectly legal.

> So, when Google scraped the internet to provide search engine, that was okay, but when some other site scrapes something, it is not?

No, but when Google scraped the Internet, nobody did anything about it.

The value they added by making everything searchable was so great.

Giving content creators platform flexibility is clearly a threat to YouTube.

Double standards are twice as good.

> The value they added by making everything searchable was so great.

Altavista, yahoo

++Google
Everyone was within their rights to deny Google access to their machines for scraping, but that would have been impossible to do. The web is only a psuedo public apparatus, filled with private computers. While you could technically deny access to a specific person you could never actually implement that or follow it up legally.
While scrapping Google follows robots.txt file rules
https://www.youtube.com/robots.txt

Looks like scraping is allowed then, if your robot is named "Mediapartners-Google*" And you can name your robot whatever you like (there's no legally enforced nor defacto registry), so my robot:

"Mediapartners-Google-im-cleary-not-a-member-of/1.0"

Can legally scrape anything on google's youtube. Perfect.

Robots.txt is a thing.
If robots.txt were a thing, Google would never have built its search engine. Everyone would just allow Altavista and disallow everyone else, like they do with Googlebot and Bingbot now.

Also, the issue here is not that Google banned their tool from accessing YouTube because the tool didn't obey robots.txt. The issue is that Google, the search engine, nuked their site, because they didn't like its content.

I agree with this, but I also feel Google should sue them for this feature, instead of removing them from Google Search. I feel like the second is an abuse of their massive power on the internet.
The EU is rubbing their hands...
> I could completely understand why Google would not want such a feature in your product for Business reasons.

yeah I can understand this too, but it looks to me like Google is using its monopoly power in industry (search) to protect its monopoly power in another industry (video streaming)

Hum, isn't copyright about publication? The mere act of downloading doesn't necessarily imply this, and there's still legitimate cause for publication as in fair use. Also, wouldn't a DCMA complaint have to concern specific material? (To me, this doesn't make much sense.)
Downloading even for caching is prohibited afaik.
But again, GIPHY has the exact same feature, and I can find Giphy just fine
Maybe GIPHY negotiated something with Google? Or the fact that it's only short extracts means they don't see it as scraping?
> The case is more nuanced than it looks from the outset. In this case YTDL is executed on the provider's premise not on the users machines. So you are not distributing a tool, you are providing a service that scrapes videos from YouTube.

So the rule is that you can't (as a user of Kapwing) use a computer you don't own (and are just renting in some capacity) to download your own videos? Or to download videos for "fair use" use cases?

I mean who cares where the software runs?

> So the rule is that you can't use a computer you don't own (and are just renting in some capacity) to download your own videos? Or to download videos for "fair use" use cases?

One of the (many) reasons the DMCA is bad is because it makes using your existing rights harder. You've mentioned "fair use" here. But the submitted article is talking about:

> (2) No person shall manufacture, import, offer to the public, provide, or otherwise traffic in any technology, product, service, device, component, or part thereof, that—

> (A) is primarily designed or produced for the purpose of circumventing a technological measure that effectively controls access to a work protected under this title;

You can't use your right to fair use if the IP holder has implemented an access control, because the DMCA makes circumventing an access control unlawful.

The fact that the IP industry was able to implement a law that interferes with free speech shows how out of control they are.

Sad. Who decides which law has priority though, when the laws are in conflict?

Apparently http clients are not banned by this law even though they are necessarily "part of a product" that is designed to circumvent protection,... And they are the main thing doing the actual copying. This law is still not achieving it's full potential. :D

In the same vein, there's conflict between exceptions offered by one law (fair use), and bans by this same law (software to exercise this right).

I guess the courts decide, but in practice courts don't run around looking for conflicts in laws to decide on their own, to make laws more followable by normal people, but they just decide cases brought forward by seomeone. And that someone is usually someone who benefits the most out of some particular outcome, and is thus able to pay for the case.

The whole system seems a bit prejudiced in how it works.

I could completely understand why Google would not want such a feature in your product for Business reasons.

But if Google would actually act on that desire, they're so far into antitrust territory they'll never find their way out on their own. We can only hope that some of those teeth might have grown back.