Hacker News new | ask | show | jobs
by _ncuy 394 days ago
Google hit the jackpot with their acquisition of YouTube and it's now paying dividend. YouTube is the largest single source of data and traffic on the Internet, and it's still growing fast. I think this data will prove incredibly important to robotics as well. It's a shame they sold Boston Dynamics in one of their dumbest ever moves because of bad PR.
6 comments

"Growing fast" is questionable these days.

There is an ever growing percentage of new AI-generated videos among every set of daily uploads.

How long until more than half of uploads in a day are AI-generated?

Even if the content was 100% AI generated (which is the furthest thing from reality today) human engagement with the content is a powerful signal that can be used by AI to learn. It would be like RLHF with free human annotation at scale.
Won't the human engagement be replaced by AI engagement too? if it isn't already being replaced?
The AI is not paying for watching videos yet
Indeed, it's the advertisers who are paying for AI to watch videos....
And paying for my sofa to watch a unskippable 50s ad while I make a coffee.
Google already invests a tremendous amount of resources into identifying and preventing fraudulent ad impressions -- I don't see that changing much until AI is so cheap that it makes sense to run a full agent for pennies per hour. Sadly.
Yes it will. Soon humans will be the minority on the internet. I wrote some guesses about this 2 years ago: https://art.cx/blog/12-08-22-city-of-bots
And google is in the best possible position to detect it if they want to exclude it from their datasets.
They're never going to manage to do that, just on a technical level

Plus some users might want to legitimately upload things with AI-generated content in it

I'm pretty sure YouTube saves the metadata from all the video files uploaded to it. It seems pretty trivial to exclude videos uploaded without camera model or device setting information. I seriously doubt even a tiny fraction of people uploading AI content to YouTube are taking the time to futz about with the XMP data before they upload it. Sure, they'll miss out on a lot of edited videos doing that, but that's probably for the best if you're trying to create a data set that's maintaining fidelity to the real world. Lots of ways to create false images without AI
"Since launching in 2023, SynthID has watermarked over 10 billion images, videos, audio files and texts, helping identify them as AI-generated and reduce the chances of misinformation and misattribution. Outputs generated by Veo 3, Imagen 4 and Lyria 2 will continue to have SynthID watermarks.

Today, we’re launching SynthID Detector, a verification portal to help people identify AI-generated content. Upload a piece of content and the SynthID Detector will identify if either the entire file or just a part of it has SynthID in it.

With all our generative AI models, we aim to unleash human creativity and enable artists and creators to bring their ideas to life faster and more easily than ever before."

From the page linked in the post....

So there's different ways to detect AI generated content (videos/images atleast). (https://www.nature.com/articles/s41586-024-08025-4 <-- paper on synthID / watermarking and detecting it with LLMs)

I somewhat doubt that YT cares much about AI content being uploaded, as long as it’s clearly marked as such.

What they do care about is their training set getting tainted, so I imagine they will push quite hard to have some mechanism to detect AI; it’s useful to them even if users don’t act on it.

> They're never going to manage to do that, just on a technical level

Why not? Given enough data, it's possible to train models to differentiate - especially since humans can pick up on the difference pretty well.

> Plus some users might want to legitimately upload things with AI-generated content in it

Excluding videos from training datasets doesn't mean excluding them from Youtube.

I agree, especially because in practice the vast majority of AI-generated videos uploaded to YouTube are going to be from one of about 3 or 4 generators (Sora, Veo, etc.). May change in the future, but at the moment the detection problem is pretty well constrained.
> Excluding videos from training datasets doesn't mean excluding them from Youtube.

Ah then sure. It was this part that was problematic.

If users are still allowed to upload flagged content, then false positives almost don't matter, so Youtube could just roll out some imperfect solution and it would be fine

In the future, a new intelligent species will roam the earth, they will ask, "why did their civilization fall?" The answer? These homo-sapiens strip mined the Earth and exacerbated climate change to generate enough power to make amusing cat videos...
It’s the much-feared the paper clip apocalypse, but we did it to ourselves with cat clips.
And those videos were either not watched by anyone human or not truly watched by being part of an endless feed of similar slop.
how do you truly watch an ai-generated cat video
use your eyes. write a detailed and elaborate review on your blog of the cat and his antics. seems easy enough?
At this point heat death through cat videos sound more appealing than nuclear apocalypse, lol
We don't have an energy problem on earth. We have a capitalism problem.

Renewable energy is easily able to provide enough energy sustainable. Batteries can be recycled. Solar panels are glas/plastic and silicium.

Nuclear is feasable, fusion will happen in 50 years one way or the other.

Existens is what it is. If it means being able to watch cat videos, so be it. We are not watching them for nothing, we watch them for happiness.

Existens is what it is. If it means being able to watch cat videos, so be it. We are not watching them for nothing, we watch them for happiness.

Well that's just your opinion.

Yes we can generate electricity, but it would be nice if used it wisely.

Of course its my opinion, its my comment after all.

Nonetheless, survival can't be the life goal after all the moon will drift away from earth in the future, the sun will explode and if we survive that as a species, all bonds between elements will disolve.

It also can't be about giving your dna away because your dna has very little to no impact over just a handful of generations.

And no the goal of our society has to be to have as much energy available as possible to us. So much energy, that energy doesn't matter. There is enough ways of generating energy without a real issue at all. Fusion, renewable energy directly from the sun.

There is also no inherant issue right now preventing us all having clean stable energy besides capitalsm. We have the technology, we have the resources, we have the manufacturing capacity.

To finish my comment: Its not about energy, its about entropy. You need energy to create entropy. We don't even consume the energy of the sun, we use it for entropy and dissipate it back to space after.

On the other hand, take one look at the way they caption a video in their dataset, and you have seen like 90% of the "secret sauce" of generative art. All this supposed data and knowledge, and anyone who has worked 1 day on Imagen or Veo could become a serious competitor.

The remaining 10% is the solution to generating good hands, of course. And do you think YouTube has been helping anyone achieve that?

I hear BD aren't making much money anyway so I wonder if they couldn't just buy them back for not much loss overall.
Why videos are important for robotics?
If you can generate realistic video stream, responding to player movements and interactions, you can train your robot using that video stream. It's much more scalable, compared to building physical environments and performing real-world training.

Of course the alternative is to use game engines, but it's possible that AI would generate more realistic video stream for the same money spent. Those recent AI-generated videos certainly look much more realistic than any game footage I ever saw.

Game engines require a lot of additional work to make them suitable for that task, too— deep integration for sensor data, inputting maps and assets, plus the basic mismatch that these workflows are centered around Windows gui tools whereas robotics is happening on the Linux command line.
object detection i'd guess.
Why should YouTube be here at the advantage? Every competitor also has access to these videos(?)
Easy access to the videos without having to download them from Google (and without Google trying to stop you from scraping them, which they will) is an enormous advantage. There's way, way too much on Youtube for to index and use over the internet, and especially not at full resolution.
That is the other perk, Google has all those videos stored in original quality locally.

It wouldn't be hard for google to poison competitor training just by throttling bandwidth.

Google is making money hosting these videos, and users are freely uploading them. A competitor would have to scrape/download them, store them, process them all at their own cost, along with having much less metadata available (Which videos are most viewed, which segments, what do people repeat, what do people skip, what do people watch after this video, which video generates the most ad revenue, etc.)
> Google is making money hosting these videos

This isn't certain. Google do not break out Youtube revenues nor costs. Hosting this amount of videos, globally, redundantly, the vast majority of which are basically never watched, cannot be cheap.

It's entirely plausible that Google's wider benefit from Youtube (such as training video generation algorithms and better behaviour tracking for better targeted ads across the internet) are enough to compensate for Youtube in particular losing money.

> Google do not break out Youtube revenues nor costs.

Google does break out Youtube revenue.

Latest 10-K: https://abc.xyz/assets/77/51/9841ad5c4fbe85b4440c47a4df8d/go...

See page 10, for youtube Ads revenue.

My bad, I thought it's the two. But they don't break out costs, so in reality we don't know if YouTube is profitable or not.
Videos without metadata is not as useful. Google also has details on which videos are watched where. Which parts do people skip. All the videos that are blocked for various reasons. The performance of videos with humans over time and so on. They can focus on videos with signals that indicate that humans prefer those videos or clips.
Do they, though? Are competitors actually downloading all these videos? Supposedly there are 5 billion videos on YouTube (https://seo.ai/blog/how-many-videos-are-on-youtube), downloading all of that is a LOOOOT of data and time.

I mean, you could limit yourself to the most popular or most interesting 100 million, but that's still an enormous amount of data to download.

Just wanted to mention the latter, you don’t need all videos. It’s indeed a lot of data but doable so I am not sure if I would count this as big advantage.
You are incredibly naive if you don’t see full, unrestricted access to YT as an advantage.
presumed datasets: 1. its petabytes of data in the public/listed/free tier videos. 2. there's paywalled videos. 3. there's private/unlisted videos.

google will have access to all of these. competitors will have to do tons of network interactions with google to pull in only the first set. (which google could detect and block depending on how these competitors go about it)

Most youtube videos use stock video photography. Or the face of some youtuber.

If we look at the Veo 3 examples, this is not the typical youtube video, but instead they seem to recreate cgi movies, or actual movies.