Hacker News new | ask | show | jobs
by motbob 1960 days ago
Fwiw, there have been instances of Google straight up not being able to restore deleted content. When you go to delete videos, you do get warnings that deletion cannot be undone. I wouldn't be surprised if Google didn't have great records of deleted content.

Maybe Google deleted this tiny channel, or maybe it was user error by Project Censored (the apparent organizers of the conference), or maybe it was a publicity stunt by Project Censored. All three possibilities seem about equally likely.

3 comments

> Fwiw, there have been instances of Google straight up not being able to restore deleted content. When you go to delete videos, you do get warnings that deletion cannot be undone

As there’s (former, no-NDA) YouTube SWEs/SREs here - can anyone shed light onto how YouTube’s storage system works? What amount of storage redundancy and retention is available for the vast majority of videos that only get a few views ever?

> I wouldn't be surprised if Google didn't have great records of deleted content.

It would surprise me if Google didn't keep everything. Harvesting data is part of their business model.

>Soft deletion implies that once data is marked as such, it is destroyed after a reasonable delay. The length of the delay depends upon an organization’s policies and applicable laws, available storage resources and cost, and product pricing and market positioning, especially in cases involving much short-lived data. Common choices of soft deletion delays are 15, 30, 45, or 60 days.

https://sre.google/sre-book/data-integrity/#first-layer-soft...

I work at Google, not on anything related to this though.

Then consider yourself surprised. Deleted data is actually deleted. Keeping it around is a huge liability.
I don't know how things work these days, but a few years ago, Google's GFS didn't support deletion. The "delete" flag only meant "don't replicate this data" and it was just simpler for them to keep it around until the disk died.

Source: This was published by Google in their research papers on GFS. Sorry, can't remember which paper.

Without reading the paper, I'm about 99% positive it also would mark the data as unused, which would allow the disk space to be reclaimed and overridden. Disks aren't write once.

It also likely would reindex it, meaning that you can't find it if you go looking for it unless you happen to know which disk it's in already and it hasn't been overwritten yet.

So basically recycle bin rather than shredder
But said recycle bin is actually automatically emptied periodically.
GFS is ancient obsolete technology that is no longer used (the whitepaper was published almost two decades ago, and it described a system already built and in use!). Also, I don't think you're interpreting it correctly either.
Under GDPR I'm fairly certain that they've implemented actual deletion. GDPR even requires companies to go so far as to scan old tapes to delete user records on request by a 30-day deadline, iirc.
Good SRE practices suggest keeping data of certain sorts around for awhile before the deletion is fully completed.
Google censorship is more equal than other possibilities. They themselves have committed a lot lately compare to say owner did it. Purely based on perception, without any proof, I say your last statement is unlikely.
"More equal"? The idea that YouTube deleted some random videos that no one has ever heard of and then lied about it, is "more equal" than the idea that an organization whose stated position is basically "Google bad because censor" wants to make Google seem bad?
This was an academic channel conference that many people viewed. Perhaps not a popular channel for you but still worthy of existing.
I don’t think they’re questioning whether it’s worthy of existing, but rather why Google would choose to target this video specifically given that it has ~0 reach and the potential PR consequences of lying.
Sure, but in a sea of billions of views a day[0] YT isn't targeting specific channels - at this scale, there's going to be at least one moderation mistakes for every million views (and that's being generous).

0: https://blog.youtube/press/