Hacker News new | ask | show | jobs
by ds 811 days ago
All the existing databroker remover tools are flawed because they make use of manual labor to remove you from sites, primarily done by people in third world countries.

We @ https://redact.dev are working on a pure software mechanism for doing these optouts directly from your own device. We already have full mass deletions for over 40 social media and utilitys.

6 comments

I really dislike the trend of making everything a subscription service. I can imagine a niche market that wants to continuously delete content older than an arbitrary window but isn't this the sort of service that most users would need only need sporadically?

The pricing seems to implicitly acknowledge this: $35/m billed monthly vs $8/m billed annually! Would you really expect anyone to intentionally renew monthly? I can't argue that people forgetting to unsubscribe pays the bills, but as a business model it leaves a bad taste.

Data brokers are like the hydra, one goes down and another 2 new ones pop up. It's a lot of work to keep on top of deletions if you want privacy.
Not really. There's a fairly small and stable number of companies that actually collect and resell information about you. There is also about a zillion ephemeral web front ends that republish this data, however. I suspect this is done for a reason, but a bit of sleuthing quickly reveals who the big players are.

These "data removal" services spend a lot of effort going after the frontends, which is pretty self-serving: they can show the customer that there's something new to remove every single month or quarter, so you have to keep paying forever.

“There is also about a zillion ephemeral web front ends that republish this data, however”

yes this is what I mean, you need to contact each one to have data removed… there are hundreds of these

What else could they do? They're working within a system that the government designed, and the government always designs things to keep people running on the hamster wheel.
Request deletion from backend brokers? Many have some mechanisms for opt-out, either in general or for people in specific states (e.g., California).
OK so if Optery reports 330 removals, how many removals did they actually have to do on their end? A hundred? Thirty? Ten? Why should we care? If you pay a man to remove the snow from your driveway, would you be upset if he used a plow rather than a shovel?
I don’t necessarily doubt you, but do you have any source for this, or in general any information on the landscape of data brokers?

It’s hard to imagine what the situation actually looks like behind the scenes.

there are a handful of large data brokers like lexisnexis, they’ll sell their scraped public data to anyone

random companies will buy the data, do a little collation and merge datasets from multiple sources, start their own frontend, and resell it to consumers doing google searches for phone numbers, names, etc

because they’re not directly affiliated with the primary brokers, there are hundreds of these independent frontends… and unless you contact each one, they can all resell their data (even to each other)

so if you miss one removal, it’s possible your data gets picked up by a new frontend from that frontend… your data can kind of proliferate through this gross ether forever

This explains some trends where posts are being edited on Reddit with nonsense then deleted. Personally, I think this kind of behavior makes the web poorer as a knowledge base. Yes you have a right to do it with your own content, but doing it at scale makes the internet a less useful tool and it makes me a bit sad since the scrapers will already have the data anyway.
Those are mostly in response to reddit's API changes. By editing the comments before deletion, the archives also get wiped and it takes a bit more effort for reddit to restore deleted comments behind users' backs.

Yes, it makes the web poorer as a knowledge base, but it's in response to companies like reddit ruining the internet by baiting in users, changing the agreement and then trying to keep the content that was written under the previous agreements.

Hopefully it just makes sites remove the ability to edit or delete things once they've been published. Especially forums where things have been referenced by other things.

As much as I routinely fine-tune and fix up a comment after initially writing, I will happily go back to the old days before such ability became common, in trade for the sanity of references that don't disappear or change meaning after the fact. The typos don't hurt as much as the swiss cheese and schitzo conversations.

A good compromise in the meantime would be the Internet Archive. A lot of useful data is preserved there.

This made me curious about archivist ethics: https://www2.archivists.org/statements/saa-core-values-state...

> Privacy: Archivists recognize that privacy is an inherent fundamental right and sanctioned by law. They establish procedures and policies to protect the interests of the donors, individuals, groups, and organizations whose public and private lives and activities are documented in archival holdings. As appropriate and mandated by law, archivists place access restrictions on collections to ensure that privacy and confidentiality are maintained, particularly for individuals and groups who have had no voice or role in collections’ creation, retention, or public use. Archivists should maintain transparency when placing these restrictions, documenting why and for how long they will be enacted. Archivists promote the respectful use of culturally sensitive materials in their care by encouraging researchers to consult with those represented by records, recognizing that privacy has both legal and cultural dimensions. Archivists respect all users’ rights to privacy by maintaining the confidentiality of their research and protecting any personal information collected about the users in accordance with their institutions’ policies.

Personally I think we need the ability to delete more, not less.

Yes, I do see the irony of writing that here. :'(

The problem with the wholesale deletion of comments is that it also affects other people. For example if we have a back-and-forth constructive conversation here and one of us deletes all comments, then the value of the other person's comments are diminished, and sometimes even incomprehensible.

It's pretty clear you're putting something in the public when you're commenting on HN; this isn't a surprise and nothing is done surreptitiously. If you contribute to a debate in some TV discussion programme then you can't have that deleted later either.

And there are options without wholesale deletion: specific comments can be deleted or edited for specific reasons, and your account can be "soft-deleted" by changing your username to something random.

If you want to have more ephemeral temporary conversations then that's fair! But HN is not the right platform for that, IMHO.

that's not actually a flaw

a real flaw is that companies in this niche are actually centralizing data to re-sell while adding a new line in the dataset that says "wanted to remove their data footprints"

easyoptouts.com, which I work on, doesn't use manual labor - everything is automated! It's definitely important to avoid giving more people access to the data.

edit: and as a result of automation, our prices are also way lower than most similar services

Many databrokers make it very difficult to remove your info, on purpose, of course. That is why the legit removal providers have to rely on manual labor for some. I'd love to see it fully automated, but I'll believe it when I see it. Last I checked, Optery was removing 325+. Best of luck-- you have a long way to go.

Edit: this looks like a totally different service. Mass deletion of old posts is one thing, removing PII from data brokers is another.

In other words, would you describe your site as the Gillette razor attachment mechanism of online data deletion?