Who else feels a sense of horrible dread and frustration that every minutiae of my (and your) online activity is recorded for eternity and exploited to the fullest extent?
Oh, yes, the algorithms supposed to target you specifically are often quite bad and annoying. For example on Amazon, when you just bought a coffee machine, your recommendations will show you a bunch of coffee machines. Well, I just made a choice and ordered a coffee machine, I am not going to buy another one for quite some time.
Bad for you != bad for the advertiser, and bad for the advertiser is not necessarily correlated to annoying to you.
While your particular case may need just one coffee machine, there are other scenarios. If you just bought a coffee machine, perhaps it will be defective and you'll need a replacement. Perhaps you didn't know about the one they're showing you and will return the first one to buy this one. Perhaps you are just getting into this coffee stuff and realize that a second one for the office would be nice. Perhaps you're buying a few to compare them.
It could be rational to show you the ad instead of a random individual if the sum of these scenarios is greater than the rate at which random individuals buy coffee machines.
Or, yeah, it could be that their ad-showing algorithm just tells them you have the word "coffee" in your recent browsing history.
Targeted advertising isn't about giving you good recommendations. It's about advertisers picking which demographic they want to influence. It may be presented as a "recommendation", but it's still an ad.
Not really horrified. Yes they save it, but they don't seem to be able to use it well. The ads I get on Facebook are sometimes really interesting (i.e. targeted well), on other pages it's mostly stuff I bought a while ago or topics where I don't even have a clue why they think that could be interesting.
And even though they can save it forever, it's outdated pretty quickly. If someone has information that I clicked a button 5 times when I visited a page in '09, this will have zero value now. The information is only valid for hours or days.
I guess most of this information is used to test behavior on pages and to optimize them. And I can only support that. Tracking like this is the reason why popular pages are intuitive. They perform A/B testing extensively to see what works best. I don't see an issue with that.
But the goal of that optimization is not to make your life better, it's to make money for them.
They may be pretty bad at it now, but one thing about technology is that it gets better and better, and their goal is to encourage consumption of their product.
I'm thinking about buying a new laptop right now. I'm trying to make a smart choice, weighing the funds I have available, my need for it, the options available, my preferences for various brands and features, and my own desire to have new shiny stuff.
The advertisers can distort this rational decision. Right now they just have a poorly-targeted generic bit of text that I don't really read in the sidebar of certain websites, and I feel comfortable with this level of influence. If they were super-persuasive at selling their product to me Present Me would consider that theft. Future Me would probably be grateful to the advertiser, and that's terrifying.
A Self-Driving, Self-Selling Tesla might show up at my door, perform an inspection of my current car, and, in its silky voice, deliver an irrefutable argument why I must never get in that car again and should instead hop in for a free ride across state lines so I can take out a home equity loan and cash in my 401k to buy it. Yikes!
The optimistic side of this is that maybe they'll eventually move beyond market research and on to individual research to give us stuff that we actually want. "Oh, LeifCarrotson is filtering out our 1366x768 TN panels, doesn't seem to care about thickness, has recently read about the Samsung 960 Pro? Let's build him one with a big 9-cell battery and longer travel keyboard, a good screen, and one of those SSDs. And he seems to be running Linux? Let's swap our default touchpad for one with an open driver, and donate a few percent of the profits to the EFF, that's sure to make him happy."
> But the goal of that optimization is not to make your life better, it's to make money for them.
I don't pay to use services online. They somehow have to make money. Until we start donating/paying for each service we use, we have to expect that people will make money otherwise.
I'm not even talking about the free services, though.
I am talking about the ordinary consumer goods manufacturers that buy the ad space that may or may not be sold on free online services. They're the ones who want that ad space, who benefit from the targeting, and who make the whole operation work.
The "how" they make money and how they secure, anonymize and to whom they share it are all important considerations that are almost exclusively(there are exceptions) never divulged.
You are implicitly assuming that this won't change. Databases persist indefinitely.
> The information is only valid for hours or days.
This simply isn't true. You seem to be only thinking about single data points, not the entire picture about your life that is painted when you aggregate that all of the captured data. For example, timestamps of your clicks build a pattern about when you use your computer, and the domain names you have visited (including the order you visited them) probably gives a reasonably accurate estimation of your political views, personal beliefs, and other data that you haven't shared on the web. (Bayesian analysis, machine learning, and other modern analysis methods do amazing things with minimal data)
> I guess
It might be a good idea to not base your risk analysis on a guess. If you have no other option and have guess the level of risk, you should be assume the worse. Assuming benevolence (or incompetence) without evidence is incredibly foolish.
From personal experience as a DBA: the effort usually consists of not deleting rows out of the database. The expense is a few cents a month for gigabytes of data.
Worst case scenario, the companies will simply set up a separate data warehouse style data store, and make it available to anyone who wants it internally. If we as consumers are lucky, they will scrub the data of PII before moving it to the data warehouse.
They will probably put it in some archive where they delete it after 10 years. Or they keep it somewhere on a drive where it could be accessed but will never be, since no one has a reason to use the data.
Or they are intentionally dialing back on the accuracy.
Being too accurate in your advertising is like delving into the uncanny valley of CGI. It gets spooky and makes people intentionally shy away from your products.
So, perhaps they're following the same trajectory as CGI: adding intentional inaccuracy to mask their actual targeted ads. If I saw the one thing I wanted in a lineup of four other poorly targeted ads, I'd be less likely to consider it as spooky, and more likely to treat it as I would any other advertisement. These companies follow you around the internet for years, after all. They can afford to play a long game.
> too accurate in your advertising is like delving into the uncanny valley
This isn't speculation - it's standard practice now in some companies to try to avoid "scaring" the customer with something that reveals how much modern advertising looks like a stalker. A well known example is Target when they discovered[1] they could predict pregnancies very early and very accurately:
At which point someone asked an important question:
How are women going to react when they figure out
how much Target knows?
“If we send someone a catalog and say, ‘Congratulations
on your first child!’ and they’ve never told us they’re
pregnant, that’s going to make some people uncomfortable,”
Pole told me. “We are very conservative about compliance
with all privacy laws. But even if you’re following the
law, you can do things where people get queasy.”
The article then tells the story of the time an angry father stormed into Target after the company had sent ads for maternity clothing and nursery furniture to his high-school age daughter. He was angry what he thought was an attempt to coerce his daughter, but later apologized when he discovered that Target was right.
We now live in a world where it businesses can infer significant attributes like pregnancy from subtle changes in their buying patterns. When stalker-like behavior is built into modern business models, the smarter businesses realize that most people will hate you if you act like a creepy spy.
I sometimes wonder if I should be more concerned with privacy and my data than I am. I don't engage in risky behaviors, and use mailinator as much as possible. But I also have nothing to hide, and rarely share anything critical apart from cc# with amazon. If they want to know how often I show up and track events, I couldn't care less. I suppose it's good more people know what they're capable of. I know people who think they need to go Snowden when they're daily life is just finding answers on Stack Overflow and watching Joe Rogan on YouTube.
You do know that the ad won't necessarily be relevant to you, right? The ads will be "relevant" according to the advertisers who think you're the kind of target they want to trick into buying their product.
Regarding that "minor event": it is minor in isolation, but the point is that you are generating a large amount of those events that are being aggregated into databases where they are unlikely to be deleted. The ad(s) you might get isn't important. What you should be concerned about is the detailed pattern-of-life analysis that can be done at any time in the future by anybody that buys a copy of that database.
>You do know that the ad won't necessarily be relevant to you, right?
Perhaps, but I don't really see the difference in the end. Anecdotally, I see ads for IntelliJ IDEA because I look up a lot of coding-related things. So does it matter that I'm getting this ad because Google's ad network has decided I like coding, or because IntelliJ selected some parameters and said "send this ad to coders"?
The goal of either operator would be to send me an ad they think I'm most likely to click on, and so from the user perspective it's indistinguishable where the ad is actually directed from.
>Regarding that "minor event": it is minor in isolation, but the point is that you are generating a large amount of those events that are being aggregated into databases where they are unlikely to be deleted.
Alright, here's the thing. I don't love the idea of tracking but I'm not repulsed by it either. I always see "tracking is evil!" as if it's the final say on the matter, especially on this website. But I've yet to see a convincing argument that it's something which I should be actively concerned about, or that it's something making my life worse in any way.
Often what I see is people using emotional words like "surveillance", when really we're talking about a computer algorithm that matches ads with interests groups. It inspires imagery of somebody watching you through your computer which I suspect is the point. This sort of language strikes me as hyperbolic, and in some cases dishonest.
I realize this is outside the scope of your original comment, but understand that when making claims like "tracking is evil", or some variation of that view, unless there's something tangible to point to and say "this is how it makes your life worse", it just doesn't register on my radar.
It might be easy to dismiss that view as short-sighted, but ultimately I consider it more pragmatic than placing ethical stances on what seems to be largely speculative concerns. Why would an ad company sell user data when that's their entire competitive advantage? In the case of Google (and I suspect most others), their privacy policy explicitly prohibit them from selling user data.
Ultimately I find that technology improves my life in many ways, and I try not to fear it unless I see a real cause for concern. And on this particular issue I haven't seen that yet.
They're used to develop accurate psychological profiles, which are then exploited by government agencies, data aggregators, etc.
I find surveillance economy to be worrying, because while it makes the market efficient, it makes government too efficient and allows for bad behaviors on the part of marketers (eg, targeting addiction susceptible people via "machine learning" for deniability).
Leaking my personal information to pay for websites isn't economics I like.
First of all, it might not be about you. If you are privileged enough to not have to worry about being the target of prejudices, hatred, or the occasional witch hunt, then you might not see the necessity of keeping secrets. Some people are not that lucky.
> making my life worse
It probably isn't making your life worse at the moment. Right now we have barely scratched the surface of what is possible. Most of the current uses are fairly benign (such as ads). The concern isn't about current uses. The problem is the open-ended risk of anyone abusing that data in the future.
You are essentially making the bet that either 1) nobody will ever invent a use of your data that is harmful to you, or 2) nobody will ever abuse your data, or 3) that your data won't actually live forever. Clause 1 is already broken in some areas, clause 2 denies human nature, and clause 3 has a lot of evidence suggesting data rarely disappears.
> something tangible
Ok, lets consider insurance companies and/or banks. These businesses would really like more to get their hands on data that could give them excuses to raise your rates, deny your loan or insurance coverage. Sure, we have laws and regulations that theoretically prevent some types of data from being used. The legal situation becomes less clear when none of the prohibited data is used directly but it can be inferred from other types of data that is.
If you think this is a theoretical concern, then you need to read about the deplorable practice known as "redlining"[1], where data was used as a cover for racial hatred and forced segregation. We already see problems with various types of data being used in police work and judicial situations where certain combinations of "unrelated" data is actually a reasonable proxy for race.
Can you say with certainty that a future insurance company won't be able to take all of the data points you've been generating in ad networks - with absolutely no "PII" - and find some pattern in your history that can be used as a reason to raise your rates? Or deny coverage? This is only one stupid example; it will be a lot more subtle as we learn new analysis methods and creative data manipulation methods.
However, you asked for something tangible, so you should look at this[2] map of Amsterdam. Each black dot on the map represents 10 Jews. The Nazis commissioned this map from the local civil servants that managed the census data. I doubt they thought that their recent change to the census to include a question asking for religious affiliation could ever be dangerous. 3/4 of the dots on that map were murdered in the camps. Yes, this is an extreme example. I hope ad tracing data won't end up being used for that level of evil. Unfortunately, there are a lot of possibilities between "serving an ad" and "genocide", even though they are both data problems.
> Why would an ad company sell user data when that's their entire competitive advantage?
Selling data can become another source of revenue if the company has significant financial problems. Given the recent-ish trend of companies to agglutinate into a single power (or small group of powers), the transfer of data might be "internal" instead of a sale. Also, you're assuming it would be the ad companies choice; bankruptcy courts may see it as a valuable asset to be liquidated, and governments may simply take the data using various methods.
>you might not see the necessity of keeping secrets
I wanted to address this first. I'm not a believer in "you have nothing to hide if you're not doing anything wrong", so I can appreciate the argument that some people may be more vulnerable than others. For instance whistle blowers that may need to maintain anonymity in all situations.
In these cases however, I believe some responsibility lies to those at risk to opt out of appropriate settings, or avoid using services that require tracking. Similarly, responsibility exists for the companies involved to make those opt-outs accessible, and to not use misleading language.
>Ok, lets consider insurance companies and/or banks. These businesses would really like more to get their hands on data that could give them excuses to raise your rates, deny your loan or insurance coverage.
As you said, laws do exist to protect the user against discrimination in these cases. If there's ways for companies to route around it then I'm not familiar with them, but I would imagine that opens them up to the potential of being heavily fined.
>If you think this is a theoretical concern, then you need to read about the deplorable practice known as "redlining", where data was used as a cover for racial hatred and forced segregation.
This is a good example, and certainly drives your point home. I agree it's absolutely a concern how data is collected in cases such as these. That said, open data can also be used for good. Consider medical studies that can look at entire populations for trends, or data that can help inform governments to the pain points in their region. My takeaway is that we need to be very careful about how data is aggregated and anonymized to avoid this sort of targeting.
So on that point I don't necessarily disagree with you, but my stance is that we shouldn't throw the baby away with the bathwater. Like any tool, data collection can be used for good or evil. We should be concerned with how we enable its use for evil rather than demonize the tool itself.
It's true that everything that can be recorded will be recorded, but the actual way to not bring attention to you is to let everything be recorded. NoScript/Ghostery/etc users are actually the outliers. Similar to what happens with Tor.
I can find relief in the fact that it's extremely unlikely someone will look for your specific data because of the sheer volume of it. Website owners will just look at some charts and metrics in their analytics platform and that's it.
If the fullest extent at which they can exploit my activities is to show me mostly irrelevant shit that I'll never buy, then I'm not worried about it. It's be like worrying about all those 'psychics' that pretend to know what you're thinking by repeating what you've just said back at you. Whatever power they find there is an exploitation of other peoples' careless ignorance, which is something I do dread on occasion.
I think, historically, most people have lived in very tight quarters, quite close to a lot of other people. Indeed, that's still true for many of us!
Some of these very 'always nearby' people would be family members and/or greatly trusted, but even one layer beyond that would be a lot of people who we would consider acquaintances today.
My question: isn't it possible that the amount of privacy some people had in the last couple of centuries was rather anomalous? That our default level of privacy has always been very low.
Edit: To be clear, I do not intend to minimize the value of privacy, whether it's a recent thing or not. However, some historical context is, I believe, useful.
To state it another way: isn't it possible that the amount of freedom some people had in the last couple of centuries was rather anomalous? That our default level of freedom has always been very low?
I would posit both your and my "questions" are true. Does that truth make it any less worth fighting for? One might even argue, there is no such thing as freedom without privacy.
This is true for many things though. Isn't it true that the level of healthcare available recently is rather anomalous, historically speaking? Or the quantity of food available?
Technology can grant us advantages if we allow it. It can also take them away.
> Most of the web is broken with NoScript installed.
More accurately, most of the web is broken, whether or not NoScript is installed — it's just that you can see that it's broken when NoScript is installed.
I consider it a feature, more times than not. Google's anticipating my search, flash heavy eye-candy/media and numerous unaffiliated servers baked into web pages all clog my 1mb connection and make my browser run like molasses. How many Targetimg cdn's does it take to display a single product page? Last I checked it was 4, plus a half-dozen or more other presumably ad servers. I choose not to use many big retailers' sites b/c of it. Again, for me, that is a feature to keep them at bay from anything beyond our potential transactional relationship.
99% of the distraction, tracking, annoyance of the modern web are due to JS. Blocking it will not "break most sites", you can usually read just fine. Selectively allowing JS per domain works great. Try it, you might be surprised and not talk down to us anymore.
Maybe all you visit is blogs. But there are lots of websites that rely on JS in order to express their intentions. It's like watching a movie with the mute on because the english accents bothers you -and without the possibility of subtitles-.
PS - how come all website product recommendations still suck? Data collection is rather ahead of data usage it seems...