Hacker News new | ask | show | jobs
by belval 1730 days ago
Eh Minecraft is a video game, I just don't see the privacy risk with "typical" Windows-level telemetry happening there.

At some point we have to accept that this data actually helps them fix real issues too, they're not monetizing their players directly with it.

12 comments

>At some point we have to accept that this data actually helps them fix real issues too

It is strange that this community of all communities has such a resistance to this idea. Software developers should know better than anyone how difficult it is to identify and fix vague software problems without having specific details about the problem. Yes, there is a negotiation between the value of telemetry and privacy and often too much privacy is sacrificed. But I am always surprised to hear developers say all telemetry is bad.

There are at least two distinct issues here this community is concerned about:

1. Telemetry is unethical if the users didn't provide informed consent (opt-in) for it. It's not just a theoretical point - anyone who's worked in tech sector for a while should know most companies cannot be trusted to behave ethically (especially if they took VC funding).

2. There's a certain dysfunction/antipattern that's popular in tech sector, called being a "data-driven company". It's the practice of making decisions through divination from data collected through extensive telemetry, to the exclusion of other knowledge sources (like e.g. actually talking to your users, hallway testing, or thinking things through). This leads to software being optimized in questionable directions - so in a sense, you could say that adding telemetry implies an increased chance the software will become worse over time.

Worse in a really insidious way too. Optimizing based on engagement for instance could be maxxing the amount of time people spend using a service, while minning unseen variables people care about like their emotional state while engaged. It’s sort of an inevitable thing that optimizers do to things they don’t measure, and it’s an extremely difficult problem to put everything people care about into the equation. Like for instance, think of how horrible a polynomial fit gets for a function just outside the window you are fitting as you add more terms.
> But last week, Facebook revealed that it had manipulated the news feeds of over half a million randomly selected users to change the number of positive and negative posts they saw. It was part of a psychological study to examine how emotions can be spread on social media.

Doing such a study would be the first step in optimizing for a good emotional state. But it (quite understandably) led to an outcry which stopped it dead in its tracks.

https://www.nytimes.com/2014/06/30/technology/facebook-tinke...

Sentiment analysis will tell you people have positive feelings about cute animals and negative feelings about friends getting cancer. An extremely difficult problem to put everything people care about into the equation.
I think it's way more complicated than that. Like you could measure how much better someone feels after seeing cute animals. Short term happiness isn't the only goal and needs to be traded of with other things, both as informing the user about what their friends are doing (you probably do want to know if they get cancer even if learning about it is sad), trending topics so you are clued into what everyone else is talking about, and of course ads. Learning more about the numbers allows better tradeoffs.
> Telemetry is unethical if the users didn't provide informed consent (opt-in) for it.

And if the only way to opt out is not use the product. Or stop using it particularly.

>There's a certain dysfunction/antipattern that's popular in tech sector, called being a "data-driven company".

This is sarcasm. Right?

I (like most here) have personally benefitted from extensive software telemetry but am generally against it in my personal life. But I also write business software where the expectation of privacy is already moot.

Gaming is a weird point though, telemetrics are already in heavy use, the privacy risk is indeed minimal, and gamers don't seem to care anyway though. I can't tell you how many GDC talks I've watched that discussed player heatmaps, incident (like death) reports, and whatnot used to fine-tune game balance.

There are numerous places where telemetry is completely inappropriate, like one's operating system. An idle computer should indeed be 100% idle, internal housekeeping exempted. (I recently installed freebsd on a new server, did some setup, and basked in the glory of htop showing 32 cores at 0.0% and a root process list that was under a page long. I wish other operating systems could follow that example)

> freebsd on a new server

It always makes me happy to see how short the list returned from `ps aux` is with FreeBSD. Whereas if you go onto a typical Linux box (even just a raspberry pi!) and run that, you get at least a screenful of processes doing who knows what. (Just my small experience.)

This is also the experience with Ubuntu on Windows WSL:

  $ ps aux
  USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
  root         1  3.5  0.0   8944   332 ?        Ssl  11:38   0:00 /init
  root         7  0.0  0.0   8944   228 tty1     Ss   11:38   0:00 /init
  dando        8  1.7  0.0  16804  3396 tty1     S    11:38   0:00 -bash
  dando       32  0.0  0.0  17392  1916 tty1     R    11:38   0:00 ps aux
A strange irony that the most purist Unix-like Linux is to be found in the belly of the Windows beast, for the equivalent of those hundreds of processes doing who knows what on a typical native Linux install are instead all in Windows Task Manager.
Tip: If a process is a service-host, run `tasklist /svc` and that way you can actually know what's going on in each process.
If you install a server distro, the number of default processes is pretty minimal. Run GNOME on FreeBSD and you'll get the same huge list of processes.
Not so sure about that, my last fresh install of Ubuntu Server (admittedly like 4-5 years ago now) had several cores hovering in the 20-30% range while idle and nothing was installed on the server yet. "Ooop, guess I'm going back to debian"
> If you install a server distro, the number of default processes is pretty minimal.

It depends on which processes. If you include kernel threads (which show as processes), the number of processes can get pretty huge, especially on many-core servers (several of these kernel threads are per-core).

I don't know whether on FreeBSD kernel threads show up as separate processes; if they don't, it might explain part of the difference.

I totally agree. Telemetry is invaluable to making software better.

Transparency is key here. If projects explained the steps taken to anonymize the data (either provably using Differential Privacy, or approximated via some other means), I feel like people might trust them more. Even with DP, though, the server does see IP addresses, even if it doesn't know what the telemetry is, and that alone might cross the line for some people. Even if the project promises to not log them.

The problem with explanations and promises is that there's a lot of bad water that has flowed under those bridges. They amount to just saying "trust me".

As a dev, telemetry makes me nervous because so many projects rely on it too heavily and make bad design decisions because the telemetry blinds them.

Absolutely. There are steps that software can take to build this trust, though, like

- Have a screen that shows all telemetry that is going to be or has been sent

- Ask for permission to send any telemetry, or certain types of telemetry (i.e. crash reports)

– Publicly share the collected telemetry.

> I totally agree. Telemetry is invaluable to making software better.

Can you give 1(one) example of a program which was improved by using telemetry ? And no, trashing the UI in the name of change or modernism does not count.

Thank you.

We introduced some opt-in telemetry in dolphin-emu.org several years ago. I remember several times we discovered things we would likely have completely missed otherwise:

- We found out by looking at the distribution of software version that we had a strong holdout on one specific revision. It turns out we had a regression in a niche feature which was very important to a sub-community of our users, and users were basically telling each other to just use that old version. No bug report was filed until we found out via analytics and asked.

- We have a "game quirks" mechanism where the emulator reports weird edge cases that happen very rarely. Current list: https://github.com/dolphin-emu/dolphin/blob/master/Source/Co..., example usage: https://github.com/dolphin-emu/dolphin/blob/ffdc8538a162b1ca... . We used this to find games that use currently unimplemented or stubbed features.

- The list of popular games being played on the emulator was extremely surprising because it turns out there's a huge disconnect in what most NA/EU players are playing and what JP players are playing. This led to us adding a bunch of new games to the list we regularly test for performance and stability regressions. Would you have guessed that Inazuma Eleven GO: Strikers 2013 is in the top10 of emulated games on Dolphin?

This is a fascinating answer that has persuaded me to your side. I recall opting out of Dolphin telemetry because I simply couldn't be bothered to check what would be sent, but seeing not only examples of what data is sent but also how it's used in such a positive way will definitely have me turning the telemetry on next time I go to use it.
We have a privacy policy which describes what we collect in a bit more details, see https://dolphin-emu.org/docs/privacy/

https://github.com/dolphin-emu/dolphin/blob/master/Source/Co... is the actual code which collects most of the information. We do multiple things to avoid being able to track user activity too much -- for example, while every instance of Dolphin has a unique ID so we can do things like unique counts, events that happen within a play session are associated to truncated_hash(unique ID + game ID) and not directly with the unique ID. This means that we can only correlate events from the same user playing the same game, but not* two events from one user playing different games.

* Our implementation is a bit weak given that the set of all gameids is small and enumerable. We could probably do better there.

The word “telemetry” can mean a lot more than “detailed usage data to inform design decisions”

I’m talking extremely basic things, like, what are the most popular crashes in the app? Which did I introduce in the most recent version? Why is there an increase in end to end latency in fetching data from the server? Stuff that would fall under “bug fixes and improvements” that you would likely not notice.

The word telemetry became common when it became more than optional crash reports and update checks.

The last example sounds detailed enough. And some privacy considerations are moot when the app is a client for your server.

Windows.
You mean that calc.exe opening in 3 seconds is better than instantly. Or dissapearing ribbon and title bar on windows in a multimonitor setup is better than before. Or the new designed shutdown menu.Or the white fonts on light background. Or dissapearing scrollbars because someone thought that this is a good idea. Or the new save dialog when you need 3 clicks just to be able to select the folder where you want to save.
Windows went downhill fast since introduction of telemetry.
Windows 10 is better than any previous version of Windows. So how did it go downhill?
Windows has been getting worse for the last 3 or so generations of Windows.
Windows 10 is better than any previous version. I mean my Windows laptop is far more stable than any distro I run on my desktop and work laptop.
How? I mean, in what way, and in response to which metrics?
Office.
I completely agree. However, Microsoft has demonstrated user-hostile tactics in the telemetry field so people don't feel inclined to trust them.
It helps that when you have a front row seat at a butcher's house, you become somewhat averse to eating meat as part of your regular diet. From that perspective, it is really not that surprising.
I like this analogy, let's run with it even further! Can the aversion be caused by watching butchers that kill animals in an inhumane way, in a dirty shed?

(i.e. if you observe telemetry data being abused at the company you work for and that is tolerated, you'll be wary of any telemetry?)

It’s the difference between always sending something and a pop up asking if you want to send a bug report.
How about performance issues and silent errors? Finding areas that need to me optimized because a lot of users have subtle issues kinda needs passive data collection.

As long as it is anonymous, it's a good thing, IMO.

Yup. A previous company I worked at used assert calls in the code. In a debug build of the firmware, if an assert failed, it would crash the device and on the display it would show the file and line number of the assert call. In production firmware, it would silently ignore it, and if you had opted in to sending usage data, it would phone home and report the failed assert, though I don't know what extra data was included.
The key still being "if you had opted in".
> As long as it is anonymous, it's a good thing, IMO.

As long as users give informed consent, then it's acceptable.

> It is strange that this community of all communities has such a resistance to this idea.

I don't think many people here have any resistance to this. It's simply true. I think a lot of people, though, think that benefit isn't sufficient to overcome the drawbacks.

Because the visual difference between abusive telemetry and benign or useful telemetry is 0. Same with disclosures.

And we are a very abused culture sitting in the middle of what I hope is peak surveillance capitalism.

Beat someone with a stick enough, and when you go to scratch your back and they flinch, it's not sensible to deride them for being irrational.

I don't think what type of application it is determines if a person has a right to their privacy if they want it. Same goes for data, we don't have to "accept" anything.

On the other hand, if it wasn't Microsoft's brand being connected to this and it wasn't called telemetry but "Automatic Bug Report Sharing" nobody would make a peep.

The same goes for the kind of data or application: if people know what type of data would be shared and what it would look like, they might indeed understand that this isn't some nefarious profiling but just normal software improvement. If you are a developer, bug reports are nearly useless unless it's super repeatable with normal conditions and a few clearly defined steps... or if there is a useful automatic reporting system in place.

But none of that reduces someone's expectation of privacy or acceptance of data sharing.

> they're not monetizing their players directly with it

How do you know? Did they pinkie swear on it? Did they add any kind of T&C around their use of Telemetry specifically?

Telemetry that can be used to identify frequently used game functionality could equally be used to identify game functionality that can be monetized. Identifying how a user dies is a fantastic way to identify items which can be sold to prevent those common deaths.

Precisely. Another commenter said they were shocked to see this community have a negative reaction to telemetry in this instance (and/or generally), and this is why: For-Profit corporations have routinely and repeatedly demonstrated that they should never be given the benefit of the doubt in this area. The developers and engineers in charge of game balance or mechanics may be people we can intellectually relate to and understand the positive value of telemetry, but we more than most should also be aware of how those same people have management to answer to who see something to be abused to improve monetization.

It can be a good thing, yes. Why on earth would anyone assume it would be used and only used in positive, privacy respecting ways in this day and age?

>Identifying how a user dies is a fantastic way to identify items which can be sold to prevent those common deaths.

Just as commonly, it is done to identify bugs and issues, tweak the game balance to get it to a better state, or to see which things players like to do (as opposed to what they say on forums and in focus groups) in order to provide more of similar features.

It is no different from engagement logging in a lot of applications. It is far from always being done for the purpose of monetization. If my business application has a lot of users stuck on a particular screen for prolonged periods of time due to the UX being confusing, I would want to know about it and address it. A lot of times it is hard to gauge how bad it is, because users might just tolerate it if it is ok, or maybe it is something users attribute to their own fault (i.e., they might feel it is just them being confused about it, not that the UX is confusing for everyone). Engagement logging would help me identify that problem and pinpoint it very fast.

>How do you know [they are not using it for monetizing]?

How do you know they are? I am more surprised they didn't have that type of telemetry already for a while, because that's one of the most common ways to identify pain-points in the gameplay balancing, which features might need more attention/rework, identifying potentially very tricky to catch bugs, and come up with ideas for new feature ideas that users might like based on actual data (as opposed to hearsay and public feedback, which can get significantly skewed by selection bias and other factors).

It did not helped them fix problems in Teams, Windows and Office 365. Why do you think Minecraft will be different.
Citation needed. Not every bug will be solved by telemetry, but that doesn't mean we wouldn't be in a very different state without it.
Windows-level telemetry is not "typical"; it includes a dozen or more different spy subsystems. Most spyware apps only include 1-3.

If you are judging based on Windows as a benchmark, almost all software is going to come in as acceptable.

I just meant to say that wherever the line is, I don't think Minecraft telemetry is anything to get angry about.
Well Windows is a little bigger than most apps, wouldn't you say?
Typical telemetry could well consist of ‘what processes is the user running’, ‘what kind of hardware do they have’, ‘where is the user’ and ‘what is their data that can be matched to their advertising identifiers’. Which is the typical data that can be monetized.
The problem is an irrational degree of distrust in the industry as a whole.

It's not that I blame those who distrust any form of telemetry. Many companies are eager to harvest as much personal information as possible. Depending upon one's definition of personal information, some of the data acquired by telemetry can be used for that purpose. In many cases the data collection process is deliberately opaque. In the remaining cases, very few people have the ability to verify that what is actually sent reflects what they are told is sent. That's before factoring in Microsoft's involvement here, since they have a negative reputation in some circles due to their past business practices.

I think the software industry as a whole has earned a high degree of distrust on these issues. It's not irrational.
ha ha ha I came here to troll -or at least point out tongue-in-cheek the same thing.

But seriously though, it's not on the same level as, say -visual studio code sending back telemetry which includes potentially secret code or anything. Or Defenders willy-nilly sending "samples" back to the mothership for analysis.

I sincerely don't mind it if a game sends this info back, I think it's actually a good use of Telemetry.

> At some point we have to accept that this data actually helps them fix real issues too, they're not monetizing their players directly with it.

What do we know about that?

Microsoft have long lost the benefit of doubt with their shady data collection practices.

What kind of evil things have they done with the data? Genuinely curious.
They're literally Microsoft. They've used up their lifetime supply of goodwill in the 90s.
Collected it without consent.
> Eh Minecraft is a video game, I just don't see the privacy risk with "typical" Windows-level telemetry happening there.

But then Minecraft is a video game. Why do they need to spy on their customers?

Spy is not the purpose for telemetry in most cases.
Minecraft is a game played mostly by kids (minors). Tracking their behaviour, finding out what they try (even if just looking for ways to game the system) is pretty useful info.
But the explanation just does not satisfy me.

I mean, world building is an expensive but actually pretty simple, predictable operation. If they want to see how it performs on slower computer they don't need to get telemetry, just actually run it on slower hardware.

As if all problems are reproducable at will on all the different set of software / hardware out there ...
In theory it ought to be, but since drivers are generally closed source we actually have to test on everything.
What I meant is that you don't know the problem until you had it, out of the millions of players you need feedback that something is not working properly, reports / telemetry is a must have. Most serious online games have such mechanism.