Hacker News new | ask | show | jobs
by stjo 1217 days ago
> Homebrew’s analytics are now sent both to Google Analytics and our new, self-hosted InfluxDB instance hosted in the EU.

> If you had previously set HOMEBREW_NO_ANALYTICS because you didn’t like Google Analytics and/or data being sent to the USA: please consider unsetting this and setting HOMEBREW_NO_GOOGLE_ANALYTICS instead, allowing analytics data to be sent to our new InfluxDB host.

My package manager was reporting my actions to Google and I didn’t even know about that. Great…

11 comments

Homebrew doesn't "report" anything to Google, much less anything of yours (implying unique identification). This is an unnecessary editorialization.

You can see exactly how Homebrew does analytics here[1], and you can see the sum total of the information collected here[2]. No identifiable information is collected or retained.

[1]: https://github.com/Homebrew/brew/blob/master/Library/Homebre...

[2]: https://formulae.brew.sh/analytics/

>You can see exactly how Homebrew does analytics here[1], and you can see the sum total of the information collected here[2]. No identifiable information is collected or retained.

If my computer is sending data to Google then Google has my IP address and can correlate that data however they want. You simply can not claim that identifiable information isn't collected or retained unless you work for Google.

I don't know if you work with Homebrew or not, but I would be much more comfortable if they used something like Plausible for analytics.

They're planning to completely drop and delete all Google data in 90 days, once they are sure the new plan works out.

FTA:

> We expect to migrate entirely from Google Analytics to our self-hosted InfluxDB instance in ~90 days at which point we will remove all Google Analytics and destroy all existing data.

> I would be much more comfortable if they used something like Plausible for analytics

Their intent is to migrate entirely away from Google to the San Francisco based InfluxData, which I guess is marginally better in some ways but seems to entirely miss the point of people's objections in spirit. Other than people's personal subjective trust of one US company over another, there's no inherent difference between Alphabet & InfluxData from a technical standpoint.

InfluxData do collect IPs per https://www.influxdata.com/legal/privacy-policy/

> our new, self-hosted InfluxDB instance

Given that it’s a self-hosted instance, InfluxData shouldn’t be able to exfiltrate anything from it.

Maybe the self-hosted is planned but I was just going on this line which uses the managed service: https://github.com/Homebrew/brew/blob/master/Library/Homebre...
From the Homebrew documentation:

A Homebrew analytics user ID, e.g. 1BAB65CC-FE7F-4D8C-AB45-B7DB5A6BA9CB. This is generated by uuidgen and stored in the repository-specific Git configuration variable homebrew.analyticsuuid within $(brew --repository)/.git/config. This does not allow us to track individual users, but does enable us to accurately measure user counts versus event counts. The ID is specific to the Homebrew package manager, and does not permit Homebrew maintainers to e.g. track you across websites you visit.

IANAL, but an UUID is definitely PID under the GDPR:

‘personal data’ means any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person;

Also see recital 30:

Natural persons may be associated with online identifiers provided by their devices, applications, tools and protocols, such as internet protocol addresses, cookie identifiers or other identifiers such as radio frequency identification tags. This may leave traces which, in particular when combined with unique identifiers and other information received by the servers, may be used to create profiles of the natural persons and identify them.

The GDPR doesn't only take into account whether an identifier can currently be used to identify a person, but also whether the data can be correlated in the future to do so (e.g. by correlating package installs with visiting project websites, thus deanonymizing the UUID).

The only safe way to abide by the GDPR is to avoid storing any non-essential data without consent.

I am pretty sure that Homebrew have been violating the GDPR for years by making analytics opt-out. Sadly, anyone who tries to warn them gets banned from their issue tracker.

Read your own cite: nothing about the UUID in question is associable with an identified or identifiable natural person, which is what the GDPR concerns.

We do not have the ability to correlate your package installs (again, we do not know what you install) with your browsing history, and we do not store any information that would allow us (or an adversary) to do so.

Read your own cite: nothing about the UUID in question is associable with an identified or identifiable natural person, which is what the GDPR concerns.

This is false and a misunderstanding of the GDPR. It is not about whether it is currently possible. But whether it would be possible if it was correlated with other data.

What differs pseudonymisation from anonymisation is that the latter consists of removing personal identifiers, aggregating data, or processing this data in a way that it can no longer be related to an identified or identifiable individual. Unlike anonymised data, pseudonymised data qualifies as personal data under the General Data Protection Regulation (GDPR). Therefore, the distinction between these two concepts should be preserved.

https://edps.europa.eu/press-publications/press-news/blog/ps...

‘pseudonymisation’ means the processing of personal data in such a manner that the personal data can no longer be attributed to a specific data subject without the use of additional information, provided that such additional information is kept separately and is subject to technical and organisational measures to ensure that the personal data are not attributed to an identified or identifiable natural person;

https://gdpr-info.eu/art-4-gdpr/

So, basically if we have a data set with three columns:

Personal name, UUID, Action (e.g. brew install fzf)

Removing the first column is pseudonymization, and thus qualifies as personal data under the GDPR. Removing the first and the second column is anonymisation and is not personal data.

Again IANAL, but it is clear from the GDPR that the only thing you could do without consent is e.g. recording what packages get installed/uninstalled, but without a UUID.

Apply the counterfactual: what would have to be the case in order to correlate the UUID in question with user data?

We do not store anything else that could correlate with that UUID. We don't expose it to anybody else and it's unclear how, even if we did, it would result in personal correlation.

Apply the counterfactual: what would have to be the case in order to correlate the UUID in question with user data?

We do not store anything else that could correlate with that UUID. We don't expose it to anybody else and it's unclear how, even if we did, it would result in personal correlation.

You can argue against this, but it's simply how the GDPR defines personal data, and if you violate it, someone could report you to their data protection authority.

Secondly, the GDPR does not just do this to protect citizens against direct use of their personal data (I think most Homebrew users would be immediately convinced that you wouldn't misuse this data, including me), but also scenarios that are outside of your control. Such as: Google decides to violate the GDPR against your will and correlates the data. Or: Google Analytics gets hacked, the data set becomes available on the black market or wherever and people correlate the data with other leaked data.

So, how would it be possible if it was correlated with other data?
They'd be well advised to make this opt-in only for legal reasons. This is not going to go down well in a lot of places and they might get exposed to law suits.
Opt-in analytics are useless, unless a large part of your userbase just clicks through the entire wizard without thinking; there’s little overlap with Homebrew’s userbase
Ok, so don't do analytics.
Yes, hardware2win, have a worse product. Not everything needs to be phoning home all the time.
Then have worse product for the users?
Homebrew was fine before they started collecting analytics. There are plenty of great package managers outside the macOS ecosystem that don't use analytics.
So, just because you werent affected then no one was?
Yes, you don't get to decide to violate your users' rights and surveil them because you think it will improve your product.
1 it is their choice what soft they use, isnt it? Its not like chromium on android being pushed on you

2 there is "reasonable" / "good faith" data that in my opinion can be sent e.g crash log, stats like e.g package popularity etc.

You just create drama over nothing.

Ive used data like this to improve my soft countless times and there is nothing shady at all, everything is about what you collect.

Theres difference between keylogger or stealing nudes and tech data

Not my problem. They don't have a right to spy on users.
Stop it. They are not spying on users.
Anytime "analytics" are opt-out instead of opt-in is spying in my book.
"The alternative is useless" is not a valid legal defense.
As in the post: they're intending to drop the GA part entirely within 90 days, and it sounds like the new metrics are entirely anonymous, and so not covered by GDPR etc. IANAL but as far as I can tell that should avoid all legal concerns once GA is gone.
Why does a package manager need to track their users at all?

If you want usage statistics for packages just track how often individual packages are downloaded on the server side. A maintainer has no need to know who's installing what.

> If you want usage statistics for packages just track how often individual packages are downloaded on the server side. A maintainer has no need to know who's installing what.

To be clear: Homebrew has no idea which users are installing what. We only store counters for package install, failure, etc. events, and everything that's stored is visible on the Homebrew website[1].

Homebrew's architecture doesn't really have a "server side" in the way your suggestion requires: the formulae and bottle components rely heavily on public services like GitHub Packages and GitHub Pages, which don't offer those kinds of analytics.

FD: Member of Homebrew.

[1]: https://formulae.brew.sh/analytics/

What kind of decisions do you make based on the analytics? Do you drop unpopular packages? To me, one of the advantages of a package manager is having a huge database of long-tail packages that are just one command away. If you kept only the popular packages you might as well just have an installer that installs them all together in bulk.
> Do you drop unpopular packages?

Yes: Homebrew deprecates and/or disables packages if we see evidence that they're unmaintained and not actually supported on the platforms we support, or only used by a tiny fraction of users while also requiring disproportionate maintainer time (e.g. due to complex or flaky builds).

The goal is to balance conflicting user interests: 99% of users want maintainer effort focused on the top 100 (or 500, or 1000) packages, and many of those packages also require significant maintainer effort (e.g. making sure that they don't cause transitive breakages).

> Why does a package manager need to track their users at all?

According to https://docs.brew.sh/Analytics they use it to measure how often formulas fail to install, to get overall metrics on which OS versions are used, and to correlate those (i.e. to tell on which OS versions specific packages fail to install correctly).

> A maintainer has no need to know who's installing what

Aside from the IP, they don't know who's installing what, and in the new model announced in this post they now don't store IPs or any other user token at all, so it should be purely anonymous aggregate metrics.

I agree that they don't need to know the "who", but it is perfectly understandable that they want to know "what" is being installed. And as part of the "what", they would want to know on which platform, and whether the install succeeded or failed, and probably a few other metrics about the install to ensure that things are working correctly and identify gaps that should be filled.

Based on what I read on the site, that looks like exactly what they are doing, and they are explicitly NOT storing information that would identify "who".

Correct: there's no identifiable information being stored, either before or with these changes.
They can already gleen a lot of this since they run the hosted formula db anyway. A 90-day analytics capture isn't a big deal IMO.
Aside from the IP? IP nowadays is all you need…
I think nowadays IP is less and less relevant when majority of people sit on dynamic IPs
They've been quite clear about what they store and it's not IPs.
All stuff that should be in a trouble ticket from a whiney user. Which we know this type of user would be.

Edit-Also, this is for Mac OS. Chose a few standard OSes to support and test them. If a system update will fix the issue then it shouldn't be fixed at the package manager level.

Much rather deal with anonymized telemetry blowing up than tickets from whiney ass users.
> Why does a package manager need to track their users at all?

Do any of you actually work in this industry shipping software products to end users? Without telemetry the problem there is literally one of trying to read the mind of your end users to figure out what they're doing, hoping that your internal CI manages to reflect the configuration in their environment.

I think HN has a very varied audience - some work in the industry, others want A/B testing to be made illegal on the grounds that it is non-consensual mind-control experimentation :P
The groups of people who work in the industry and those who believe A/B testing is psychological experimentation aren’t disjoint.
I am in both groups. I work in the industry and I am so tired of colleagues wanting to grab or data they can get their grubby hands on and then barely use it at all for anything useful. So many companies collect data just in case.
Users report issues to GitHub? It's not like Brew users aren't sophisticated in that sense.

In addition to being INCREDIABLY slow, now I have to worry about what it might spy on. If I have a problem I'm more than happy to go to GitHub (or which ever site it's hosted on), and report it.

I imagine many of us work shipping software to end users and also respect their right to privacy, and only track their actions with informed consent.
This industry has managed to ship software products without telemetry just fine - mass-collecting usage data from end users is only a relatively recent trend.
Any actual arguments?

I don't see why something that's little more than a file server needs telemetry.

Homebrew is a package manager with thousands of packages, not a file server. We maintain those packages, and knowing when they break (or can be deprecated due to lack of use) is critical to the project's sustenance.
Have you looked at the analytics yet? Or are you only speaking from ideological priors?

The most valuable one I’d guess is package install error rates. Seems pretty useful to me.

> something that's little more than a file server

You're doing an awful disservice to Homebrew.

Maybe they want to include the most common packages in their unit tests, or understand usage patterns so they can prioritize development?

It’s very hard to write and maintain good software without knowing how it’s used. No package manager needs to know how you specifically use it, but aggregate data and the ability to identify scenarios it does not handle well are both very important for SW lifecycle.

It's 2023. Hard drive space shouldn't be an issue. Test installing the full software suite, make it work, and you know the lesser installs will all work.
Do you think “hard drive space” is the constraining factor when building and testing over 6.5k third party packages?

Do you really not see any advantage to maintainers having visibility into what packages people actually use?

How much would you be willing to pay so that Homebrew can maintain a large amount of hardware covering nearly all configurations?
> Test installing the full software suite, make it work

Are you paying for the compute?

Assuming their claims of anonymity are true, they won't be tracking users at all.

I imagine they can get much richer metrics through this as opposed to only tracking downloads on the server side.

I'm not saying I like it. In fact, I plan to keep it disabled. I'm just saying it's a bit naïve to think client-side analytics are the same as server-side download tracking.

Richer how, exactly? I fundamentally don't 'get' what richness they actually need.

If anything they need money, of course, and to know their software works for their users. Prior to release have a test system install the full base, test those packages work, and you know anything less will work too.

I haven't looked at what they're actually collecting, but here's a few things that come to mind:

- Time to install packages - Versions of things - Has the compilation (when required) failed? What dependency versions are installed? - CPU architecture - OS version ...

There's a lot more that can be sent from the client that's not available on the server side.

> I fundamentally don't 'get' what richness they actually need.

That's fine. Perhaps you could ask them instead of ranting about what you don't know or don't 'get' in a public forum?

> to know their software works for their users

Sounds like you're not very far from understanding why they want better telemetry.

> Prior to release (...) and you know anything less will work too.

Things break in unexpected ways. OSs are complex systems and there's a lot of interactions between components. Homebrew's user base is enormous and very diverse. There's 2 different architectures, many OS versions, lots of environment variables that might be set differently in each user's systems, different versions of libraries, ... I could go on but I think you get the picture.

Edit: s/collected/collecting/

This discussion on GitHub reveals the mindset of the Homebrew people: https://github.com/Homebrew/brew/pull/6745
The Homebrew folks think this is a non-problem. You may agree or disagree, but the pull request is certainly a non-solution to this maybe-problem. Not just because it gained zero traction and did not get merged anywhere, but also because it's just an obscure band-aid. Either opt-out anonymous telemetry is a good idea, or it's a problem. If to you it's a problem, advocate for its removal in its entirety.

So even if I have an issue with telemetry, good on the Homebrew maintainers for ignoring this MR.

The status quo is different variables and even different mechanisms for each program. Many badly documented. Only out of context is DO_NOT_TRACK obscure.

I think you know developers who reject informed consent will never adopt an informed consent model. The proposal was the best users could hope for realistically. Did you never compromise?

"Without telemetry, developers rely on bug reports and surveys to find out when their software isn’t working or how it is being used. Both of these techniques are too limited in their effectiveness."

https://research.swtch.com/telemetry-intro

It's hard to send stuff over the Internet without exposing some personal information, like your ip number.

I guess they might send it over TOR to get around that.

Isn’t it GDPR compliant if you never store the source ip at all? So from a GDPR perspective there’s no user data to track and remove.

I’m not sure how organizations get audited to prove that they actually do that and that there’s no other way to reidentify users (eg, I download the prepend package every day and that’s unusual enough to link that it’s me, prepend, the author of that package, etc etc).

This is exactly the use case that Oblivious HTTP is being built for in he IETF.
We only have their pinky promise that the new analytics are anonymous. For all we know this might be a PR operation because people increasingly dislike Google, and they'll sell the "anonymous" analytics to Google under the table.

I'll make it a goal to stop all their tracking on the level of my router.

> We only have their pinky promise that the new analytics are anonymous.

Isn’t Homebrew open source? One could audit the source themselves. If you’re talking about what happens on the server side yes, but I don’t see the difference with any other computer I connect with over the internet.

[1] https://github.com/Homebrew/brew

Homebrew is not a custodian of any personally identifiable data.
Does that include IP addresses? Because I think that is considered PII
Homebrew does not store IP addresses, so yes.

You can see the totality of the information stored on the Homebrew website[1].

[1]: https://formulae.brew.sh/analytics/

From the post:

> Our self-hosted InfluxDB instance does not store either anonymised IP addresses or an anonymised user token so it has additional privacy benefits over Google Analytics.

Homebrew should support the DO_NOT_TRACK environment variable.

https://consoledonottrack.com/

It doesn't look likely though. I don't think it looks good that comments pointing out that Homebrew's existing behaviour (collecting analytics without obtaining informed consent from users) violates the law have been classified as abuse and hidden!

https://github.com/Homebrew/brew/pull/6745

> Homebrew should support the DO_NOT_TRACK environment variable.

No, rather something like TRACK_ME opt-in variable.

Or, just not track.
In retrospect (consoledonottrack operator here) I never should have pushed an opt out standard; it legitimizes opt-out which is indefensible and unethical.

Opt-in by advance consent is the only way. Homebrew devs are unethical jerks.

Use nixpkgs and don't look back.

What’s unethical about counting errors with no PII?
Errors are the property of the user on the system in which they occurred. Exfiltrating them without consent is unethical and oftentimes illegal, and leaks the user's IP to Google. Homebrew has no claim to them without the consent of the user. It's simple spyware.

Unless you report with Tor, it's not without PII. (Homebrew also includes a unique install UUID supercookie which persists, so every analytics data point includes PII in addition to IP address which allows Google to track that user's physical travel history.)

> consent

Users are informed upon installation.

> IP to Google.

Lots of people seem to care. I haven't heard a reason why though.

But regardless, Homebrew is deprecating and nuking the google system, so that's nice.

Consent doesn't work that way. Imagine posting a sign at the entrance to your house saying "all those who enter here consent to being groped".

Notice is not consent. Your statement that users are notified is a red herring. Users must also consent, not just be told of an entity's plan to violate/assume consent.

Surveillance without consent is unethical no matter who the data goes to, or whether or not the data is anonymized or otherwise stripped of PII. It's stealing of information unless the user agrees to it in advance.

I appreciate you changing your mind on this.
Why don't they just use something like plausible for this? I switch to plausible for all my analytics and it works great.

https://plausible.io/

Whilst Plausible does look great, one of their main goals, as I understand it, is to focus on simple core analytics for websites, not general-purpose do-it-all analytics for everything.

If you're not a website, and so your metrics are arbitrary events with metadata instead of page views, this tends to quickly run into awkward mess.

Self hosting an InfluxDB to capture core metrics seems like a good solution that avoids privacy concerns without being too complex imo.

It is fine for medium size applications, you do have to register all events (goals) on the website before you can use them [0]. I have written the Qt/QML plugin for plausible [1], I don't think predefining them is too bad.

[0] https://plausible.io/docs/goal-conversions

[1] https://gitlab.com/kelteseth/qml-plausible

Ah, neat, ok! I'll take another look. For now I've been testing out Posthog (https://posthog.com). They seem more focused on this use case - they let you do arbitrary queries and build graphs over all event data without having to predefine goals or anything, and they have an open-source & self-hostable version, in addition to a EU-hosted cloud option. Now that GA is so clearly dying (finally) it's an exciting space!
Thanks for sharing. I was looking for other options and this looks like a good candidate. Does you or anyone else have any other suggestions?
If you are into self-hosting: https://uxwizz.com
jitsi as in a segment alternative?
In your .bashrc or similar: export HOMEBREW_NO_ANALYTICS=1
That’s not the point. I don’t want any software sending analytics unless I specifically allowed it.
FWIW, it gives you a warning and the chance to disable analytics before sending any analytics. https://docs.brew.sh/Analytics

> Homebrew gathers anonymous aggregate user behaviour analytics using Google Analytics (until our in-progress migration to our own InfluxDB). You will be notified the first time you run brew update or install Homebrew. Analytics are not enabled until after this notice is shown, to ensure that you can opt out without ever sending analytics data.

Agreed, I was shocked when I installed the Dart programming language and found out it sends analytics by default. A programming language!
An implementation of a build system for a programming language
You know, as the law requires in the EU
That is a very simplifying view of the legal situation and that's not helpful at all.

First, it only applies if you collect PII - depending on what they collect, they might not be subject to the GDPR at all.

Second, informed consent is only one of the options that allows collection and storage of PII. There are various other reason that allow collection and storage of PII, among them "Legitimate interest". For example, it is considered legitimate to store webserver logs containing PII (IP Addresses) for purposes of fraud analysis, unauthorized system access etc. Whether a specific collection of data is legitimate under those clauses depends on the specifics of a case (who has access, what's the exact purpose, how long you store, ...) - ask a lawyer if you need an assessment.

Depending on what they log and how they log, they may be either in the clear or in a bad place, but it's definitely not as simple as "the law requires no logging".

The analytics page describes them tracking information across time with a unique user identifier. They claim that identifier doesn't identify you, but it's attached to an exact Brew install so it does track your personal account on your machine at the very least; I'd classify that as PII.

Had they not submitted unique user tokens I think you mag be right. However, that's not how the analytics seem to work.

The law does allow logging for a variety of things but in this case I'd say they're in the wrong. They assume that it's okay because they don't track you across websites and that's good to know, but that's not the point.

I don't think that's true - AFAICT there's no EU law banning analytics. EU law just restricts storing & processing _personal_ data (GDPR) and storing unnecessary data on machines without consent (ePrivacy/'cookie law').

If you want to log fully anonymized data, without persistent tracking ids and without leaking personal data to 3rd parties en route (so no "send it to Google and they promise to anonymize the IP afterwards") then you're all good (but IANAL!).

The only reason you see all those cookie notices and GDPR consent requests is because so few companies are willing to accept even the tiniest tradeoff in their metrics to protect their users' privacy.

> EU law just restricts storing & processing _personal_ data (GDPR)

To be clear to anyone reading: using Google Analytics without a non-Google-hosted anonymisation step breaks GDPR. This _has_ been litigated in court in several countries. There's no "ifs" or "buts" about it.

There's an implicit _for websites_ on there. Has it been litigated for non-website use like this where the program can control which fields are being submitted?
That is not actually how the GDPR works. Anonymous telemetry without PII does not need any consent.
Besides consent there is also the possibility of legitimate interest under the GDPR.
That wouldn't be it
I don't think the law is applicable to a software project. For example, GDPR is applicable to organisations that are processing personal data.

I'm fairly sure that an open-source piece of code that you download and install yourself isn't in scope.

The GDPR applies to anything that collected PII about EU citizens.
I feel the same way.

I think it’s not cool when orgs track telemetry with opt out. But it’s not cool like when you’re at a party and you go off and fart in the corner as no one’s there and then a few seconds later someone walks by and smells it.

Continuing the analogy, telemetry with no opt out is like farting silently amongst a group of people. And tracking identified user requests while selling data is like slapping each person at the party while farting in their face.

And I guess opt in telemetry is like holding in your fart and people notice and might feel some discomfort at your discomfort.

What.
Then you can either: 1) don't use the software 2) analyze the software source code to understand what it does before using it

I tend and prefer to assume good-will WRT telemetry in well-known and independent opensource projects.

You forgot 3) Complain at different intensities, up to the shaming, about the unethical dark patterns employed by the software, no matter whether it is open source or not, to make authors of the software aware, that what they do is not welcome by their users.
Disagree, it should be the norm.
I use Little Snitch to alert on any outbound connections and make a decision. The google stuff immediately got a permanent blackhole for Homebrew. Anything I'm uncertain of I'll give a short-term approval (30mins) to not break anything. After a couple of rounds of execution (and sometimes some trial & error) you can usually work out which requests are essential and which are some notifications/tracking thing.
So you didn't see the notice brew(1) gives you on first run?
I can't recall, though this approach isn't specific to homebrew. I block it permanently at a network/process level rather than having to remember to set a ENV var.
Thanks, a quick search on the home brew page brought no guidance on how to configure these settings. I can't remember if this is in the .bashrc or in some other obscure config file, would be great to mention this if you recommend changing settings to your user base.
It prints a notice when you run it:

https://docs.brew.sh/Analytics

Yeah, thanks, I set

export HOMEBREW_NO_ANALYTICS=1

immediately!

> My package manager was reporting my actions to Google and I didn’t even know about that. Great…

This has been a known things for a long long time. That is on you for not knowing about it. And it isn't doing anything that you are trying to infer in your statement anyway.

No kidding. First I’m hearing about it.
Probably not, as the installer tells you in the same blurb it tells you how to add brew to your .profile.

What it doesn't tell you unless you read the URL, is all you have (had) to do is "brew analytics off".

I find this very invasive and I am glad I never used homebrew and always used Macport instead.
Well I have bad news for you there too

https://ports.macports.org/statistics/faq/

I really do think that people who are concerned about telemetry should install Little Snitch and also look at each of these programs documentation. Telemetry is called out for most of them, as well as how to disable it. It’s also not inherently bad as it can be a valuable tool to improve their systems, though obviously could be abused too.

You need to actively install software to report statistics on macports. See further down the linked FAQ.
Exactly!

``` To start submitting statistics, install the mpstats port in your MacPorts installation.

sudo port install mpstats ```

Time to write a client to spam both of these unauthenticated data collection endpoints with random noise data to render them useless.