Hacker News new | ask | show | jobs
by bhearsum2 721 days ago
I have worked on this in the past, and still touch it from time to time.

This data comes from attribution data (https://firefox-source-docs.mozilla.org/browser/components/a...) and the installation ping (https://firefox-source-docs.mozilla.org/toolkit/components/t...).

Most of the Firefox installers/packages for Windows & macOS on https://archive.mozilla.org/ have a sentinel value for attribution data that indicates "this is a build that Mozilla produced and uploaded to archive.mozilla.org", and is considered to be from a known source. (Anyone that downloads directly from archive and redistributes will maintain this data and also be considered known.)

Downloads that flow through www.mozilla.org and meet certain conditions (most notably, Do Not Track being disabled) will have this data overwritten at download time with UTM parameter information (see the first link above). These are considered to be from a known source as well (our own website!).

I'm not an _expert_ on the analysis side of this data, but I believe that install pings that don't contain attribution data are considered "unknown". (There may be other cases that end up in the "unknown" bucket as well - I honestly don't know.) On Windows, there's a _shockingly_ high percentage of installs that fall into this bucket.