Hacker News new | ask | show | jobs
by kohtatsu 2175 days ago
Show me what data you've collected and would like to share, including metadata, and if you ask nicely then most of the time I'd be more than happy to share it.

Things like emoji usage, page navigations, feature uses, etc. Ideally anonymous; no IP, user agent, etc, just a small byte or two packed properly can go a long way.

1 comments

>Ideally anonymous

The problem is that it's actually quite hard to reliably anonymize data especially once you start to begin combining data sets from multiple places. That's the problem differential privacy is trying to solve in a mathematically rigorous way.

See for example how researchers partially de-anonymized Netflix Prize data by cross-referencing it with IMDB reviews.

It's not possible to reliably anonymize data and still be able to infer something from it, the idea is just too logically broken. Because the whole reason for anonymity is to make sure no information about any individual or a set of individuals of the size decided by those seeking such protection can be inferred from the data by the parties that want to infer something from it. While "differential privacy" assumes that not being able to infer information about relatively small sized sets of individuals decided by the parties who want to infer something from the data is "privacy". It isn't of course, it's a pretty dystopian use of the word privacy. Hence why corporations love this stuff, privacy without privacy is godsent to them.
Depends on the data.

DDG:

  { used_advanced_search }
  { used_country_toggle }
  { tabbed, *tab_maps }
  { filtered, *filter_date }
  { os "iOS", *ver "13.5", browser "Safari" }
iOS:

  Mail
  { disabled_remote_images }
  { flagged_mail }
  
  Keyboard
  { emoji_keyboard_via_globe }
  { *emoji_use "100-1000", *emojis [ ":)" ":P" ":(" ] }

Each of these could be stored separately without metadata then aggregated no problem. Things marked * could be left out, and some things could be randomized up or down buckets and such.