Hacker News new | ask | show | jobs
by zmjjmz 1083 days ago
I'm curious if you feel strongly about the 'open internet' as well, i.e. that platforms and content should be publicly available by default (as opposed to e.g. Elon's recent move to restrict the viewability of tweets by not logged in users).

To me, these datasets don't come from some nefarious violation of privacy (surveillance) on the people who've written or otherwise authored their content, but from an aggregation of public content that was previously uncontroversial.

If these data were pulled from private content, it would be a different story - although I'm sure that many companies TOS allows the use of your 'private' content to train their models.

Going forward I wonder if we may see an 'AI opt-out' for your data regulated (I'm not sure if the new EU act stipulates this)

1 comments

All personal data collection should be opt-in only

So if you have data on someone that is attributable to their person, then the only legal way to store that is with the approval and consent of the individual who created it for the explicit and narrow purposes the data is being used for.

So for example if you use any personal data for scoring in recommendation systems, then that specific use case must be agreed to by the user. If you use personal data to cluster people into affinity groups that are then shown different content than other groups, that explicit use case must be agreed to by the user. etc...

The enforcement mechanism is simple: Attributable personal data that is found to be stored without consent must be:

1. Immediately deleted 2. All previous revenues derived from the data will be transferred to the user in question

That rule + enforcement mechanism should put concrete boots on everyone collecting data.