Hacker News new | ask | show | jobs
by studius 1979 days ago
Without even any AI or ML:

* Average counts of emojis, messages, acronyms, etc. could be misused by a human reader to determine things like age, religion, or ethnicity. If we frequently use poop emojis, we could be considered immature. If we frequently use anger emojis, we could be considered hot-tempered. If we frequently use sad emojis we could be considered to have depression.

* Counts of specific words or phrases in each language could be misused to convey physical and mental health problems that may be affecting an individual.

* The median length of a user's messages not including URLs, things in quotes, or tick blocks could be used to assess whether the user tends to be concise or wordy.

That data could be sold and could make you unemployable because their statistics indicate that you're overly angry, when the anger emoji was just your shtick at your last job.

Even if you're using the data just to see when people are working, if you have a team of employees that are productive but don't communicate much, that data could be misused. If their manager doesn't get along with the team and wants to move to a different one, when a higher level manager is cutting headcount, the team manager could say that the team has always been lazy and never listens to them, so the higher level manager fires the manager and his entire team, based on data provided by the analysis of the IMs and the manager's account. Since the data isn't a true indicator, it shouldn't even be used as supporting evidence of lack of work.

1 comments

In GDPR this information is called Personal Identifiable Information PII.