| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by throwaway8291 2336 days ago
	They claim machine learning, but I'd guess their workhorse is a "grep -f badwords tweets.json \| makepdf \| send boss@employer" kind of thingy.

4 comments

allovernow 2336 days ago

They're definitely using some form of sentiment analysis, but IMO these are exactly the kind of results you'd get after setting a programmer, with little to no actual data science background, loose to train his nets on data with a limited or absent intuitive understanding of bias from training data. And what's worse is that in their business false flags probably aren't even considered. It's an extra wide supercharged net.

E.G the "BIG DICK ENERGY" post being flagged as bigotry and sexism, no doubt they trained on limited data of hand curated "questionable posts" and I wouldn't be surprised if they used a source like 4chan and just automatically assigned negative labels to the vast majority of posts.

link

BlueTemplar 2336 days ago

The problem here is rather likely to be to not have someone with human science background... but then that person would likely just tell them their whole premise is flawed?

link

allovernow 2336 days ago

I don't think so. There are valid use cases for sentiment analysis, but you need to understand the limitations of your training data and probably still want humans to QC at least some representative proportion of flagged posts, if you're going to do this legitimately. Of course a company like this just wants to sell any garbage they can dig up.

link

BlueTemplar 2336 days ago

Well, I will admit that I am quite ignorant of what exactly a background check is, but I just don't see how such a morally fraught question can be legally allowed to be decided by anyone else than a psychologist.

In fact, considering the moral hazard, I don't even see how even using an AI, even as simple as "grep", helping that psychologist in the cost-minimization contexts of a private company would not result in an unacceptable slippery slope where the psychologist would end up just rubber-stamping the decisions of the AI and its creator ? Maybe someone with a dual data "science" / psychology degree would be acceptable, but I'm guessing he/she wouldn't be able to use any "black box" AI...

link

isoskeles 2335 days ago

That’s not what moral hazard means.

https://en.wikipedia.org/wiki/Moral_hazard

link

BlueTemplar 2334 days ago

Right, it's probably inappropriate to use this term at this specific place in my argument.

However, in what is probably not just a coincidence, the global issue this is only one facet of is about the information asymmetry between citizens and corporations...

link

Balgair 2336 days ago

People who live in Scunthorpe or Apeniston would have a great time with these things, I imagine.

https://en.wikipedia.org/wiki/Scunthorpe_problem

link

tor291674 2330 days ago

AWS and Azure both have services for this now. Tweet sentiment analysis is like the first getting started tutorial you land on when you hit both of their docs.

link

dylan604 2336 days ago

That's some pretty high level stuff. I imagined their machine learning is a large cubicle filled room of humanoid machines manually sifting through people's profiles.

link

thecleaner 2336 days ago

And I guarantee that this will be faster than using Hadoop with distributed NoSQL database. Which I presume is what the company is probably doing.

link

0xff00ffee 2336 days ago

ML? No. But regular expressions are a type of artificial intelligence the same way an A* search is.

link