Hacker News new | ask | show | jobs
by aetherson 2620 days ago
It is my feeling that sentiment analysis has a ways to go. Here are a few of the comments I've made that the system has described as among my saltiest:

Oh, sorry, "nm" means "nanometer" to me, but of course nautical miles. (Score: -0.5. Comment is entirely taking responsibility for misinterpreting someone)

Well, if he was trying 1M combinations every 40 seconds, for $7 per hour, and he didn't need to use hundreds of dollars per hour of commute time, let's say 10 hours = $70. That's 900M combinations per hour, so 9B combinations in 10 hours. If he was trying combinations using upper-case characters, lower-case characters, numbers, and let's say 20 symbols, that's 82 possible combinations for each one. We'd expect him to find the password after exhausting half of the search set, so we want log base 82 of 18B. That suggests 5 characters. If he let's say just used lower-case characters and numbers, that's log base 36 of 18B, which suggests 7 characters. (Score: -0.32. Comment is 100% technical, with no meaningful sentiment.)

Sorry, I submitted this article earlier with the wrong link. (Score: -0.27. System appears to regard legitimate, largely bloodless apologies as salty.)

Note that the article is from approximately 20 years ago. (Score: -0.24. This is to some degree a critical comment, but it's a short, straightforward statement of fact.)

Probably too late to reply, but I mean things like :ets.method or :queue.whatever. (Score: -0.20. Probably it's cueing off the first phrase?)

4 comments

Thanks for trying my app! And thank you for taking the time to respond!

It does have a LONG way to go. These are valid criticisms. And all areas we are working to improve on with our next ML model. (Sad vs Negative, rating numbers as "salty",

This model is pretty simple. It's using TextBlob and looking for a combination of negative sentiment (not necessarily condescending) and subjectivity. Essentially hand built heuristics derived by weighting each word in the sentence. Not a great way to make predictions.

The model is FAR from great. But great from afar. For high level (overall user saltiness) it performs better.

The unlabeled dataset of this size presents some unique challenges but in our testing of our new model (based on SOTA BERT fine-tuining & a large labeled training dataset) the results look promising. I'm really looking forward to getting it deployed.

I am encouraged by the words of @pg who said "you can and should give users an insanely great experience with an early, incomplete, buggy product, if you make up the difference with attentiveness.

Can, perhaps, but should? Yes. Over-engaging with early users is not just a permissible technique for getting growth rolling. For most successful startups it's a necessary part of the feedback loop that makes the product good. Making a better mousetrap is not an atomic operation. Even if you start the way most successful startups have, by building something you yourself need, the first thing you build is never quite right. And except in domains with big penalties for making mistakes, it's often better not to aim for perfection initially."

No problem. By the way, another thing that seems like it maybe departs from the intuition of human-understanding of negative sentiment versus the machine scoring: I notice that almost all of the highly negatively rated comments that it's flagged -- both mine and others -- are relatively short.

I know that there are multi-paragraph laments about how dumb other people are or whatever on HN. In general, those strike me as seeming more salty than even deeply negative one-sentence putdowns. Like, sure, "Javascript is awful" is clearly negative sentiment. But spilling a few hundred words on the topic of "Javascript is awful" is surely more so?

That is interesting. I"ll have to explore that correlation and make sure that we have a good baseline comparison for the v2.0 model.

Thanks!

My "saltiest" comment was:

Oh man, that's terrible news. R.I.P. Tim May.

So yeah, the sentiment analysis could use some work.

It seems like this is heavily biased towards the most active users: Dang, tptacek, jaquesm, TeMPOraL, etc...
Likely negative-scoring words: sorry, mean(s), exhaust(ing), submit(ted), wrong, late, whatever.