| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by PaulHoule 965 days ago

It's a big question on my mind now. I am thinking about adding something to my YOShInOn RSS reader that picks up articles from, say, ScienceX, looks up original sources, automatically tests that if papers are open access and gives me the tools to quickly make a judgement call about what kind of link to share on what platform. In the immediate term this will be done after the system chooses articles to show me (about 5-10% of the articles it ingests) but someday it might be done before that.

There is the question of what my judgement is and the question of what your judgement is. The primary selection process (that shows me articles) is great right now, I am upvoting maybe 220 articles out of 300 in a cycle. I pick out links to post to HN and I am currently adding links to the queue faster than I am posting them which means I can definitely raise the quality of what I post but the definition of "quality" is where I get stuck. It has all sorts of factors such as a lack of annoyingness (I hate those cookie popups but there is a lot of good news behind them) but there are also articles that look really good to me at first (I like what they set out do) but then what I look at them again I realize they didn't accomplish what they set out do.

I do think votes and comments are worth something, but I also know that I could get more of both by posting clickbait articles. On one level I want to post things that are enlightening, boy I get frustrated that y'all just don't care about robotics or chemical recycling of polymers or Arduino projects. (Though my real secret ambition is to get a #1 post about sports...)

Somehow I want to pose the problem of posting to HN, Mastodon, etc. as a sequential recommendation problem which means I have to back and look at all the papers on the subject that YOShInOn has collected for me. Also I am likely to put some more work into "quality models", particularly a stacked model for predicting votes if not comments on HN articles, a broad topic model based on data from Tildes (is is sports? music? science?) and particularly sentiment models.

That last one is on my mind because I'm thinking about the emotional tone of what I post to Mastodon, some days I think I should just stick to posting flower photos because they get good engagement, but past the people who are calling everybody a "fascist" that get amplified there is a "silent majority" of people on Mastodon who try to avoid the news and other inflaming topics so I am torn between being unrelentingly positive or trying to balance out positive and negative articles to make a more appealing feed to the good people of Mastodon. It would probably be 2-3 days of labeling work to make a sentiment model but if I had to find and categorize 5000 angry toots it would kill me, but I am thinking now about grading my own posts (what the system is going to do inference on anyway) and also grading high-engagement and predicted high-engagement submissions to HN to make a model that finds high-engagement posts that aren't clickbait.