Hacker News new | ask | show | jobs
by DupDetector 5657 days ago
So, just out of interest to understand your position better, would you advocate that every time an item sinks without trace, the submitter should subtly alter the URL and resubmit it every day until it gains traction?

ADDED IN EDIT:

From 5 days ago: http://news.ycombinator.com/item?id=2002081

From 4 minutes ago: http://news.ycombinator.com/item?id=2020721

Identical items. There are two reasons it didn't get traction last time. One is that it wasn't noticed. The other is that it isn't of value. If it's not of value, it shouldn't be resubmitted. If it simply didn't get noticed then the chances of valuable items getting noticed would go up if people didn't resubmit the same things over and over again.

Resubmission is bad. Helping people find decent things - even when old - has to be better. Reducing the noise will help.

2 comments

> Helping people find decent things - even when old - has to be better

How are previous submissions with 1 point and 0 comments 'decent', exactly? Because you frequently link to such posts.

Maybe you should tweak your algorithm to only detect dupes with >10 votes and >10 comments. Linking to dead submissions is pointless.

You seem to have misunderstood. I'm not saying that the DupDetector is helping to find good, older submissions. The idea is to get help reduce the endless repeats, to provide space, and then, instead of endlessly repeating the same submissions, have a way that people can find the good ones that are already there.

But what running DupDetector has shown me is that the problem is endemic. There is no way that I can see to reduce the rate of re-submissions. People have got very annoyed when the problem has been pointed out. Indeed, many people don't see it as a problem. The very fact that people see an enormous number of posts by DupDetector tells us two things: There are a lot of duplications, and people don't want to know.

Some discussion here: http://news.ycombinator.com/item?id=2013666

So that's fair enough. As I say, I'm terminating the experiment. I've learned that the current level of noise caused by constant re-submissions of the same stories, and often the exact same items, won't stop. So instead I'm concentrating on finding items that fit with PG's stated preference:

    Stories on HN don't have to be about hacking, because
    good hackers aren't only interested in hacking, but they
    do have to be deeply interesting.
(From http://ycombinator.com/newswelcome.html)

On any day, look at the "newest" page and ask yourself how many are really "deeply interesting", how many are amusing diversions, and how many are diverting, but useless. Ask how much you learn from each one. The answer I suspect will be "not much".

Look at the "news" page and ask the same question. The problem is that if enculturated "old hands" don't read the "newest" page, then the "news" page will deteriorate, because the people voting on new items won't be voting the right things for the right reasons.

We see that now. Lots of submissions get votes because they are, in PG's words, "intensely but shallowly interesting."

So I'm going away for a while to see if I can do that. I'll still read stuff, but every time I do I will consciously remind myself that I can get content-free entertainment pretty much anywhere, and that I'm looking specifically for stuff that's "deeply interesting." If I figure out a way to find it automatically I'll certainly come back and share it, although I don't expect I'll succeed. It's a hard problem, and I'm certainly no smarter than others who have worked on it before me.

Currently the best option seems to be to have a small community of like-minded people. That's what HN used to be, and I think that despite PG's excellent efforts, it's getting too big to hold together.

But I'll work on it, and I'll certainly share any useful findings. If there are any.

Thanks for the reply. I understand a little more about what you're trying to achieve now. And actually I agree with you.

> Lots of submissions get votes because they are, in PG's words, "intensely but shallowly interesting."

Agreed. The "High quality typefaces" (nee "25 new free high quality fonts") post is a prime example imho. And yet trying to point this out is frowned upon massively - complaining in any way that a popular submission isn't hackerly or doesn't fit is a surefire way to get downvoted into oblivion (and, I guess, whinging isn't going to get anyone anywhere, so fair enough).

Perhaps we need more emphasis on flagging articles as opposed to upvoting them.

A HN style site where you can only downvote articles, and the least downvoted rise to the top would be interesting.

But what running DupDetector has shown me is that the problem is endemic. There is no way that I can see to reduce the rate of re-submissions. People have got very annoyed when the problem has been pointed out. Indeed, many people don't see it as a problem. The very fact that people see an enormous number of posts by DupDetector tells us two things: There are a lot of duplications, and people don't want to know.

It looks to me like this is an issue rooted in friction between those folks who spend a great deal of time here and those folks who spend less time here. Those folks who are here a great deal see all the duplicate submissions and find them annoying -- "BTDT, got the t-shirt, can we puhleez move on??". Those folks who spend less time here see it as a new submission or submit it without having any idea it was previously submitted. It sounds to me a little like the running battle you see on some email lists between folks who get digest (and have hissy fits about people not trimming replies) and folks who get individual emails (and feel that trimming replies is too time consuming and loses too much context, thus hurts communication).

I'm sorry you are frustrated by this. And I'm sorry if this issue contributes to "evaporative cooling" of this site (ie by you and others leaving -- my general impression is that you are a significant contributor and that the folks most annoyed by duplicates are usually also important, valued members). But I don't think that vilifying individuals who have posted something in good faith is a constructive solution to a systemic problem.

Please do let us know if you come up with something that might effectively address this issue.

Peace. Thanks for all you have contributed. I imagine many people here will miss you if you leave.

> It looks to me like this is an issue rooted in friction between those folks who spend a great deal of time here and those folks who spend less time here. Those folks who are here a great deal see all the duplicate submissions and find them annoying

Yeah I think that's a lot to do with it. I've noticed time and time again with online communities that the longer you use it, the more likely you are to start complaining that it's not as good as it used to be. It happens everywhere, and the problem is almost invariably you not it.

I don't have a "position". I was just replying to someone's question. My submissions tend to generate little or no interest. I have never resubmitted them. But I don't see any reason to make a big deal out of it, at least not in this particular case. Due to timing or some other reason, this submission is actually generating discussion when the first one didn't. It doesn't look to me like trolling or anything negative.

Just out of interest, your profile indicates you are a 'robot' but your posts sound human generated to me. Care to clarify/elaborate/enlighten me? (I've been wondering this for a day or two but didn't see any point in starting a thread to ask. This seems like a convenient time to ask.)

Thanks.

Thanks for your reply. Because of my current concerns (referenced below) I find it useful to get opinions and comments from people whose names I recognise on HN.

To answer your specific question, DupDetector was an attempt at two things.

Firstly, to separate concerns - to separate my "curating" efforts of cross-referencing and duplicate finding from my occasional contributions of original material and links. I wanted to see the effects of that activities separately. In particular, I wanted people look at my "submissions" page from my profile to see the submissions, not the cross-referencing.

Secondly, to make the process more comprehensive and more automatic. I wrote a few small scripts and had them generate comments. I never let it run automatically, and always submitted them "by hand". That meant that they weren't as obviously robotic as I would eventually have intended them to be. I have also written a couple of comments and submitted them under that ID because that was part of the separation of concerns mentioned above.

I've learned a lot from the experiment, and over time I will distill that and feed it back to the community. You can see some of the discussion here:

http://news.ycombinator.com/item?id=2013666

http://news.ycombinator.com/item?id=2021596

I replied to your second link already. I will note that the profile "DupDetector" came across to me as obnoxious, cold, unfriendly and hostile and as having an ugly agenda. Part of the value your prior efforts to cross-reference duplicates had is that they were associated with a well-known, trusted member of a community who was (by all appearances to me, as someone who isn't here all that much) held in quite high esteem by the community as a whole. That serves a very different social function from a robot intent on weeding out duplicates as ugly and intolerable imperfections in our collective garden.

I am about to get off line and do other things for a bit. Might as well stop here. I'm not a hacker. I'm a people person who has been dragged kicking and screaming into learning a little code because it serves other interests of mine which have been poorly served by other approaches. Historically, hackers and other more technically oriented types tend to not much appreciate my views, especially so on issues like this one. In the interest of not wasting your time or mine: sayonara. :-)