Hacker News new | ask | show | jobs
by antimora 978 days ago
For finding answers, Stack Overflow may be useful, but for posting questions, it is not. I ceased posting questions a while ago due to the aggressive moderation, which often labeled my questions as duplicates or too common. I was very upset with the direction Stack Overflow had taken, as it seemed to embody the very criticisms it had once had about forums. The Stack Overflow community has lost its vibrancy and has become stagnant. I am pleased that platforms like ChatGPT are emerging as real competitors.
5 comments

If they didn't moderate aggressively, they would probably end up with a bunch of duplicate questions. I really like the fact there's not a ton of dups on SO. ie, see other help sites. They also have a high standard for those questions, to the point where it will take 10+ minutes to post a good question. I don't ask questions either, mainly because I don't want to spend 30+ minutes writing a good question AND that's a good thing.
IMO, the aggressive de-duplication of questions means that people will keep updating the answers as things change, which is really nice as you don't have to worry about the responses to a question becoming out-of-date/stale.
I don't think people object to de-duping true duplicates. The problem is near duplicates, where the original question/answer didn't quite cover what you're asking. In those cases, it can be frustrating getting past the SO duplicate filter, which can sometimes be too aggressive.

Or, when searching for answers, you keep chasing your tail because the mods kept killing off the variants of the question you cared about, and you have to cobble together answers from the comments because those will often be all you get.

Or (and arguably worse), you'll find a relevant question but the answers are inundated with "like the real answer, but variation for situation X" because people are posting to the main question because they know the almost-but-actually-not dupes will all get closed out.

There still are a ton of duplicate questions. When questions get marked as duplicate they don't get merged or anything, they just lock the most recent ones (even if they are much better than the older one).

Their moderation doesn't prevent duplicate questions it just makes it more frustrating when the question you find has been marked as duplicate.

Counterpoint: I've never had a question moderated as duplicate, and I've been using StackOverflow for 15 years.
It can still happen. I've been actively using SO for 14 years and had my first question marked as a duplicate last week. The question was already 9 years old and had been thoroughly answered. Last week somebody decided to "optimize" it, which sent it to the review queue and finally led its closing.
Would be almost hilarious if it was not my next thing to search for
Your experience must be an outlier. After I posted I read other comments here and others had the same experience (just search "duplicate" here).
It does seem likely that I'm an outlier . OTOH it could be that just people that have had bad experiences are posting their experiences and the silent majority without bad experiences aren't posting. It's so hard to tell in a forum format.
I think the loudest people are the ones talking about their bad experience. I've been on SO for, I think, 13 years. I've rarely had questions closed as duplicates and if one was it truly was a duplicate and I found my answer on the referenced question.

I don't see how all the people complaining about moderation don't look at all the low quality questions and see why moderation is what it is.

Personally I don't remember having gotten questions closed. However, sometimes other people's good questions have gotten closed, making me annoyed that I didn't get to see any, or more, answers.
> I've been using StackOverflow for 15 years.

I guess you and any experienced user know the "meta game" of allowed questions. Survivorship bias, kinda.

Yes, seems a likely explanation. Being on the internet prior to the eternal September probably helps too.
> For finding answers, Stack Overflow may be useful, but for posting questions, it is not.

That is mostly as it should be. The primary original intent of StackOverflow is to curate a good Q&A, not to provide an interactive support/mentoring service. Questions are (supposed to be) evaluated by what they add to the repository of questions and answers.

That being said - there has certainly been a culture of harshness in relating to newbie question askers on large parts of the SO.

Is ChatGPT a competitor? I always viewed LLMs in this context as an interface for SO. They're not contributing new answers/code, right?
Suppose you are just learning C++ language. There is a syntax error or type mismatch. The error message is 50 lines long and it does not make any sense to you. You already spent 10 minutes on it. What do you do now?

If there is an experienced C++ developer around who is happy to answer questions, great. Otherwise you are stuck with having to figure it out by using trial and errors and Google-fu. If you post it to a forum, it could take a while for anyone to respond. If you post it to stackoverflow, very likely nobody wants to look at your horrible code (natural for a beginner, you know), and your question gets downvoted.

By contrast, ChatGPT can look at your code and explain very clearly what is wrong with it within seconds.

This is just one example. And it's not only for beginners. I have found that ChatGPT can answer high-level questions in programming very well. The alternative would be searching for the Internet and sift through all the noise to find the answers as well.

Isn’t it worth considering that the reason ChatGPT can do those things is that it was trained on data from platforms like Stack Overflow? This is hard to quantify but my guess would be that without the SO data it wouldn’t be as useful.
You are correct.

This is why reddit and twitter both locked down their APIs, due to the data being very high quality and immensely valuable for training.

Too bad they all did it too late, no one saw ChatGPT coming. And since all the data was already scraped from Stackoverflow, it no longer has any value for OpenAI. Stackoverflow is rapidly declining in volume, so future data from it is irrelevant.

It is not worth considering that because that is irrelevant to the fact that ChatGPT is a competitor now.
It bootstraps from the internet, including SO, but in fact they do have coders in house teaching it stuff: https://www.semafor.com/article/01/27/2023/openai-has-hired-...
Oh wow that's interesting, thanks for sharing! I thought that the value in today's LLMs comes from the wealth of "free" crowd-sourced answers online. It seems like OpenAI is trying to create more source material specifically for "basic coding", though I wonder how cost-effective that is in practice. Or effective in general, for that matter.
Problem is ChatGPT is feeding on SO
Exactly. I've already hit the dreaded "knowledge cutoff" of 2021.

What happens when this training data is 10 years old? Where will we get new data?

In the meantime we'll get used to functioning with these models with assumption the "knowledge cutoff" will just be moved. But if there is little new data, how? Furthermore, how do we prevent feeding ML generated data to ML training in the first place?