Hacker News new | ask | show | jobs
by Vermyndax 775 days ago
If I wanted to use OpenAI, I would. If I wanted to use StackOverflow, I would. Now I just only get to use OpenAI no matter what.

This hellscape is forming way too fast.

7 comments

The article says that they're partnering to incorporate OpenAI's algorithms into a generative AI solution that SO was already working on in parallel to their Q&A sites, and to allow data from SO sites to be accessible to OpenAI's own solutions.

It doesn't indicate that generative AI is going to be shoehorned into StackOverflow's websites. It would seem counterproductive, in fact, to do that, since the gist of this seems to be that StackOverflow provides a large wealth of organized, validated human-generated knowledge, which is exactly the sort of thing you want to train LLMs on. Feeding AI-generated data back into that would diminish the value of the data SO hosts for that purpose.

Too bad OpenAI already scrapped all of this data years ago and is in a position of power here.
Not sure what you mean. Sure, they've scraped a lot of data, but websites are in a position to inhibit further scraping, so it's in their interests to cooperate with data sources they want to rely on.

I'm not sure what "position of power" you could be referring to. Power to do what, with respect to what? OpenAI has useful tools that Stack Overflow wants to apply to its own use cases, and Stack Overflow has good data for training LLMs on. Seems like a straightforward alignment of incentives.

OpenAI has enough motivation to circumvent whatever anti-scraping measures stackoverflow could muster.

I assume stackoverflow's metrics (traffic, number of new questions and answers) are down by an amount they are not happy with, so they are eager to strike any deal before their ship sinks.

At least that's how I read the news piece. Personally, I'm as often on stackoverflow, as I've ever been, whereas my chatGPT usage is down to almost zero.

> OpenAI has enough motivation to circumvent whatever anti-scraping measures stackoverflow could muster.

And even greater motivation to just cooperate with StackOverflow for mutual benefit, rather than engage in a ridiculous arms race with them.

> I assume stackoverflow's metrics (traffic, number of new questions and answers) are down by an amount they are not happy with, so they are eager to strike any deal before their ship sinks.

I'm not sure I'd understand the connection to this even if that were true. The value StackOverflow seems to be bringing to the table is specifically a large dataset of human-curated technical knowledge. Both parties in this arrangement would have strong interest in ensuring that StackOverflow continues to generate this data through its user-centric Q&A website. I'm not sure how a deal with OpenAI would prevent their "ship" from "sinking" if that were the situation they were in.

> Personally, I'm as often on stackoverflow, as I've ever been, whereas my chatGPT usage is down to almost zero.

Same here. ChatGPT is a nice novelty, but I haven't found all that much productive use for it. Most people I know who do use it regularly are using it for either correcting their spelling/grammar, or as a conversational-interface search engine, neither of which I find to be superior to proofreading my own writing or evaluating information from its original sources after doing a conventional search.

But there might be a value-add for StackOverflow in the latter case: finding specific answers to complex questions can be a hit-or-miss proposition, and ChatGPT might at least provide a more efficient way of finding the articles that answer your questions, if implemented properly.

Of course, implementing it properly would likely involve designing the LLM to track the sources of the data it's tokenizing, and present a 'bibliography' for each of its answers, rather than just blindly compositing data from all sources into single probability values.

StackOverflow released a data bundle that anyone could use to prevent scraping.
I hope that StackOverflow people understand this. And that they do not panic because their usage/engagement metrics is down quite a bit over the last years.
Might very well be in panic mode. They're also partnering with Indeed to bring back a new version of StackOverflow Jobs.

https://meta.stackexchange.com/questions/399440/testing-a-ne...

Regarding usage, I was on SO.

I specialize in Amazon Redshift.

I've written a lot of PDFs about Amazon Redshift - serious stuff, deep technical investigations and explanations, published along with the source code which produces the evidence which the PDF is based on - and when people asked questions where I'd written up the answer, I pointed them at the appropriate PDF.

After some months, I received a direct message, which looked to me to be a pro-forma, a standard message sent in this situation, from the staff that I was promoting my site and I should not do so. It was well written and polite.

That's fine - I have no problems with that, it's their web-site.

What I did not like, however, and what came over as slimey, was that the staff had also deleted every post I had made.

This was not mentioned, at all, in the well written and polite message, which then of course became disingenuous. If you're going to do something serious like that, you need to tell people, not let them discover it for themselves.

This was for all posts, where I'd explained something directly or pointed to a PDF - presumably it's a standard action SO take in this situation.

I deleted my account and left.

SO corporate has been trying to shoehorn AI into the sites ever since it became the latest buzzword. It's been largely laughably bad and is alienating the community, who don't want it and aren't asking for it.
Can't we continue to use StackOverflow as normal? Wouldn't that normal use case (using the web page) be unencumbered by any AI stuff?
Honestly it's not clear the SO actually gets anything out of this deal, other than:

> provide attribution to the Stack Overflow community within ChatGPT

...and that didn't seem important enough for OpenAI to bother to mention it on any of their media channels that I've seen.

so, who knows?

It feels like it's a whole lot of nothing to me, and exchange they're letting OpenAI having all of their Q/A data.

I doubt it will make any significant difference to S/O for most people; and anyone who thinks putting S/O links in a chatGPT response is going to drive traffic back to S/O is kiddddddddddding themselves.

I feel like they are already very similar in the sense that any answers you read should be assumed as being wrong first and let them prove they are correct before putting something in your code.
Conversely, if you don't want to use OpenAI and/or SO, you are free to do so. SO has no obligation to continue losing users for your whims.

On top of this, you could say the same about any disrupting technology.

Honestly I barely use stack anymore. I know I'm not the only one and they're losing their lunch just like experts-exchange
yea me too. i don't even understand entirely why i don't use stackoverflow anymore.
I can tell you exactly why my engagement is down with the site. It’s because every time I ask a question, it gets closed as a duplicate by people who clearly haven’t read my question carefully. It’s exhausting and not really worth the effort to fight for it to be reopened.
Yep, knowing this problem well, I asked a question the other day and defensively linked to the other similar questions to explain why they were not duplicates. My question was still closed with the claim of it being a duplicate. Last time I’ll ever bother trying to use SO again.

The decision to close my question in spite of it having a clear technical difference made no sense at all. It honestly felt like a bot that just noticed that a lot of the content of the question was related to other questions-a bot without the ability to understand why the question is literally different.

Why is SO like this these days. Is it just because there is such a large history of content in the site, that it’s easy for people who don’t want to think to just mark questions closed?

Sometimes questions get answered despite them being closed. These are often the most useful!
Over zealous moderation and the average age of a question/answer being like 8 years.

There are very few novel questions and the ones that are there use outdated apis.

I've come to use ChatGPT instead.

The reason is that while using SO you generally reach similar errors and then read answers and try to make sense out of the problem you are having, that's fantastic, but being able to explicitly state your problem and make followup questions on it is even better.

Yesterday I had to engage with a project using Redux. It has been a while since I touched that technology so I went forward and gave a summary of it to ChatGPT asking if I was correct on my assumptions, from there onwards I made a couple more explanations, a couple questions and I was done. I think this ability to further prod with questions is too good of a feature to pass on.

moderation there is done so poorly it continues to discourage users from participating while not really slowing down entropy as the site ages and the number of posts grow

moderation there is done so poorly it has become a meme of sorts, so even if and when it improves, any improvement in perception will lag... and because users choose to use the site based on their perception of its value rather than its true value, it has sort of become a vicious cycle

It's full of assholes now and people generally prefer not to be around those.
May I ask what you use instead?
Documentation, GitHub issues, language forums, reddit. Nowadays it seems more often that those resources help me work around the issues I'm encountering rather than stackoverflow. There are also the AI tools that help me easily get answers to the question "how do I do X in language/framework Y"
Not OP, but I’ve been trying to formulate problems in ways that first principles and primary sources (language docs, etc) can answer. It’s more work but also more rewarding and a better learning experience for me.
I feel like they are announcing that OpenAI is going to be getting worse at answering technical questions.

I use OpenAI because StackOverflow answers are just the absolute wrong answer. A combination of gaslighting (you shouldn't be having this problem), dogmatic enforcement of good ideas that started as guidelines and problematic example code that should not be trusted. You are better of with a reddit thread or a blogpost and much better of with actual documentation. StackOverflow is the thing that causes the bugs and the tech debt in the first place.

At least now OpenAI's competition has a fighting chance, because their models won't be poisoned by SO

If you want to be the only customer of a service, and have them do exactly what you want, you can foot the entire bill.
What is the point of your comment? We are not allowed to complain about a service we don’t own anymore?