Hacker News new | ask | show | jobs
by janalsncm 1074 days ago
ChatGPT is the killer app. It’s a Google killer. It is better than the SEO listicle garbage filling the internet. Even if it’s not always accurate it’s still better in a lot of circumstances. There’s a reason Sundar rushed Bard out of the gate even though it is clearly inferior.
3 comments

Step 1. Humans write copy for humans to buy their garbage, humans counter by tuning out and switching channels

Step 2. Humans write SEO copy for machines to rank them higher.

Step 3. LLM writes copy for machines to rank them higher.

Step 4. Human uses LLM to try to distill the LLM generated SEO spam for any remaining signal.

Also to your point:

> SEO listicle garbage filling the internet.

the feeling that the LLM is better than what you described is going to be very temporary, then the mountains of LLM generated bullshit is going to overwhelm even LLM to make meaningful sense of.

You're missing the point. If we want to know something, we won't even have to google it; we will just ask an LLM. There will be no market for websites full of it because we can just directly ask it to answer our questions.

The only "if" to all this is if we will destroy the LLMs by feeding them their own diarrhea. I expect a sort of natural selection here to play out, especially in the open source space. Ones that are trained on LLM generated blogspam will probably, I expect, get outperformed by ones that are trained on genuine information, or at the very least ones made using new techniques that adequately filter noise.

> If we want to know something, we won't even have to google it; we will just ask an LLM. There will be no market for websites full of it because we can just directly ask it to answer our questions.

How will it learn anything new?

> Ones that are trained on LLM generated blogspam will probably, I expect, get outperformed by ones that are trained on genuine information, or at the very least ones made using new techniques that adequately filter noise.

Yes, humans are notorious for only seeking out high quality, accurate data, especially when it conflicts with our priors.

To say nothing of our ability to assess the accuracy or truthiness of information in the first place (look at how many people take, on faith, that Chat GPT isn’t wrong as often as it is right).

But there's still no way to get an LLM to only output "fact", because that's not a property of language.
That's also true of a web search engine; but an LLM can (in principle, not saying it's there yet) be able to spot inconsistencies in the source data, to notice disagreement.
I’m not following. If ChatGPT gets worse, OpenAI can simply not update it. Or revert to a previous version.

For Google, they’re at the mercy of whatever the internet has.

ChatGPT is also at the mercy of whatever the internet has. Including more and more of what it was used to generate.
It isn’t though. Like I said, if the model gets worse OpenAI can simply not release a new version.

You also have to consider the money angle. As using ChatGPT and other chatbots becomes more popular, people will stop producing garbage internet articles because they will be less popular and therefore less profitable. Bloggers who enjoy writing will continue to do so because it was never about the money, they just enjoy writing.

Further, the internet is only one small portion of information available to train on. There’s a lot of other data out there, including real-world conversations.

> Like I said, if the model gets worse OpenAI can simply not release a new version.

So now it's got great information about the Model T Ford but knows nothing about our new mars colony?

I don't think "just don't update the model" is a likely option.

You don’t update models to add new information. That’s extremely inefficient and susceptible to catastrophic forgetting. If you want the model to have new information you update an offline knowledge base. So yes you can simply not update the model.
I understand what you are saying but to me it sounds very handwavy and (not to be disrespectful) naive.

How would LLM upstarts be able to counter the massive commercial interests? As with google they will also succumb to prefer money over usefulness at latest when they have a wide user base.

There is also an even less proven way of distinguishing spam from signal with LLMs.

And not updating a model means that they will be stuck in COVID-19 era forever.

I’ll push back on this, at least in its current iteration. I just asked it to list some restaurants near my apartment (major intersection in San Jose) and it wasn’t particularly close. While there are several restaurants less than half a mile from the intersection, ChatGPT listed restaurants from several miles away.

Given the “weights in a matrix” architecture of ChatGPT, I’m not sure it’s possible to store enough data to make the query practical to answer. Say there are a couple hundred intersections in my city. You have to store the token of “restaurant name” “close to” “intersection” for each intersection. I don’t know the size of Google’s Maps DB, but I would guess it’s several Gigabytes per city. From my understanding of the theory, you would need to store BOTH the LLM weights AND the Maps data for ChatGPT to have a shot at generating good answers for that type of query.

I’m happy to be wrong here. If I’m misunderstanding something, please let me know.

Well you’re right since ChatGPT isn’t hooked up to the internet, so certain queries aren’t good use cases. Adding maps info to a language model would be a pretty bad idea (even if it didn’t hallucinate) since it can change at any time, which would require more (expensive) training.

What Bing does is to use your query to search the web and use the top N search results in the context window for the chat.

However I’ll push back on your pushback. ChatGPT doesn’t need to be perfect to be a killer app. It is highly flawed. Maybe it was a bit to strident to say ChatGPT will kill Google search, but it’s strictly better for a lot of squishy queries that don’t have a factual basis.

How can I convince my boss to give me a raise? gets you a listicle on Google and a highly specific response on ChatGPT. And if some of the advice doesn’t apply, you can continue directing the conversation. It’s an idea generator, even if some of them are bad or don’t make sense.

Tangentially related, I had GPT-4 plan the sightseeing on my latest holiday.

It both picked out the interesting places of note, and then I asked it to plan them in such a way that made sense walking-wise (so I wasn't backtracking) and it did so without a hiccup.

You're not wrong at all, it doesn't know everything.

But it does know a lot of things and can be super useful. Personally i think search engine is a terrible use case, unless you use the Bing enabled version, or bing chat.

I've used it to write pretty complicated scripts where I had no idea what I was doing, rebuild crusty httpd configs from first principles, explain disassembled code, explain regular code, explain configs, read dmidecode and lspci for me and make a pcie slot report... It's bloody brilliant.

Other: read and translated my blood tests. Accurately!

> ChatGPT is the killer app. It’s a Google killer. It is better than the SEO listicle garbage filling the internet.

And yet it hallucinates URLs when I ask it to cite its sources. It's still Google search with a little patience for me.

It is not a direct replacement for search engines, but it will seriously dent their market share.

If you are looking for a location on the internet, use a search engine. LLMs do not memorise the data sources verbatim.

If you want to know how to do something, it will normally give you a better answer than you would find by googling around multiple blogs. No location on the internet needed.