| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by janalsncm 1121 days ago
	ChatGPT is the killer app. It’s a Google killer. It is better than the SEO listicle garbage filling the internet. Even if it’s not always accurate it’s still better in a lot of circumstances. There’s a reason Sundar rushed Bard out of the gate even though it is clearly inferior.

3 comments

jnsaff2 1121 days ago

Step 1. Humans write copy for humans to buy their garbage, humans counter by tuning out and switching channels

Step 2. Humans write SEO copy for machines to rank them higher.

Step 3. LLM writes copy for machines to rank them higher.

Step 4. Human uses LLM to try to distill the LLM generated SEO spam for any remaining signal.

Also to your point:

> SEO listicle garbage filling the internet.

the feeling that the LLM is better than what you described is going to be very temporary, then the mountains of LLM generated bullshit is going to overwhelm even LLM to make meaningful sense of.

link

friend_and_foe 1121 days ago

You're missing the point. If we want to know something, we won't even have to google it; we will just ask an LLM. There will be no market for websites full of it because we can just directly ask it to answer our questions.

The only "if" to all this is if we will destroy the LLMs by feeding them their own diarrhea. I expect a sort of natural selection here to play out, especially in the open source space. Ones that are trained on LLM generated blogspam will probably, I expect, get outperformed by ones that are trained on genuine information, or at the very least ones made using new techniques that adequately filter noise.

link

mcphage 1121 days ago

> If we want to know something, we won't even have to google it; we will just ask an LLM. There will be no market for websites full of it because we can just directly ask it to answer our questions.

How will it learn anything new?

link

jmye 1121 days ago

> Ones that are trained on LLM generated blogspam will probably, I expect, get outperformed by ones that are trained on genuine information, or at the very least ones made using new techniques that adequately filter noise.

Yes, humans are notorious for only seeking out high quality, accurate data, especially when it conflicts with our priors.

To say nothing of our ability to assess the accuracy or truthiness of information in the first place (look at how many people take, on faith, that Chat GPT isn’t wrong as often as it is right).

link

mrguyorama 1121 days ago

But there's still no way to get an LLM to only output "fact", because that's not a property of language.

link

ben_w 1120 days ago

That's also true of a web search engine; but an LLM can (in principle, not saying it's there yet) be able to spot inconsistencies in the source data, to notice disagreement.

link

janalsncm 1121 days ago

I’m not following. If ChatGPT gets worse, OpenAI can simply not update it. Or revert to a previous version.

For Google, they’re at the mercy of whatever the internet has.

link

jnsaff2 1121 days ago

ChatGPT is also at the mercy of whatever the internet has. Including more and more of what it was used to generate.

link

janalsncm 1121 days ago

It isn’t though. Like I said, if the model gets worse OpenAI can simply not release a new version.

You also have to consider the money angle. As using ChatGPT and other chatbots becomes more popular, people will stop producing garbage internet articles because they will be less popular and therefore less profitable. Bloggers who enjoy writing will continue to do so because it was never about the money, they just enjoy writing.

Further, the internet is only one small portion of information available to train on. There’s a lot of other data out there, including real-world conversations.

link

Volundr 1121 days ago

> Like I said, if the model gets worse OpenAI can simply not release a new version.

So now it's got great information about the Model T Ford but knows nothing about our new mars colony?

I don't think "just don't update the model" is a likely option.

link

janalsncm 1121 days ago

You don’t update models to add new information. That’s extremely inefficient and susceptible to catastrophic forgetting. If you want the model to have new information you update an offline knowledge base. So yes you can simply not update the model.

link

jnsaff2 1121 days ago

I understand what you are saying but to me it sounds very handwavy and (not to be disrespectful) naive.

How would LLM upstarts be able to counter the massive commercial interests? As with google they will also succumb to prefer money over usefulness at latest when they have a wide user base.

There is also an even less proven way of distinguishing spam from signal with LLMs.

And not updating a model means that they will be stuck in COVID-19 era forever.

link

distortedsignal 1121 days ago

I’ll push back on this, at least in its current iteration. I just asked it to list some restaurants near my apartment (major intersection in San Jose) and it wasn’t particularly close. While there are several restaurants less than half a mile from the intersection, ChatGPT listed restaurants from several miles away.

Given the “weights in a matrix” architecture of ChatGPT, I’m not sure it’s possible to store enough data to make the query practical to answer. Say there are a couple hundred intersections in my city. You have to store the token of “restaurant name” “close to” “intersection” for each intersection. I don’t know the size of Google’s Maps DB, but I would guess it’s several Gigabytes per city. From my understanding of the theory, you would need to store BOTH the LLM weights AND the Maps data for ChatGPT to have a shot at generating good answers for that type of query.

I’m happy to be wrong here. If I’m misunderstanding something, please let me know.

link

janalsncm 1121 days ago

Well you’re right since ChatGPT isn’t hooked up to the internet, so certain queries aren’t good use cases. Adding maps info to a language model would be a pretty bad idea (even if it didn’t hallucinate) since it can change at any time, which would require more (expensive) training.

What Bing does is to use your query to search the web and use the top N search results in the context window for the chat.

However I’ll push back on your pushback. ChatGPT doesn’t need to be perfect to be a killer app. It is highly flawed. Maybe it was a bit to strident to say ChatGPT will kill Google search, but it’s strictly better for a lot of squishy queries that don’t have a factual basis.

How can I convince my boss to give me a raise? gets you a listicle on Google and a highly specific response on ChatGPT. And if some of the advice doesn’t apply, you can continue directing the conversation. It’s an idea generator, even if some of them are bad or don’t make sense.

link

Nerada 1121 days ago

Tangentially related, I had GPT-4 plan the sightseeing on my latest holiday.

It both picked out the interesting places of note, and then I asked it to plan them in such a way that made sense walking-wise (so I wasn't backtracking) and it did so without a hiccup.

link

literalAardvark 1121 days ago

You're not wrong at all, it doesn't know everything.

But it does know a lot of things and can be super useful. Personally i think search engine is a terrible use case, unless you use the Bing enabled version, or bing chat.

I've used it to write pretty complicated scripts where I had no idea what I was doing, rebuild crusty httpd configs from first principles, explain disassembled code, explain regular code, explain configs, read dmidecode and lspci for me and make a pcie slot report... It's bloody brilliant.

Other: read and translated my blood tests. Accurately!

link

thefz 1121 days ago

> ChatGPT is the killer app. It’s a Google killer. It is better than the SEO listicle garbage filling the internet.

And yet it hallucinates URLs when I ask it to cite its sources. It's still Google search with a little patience for me.

link

MattPalmer1086 1121 days ago

It is not a direct replacement for search engines, but it will seriously dent their market share.

If you are looking for a location on the internet, use a search engine. LLMs do not memorise the data sources verbatim.

If you want to know how to do something, it will normally give you a better answer than you would find by googling around multiple blogs. No location on the internet needed.

link