Hacker News new | ask | show | jobs
by dmw_ng 1221 days ago
I have been using it as a search replacement for most of the past month and only found two subtly wrong answers. This covers legal questions, researching product differences, wiring diagrams, suggesting books to read, correcting misremembered quotes, and about a hundred other tasks.

Of course still relying on google in the background, but increasingly rarely, and presuming all the negative commentary we've been seeing online are folk who simply haven't tested it in anger yet. Today's chatgpt hallucination is yesterday's Google blogspam etc. Folk for some reason continue to act like the old world was perfect. This is much closer to perfection than anything we ever had, and infinitely more comprehensive. Google as we knew it is already dead, because the medium google was built for just got made obsolete. This is far closer to a new Internet iteration (WAIS, FTP, Gopher, HTTP, Web2.0, ...) than it is a new search engine

Now watch as the search engines try to adapt it to their recency-biased ads model and fail miserably, as what we have is already better than what they were able to sell. Very unclear bing or Google or anyone you've heard of will win this round, its suddenly a very exciting time in tech again

Another aspect I find very exciting is that these effectively represent a return to a curation-driven Internet, selection of input data for model training is probably an interesting new form of diversification. Who cares about having a site in the world wide web if its not part of the inputs for the language models used by millions of users? That's a completely new structure for the dissemination of ideas, marketing, "SEO" etc., and a brand new form of mass media

6 comments

I don't know what you've been searching for that you've only found two subtly wrong answers. It frequently gives me incorrect answers, some of which are subtle and some of which are obvious. It's given me incorrect code, told me about incorrect APIs, explained deep learning concepts incorrectly, given me wrong answers about science-related questions, made up characters wholesale when I asked it about Irish mythology, given me made-up facts about (admittedly niche) philosophers.

I'm glad you've found use out of it, but I can't imagine using it as a search replacement for my use cases.

Edit: And I don't see why it would be surprising that ChatGPT wouldn't have all of the answers. The underlying model is much, much smaller than it would take to encode all of the knowledge it was trained on. It's going to make things up a lot of the time (since it's not good at remaining silent).

Exactly my experience. And if you point out the errors, often it will correct itself (most of the time) and explain why it was incorrect before (sometimes).
I'm going to echo other people's skepticism and give a concrete example that's easy to reproduce and which has virtually no dependence on real experience in the physical world. Try asking it about public transit wayfinding trivia. Pure text matching, well defined single letter / digit service names, closed system of semantic content. All there is are services and stations and each service is wholly defined by the list of stations it stops at and each station is wholly defined by the list of services that stop at it. This should be a language models bread and butter. No complexity, no outside context, just matching lists of text together.

I talked to it about the NYC subway. Every time I nudged it with a prompt to fix a factual error or omission, it would revise something I didn't ask for and introduce new errors. It was inconsistent in astounding ways. Ask it what stations the F and A have in common twice and you'll get two wrong answers. Ask it to make a list putting services in categories, it will put the same service into more than one contradictory category. Point this out, it will remake the list and forget to include that service entirely. And that's when it isn't confidently bullshitting about which trains share track and which direction they travel.

Bullshit is worse than a lie. For a lie is the opposite of the truth and thus always uncovered. But bullshit is uncorrelated with the truth, and may thus turn out to be right, and may thus cause you to trust the word of the bullshiter far more than they deserve.

I've been spending some time trying to get a sense of how it works by exploring where it fails. When it makes a mistake, you can ask questions in a socratic method until it says the true counterpart to its mistake. It doesn't comment on noticing a discrepancy even if you try to get it to reconcile its previous answer with the corrected version that you guided it to. If you ask specifically about the discrepancy it will usually deny the discrepancy entirely or double-down on the mistake. In the cases where it eventually states the truth through this process, asking the original question that you started with will cause it to state the false version again despite obviously contradicting what it said in the immediately previous answer.

ChatGPT is immune to the socratic method. It's like it has a model of the world that was developed by processing its training data but it is unable to improve its conceptual model over the course of a conversation.

These are not the kinds of logical failures that a human would make. It may be the most naturalistic computing system we've ever seen but when pushed to its limits it does not "think" like a human at all.

> If you ask specifically about the discrepancy it will usually deny the discrepancy entirely or double-down on the mistake.

I have had the exact opposite experience. I pasted error messages from code it generated, I corrected its Latin grammar, and I pointed out contradictions in its factual statements in a variety of ways. Every time, it responded with a correction and (the same) apology.

This makes me wonder if we got different paths in an AB test.

How the hell does one A/B test a language model that even the designers don’t fully understand?

Of course, I’m sure that once you start plugging engagement metrics into the model and the model itself conducts A/B tests on its output… hoo boy….

I pasted error messages from code it generated. It kept generating the same compiler error eventually. When I applied the "socratic method" and explained to it the answer based on stack overflow answers. It would at first pretend to understand by transforming the relevant documentation I inserted into it, but once I asked it the original question, it basically ignored all the progress and kept creating the same code with the same compiler errors.
It's a incredible at writing rich and persuasive comments that take the momentum out of bigoted Facebook posts. An extended family member is unfortunately all aboard the election fraud and "groomer" trains, posting absurd and hateful stuff constantly every day (in classic Facebook style many of these posts "do not violate the community guidelines). I and a couple other younger members of the family have taken to using ChatGPT to gently but firmly counter every lie and misdirection he tries to make. I'm not sure if it's deeply changed his mind or heart yet, but he posts much less extremist content now and has actually resumed posting wholesome and funny things like he did before going down the rabbit hole.
It’s nice to get quick in context answers to concepts and their relationships. Sometimes I have a vague notion, but with ChatGPT it resolves my hunch quite quickly without reading through a (sometimes ad spammed) article.

Google should be concerned.

Your entire post is questionable the moment you write something like "Google as we knew it is already dead".

Yeah, no.

Yeah, it is.
I can't even augment ChatGPT with Google results, how can it be a replacement for Google?
When I ask it for things that are obviously on stackoverflow but hard to spot or understand because they are pointlessly clever or weird, it is nigh unusable. It is a complete waste of time. Even if you paste in the stack overflow answers it will take some iterating and at that point I am teaching an unteachable AI.