[quote]BULLSHIT involves language or other forms of communication intended to appear authoritative or persuasive without regard to its actual truth or logical consistency.[/quote]
That is an emotionally manipulative definition and anthropomorphizes the LLMs. They don't "intend" anything, they're not trying to trick you, or sound persuasive.
I've never used Claude, but Perplexity often says that no definitive information about a topic could be found, and then tries to make some generalized inferences. There's a difference between a specific implementation, and the technology in general.
In any case, it's worthwhile for people to understand the limitations of the technology as it exists today. But calling it "bullshit" is a mischaracterization; I believe based on an emotional need for us to feel superior, and to dismiss the capabilities more thoroughly than they deserve.
It's a little like someone saying in the industrial revolution, "the steam shovel is too rigid, it will NEVER have the dexterity of a man with a shovel!". And while true and important to know, it really focuses on the wrong thing, it misses the advantages while amplifying the negatives.
As the technology exists today: imperfect, often prone to mistakes, and unable to relay confidence levels. These problems may be addressed in future implementations.
That's the same message, without any emotional baggage, or overly dismissive tone.
> Being persuasive (i.e., churn out convincing prose) is how LLMs were designed to be.
No. They were designed to churn out accurate prose that accurately reflects their model of reality. They're just imperfect. You're being cynical and emotional to use the term bullshit. And again, it anthropomorphizes the LLM, it implies agency.
> that accurately reflects their model of reality.
you are also seemingly anthropomorphising the technology by assigning to it some concept of having a “model of reality”.
LLM systems output an inference of the next most likely token, given: the input prompt, the model weights and the previously output token [0].
that is all. no models of reality involved. “it” doesn’t “know” or “model” anything about “reality”. the systems are just a fancy probability maths pipelines.
probably generally best to avoid using the word “they” in these discussions. the english language sucks sometimes. :shrug:
[0]: yes i know it is a bit more complicated than that.
It literally has a mathematical model that maps what would, colloquially at least, be known as reality. What exactly do you think those math pipelines represent? They're not arbitrary numbers; they are generated from actual data that is generated by reality. There's no anthropomorphizing at all.
a corpus of training data from the internet is finite.
any finite number divided by infinity ends up tending towards zero.
so, mathematically at least, the training data is not a sufficient sample of reality because the proportion of reality being sampled is basically always zero!
fun with maths ;)
> What exactly do you think those math pipelines represent?
probability distributions of human language, in the case of text only LLMs.
which is a very small subset of stuff in reality.
-
also, training data scraped from the public internet is a woeful representation of “reality” if you ask me.
that’s why LLMs i think are bullshit machines. the systems are built on other people’s bullshit posted on the public internet. we get bullshit out because we made a bunch of bullshit. it’s just a feedback loop.
(some of the training data is not bullshit. but there is a lot of bullshit in there).
Where in the loss function of LLM training is the relationship between their model of reality and their predicted tokens? Any internal model an LLM has is an emergent property of their underlying training.
(And, given the way instruct/chat models are finetuned, I would say convincing/persuasive is very much the direction they are biased)
No, the lesson or the quote is not anthropomorphizing LLMs. It is not the LLM that "intends", it is the people who design the systems and those who make/provide the training data. In the LLM systems used today the RLHF process especially is used to steer towards plausible, confident and authorative sounding output - with no to little priority for correctness/truth.