Hacker News new | ask | show | jobs
by ta8645 496 days ago
That is an emotionally manipulative definition and anthropomorphizes the LLMs. They don't "intend" anything, they're not trying to trick you, or sound persuasive.
2 comments

They address this in lesson 2:

> According to philosopher Harry Frankfurt, a liar knows the truth and is trying to lead us in the opposite direction.

> A bullshitter either doesn't know the truth, or doesn't care. They are just trying to be persuasive.

Being persuasive (i.e., churn out convincing prose) is how LLMs were designed to be.

Some pushback on this, but it remains true.

Easy to see when - for example - Claude gushes about how great all your ideas are.

Also the stark absence of "I don't know."

I've never used Claude, but Perplexity often says that no definitive information about a topic could be found, and then tries to make some generalized inferences. There's a difference between a specific implementation, and the technology in general.

In any case, it's worthwhile for people to understand the limitations of the technology as it exists today. But calling it "bullshit" is a mischaracterization; I believe based on an emotional need for us to feel superior, and to dismiss the capabilities more thoroughly than they deserve.

It's a little like someone saying in the industrial revolution, "the steam shovel is too rigid, it will NEVER have the dexterity of a man with a shovel!". And while true and important to know, it really focuses on the wrong thing, it misses the advantages while amplifying the negatives.

If not bullshit then what would you call it?
As the technology exists today: imperfect, often prone to mistakes, and unable to relay confidence levels. These problems may be addressed in future implementations.

That's the same message, without any emotional baggage, or overly dismissive tone.

That would be great if those who are selling the technology described it that way. I, and apparently others, feel like maybe "bullshit" is a better counter to the current marketing for LLMs
This is patently false. They are trained to generate correct responses.
Then comes the question of what is a correct response...

ps: I fail to detect whether your comment was ironic or not.

There are different criteria in use for that. But sycophantic behavior is not the goal. It's something model builders actively try to prevent.
> Being persuasive (i.e., churn out convincing prose) is how LLMs were designed to be.

No. They were designed to churn out accurate prose that accurately reflects their model of reality. They're just imperfect. You're being cynical and emotional to use the term bullshit. And again, it anthropomorphizes the LLM, it implies agency.

> that accurately reflects their model of reality.

you are also seemingly anthropomorphising the technology by assigning to it some concept of having a “model of reality”.

LLM systems output an inference of the next most likely token, given: the input prompt, the model weights and the previously output token [0].

that is all. no models of reality involved. “it” doesn’t “know” or “model” anything about “reality”. the systems are just a fancy probability maths pipelines.

probably generally best to avoid using the word “they” in these discussions. the english language sucks sometimes. :shrug:

[0]: yes i know it is a bit more complicated than that.

> no models of reality involved.

It literally has a mathematical model that maps what would, colloquially at least, be known as reality. What exactly do you think those math pipelines represent? They're not arbitrary numbers; they are generated from actual data that is generated by reality. There's no anthropomorphizing at all.

reality is infinite.

a corpus of training data from the internet is finite.

any finite number divided by infinity ends up tending towards zero.

so, mathematically at least, the training data is not a sufficient sample of reality because the proportion of reality being sampled is basically always zero!

fun with maths ;)

> What exactly do you think those math pipelines represent?

probability distributions of human language, in the case of text only LLMs.

which is a very small subset of stuff in reality.

-

also, training data scraped from the public internet is a woeful representation of “reality” if you ask me.

that’s why LLMs i think are bullshit machines. the systems are built on other people’s bullshit posted on the public internet. we get bullshit out because we made a bunch of bullshit. it’s just a feedback loop.

(some of the training data is not bullshit. but there is a lot of bullshit in there).

You're really missing the point and getting lost in definitions. The entire point of human language is to model reality. Just because it is limited, inexact, and imperfect does not disqualify it as a model of reality.

Since LLMs are directly based on that language, they are definitely based on and are a model of reality. Are they perfect? No. Are they limited? Yes. Are they "bullshit"? Only to someone who is judging emotionally.

> probably generally best to avoid using the word “they” in these discussions. the english language sucks sometimes.

Thanks for this specific sentence.

Subscribed to your RSS feed. Although I will never know for sure if a human being posts there or a bot of some sort.

Where in the loss function of LLM training is the relationship between their model of reality and their predicted tokens? Any internal model an LLM has is an emergent property of their underlying training.

(And, given the way instruct/chat models are finetuned, I would say convincing/persuasive is very much the direction they are biased)

> Where in the loss function of LLM training is the relationship between their model of reality and their predicted tokens?

In the part where their loss function is to predict text that humans would consider a sensible completion, in a fully general sense of that goal.

"Makes sense to a human" is strongly correlated to reality as observed and understood by humans.

No, the lesson or the quote is not anthropomorphizing LLMs. It is not the LLM that "intends", it is the people who design the systems and those who make/provide the training data. In the LLM systems used today the RLHF process especially is used to steer towards plausible, confident and authorative sounding output - with no to little priority for correctness/truth.