Hacker News new | ask | show | jobs
by ta8645 496 days ago
> Being persuasive (i.e., churn out convincing prose) is how LLMs were designed to be.

No. They were designed to churn out accurate prose that accurately reflects their model of reality. They're just imperfect. You're being cynical and emotional to use the term bullshit. And again, it anthropomorphizes the LLM, it implies agency.

2 comments

> that accurately reflects their model of reality.

you are also seemingly anthropomorphising the technology by assigning to it some concept of having a “model of reality”.

LLM systems output an inference of the next most likely token, given: the input prompt, the model weights and the previously output token [0].

that is all. no models of reality involved. “it” doesn’t “know” or “model” anything about “reality”. the systems are just a fancy probability maths pipelines.

probably generally best to avoid using the word “they” in these discussions. the english language sucks sometimes. :shrug:

[0]: yes i know it is a bit more complicated than that.

> no models of reality involved.

It literally has a mathematical model that maps what would, colloquially at least, be known as reality. What exactly do you think those math pipelines represent? They're not arbitrary numbers; they are generated from actual data that is generated by reality. There's no anthropomorphizing at all.

reality is infinite.

a corpus of training data from the internet is finite.

any finite number divided by infinity ends up tending towards zero.

so, mathematically at least, the training data is not a sufficient sample of reality because the proportion of reality being sampled is basically always zero!

fun with maths ;)

> What exactly do you think those math pipelines represent?

probability distributions of human language, in the case of text only LLMs.

which is a very small subset of stuff in reality.

-

also, training data scraped from the public internet is a woeful representation of “reality” if you ask me.

that’s why LLMs i think are bullshit machines. the systems are built on other people’s bullshit posted on the public internet. we get bullshit out because we made a bunch of bullshit. it’s just a feedback loop.

(some of the training data is not bullshit. but there is a lot of bullshit in there).

You're really missing the point and getting lost in definitions. The entire point of human language is to model reality. Just because it is limited, inexact, and imperfect does not disqualify it as a model of reality.

Since LLMs are directly based on that language, they are definitely based on and are a model of reality. Are they perfect? No. Are they limited? Yes. Are they "bullshit"? Only to someone who is judging emotionally.

and herein lies the rub.

> The entire point of human language is to model reality.

is it? are you absolutely certain of that fact? is language not something that actually has a variety of purposes?

fiction novels usually do not describe our reality, but imagined realities. they use language to convey ideas and concepts that do not necessarily exist in the real world.

ref: Philip k dick.

> Since LLMs are directly based on that language, they are definitely based on and are a model of reality.

so LLMs are an approximation of an approximate model of reality? sounds like the statistical equivalent of taking an average of averages!

i am playing with you a bit here. but hopefully you see what im getting at.

by approximating something that’s approximate to start with, we end up with something that’s even more approximate (less accurate), but easier than doing it ourselves.

which is the whole USP of these things. why think about things when ChatGPT can output some approximation of what you might want?

> imagined realities.

Imagined realities are a real part of reality.

> so LLMs are an approximation of an approximate model of reality?

Yes, and we as humans have a mental model that is just an approximation of reality. And we read books that are just an approximation of another human's approximation of reality. Does that mean that we are bullshit because we rely on approximations of approximations?

You're being way too pedantic and dismissive. Models are models, regardless of how limited and imperfect they are.

> probably generally best to avoid using the word “they” in these discussions. the english language sucks sometimes.

Thanks for this specific sentence.

Subscribed to your RSS feed. Although I will never know for sure if a human being posts there or a bot of some sort.

Where in the loss function of LLM training is the relationship between their model of reality and their predicted tokens? Any internal model an LLM has is an emergent property of their underlying training.

(And, given the way instruct/chat models are finetuned, I would say convincing/persuasive is very much the direction they are biased)

> Where in the loss function of LLM training is the relationship between their model of reality and their predicted tokens?

In the part where their loss function is to predict text that humans would consider a sensible completion, in a fully general sense of that goal.

"Makes sense to a human" is strongly correlated to reality as observed and understood by humans.