| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by lsy 376 days ago
	The example given for inverting an embedding back to text doesn't help the idea that this effect is reflecting some "shared statistical model of reality": What would be the plausible whalesong mapping of "Mage (foaled April 18, 2020) is an American Thoroughbred racehorse who won the 2023 Kentucky Derby"? There isn't anything core to reality about Kentucky, its Derby, the Gregorian calendar, America, horse breeds, etc. These are all cultural inventions that happen to have particular importance in global human culture because of accidents of history, and are well-attested in training sets. At best we are seeing some statistical convergence on training sets because everyone is training on the same pile and scraping the barrel for any differences.

5 comments

jychang 376 days ago

I fail to see how that matters. You're implying that all reality is cultural, but that seems irrelevant. The same thing would apply to scientific facts, but whales not having a word for science doesn't make it not real.

If we somehow discover LLMs right after Newton discovered the theory of gravity, and then a while later Einstein discovers General Relativity, then GR would not be in the training set of the neural net. That doesn't make GR any less of a description of reality! You also can't convert General Relativity into whalesong!

But you CAN explain General Relativity in English, or in Chinese to a person in china. So the fact that we can create a mapping from the concept of General Relativity in the neural network of the brain of a human in the USA using english, to someone in china using chinese, to a ML model, is what makes it a shared statistical model of reality.

You also can't convert General Relativity to the language of "infant babble", does it make general relativity any less real?

zer00eyz 376 days ago

> You're implying that all reality is cultural,

Let's look at two examples of cultural reality:

Fan death in South Korea. Where people believe that a fan running while you sleep can kill you.

The book "Pure, White and Deadly". Where we discredited the author and his findings and spent decades blaming fat, while packing on the pounds with high fructose corn syrup.

An LLM isn't going to find some intrinsic truth, that we are ignoring, in its data set. An LLM isn't going to find issues in the reproducibility / replication crisis. I have not seen one discredit a scientific paper with its own findings.

To be clear LLM's can be amazing tools, but garbage in garbage out still applies.

sigmoid10 376 days ago

You are describing the state of LLMs from 2 years ago. Which basically means they were just pre-trained on the internet and then fine tuned to follow a particular instruction format. Current models still use this as a first step, but are then trained a lot using reinforcement learning, which has given them much better skills at reasoning and logic than human tainted data ever could. See how Grok 4 for example still eagerly dismisses all those right wing hoaxes, despite being massively tuned to favour right wingers by its creators carefully selecting pre-training data.

otabdeveloper4 376 days ago

You have some sort of very confused idea of what reinforcement learning is. (Which is probably why you're being downvoted.)

sigmoid10 375 days ago

I suggest you reed something like the DeepSeek R1 paper, because you and everybody else here seems to have no clue how it works (which is not surprising tbh).

indymike 375 days ago

> That doesn't make GR any less of a description of reality!

When you layer the concept of awareness into the mix it does alter reality for an individual or llm. Awareness creates interesting blind spots into our statistical models of reality.

cainxinth 376 days ago

It doesn’t matter whether the Kentucky Derby is core to reality. The point is it is part of reality. If you want to model reality with 100% accuracy, you need to know about the Kentucky Derby. The author is arguing that models are converging on something close to the platonic ideal representation. So, a perfect model with perfect translatability would in fact be able to communicate the concept of a four legged land animal (named after a being capable of impossible feats) that attempts to be faster than other animals to win a reward for a rider on its back. Whether the platonic representation hypothesis is correct or not and whether our models will ever actually get that good are different questions.

cmiles74 376 days ago

Agreed. They aren't converging on a statistical model of reality, they are converging on a statistical model of their training data. In the case of LLMs and the size of the training data it's possible they are also converging on some commonality between all text. I doubt this reveals a core truth but maybe it will give us some insight into what we all agree certain chunks of text represent (when I use this idiom, everyone understand I mean this).

benlivengood 376 days ago

You also can't translate "Mage (foaled April 18, 2020) is an American Thoroughbred racehorse who won the 2023 Kentucky Derby" into Hellenistic Greek or some modern indigenous languages because there isn't enough shared context; you'd need to give humans speaking those languages a glossary for any of the translation to make sense, or allow them to interrogate an LLM to act as the glossary.

I'd say our current largest LLMs probably contain sufficient detail to explain a concept like a named race horse starting from QCD+gravity and ending up at cultural human events, given a foothold of some common ground to translate into a new unknown language. In a sense, that's what a model of reality is. I think it's possible because LLMs figure out translation between human languages by default with enough pretraining.

waldrews 376 days ago

Your point holds, but the example of Hellenistic Greek seems ill-chosen to make that case - they had horse races and calendars, and mythological mage equivalents that would be reasonable to name a horse after, so the only thing left to map is 'American' as a geographic proper name and 'an important race in America' - which is about as translatable as it gets. Maybe if we pick one of the many cultures that never had horses, their translation would have to throw in so much context that the corresponding text is structurally different?

cwmoore 376 days ago

The LLM almost caught a 42 slork babelfish.

cwmoore 376 days ago

Man"splain a concept like...given a foothold of some common ground to translate into a new unknown language. In a sense, that's what a model of reality is" not. It is a story with motivated characters.

edwardbernays 376 days ago

Why QCD? Quantum chromodynamics, the quantized theory of the nuclear strong force? There is also QED, quantum electrodynamics, which is the quantized field theory for electrodynamics, and then also QFD (quantum flavordynamics) for the weak force. Do you seriously mean to imply that the quantum field theory corresponding to ONLY the strong force, plus gravity, explains every emergent phenomena from there to culture? Fully half of the fundamental forces we account for, in two disparate theoretical frameworks?

messe 376 days ago

"I mean, what are we, to believe that this is some sort of a, a magic xylophone or something? Boy, I really hope somebody got fired for that blunder"[1]

Just replace QCD with "known/understood quantum theories" and move on with your life. Thats not the important part of the comment you're replying to.

[1]: https://youtu.be/pYrRqMHQY7o

edwardbernays 376 days ago

No, it actually is kind of important. Because he obviously does not understand what he's talking about, nor does he apparently have a good mental model of the standard model and its relation to prior work. He is just using buzzwords and leaning on that for sloppily gesturing at his imagination saying "look at all of our treasures, we can already solve all the most fundamental issues! We already have all of the pieces!"

Forgive me if I insist somebody show the most minute amount of competence before entertaining their absolutely wild speculation regarding whether the corpus of our species can explain physics-to-culture.

benlivengood 376 days ago

I don't mean expressing human culture in purely mathematical terms, which sounds intractable.

I mean expressing the relationships and abstractions between the different levels at which we model the world. If you need to explain horses to whales, you probably need to drop down to a biological level to at least explain keratin for hooves to e.g. the baleen whales. Other than that, common experiences of mammals probably suffices to explain social and mating differences (assuming sufficient abstractions in this hypothetical whale languages)

If you need to explain horses to aliens, you'd drop all the way down to mathematics and logic and go back up through particle physics to make sure both parties were grounded in the way we talk about objects and systems before explaining Earth biology and evolutionary history. Behavioral biology would have to be the base for introducing cultural topics and expressing any differences in where we lump behaviors in biology vs. culture, etc.

My basic claim is that if we had a pretrained model over human languages and one hypothetical alien language then either a human or an alien could learn to speak the other's language and understand the internal world model used by the other, because the amount of information contained in large LLMs covers enough of our human world model that it can translate between human languages and also answer questions about how our world-models at various levels of abstraction are related to and built upon each other via definitions.

I am less certain if that same hypothetical LLM could accurately translate between an alien language and human language; I think that the depth required to translate across potentially several layers of abstractions might not fit in the context windows and attention-span of LLMs. I think ~current LLMs will be able to accurately translate inter-species on earth if we can get enough animal language+behavior data into them.

benlivengood 376 days ago

My mistake; not being a physicist I routinely mix up which theory combines the 3 quantum forces. The standard model doesn't have a nice acronym and QFT doesn't necessarily mean standard model.

photonthug 376 days ago

> You also can't translate .. into Hellenistic Greek or some modern indigenous languages because there isn't enough shared context; you'd need to give humans speaking those languages a glossary for any of the translation to make sense

What? By substitution, this means you can translate it. As long as we're assuming a large enough basis of concept vectors of course it works.

> I'd say our current largest LLMs probably contain sufficient detail to explain a concept like a named race horse starting from QCD+gravity and ending up at cultural human events

What? I'm curious how you'd propose to move from gravity to culture. This is like TFAs assertion that the M+B game might be as expressive as 20 questions / universal. M or B is just (bad,sentient) or (good,object). Sure, entangling a few concepts is slightly more expressive than playing with a completely flattened world of 1/0 due to some increase in dimensionality. But trying to pinpoint anything like (neutral,concept) fails because the basis set isn't fundamentally large enough. Any explanation of how this could work will have to cheat, like when TFA smuggles new information in by communicating details about distance-from-basis. For example to really get to the word or concept of "neutral" from inferred good/bad dichotomy of bread/mussolini, you would have to answer "Hmmmmmm, closer to bread I guess" in one iteration and then "Umm.. closer to Mussolini I guess" when asked again, and then have the interrogator notice the uncertainty/hesitation/contradiction and then infer neutrality. This is just the simple case.. physics to culture seems much harder

edwardbernays 376 days ago

I completely believe that physics to culture is intractable given our current corpus, to the degree it's probably a nonsense claim. There are so many emergent phenomena that introduce confounding effects at every level of abstraction.

Also, why QCD? Quantum chromodynamics, the quantized theory of the nuclear strong force? There is also QED, quantum electrodynamics, which is the quantized field theory for electrodynamics, and then also QFD (quantum flavordynamics) for the weak force. Does OP seriously mean to imply that the quantum field theory corresponding to ONLY the strong force, plus gravity, explains every emergent phenomena from there to culture? Fully half of the fundamental forces we account for, in two disparate theoretical frameworks?

OP's comment is not serious speculation.

Invictus0 376 days ago

You could settle this fairly easily by training two small models on highly disparate datasets, maybe historical Chinese texts and historical greek texts, in their native languages, and seeing if the same similarities recur.