Hacker News new | ask | show | jobs
by ykl 30 days ago
I think of most things you can get to by guess and checking as definitionally inside of the hull; most forms of guess and checking are you take some existing thing, randomize a bunch of its parameters, and see what you get. Whereas with something like relativity, there's not even a starting point that you can randomize and guess/check from the pre-existing knowledge space that will lead you to relativity. That's more like, adding a new dimension to the space entirely.

It's possible LLMs can handle this after all! But at least so far we only have existence proofs of humans doing this, not LLMs yet, and I don't think it's easy to be certain how far away LLMs are from doing this. I should distinguish between LLMS and AI more generally here; I'm skeptical LLMs can do this, I think some other kind of more complete AI almost certainly can.

I supposed you could just, I dunno, randomly combine words into every conceivable sentence possible and treat each new sentence as a theory to somehow test and brute force your way through the infinite possible theories you could come up with. But at that point you're closer to the whole infinite random monkeys producing Shakespeare thing than you are to any useful conclusion about intelligence.

1 comments

I think your point about “you could randomly generate a sequence of words, which could in principle produce a text interpretable as expressing any particular expressible-as-a-sequence-of-words novel good idea” pretty much refutes the idea that guessing and checking can only result in things inside such a convex hull, unless said hull already contains everything. Of course, there’s a significant role to play by the “checking” part.

Like, “take a random sequence of bits and interpret it as Unicode” is at one end of a scale, and “take a random sequence of words in a language” is just a tad away from it, and the scale continues in that direction for quite a while.

This assumes that everything outside of the convex hull can already be described using existing language. If you need new language to describe what is outside of the convex hull, is this something an LLM can do?

I actually don't know the answer to that; my understanding is that LLMs by nature of what they are can't understand concepts that are independent of the existing language they are trained on, but I don't have enough in-depth nitty-gritty knowledge of like, core LLM implementation details and architecture and stuff to know if that understanding is correct or not.

I suppose it is conceivable that there are some useful ideas that cannot be described in terms of language we understand (e.g. if there are ideas that are alien to us and beyond what can be described using https://en.wikipedia.org/wiki/Natural_semantic_metalanguage#... ), but, if there is, I'm not sure those are ideas we can communicate to one-another?

By "If you need new language" do you mean like, coining new words?

I don't see what would prevent them from doing this? LLMs can process text that includes newly coined terms, and respond to that text in ways that use those newly coined words in accordance with the descriptions of the meanings given for those new words in the prompt. They can also make up new words+definitions when asked to do so. Now, whether they can, without being told to do so, recognize that it would be useful to coin a new word for something, and then start using it, I don't know of any instances of this, but based on the previous two things, I don't see a reason to expect this to be fundamentally beyond what they can do?

I don't know what it would mean for a concept to be "independent of the existing language they are trained on". If there are ideas that can't be expressed in terms of the semantic primes all ideas we can express can be expressed in terms of, then I guess such an idea would be independent of our language, but I think that's a much stricter condition than what you mean (and I'm not sure if there even are any good ideas that can't be indirectly expressed in terms of semantic primes -- I kind of suspect not, unless they are like, ideas that are too big to fit in a human mind anyway).

Of course, the outputs these models produce is causally downstream from the data they are trained on, and the distribution they produce over text is largely based on the distribution over text in the training data, but altered in a number of ways (for example, to make them implement the character of the "assistant" persona).