| HN Mirror

Perhaps we've been talking past each other then. I've been trying to show that these things can do reasoning, and my upper benchmark is "like a human". If you're starting from "these things might be doing what humans are doing / capable of performing similar tasks" then we're largely aligned.

The further question I still find interesting though.

> I read you (correct me if I'm wrong) as giving this way to much agency. To change my mind that it's doing something unexpected I would ask for logs on the calculations it does, and be able to correlate that to the training set. I have to be able to falsify the conclusions I'm asked to make. I know some people claim we don't understand these algorithms but I assume that's just hyperbole and with the correct measures we could follow every step.

This one is tricky. We know exactly what they do. Interpreting that is very hard though, they've a big pile of mathematical operations with billions of magical constants and it... works. We can see exactly what they do but if I could see every synapse firing in your brain I'd still not be able to understand how it works in a useful manner. So we understand them obviously, but at another level we really don't.

> I'm sure Hiroshima, Fukushima and other dangers of radiation is in the training set, as are all of the other steps you mention, it goes round and round testing the numbers based on training.

Just to be clear here, there is no recursion other than when you add more text. There is not an algorithm saying "identify parts X, then look in database Y, now summarise...". They're trained essentially to just predict the next word given some text. There's some later training to make them more conversational. The capabilities you see are just a consequence of that.

Othello GPT is show just moves. It ends up building an internal model of a board.

> I did however give you an example that would surprise me, if it considered mass and environment in a way that proves that it understands the problem for what it is. If it told me weight is a human construct and requires gravity/movement and how it depends. An intelligent human doesn't necessarily answer the question it is asked in the way it is phrased. It identifies and irons out misunderstandings, assumptions and other details important to correctly understand the problem, and may even rephrase the question to give a proper response. That would show me a deep understanding of the problem and maybe freak me out a little, but only if the hallucinations are gone and those can be difficult to spot.

Let's try and investigate that then, that sounds interesting. I'm not sure I understand myself what you mean that weight is a human construct (it explains the difference between weight and effective weight in the answers to me off the bat, that's the only real difference). Perhaps this is too simple, the answer is quite straightforward.

> If it told me weight is a human construct and requires gravity/movement and how it depends

I asked "Which is the most, a pound of feathers or a pound of helium?" with largely just your statement as the system message and got

>The question seems to contain an intrinsic confusion. When we discuss weight, both a pound of feathers and a pound of helium would weigh the same - a pound. The difference, though, comes in their volume and density. A pound of helium would take up a lot more space than a pound of feathers considering the density of helium is lower than the density of feathers. If you were implying which would be more in terms of volume, then a pound of helium would be significantly more than a pound of feathers.

> However, I might be wrong if we take into account that helium, being a gas, is usually measured in terms of its volume at standard temperature and pressure, rather than by weight like solid or loose materials such as feathers. Also, the weight of a pound can vary slightly depending on where on Earth it is measured due to differences in gravity. However, these factors don't fundamentally change the answer to the question as it was posited.

Perhaps instead you could give me a short question and the kind of answer that would surprise you? I know this thread has gone on some time, but personally this is interesting to me. If you wanted to shift off from hn, feel free to drop me an email, I have a vested interest in understanding how people view LLMs.