| > Fully rediscovering what took humans many years to do off-the-cuff is an outrageously high bar. > What features of a question would you look for to identify whether it's "taking several answers from a database and merging them together" or performing some reasoning? I've asked a few times but don't understand what you're expecting. You're misunderstanding me, I'm setting no bars, and I have no threshold where this changes. We humans are also just looking things up in our database and doing deductions. We do some computing on urgency as well, like how when we hear a bang our mind goes for danger first before realizing it was harmless, but very similar to what these AIs do. Probabilities and experience. Fresh and novel ideas are very rare in humans as well, and not something I demand before I would consider someone a human. I did however give you an example that would surprise me, if it considered mass and environment in a way that proves that it understands the problem for what it is. If it told me weight is a human construct and requires gravity/movement and how it depends. An intelligent human doesn't necessarily answer the question it is asked in the way it is phrased. It identifies and irons out misunderstandings, assumptions and other details important to correctly understand the problem, and may even rephrase the question to give a proper response. That would show me a deep understanding of the problem and maybe freak me out a little, but only if the hallucinations are gone and those can be difficult to spot. This is, just like us, performing calculations and database look-ups. It may feel like it's doing something else but it's not. What would happen if we leave the weights as they are but switch the words? It would give us complete gibberish, but it's no less correct than it was before and it's not even giving us different answers, only the translations to language get distorted. Most people would call it stupid and pointless even if the only change is our interpretation of the answers. I'm sure Hiroshima, Fukushima and other dangers of radiation is in the training set, as are all of the other steps you mention, it goes round and round testing the numbers based on training. Remember how this chain started, you claimed: > They're not just retrieving stored text like pulling the most relevant passage from a database. If they were they'd not be able to deal with things outside the training set. To which I simply replied: > It's not taking a single answer from a database no, it's taking several based on probability and merging them into what it thinks we're looking for. I read you (correct me if I'm wrong) as giving this way to much agency. To change my mind that it's doing something unexpected I would ask for logs on the calculations it does, and be able to correlate that to the training set. I have to be able to falsify the conclusions I'm asked to make. I know some people claim we don't understand these algorithms but I assume that's just hyperbole and with the correct measures we could follow every step. If there are things there which I can not trace I would be very impressed, and honestly a little afraid. They are not trained for every single task, but approximations based on similarities have proven to be very capable even when we think we're out of context (we're not, it doesn't understand context and doesn't care, but neither do most humans). |
The further question I still find interesting though.
> I read you (correct me if I'm wrong) as giving this way to much agency. To change my mind that it's doing something unexpected I would ask for logs on the calculations it does, and be able to correlate that to the training set. I have to be able to falsify the conclusions I'm asked to make. I know some people claim we don't understand these algorithms but I assume that's just hyperbole and with the correct measures we could follow every step.
This one is tricky. We know exactly what they do. Interpreting that is very hard though, they've a big pile of mathematical operations with billions of magical constants and it... works. We can see exactly what they do but if I could see every synapse firing in your brain I'd still not be able to understand how it works in a useful manner. So we understand them obviously, but at another level we really don't.
> I'm sure Hiroshima, Fukushima and other dangers of radiation is in the training set, as are all of the other steps you mention, it goes round and round testing the numbers based on training.
Just to be clear here, there is no recursion other than when you add more text. There is not an algorithm saying "identify parts X, then look in database Y, now summarise...". They're trained essentially to just predict the next word given some text. There's some later training to make them more conversational. The capabilities you see are just a consequence of that.
Othello GPT is show just moves. It ends up building an internal model of a board.
> I did however give you an example that would surprise me, if it considered mass and environment in a way that proves that it understands the problem for what it is. If it told me weight is a human construct and requires gravity/movement and how it depends. An intelligent human doesn't necessarily answer the question it is asked in the way it is phrased. It identifies and irons out misunderstandings, assumptions and other details important to correctly understand the problem, and may even rephrase the question to give a proper response. That would show me a deep understanding of the problem and maybe freak me out a little, but only if the hallucinations are gone and those can be difficult to spot.
Let's try and investigate that then, that sounds interesting. I'm not sure I understand myself what you mean that weight is a human construct (it explains the difference between weight and effective weight in the answers to me off the bat, that's the only real difference). Perhaps this is too simple, the answer is quite straightforward.
> If it told me weight is a human construct and requires gravity/movement and how it depends
I asked "Which is the most, a pound of feathers or a pound of helium?" with largely just your statement as the system message and got
>The question seems to contain an intrinsic confusion. When we discuss weight, both a pound of feathers and a pound of helium would weigh the same - a pound. The difference, though, comes in their volume and density. A pound of helium would take up a lot more space than a pound of feathers considering the density of helium is lower than the density of feathers. If you were implying which would be more in terms of volume, then a pound of helium would be significantly more than a pound of feathers.
> However, I might be wrong if we take into account that helium, being a gas, is usually measured in terms of its volume at standard temperature and pressure, rather than by weight like solid or loose materials such as feathers. Also, the weight of a pound can vary slightly depending on where on Earth it is measured due to differences in gravity. However, these factors don't fundamentally change the answer to the question as it was posited.
Perhaps instead you could give me a short question and the kind of answer that would surprise you? I know this thread has gone on some time, but personally this is interesting to me. If you wanted to shift off from hn, feel free to drop me an email, I have a vested interest in understanding how people view LLMs.