| If you've not looked at it I really recommend othello gpt. That is an experiment explicitly designed to tackle this kind of question, has it just seen enough moves that it knows what should come next? > Why doesn't it tell me how much air it displaces under what circumstances? You can ask it and it'll answer. > If the fix is better training data does that mean it's reasoning or regurgitating? More training data helps with things you can just bring to the fore, same as a lot of learning. More useful training data though can also help reasoning, which makes sense - deliberate training of people helps improve their logical reasoning capabilities. I know that doesn't guarantee that's what LLMs are doing but humans benefit significantly from both more teaching and better teaching. > Very impressive, but is it any more original than classic search engines' old trick of regular expressions to figure out if I mean the currency or weight when I ask "1 pound =" with the contexts USD or kg after "="? Does it understand the input, or are there just enough discussions in the training data to make it look like it is? I'd be interested to know any requirements around this to clearly show the difference. I tried asking what if I filled a balloon at a childrens party with a gas made of atoms that have 1 proton and 100 neutrons: https://chat.openai.com/share/71224df4-5c6c-45f7-88fd-eec316... (tl;dr: "In the context of a child's birthday party, introducing such a balloon would be a grave mistake.") It identifies: * Whether it would float or not and why
* That it would be radioactive, and likely types of radiation from it
* What that would mean to the balloon
* How people would react and the likely consequences of releasing it in a room of children This is an element that does not exist, in a setting where nothing like this has happened before, with details ranging from types of decay, consequences and human emotional reactions to something like this. Yes, there are real things you can use as a base (e.g. how do people respond to events that kill people), but I feel it's an example of where it's beyond a search engine summary. |
I skimmed it and read the conclusion, and it looks interesting, will take a closer look when I have time.
The prompt covers a subject that goes completely over my head so I can't tell how well it reasons. I don't know what 1 proton to 100 neutrons means, but I gather it's radioactive. I don't think it's far fetched that it draws the same conclusion from the training set because to you it seems obvious, and is probably well known to anyone who knows the subject. Kind of like it would understand that "hotter than the sun" is super hot, can correlate to different melting points. But I wouldn't say it understands the concept of temperature. Given the right prompt it might give you the impression it does.
The feelings of the scenario reads like any PR comment after a tragedy. "We feel shock and disbelief" and so on. The scenario being hypothetical doesn't change that since it's probabilities. It acts just like you'd think it would. The earlier example with the helium balloon is similar, it assumes a human context and not the form, and environment the helium is in. True intelligence might not even consider the presence of atmosphere as the norm. "It has no weight outside of your human constraints" would be novel.
Lets say it has odd numbers between 1-9 in the database. Given the prompt 2 and 8 you will get back 1,3,7,9, sprinkled with some natural language and we get the impression it's intelligent.
Are you saying it understands the effect the neutron to proton ratio has, as opposed to just comparing the vectors closest to your prompt that it builds the answer from? Being tested on new and hypothetical examples only means it will be further from the vectors but still close enough to give us the impression it understands the subject. If the training data didn't include the words neutron or proton it would have no idea where to begin.
In my first comment that started this chain I said:
> I don't see why not. It's not taking a single answer from a database no, it's taking several based on probability and merging them into what it thinks we're looking for.
I don't think even this latest answer is any proof of anything other than that. Are you claiming there is? And what are you claiming is happening?