| Very impressive, but is it any more original than classic search engines' old trick of regular expressions to figure out if I mean the currency or weight when I ask "1 pound =" with the contexts USD or kg after "="? Does it understand the input, or are there just enough discussions in the training data to make it look like it is? I'm not convinced it's not the latter. It uses context to figure out we're trying to convert something to something else. Then it adds all those numbers up. Taking helium into consideration is no doubt interesting, but they've also polished that task since that was the common critique they got so very wrong with the first release (which I mentioned they had fixed). I'm not qualified to assess this part of the answer; > "If the balloons displace more than 100g of air when filled with helium, then they would effectively weigh less than if they were left empty. If they displace exactly 100g of air, then the balloons would have the same weight as if they were left empty." I don't know enough to understand how much 100g of helium is and how it behaves. And it doesn't try to explain it to me, it mentions it then takes the easy route assuming it's a trick question. What does that tell you? I guess there are similar discussions around and it gives me the summary. Why doesn't it tell me how much air it displaces under what circumstances? Temperature etc, it should be easy if it's not just a simple discussion on a random forum. A conversion regex could do it. This comment[1] has a very impressive example. But anything I'm qualified to assess has mostly been meh. If the fix is better training data does that mean it's reasoning or regurgitating? The mistakes it makes are what tells me how it works, not when it tricks me that it's correct. To me it's a very well polished search engine summary. [1]: https://news.ycombinator.com/item?id=37219351 |
> Why doesn't it tell me how much air it displaces under what circumstances?
You can ask it and it'll answer.
> If the fix is better training data does that mean it's reasoning or regurgitating?
More training data helps with things you can just bring to the fore, same as a lot of learning. More useful training data though can also help reasoning, which makes sense - deliberate training of people helps improve their logical reasoning capabilities. I know that doesn't guarantee that's what LLMs are doing but humans benefit significantly from both more teaching and better teaching.
> Very impressive, but is it any more original than classic search engines' old trick of regular expressions to figure out if I mean the currency or weight when I ask "1 pound =" with the contexts USD or kg after "="? Does it understand the input, or are there just enough discussions in the training data to make it look like it is?
I'd be interested to know any requirements around this to clearly show the difference. I tried asking what if I filled a balloon at a childrens party with a gas made of atoms that have 1 proton and 100 neutrons: https://chat.openai.com/share/71224df4-5c6c-45f7-88fd-eec316...
(tl;dr: "In the context of a child's birthday party, introducing such a balloon would be a grave mistake.")
It identifies:
* Whether it would float or not and why * That it would be radioactive, and likely types of radiation from it * What that would mean to the balloon * How people would react and the likely consequences of releasing it in a room of children
This is an element that does not exist, in a setting where nothing like this has happened before, with details ranging from types of decay, consequences and human emotional reactions to something like this. Yes, there are real things you can use as a base (e.g. how do people respond to events that kill people), but I feel it's an example of where it's beyond a search engine summary.