|
|
|
|
|
by RandomLensman
985 days ago
|
|
Get it to explain something, lets say dynamic hedging for derivatives, and then ask it to explain how to exactly hedge something specific. Or describe some physical situation with a quirk and then let to derive the implications. Someone on HN had an example of asking to imagine entropy working in reverse in a cup of coffee with sugar dissolved. While it discussed sugar spontaneously forming crystals and other things, it never considered what the water would do, for example, let alone bigger issues such as if even the existence of water works etc. Again, humans are often poor at these things, too, but if it had "mechanized" reasoning capabilities instead of "replicative" ones (i.e., just repeating stuff), I would expect it to do generally better. |
|
I find these questions generally poor at gauging anything when people haven't given them to a representative sample of people first as a benchmark. Consider that not long ago there was a tedious trend of people posting "difficult" questions of orders of operations involving basic arithmetic, and a significant proportion of people in the threads would continue to belabour and argue for the wrong result even after having been told in excruciating detail how to apply the rules. In other words: I think people here tend to massively overestimate the reasoning ability of the average person.
E.g. to the example questions here, I'd bet the average person can't give a satisfactory definition of entropy, much less be able to tell what it does "forwards" before even considering "reverse". So why would we treat this as a benchmark of whether or not an LLM can reason?