|
|
|
|
|
by randomdata
637 days ago
|
|
Or just a way to compel the model to do more work without needing to ask (isn't that what o1 is all about?). If you do ask for the extra effort it works fine. + How many "r"s are found in the word strawberry? Enumerate each character.
- The word "strawberry" contains 3 "r"s. Here's the enumeration of each character in the word:
-
- [omitted characters for brevity]
-
- The "r"s are in positions 3, 8, and 9.
|
|
Even if these models did have a concept of the letters that make up their tokens, the problem still exists. We catch these mistakes and we can work around them by altering the question until they answer correctly because we can easily see how wrong the output is, but if we fix that particular problem, we don't know if these models are correct in the more complex use cases.
In scenarios where people use these models for actual useful work, we don't alter our queries to make sure we get the correct answer. If they can't answer the question when asked normally, the models can't be trusted.