|
|
|
|
|
by jonplackett
408 days ago
|
|
Just that what I thought would be better models don’t do it right. I was expecting this model to be no-where near chatGPT Although someone above is saying 4o-mini got it right so maybe it’s meaningless. Or maybe thinking less helps… |
|
Try re-running your test on the same model multiple times with the identical prompt, or varying the prompt. Depending on how much context the service you choose is keeping for you across a conversation, the behavior can change. Something as simple as prompting an incorrect response with a request to try again because the result was wrong can give different results.
Statistically, the model will eventually hit on the right combination of vectors and generate the right words from the training set, and as I noted before, this problem has a very high probability of being in the training data used to build all the models easily available.