|
|
|
|
|
by Mizza
1211 days ago
|
|
The chain-of-thought prompting in section 4.5 is extremely interesting to me, but it looks like they're missing a test group - what is the performance if the image is simply described and then the task is evaluated using only the text of the description, not only when combined with the image. |
|