|
|
|
|
|
by xianshou
545 days ago
|
|
This doesn't replicate using gpt-4o-mini, which always picks Flight B even when Flight A is made somewhat more attractive. Source: just ran it on 0-20 newlines with 100 trials apiece, raising temperature and introducing different random seeds to prevent any prompt caching. |
|
But the meat of the paper is the Shapley value estimation algorithm in appendix A4. And in A5 you can see that different models giving different results is to be expected.