| It's not about a non-binary threshold. The problem is having a threshold in the first place. Say that given the evidence there's a 9 % chance the null hypothesis is true. A frequentist used to a 10 % significance level would then say the effect is true. A frequentist trained on a 5 % significance level would say it is false. But that's just an arbitrary cutoff that by itself means nothing. If instead we look at a practical scenario where we would use this result, we understand the problem space better. Maybe we have figured out how to get limited rights to the transpositions of famous music to other A4s, and this would cost a lot to do, but earn us some money if we do it and the effect is real. Should we acquire those rights or not? Ask the 5 % frequentist and they would say "there's no significant difference, so you shouldn't." Ask the 10 % frequent and they say the opposite. Who do we listen to? Let's ask the poker player. They will ask "What exactly does it cost to do this, and how much will you earn if it works out? That matters!" So let's say it costs us $100 per song to get the rights to the transposed version, and we think it will earn us $102 per song if the effect is true. Now we can just plug in, remembering that there's a 9 % chance the null hypothesis is correct: -100 + 102 * 0.91 = -7.18 per song Not good. What if we made $127 per song with the same cost? -100 + 127 * 0.91 = +15.57 per song Worth doing, at the same probability of the null hypothesis! In other words, you can't determine what's significant until you know how you will use the result. Statistics by itself is meaningless. It gains meaning only when it's used to choose between actions. |
That's a binary decision, which is bad, so we shouldn't.
In accordance with the confidence probability, we should buy into a percentage of the rights, and transpose the music to an interpolated value between A=432 and A=440 Hz.