| The issue is with false negatives (not finding real problems). It is hard to say that 100% of issues were found, because how can you quantify the number of problems that weren't found? The claim of this article that you can somehow quantify the percentage of usability problems is rather absurd. How can you quantify the total number of usability issues in a program a priori? Also, I don't think that graph ever reaches 100%, I think it approaches 100% (but then I'm not a math-whiz, so I will gladly accept correction). They've successfully performed statistical sleight of hand. The assumption that you can quantify the total number of defects is untenable, but they leave that part out of their article and just show the nice charts and math to provide strength to their broken hypothesis. I think this is unfortunate because I think that there is probably some great insights in this article and the intent is good. Usability and Human-Centered Computing/Human-Computer Interaction has a tendency to suffer from this type of "pseudo-science" of using fuzzy statistics to present great sounding finding. I was made aware of this trend by the following article: Wayne D. Gray and Marilyn C. Salzman, "Damaged merchandise? A review of experiments that compare usability evaluation methods", Human-Computer Interaction, vol. 13, pp. 203-261, 1998. |
Nielsen just claims that if you test with 15 users, you can be pretty sure you have found all problems you will ever find - you most likely won't find any new problems by using an additional 15 or 50 tests. (But obviously you cannot be 100% sure.)
But whether his numbers are based on sound research I wont judge.