Information Re-Retrieval: Repeat Queries in Yahoo’s Logs
Abstract: "This paper explores repeat search behavior through the analysis of a one-year Web query log of 114 anonymous users and a separate controlled survey of an additional 119 volunteers. Our study demonstrates that as many as 40% of all queries are re-finding queries. Re-finding appears to be an important behavior for search engines to explicitly support, and we explore how this can be done."
I am making some assumptions here absolutely, but because 40% is a large effect you don't need as many samples to be confident.
The other way of looking at it is that maybe it's actually 35% or 45% but either way, that's still interesting, even with a rougher approximation of the actual "answer". If, for some reason, you needed to know if it was 40% or 40.01% because that mattered to you then you would absolutely be annoyed at the small sample size.
If the finding was 2% then we would care about the uncertainty of +/- 5% since the finding is dwarfed by the error rate. That's a smaller effect size so you would need more samples to separate reality from the noise.
I am, by the way, pulling all of these numbers out my ass. Your stats 101 class will teach you the formulas to calculate the actual error bars at work here as well as the assumptions you need to make about the distribution of the data to use those formulas.
Information Re-Retrieval: Repeat Queries in Yahoo’s Logs
Abstract: "This paper explores repeat search behavior through the analysis of a one-year Web query log of 114 anonymous users and a separate controlled survey of an additional 119 volunteers. Our study demonstrates that as many as 40% of all queries are re-finding queries. Re-finding appears to be an important behavior for search engines to explicitly support, and we explore how this can be done."