Hacker News new | ask | show | jobs
by bob_theslob646 2531 days ago
Holy smokes that is an amazing story. Do you have any more? I am always fascinated to hear how researchers design studies to prevent people from just spamming in any old answer.
1 comments

I'm not sure what the state of the art is nowadays. Back in the day. The method at the time was so new, we just assumed that there would be no bots that would be able to complete the experiment successfully, or that they would stand out like a sore thumb upon analysing the data. Modern frameworks will eventually have to deal with this I'm sure. The structured way in which experiments are defined also make it easy to develop tools to automate this. Unfortunately I'm not into the field enough anymore to know how people are dealing with it.

The biggest problem we had was with people with high latency connections. The effects we were looking for were measured in 10s of milliseconds. In order to tease out these effects, we had to be very particular about the timing of when certain stimuli were presented to the participant. High latency page reloads (which were unavoidable in the system we built our method on) would mess with this high-precision presentation requirement. We measured the latency, but did not pre-emptively exclude anyone based on their latency, hence the high % of people with unusable data in our initial validation experiment.

For subsequent experiments I built a "loader" screen that would pretend to be loading the experiment. What it in fact was doing was refreshing the webpage several times (ofcourse, while progressing a progress bar) to measure the latency of the connection of the participant. High average latency + high variance latency connections were excluded. The tresholds were based on what we found in the early validation study.

Surprisingly, after throwing out the high-latency data, there were no other exclusions necessary. It seemed that for our validation study, all participants were very attentive during the experiment.

In "in-person" cases, researchers would add attention checks to their experiments with the logic that failing these attention checks by itself is no indication of "spamming", but seeing weird quirks in the data + failed attention checks would be. One that my supervisor was fond of was throwing in instruction screens, in the middle of runs of trials that required you to press a very specific button to continue. E.g., the experiment has you press 'A' and 'J' constantly, and to continue you have to press 'N'. Secretly though, A and J were also valid ways to continue with the experiment. The thought being that if you were hammering one of those buttons to quickly get through the experiment, you would also skip past the instruction screen very quickly with it.