| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by rococode 2513 days ago

This is definitely a space that needs innovation! How do you plan to handle the case of survey takers being real people but just skimming through the survey?

In my experience running research studies, this was the main problem with MTurk. Things like bots and blatantly junk answers were relatively rare and easy enough to detect and filter. What was much harder to deal with was the fairly high volume of users who just want to finish the survey as fast as possible. It's not so bad for short surveys, but anything over 5 minutes starts to have issues with low quality responses.

We had to introduce several control questions to check for consistency in answers and measure time to find outliers. But, it was not an ideal setup and there were still many survey takers we suspected were not paying much attention. The breakdown for our studies was something like 60% good quality answers, 35% low quality answers that are hard to distinguish from high quality, and 5% total junk answers. We ended up doing an in-person study where we got much cleaner results, presumably because people pay better attention when they feel someone is watching them.

Wages don't seem to do much for this. We were paying a relatively generous $4 per HIT, estimating 30 minutes per HIT when in reality in reality the average time was under 20.

So I'm wondering, are you able to share with us how exactly the "unusual data patterns" and "technical and behavioral checks" can help ensure quality in an ecosystem where users are generally motivated to 1) finish surveys as fast as possible, and 2) appear as if they are giving high quality responses when they are not?

1 comments

psb31 2513 days ago

It’s a great question and a tough problem. We have written a little bit about the problem of ‘slackers’ (participants with low attentiveness and engagement) [1,2]. The things we’re doing right now to address this problem is to 1) test for attentiveness and engagement before participants take part in real studies, 2) distribute surveys more broadly to reduce the % of “professional survey takers” in each study and 3) educate researchers about ways to use good attention and engagement tests [3] in their studies. We can then feed this data back into our system so we can iteratively improve on data quality.

I can’t go into too much detail about the attention and engagement checks we have built in, but if you sign up as a participant and pay attention you might spot some. In the long run, we think good feedback systems and high trust will be key so that our pool iteratively gets better and participants don’t feel as incentivized to cheat. It’s key to make sure that participants feel that their high effort responses get fairly rewarded (both financially and with social appreciation).

[1] https://blog.prolific.co/how-to-improve-your-data-quality/

[2] https://blog.prolific.co/bots-and-data-quality-on-crowdsourc...

[3] https://researcher-help.prolific.co/hc/en-gb/categories/3600...