Psychology turns to online crowdsourcing to study the mind, but it’s not without its pitfalls - World leading higher education information and services

You may not know this, but a great deal of our data about the human mind is based on a relatively small but intensively studied population: first-year undergraduate university students.

There has long been concern about the over-reliance on students as a source of data, particularly around lack of demographic diversity and limited sample sizes. Both concerns have been implicated in the current crisis in psychological research, in which many key effects have not been replicated by subsequent studies.

But now there’s a new tool in the psycologist’s arsenal, one that has shown it can produce valid data, which can help broaden the population of test subjects: Amazon’s Mechanical Turk.

Mechanical Turk is the most popular online data collection platform. People register with the platform, then choose from thousands of advertisements searching for participants. Compared to the slow slog of testing first-year students in a laboratory, Mechanical Turk offers the opportunity to collect hundreds of responses at a modest cost in a matter of hours.

Psychologists have embraced Mechanical Turk with gusto, and many recent studies have drawn their data from Mechanical Turk subjects. However, the new system is not without its drawbacks.

Too good to be true?

Recently, researchers working with Mechanical Turk have raised concerns about whether it comes with hidden costs. The original goal of many researchers using this platform was to conduct scientifically valid studies in large and diverse samples.

But does online data really provide the solution? Here are some pros and cons.

Sample Size

Mechanical Turk offers an unparalleled opportunity to collect large samples, particularly in comparison to traditional undergraduate participant pools, many of which cap annual testing at 200 participants.

In contrast, there are more than 500,000 workers registered on Mechanical Turk. However, researchers must be careful to prevent Mechanical Turk workers from participating in the same study more than once and thus invalidating the results.

Diversity

Psychologists are concerned with whether findings generalise beyond student samples, or beyond so-called “WEIRD” (Western, Educated, Industrialised, Rich and Democratic) samples.

This is important not only for their own theoretical closure, but also to increase confidence in the general public about the overall validity and importance of the findings. The demographic diversity of Mechanical Turk workers is certainly more varied than that of undergraduate students.

Reliability

One barrier to obtaining reliable data is participant engagement, which is easier to ensure and monitor with students in the lab than it is with Mechanical Turk workers in their own home.

To combat this issue, researchers typically integrate checks into questionnaires to identify participants who are not paying attention, such as asking: “This is a test item, please answer ‘not at all’ for this question”. However, there are now reports that Mechanical Turk workers are adept at spotting such questions.

Fortunately, larger sample sizes do allow researchers to “wash out” the noise of less-than-perfect data. This means that relationships can be detected in noisy data given a large enough sample, but that averages across different samples are likely to differ.

Naivety

Another concern is the naivety of research participants. Study results are unlikely to be valid if participants know the procedure and expected hypotheses in advance.

Lack of naivety in this form has the potential to significantly alter results, and thus impact on replicability. Here, undergraduate samples present an advantage: students typically complete only a handful of studies in their first year, and most before being exposed to detailed information about psychology.

In contrast, some MTurk workers treat study completion as a full-time job, completing hundreds of studies per week. More concerning still is the availability of online communities in which workers trade information about study hypotheses and procedures and offer tips on completing studies quickly, which inevitably comes at the cost of psychological engagement.

The future of online data collection

Researchers turned to online data collection platforms like Mechanical Turk because they offered quick, cheap and apparently scientifically valid solutions to problems implicated in the replication crisis.

Although Mechanical Turk allows for collection of large and diverse samples, it comes with other costs that may compromise scientific rigour, including questionable quality and validity of results.

This means Mechanical Turk is useful, but only to the extent that researchers are aware of, and compensate for, its pitfalls. This includes:

1) embedding novel attention checks to keep ahead of savvy workers

2) ensuring workers do not complete studies more than once

3) avoiding common procedures that workers have seen hundreds of times; and

4) diversifying onto other online platforms (or creating an Australian platform that is better suited to the requirements of local researchers)

Overall, online samples should be used as a complement to, not a replacement for, traditional student samples. Both methods have their own strengths and weaknesses, but together produce better science.

While Mechanical Turk isn’t the silver bullet that psychology researchers had hoped, harnessing its benefits and offsetting its costs will ensure the future of online data collection is still bright.

Author Bios: Michael Humphreys is Emeritus Professor of Cognitive Psychology, Katharine H. Greenaway is a Research Fellow in Social Psychology and Sarah Bentley is a Researcher Social Psychology at The University of Queensland