Data collection via the internet revolutionises research methods but also poses new dangers for data quality.
Interview with a fraudster in Bangladesh
The InnovateMR and OpinionRoute teams interviewed a current fraudster who uses YouTube to create training videos on how to commit survey fraud. He can make upwards of $400 US dollars per month across 400+ surveys. He also teaches 1:1 classes, avoiding the potential legal ramifications.
Here are four categories of survey fraud with suggestions for how to stop them:
1. Inattentive respondents
The fourth group consists of inattentive respondents. They are distracted and generally aren’t concerned with making the most money. Instead, they’re not motivated to give any real facts for the money you’re paying. Inattentive respondents are a wide-ranging segment of individuals: students who do not understand instructions correctly and those who fill in questionnaires while watching television. They might input random responses as gibberish or even low-effort text. Most importantly, distracted respondents are not always honest. Some may consider the reward insufficient to warrant the time and effort.
How to stop the inattentive respondents
- Provide a time limit to fill out questions to ensure that the participants do not have the time to become distracted by television or the internet.
- Ask the participants a few questions to clarify the instructions after the survey (to make sure they understand them)
- Record page-view and timing data
- Save the page load and the timestamp for every time a question is asked.
- Track the amount of times that the page is either reduced or hidden.
- Track the amount of time spent reading instructions.
- Look at unusual time patterns. What if someone took just 2 seconds to go through the instructions? Did you take 40 minutes to answer the questionnaire with a five-minute gap between questions?
- Make Attention checks (aka Instructional Manipulation Checks or IMCs). They should be kept extremely easy, to be fair. “Memory tests” aren’t great focus tests, nor are they concealing one of the errors in attention tests.
- Incorporate open-ended questions that demand more than one single-word answer. Make sure to look for responses that are low-effort.
- Examine your data with careless response measures like consistency indices or response pattern analysis.
2. Dishonest Respondents
Dishonest respondents submit fake prescreening details to access the biggest volume of research studies, increasing their earnings potential. The impact of lying on the quality of your data is based on two aspects:
The significance of prescreening: In studies that compare both genders of respondents, if one or more members of either group aren’t what they claim to be, it could invalidate the study and render the data useless. However, if gender doesn’t play a role in the study’s design, whether the respondent has lied about their gender is irrelevant.
The uniqueness of your study: The presence of impostors is more frequent when recruiting target groups with low incidence rates in the population. This is because the demand for specific target groups can be very competitive, and the studies with broad eligibility can fill quickly. Therefore, if you wish to maximise your earnings, you’ll need to be able to access studies that fill up more slowly by making claims to be a member of one (or numerous) specific demographics.
How to stop dishonest respondents
Never disclose in your study’s description or title what characteristics you’re looking for. This could give malingerers the data they require to swindle to qualify for the survey.
Re-ask the prescreening questions in the first few minutes of your research (and at the conclusion, in cases where it’s not overwhelming). This lets you confirm that the answers to your participants’ prescreening questions remain current and relevant and can reveal lying participants who’ve forgotten their initial answers to the prescreening questions.
Re-ask questions from your screener questions that are hard to answer unless the person is honest. If you require participants who take antidepressants, inquire about the name of their medication and the dose. Most of the time, it’d have the potential for a deceitful person to find the solution to the concerns… however, most dishonest respondents don’t have the time or energy to make the time and effort!
3. Cheating Respondents
The third group of bad data come from cheaters who intentionally submit false data to your study. Remember that cheaters are not always attempting to deceive you: some might be confused about the information you’re trying to collect or whether they’ll receive a payment even if they fail to perform “well”.
It could be because they believe the study’s rewards will depend on their students’ performance (i.e. they believe that they’ll only get paid if achieving 100% on the test, and therefore they search for the right answers). Alternatively, they might think you only want a certain kind of response (i.e., always giving very positive/enthusiastic responses) or use aides (pen and paper) to perform much better than reality.
The last type of cheaters are people who don’t consider your survey serious, or perhaps they are doing it with their buddies or when drunk. To clarify, dishonest respondents provide false details about their demographics to get access to your research. Cheating respondents offer false details within the questionnaire. An individual participant could be dishonest and a cheater, affecting data quality negatively in different ways.
It’s important to remember that these groups aren’t separate. A dishonest respondent may use bots; inattentive respondents could cheat etc. There’s likely plenty of overlap since most bad actors don’t care about what strategies they employ to earn money for as long as they’re maximising their profits!
How to stop cheating respondents
- Record time spent in questionnaires to keep participants from spending time searching for answers on Google.
- Ask participants a few queries to clarify the task’s directions after the exercise (to ensure they’ve had a clear understanding of the task and didn’t cheat unintentionally).
- Establish precise criteria for screening data to identify unusual behaviour.
As simple as it sounds, yo could ask an open-ended question at the end of the study: “Did you cheat?”
4. Automated bots
Automated bots are semi-autonomous software designed to take online surveys and complete them with minimal human involvement. Bots are typically identified by random or nonsensical, low-effort text responses. However, because they are not human, there are various ways to identify bots in your data. However, bad-mannered humans are more difficult to spot…
How to stop automated bots from taking surveys
Include a captcha at the beginning of the questionnaire, and you’ll stop the vast majority of automated bots from submitting any answers. Also, if your research is extremely interactive (such as a cognitive or reaction time test), the bots will not be able to finish it effectively.
Include open-ended research questions (e.g., “What did you think of this research? ?”). Examine your research data to find low-effort incoherent responses to these inquiries. Most bot answers are unclear, and you might see the same phrases utilised in multiple responses.
Look over to see if your database contains random answer patterns. If you’re seeking a more straightforward approach, try adding several questions duplicated at different times within the research. Human responses will give consistent answers, while an automated system will likely give identical answers repeatedly.