MarketTools recently completed an interesting research study that examined whether or not address validation (as opposed to name / address validation) was an adequate measure for determining a research respondent’s validity or “realness”. The question we sought to answer through this research was: Is it reasonable to consider “loosening” the TrueSample match criteria for hard-to-reach segments of the population, so as to preserve the online panel capacity in these groups?
As you may know, TrueSample uses a combination of name and address information to determine if a respondent is a “real” person (as opposed to a fictitious persona). For hard-to-reach demographic segments, such as 18-24s and Hispanics, where panel capacity is limited, we wanted to understand if research quality could be preserved with a less stringent match criteria, such as that offered by address validation alone. So, we undertook a research project to investigate.
Research Methodology We devised a research methodology that would allow us to determine if the data collected from a sample of panelists who passed the “address validation” test was different from those who failed it. We did the same for a sample of panelists who passed the “name/address validation” test.
This methodology was founded on research we conducted and documented back in 2008, which showed the difference in data collected from TrueSample-certified valid and invalid respondents. The 2008 research showed that, on answers to attitudinal questions in a survey, the invalid respondents invariably had lower scores than the valid respondents. Essentially this means that on the attitudinal rating scales, these respondents tended to be less positive in their rating than the valid respondents. We had a similar approach in the two tests conducted for this current study:
TEST 1 – Name/Address Validation: This test compared the data collected from respondents whose name and address combination were confirmed to exist in the real world, and therefore “passed” name/address validation vs. those whose name and address could not be matched and therefore “failed” name/address validation. This test allowed us to re-establish the trend seen in the 2008 research.
TEST 2 – Address-Only Validation: This test compared the data collected from respondents whose address had been confirmed to exist in the real world and therefore passed address-only validation vs. those whose address could not be confirmed and therefore failed address-only validation.
Theory to be Tested: If address validation alone is sufficient as an indicator of “realness,” then those respondents that pass address-only validation should provide different data from those that fail address-only validation.
Results of Study As we can see in the charts below, when we compare the survey results from the respondents that failed name/address validation to the ones that passed name/address validation, we can see that there is a marked difference between the results (Sample1 [left]). Just as in the original TrueSample test, the scores on the same questions from the respondents that passed the validation (x-axis) were consistently higher than the scores from the ones that failed (y-axis). Hence the dots lie predominantly to the bottom of the 45 degree line. (Anything on the 45 degree line implies the responses are similar.)
On the other hand, when we look at the respondents that passed address-only validation and compare them to the ones that failed (Sample 2 [right]), we can clearly see that the responses are spread on both sides of the 45 degree line. This implies that there is not a bias toward higher or lower scores for these two types of respondents – meaning they are very similar to each other. In other words, address-only validation does not effectively provide a separation between two types of respondents and therefore cannot separate real ones from non-real ones.
Key Finding: These results indicate that address-only validation is not a reliable measure for determining the realness of a research respondent and therefore is not an adequate tool in improving sample quality.
Our team continues to investigate this issue to determine the best mechanisms for increasing panel capacity while preserving research quality. We are now looking into the impact of re-validating failed respondents after a given period of time. Check back soon for the results.