Survey Reliability vs. Survery Validity: What’s the Big Deal?
You may ask yourself or find others asking you about the “reliability” or “validity” of data that comes out of employee surveys. It’s not uncommon for those two words to be used interchangeably. But in the field of data research, the two are far from synonymous.
Why is it important to know the difference?
Because in most cases, sourcing and hiring a survey partner, or DIY survey creation and implementation, falls to HR.
Making sure you’re working with a demonstrably valid and reliable survey will get you the kinds of data you can confidently tell others is, indeed, dependable. Knowing the difference between survey validity and reliability also brings clarity around why adding or modifying questions and wording, flow, and format from one survey to the next can have consequences.
There are many components involved in survey design that contribute to high (or low) quality data. The time and effort it takes for respondents to complete a survey, number of points on the rating scale, phrasing of points, order of questions and layout are just a start.
Validity is concerned with the accuracy of your survey. It depends on asking questions that truly measure what is supposed to be measured. For instance, to what extent is an employee engagement survey actually measuring engagement?
Reliability on the other hand, is concerned with consistency or the degree to which the questions used in a survey elicit the same kind of information each time they’re asked. This is particularly important when it comes to tracking and comparing results with past internal surveys and benchmarks from external sources. Changes to wording or structure may result in different responses.
Consider this example of a misguided change to a valid question:
- Valid Question: My organization inspires me to do my best work.
- Misguided Question: I’m inspired to do my best work here at my organization.
While the two questions look almost interchangeable at the outset, the “valid” question has been tested many times, so we know how respondents interpret it, as one’s organization is the source of inspiration. Also, being a “valid” question, we expect results to be similar over time and any change would be due to a true change in the attitude, as opposed to changing interpretation.
On the other hand, the “misguided” question cannot be considered valid, since it has not been tested to see how respondents interpret it. Unless it is validated (through a debriefing exercise with many respondents, as well as predictability of response among similar randomly generated populations), we don’t know if it is truly measuring what is supposed to be measured. Is it measuring inspiration by the organization, by one’s work, or by both?
Although distinctly different, survey validity and reliability are inextricably linked.
Survey reliability on its own doesn’t establish validity.
An employee engagement survey can have high reliability – consistent responses year after year from one organization to the next – but low validity if the wrong questions are asked. When response data from a low validity survey is plotted on a graph, it doesn’t form the normal distribution shape of a bell curve where most data is near the middle but instead has scattered responses with large numbers at either end of the scale. Skewed responses suggest questions may not be properly structured. The result when validity becomes an issue? Bad information that doesn’t measure what you intended and jeopardizes sound decision-making.
Conversely, if results show responses are noticeably inconsistent year after year from one survey to the next yet there’s normal bell curve distribution; then your questions may be valid, but your survey reliability, not so much.
Several factors can interfere with survey reliability, among them changes in participants, environment, timing and the survey itself. A survey conducted just after union negotiations or a layoff for instance, is likely to have impaired reliability. A seemingly insignificant change to how a question is asked, or the addition of a new section of questions that adds more time from respondents can impact reliability too.
That’s why reputable survey vendors review questionnaire statistical properties; following a comprehensive step-by-step analysis that looks at the attributes of every single item respondents are meant to answer.