Surveys, Response Rates, and NonresponseJeffrey StantonSyracuse University1
AbstractThe survey method is one of the most popular methods in Information Systems research. One problem that plagues most survey researchers is nonresponse. As theories get more complex and more constructs must be measured, surveys tend to get longer, and this can reduce response rates. Some traditions have developed around acceptable response rates, such as these: “In my area a 45% response rate is considered quite good...” or “that research can’t possibly be valid given a response rate under 20%.” Research shows that these and other myths about response rate are incorrect. In this tutorial, participants will learn four things: 1) The important difference between response rate and nonresponse bias, and why it is more important to minimize the latter rather than maximizing the former; 2) the full range of response enhancing techniques that have had their efficacy documented in the methods literature; 3) a focus on a particular method of response enhancement – survey and scale shortening; and 4) survey design methods that allow for the detection of the presence and magnitude of nonresponse bias. At the end of the tutorial, participants will have the skills and knowledge to build assessment and control of nonresponse into their survey methods.2
Why do we care if we our study has a low response rate?
Low Response Rates…cause smaller data samples which decrease statistical power, increase confidence intervals, and may limit the types of statistical techniques that can effectively be applied to the collected data. …undermine the perceived credibility of the collected data…undermine the actualgeneralizability of the collected data because of nonresponse bias. Where nonresponse bias exists, survey results can produce misleading conclusions that do not generalize to the entire population
Research History1939F. Stanton (1939) wrote one of the first empirical pieces on the topic in the Journal of Applied Psychology entitled, “Notes on the validity of mail questionnaire returns.”Suchmanand McCandless’s (1940) Journal of Applied Psychology article titled, “Who answers questionnaires?” A significant early event to draw interest in this topic occurred in 1948.19482010
In the 1948 U.S. presidential election, pre-election polls by major newspapers and polling organizations predicted a victory by New York State Governor, Thomas E. Dewey, ranging between 5 to 15 percentage points.Instead, the victory by incumbent president Harry S. Truman was an embarrassment for the emerging public opinion polling community. What caused the failure?
Research TodayExtensive literature on techniques to increase response rates: response enhancing strategiesStatistical methods of compensating for nonresponse through imputation of missing data Rubin (1987) developed a book length treatment of methods for imputing data in sample surveys. Characteristics of Nonrespondents & Nonresponse BiasRogelberg, S.G., Spitzmüller, C., Little, I.S., & Reeve, C.L. (2006). Understanding Response Behavior to an Online Special Topics Organizational Satisfaction Survey.  Personnel Psychology, 59, 903-923
Organizational Survey Response RatesYoussefinia (2003) examined 58 organizational surveys conducted over five years by two consulting firms.  Anseel, Lievens, Schollaert(2008) Analyzed 2037 surveys, covering 1,251,651 individual respondents, published in 12 journals in I/O Psychology, Management, and Marketing during the period 1995-2008.You predict:What is a typical response rate in an organizational survey?What is the trend over recent years for response rates in organizations?
9
What is an acceptable response rate for your study?
Trick Question?Industry and academic standards only put a response rate into context The fact that everyone else also achieves 30%, 50%, or 70% response does not help to demonstrate that the reported research is free from nonresponse bias. In the absence of good information about presence, magnitude, and direction of nonresponse bias, ignoring the results of a study with a 10% response rate -- particularly if the research question explores a new and previously unaddressed issue -- is just as foolish as assuming that one with a response rate of 80% is unassailable.
The Nature of Nonresponse Bias where ‘PNR’ refers to the proportion of non-respondents, ‘Xres’ is the respondent mean on a survey relevant variable and ‘Xpop’ is the population mean on the corresponding survey relevant variable, if it were actually known. Overall, the impact of nonresponse on survey statistics depends on the percentage not responding and the extent to which those not responding systematically different from the whole population on survey relevant variables.
Error/Bias due to Non-Response(http://www.idready.org/courses/2005/spring/survey_SamplingFrames.pdf)Non-Respondentsµ0, α0, r0, F0Respondentsµ1, α1, r1, F1HowMany?HowDifferent?May 15-17, 2008
Sample:N=100If non-respondents resemble respondents, then low response rate is not a problem.n=5 say YES10% Responsen=10n=5 say NO
Sample:N=100Even when response rates are “high” substantial potential for error still exists.n=35 say YES70% Responsen=70n=35 say NO
Worst Case Scenario ExerciseWhat’s are the worst two things that could happen?Choose one scenario below and run the numbers…1. Sample N=100; Response rate 30%; Percent YES for respondents = 40% (12 YES votes)2. Sample N=100; Response rate 30%; Percent YES for respondents = 90% (27 YES votes)3. Sample N=100; Response rate: 90%; Percent YES for respondents = 50% (45 YES votes)4. Sample N=100; Response rate: 90%; Percent YES for respondents = 80% (72 YES votes)If you counted all non-respondents as YES votes or NO votes, what would be the range of results in each of these scenarios?
Previous Examples: ProportionsOf course, rating scale means can be similarly impacted by non-response errorWhat about correlations?Word on the street is that correlations are fairly robust against non-response errorWe ran a simulation – 300 runs of random samples of n=500 from a larger population where rho=0.284 between a rating scale and a criterion scaleHalf of the samples had biased nonresponse where those favorable on a rating scale were twice as likely to respondResults showed that the unbiased samples had slightly suppressed correlations: a decline of r=0.038Biased samples had more suppression: a decline of r=0.066Difference in suppression was statistically significant, p<.001No sign reversals of correlations in any of the 300 samples
Case Study ExerciseInstructions:Read brief case, make some notesDiscuss with others at your table Generate as many ideas as you can, write ideas on sheetsPrepare to report back to complete group Case Overview:You are an preparing a climate surveyUnlikely to get response rate much above 45%How can you prepare for possible criticism of results?
N-BIASResponse rate alone is an inaccurate and unreliable proxy for study quality. While improving response rates is a worthy goal, researchers’ major efforts and resources should go into understanding the magnitude and direction of bias caused by non-response, if it exists. Rogelberg and Stanton (2006) advocate that researchers should conduct a nonresponse bias impact assessment (N-BIAS), regardless of how high a response rate is achieved.
N-BIAS MethodsN-BIAS is presently composed of eight techniquesArchival AnalysisFollow-up ApproachWave AnalysisPassive Nonresponse AnalysisInterest Level AnalysisActive Nonresponse AnalysisWorst Case ResistanceBenchmarking/NormsDemonstrate Generalizability
N-BIAS: How it WorksSimilar to a test validation strategy.  In amassing evidence for validity, each of several different validation methods (e.g., concurrent validity) provides a variety of insights into validity. Each assessment approach has strengths and limitations. There is no one conclusive approach and no particular piece of evidence that is sufficient to ward off all threats. Assessing the impact of nonresponse bias requires development and inclusion of different types of evidence, and the case for nugatory impact of nonresponse bias is built on multiple pieces of evidence that converge with one another.
Technique 1: Archival AnalysisMost common techniqueThe researcher identifies an archival database that contains the members of the whole survey sample (e.g. personnel records).That data set, usually containing demographic data, can be described:50% Female; 40% Supervisors, etcAfter data collection, code numbers on the returned surveys (or access passwords) can be used to identify respondents, and by extension nonrespondents.  Using this information, the archival database can be partitioned into two segments: 1) data concerning respondents; and 2) data concerning nonrespondents.
So, if you found the above do you have nonresponse bias?
Technique 2: Follow-up ApproachUsing identifiers attached to returned surveys (or access passwords), respondents can be identified and by extension nonrespondents.The follow-up approach involves randomly selecting and resurveying a small segment of nonrespondents, using an alternative modality and often by phone. The full or abridged survey is then administered.In the absence of identifiers, telephone a small random sample and ask whether they responded or not to the initial survey.  Follow-up with survey relevant questions
Technique 3: Wave AnalysisBy noting in the data set whether each survey was returned before the deadline, after an initial reminder note, after the deadline, and so on, responses from pre-deadline surveys can be compared with the late responders on actual survey variables (e.g. compare job satisfaction levels).
Technique 4: Passive Nonresponse AnalysisRogelberg et al. (2003) found that the vast majority of nonresponse can be classified as being passive in nature (approx. 85%).Passive nonresponse does not appear to be planned.When asked (upon receipt of the survey), these individuals indicate a general willingness to complete the survey – if they have the time.  Given this, it is not surprising that they generally do not differ from respondents with regard to job satisfaction or related variables.
Technique 5: Interest Level AnalysisResearchers have repeatedly identified that interest level in the survey topic is one of the best predictors of a respondent’s likelihood of completing the survey.As a result, if interest level is related to attitudinal standing on the topics making up the survey, the survey results are susceptible to bias.E.g., if low interest individuals tend to be more dissatisfied on the survey constructs in question, results will be biased “high”
Technique 6: Active Nonresponse AnalysisActive nonrespondents, in contrast to passive nonrespondents, are those that overtly choose not to respond to a survey effort.  The nonresponse is volitional and a priori (i.e. it occurs when initially confronted with a survey solicitation).Active nonrespondents tend to differ from respondents on a number of dimensions typically relevant to the organizational survey researcher (e.g. job satisfaction)
Technique 7: Worst Case ResistanceGiven the data collected from study respondents in an actual study, one can empirically answer the question of what proportion of nonrespondents would have to exhibit the opposite pattern of responding to adversely influence sample results.Similar philosophy as what occurs in meta-analyses when considering the “file-drawer problem”By adding simulated data to an existing data set, one can explore how resistant the dataset is to worst case responses from non-respondents.
Technique 8: BenchmarkingUsing measures with norms for the population under examination, compare means and standard deviations of the collected sample to the norms
Technique 9: Demonstrate GeneralizabilityBy definition, nonresponse bias is a phenomenon that is peculiar to a given sample under particular study conditions.Triangulating with a sample collected using a different method, or varying the conditions under which the study is conducted should also have effects on the composition of the nonrespondents group.
N-BIAS: ConclusionNonresponse can be problematic on a number of frontsDo what you can to facilitate responseIn the inevitable case of nonresponse, engage in the N-BIAS approach in an attempt to accumulate information to provide insight into the presence and absence of problematic nonresponse biasEngage in as many techniques as feasible.  1, is better than 0, 2 is better than 1.  Most published literature has none!Each approach has a different purpose, each has positives and negatives.Use N-BIAS information collected to decide on next steps and educate your audience
Armstrong, J. S., & Overton, T. S. (1977). Estimating nonresponse bias in mail surveys Journal of Marketing Research, 14 (Special Issue: Recent Developments in Survey Research), 396-402.Baruch, Y. (1999). Response rate in academic studies – A comparative Analysis. Human Relations, 52 (4), 421-438. Bosnjak M., Tuten, T.L., & Wittman , W. W. (2005).  Unit (non) response in web-based access panel surveys: An extended planned-behavior approach.  Psychology and Marketing, 22, 489-505.Dillman, Don A. (2000). Mail and Internet Surveys: The Tailored Design Method. New York, NY, US: John Wiley & Sons, Inc.Groves, R., Presser, S., & Dipko, S. (2004). The role of topic interest in survey participation decisions. Public Opinion Quarterly, 68, 2-31. Rogelberg, S.G., & Luong, A. (1998). Nonresponse to mailed surveys: A review and guide. Current Directions in Psychological Science, 7, 60-65.Rogelberg, S.G., Luong, A., Sederburg, M.E., & Cristol, D.S. (2000). Employee Attitude Surveys: Examining the Attitudes of Noncompliant Employees. Journal of Applied Psychology, 85(2), 284-293.Rogelberg, S. G., Fisher, G. G., Maynard, D, Hakel, M.D., & Horvath, M. (2001). Attitudes Toward Surveys: Development of a Measure and its Relationship to Respondent Behavior. Organizational Research Methods, 4, 3-25.Rogelberg, S. G., Conway, J. M.., Sederburg, M. E., Spitzmuller, C., Aziz, S., Knight, W. E. (2003). Profiling Active and Passive-non-respondents to an Organizational Survey. Journal of Applied Psychology, 88 (6), 1104-1114.Rubin, D. (1987). Multiple imputation for nonresponse in surveys. New York, NY, US: John Wiley & Sons, Inc.Tomaskovic-Devey, D., Leiter, J., & Thompson, S. (1994). Organizational survey response. Administrative Science Quarterly, 39, 439-457Weiner, S.P. & Dalessio, A.T. (2006).   Oversurveying: Causes, consequences, and cures.  In A.I. Kraut (Ed.), Getting Action From Organizational Surveys: New Concepts, Methods, and Applications. (pp 294-311) San Francisco, California: Jossey-BassYammarino, F.J., Skinner, S.J., & Childers, T.L. (1991). Understanding mail survey response behavior: A meta-analysis. Public Opinion Quarterly, 55, 613–639.Key References
Segment 2:Methods to Facilitate ResponseQuick brainstorm: I have thought of 12 response facilitation techniques. How many can you as a group come up with in three minutes?34
Methods To Facilitate ResponseActively publicize the survey. Personally notify your potential participants that they will be receiving a survey in the near future.  Provide incentives, if appropriate.  Inexpensive items such as pens, key chains, or certificates for free food/drink can increase responses.
Keep the survey to a reasonable length. A theory-driven approach to survey design helps determine what is absolutely necessary to include in the survey instrument. Do not use the “kitchen sink” approach.What is a reasonable length?Be sensitive to the actual physical design of your survey.  For example, how questions are ordered may impact respondent participation.  A study by Roberson and Sundstrom (1990) suggests placing the more interesting and easy questions first and demographic questions last.
Send reminder notes. Response rates may bump up 3-7% with each reminder note, but keep in mind that there's a point of diminishing returns when you irritate people who have chosen not to participate.  Give everyone the opportunity to participate (e.g., paper surveys where required, scheduling time off the phone in the call centers, etc.).  At one company for example, most surveys run for 10 business days and span across three work weeks. Track response rates so that the survey coordinators can identify units with low response rates and contact the responsible manager to  increase responses. 
Foster commitment to the survey effort.  For example, you can involve a wide range of employees (across many levels) in the survey development process.  Link the content of the survey to important business outcomes. Provide respondents with survey feedback after the project is completed. Be careful not to abandon your participants once getting the data you wanted from them. You are paving the way for future survey efforts.Personalization of the survey invitation. Personal signature as part of cover letter.Topic salience
Even when controlling for the presence of other techniques, advance notice, personalization, identification numbers, and salience, are associated with higher response rates.Because of survey fatigue and declining response rates we need to do more just to get the same results as in the past.Target facilitation strategy to who you are surveying.For top executives, Anseel found that salience of the survey topic was most key. Incentives were counterproductiveIncentives worked for unemployed individuals
Segment 3:Survey reduction techniques in detailQuick Brainstorm: I have thought of seven ways of reducinga survey. How many reduction methods can you thinkof in three minutes?40
Primary Goal: Reduce Administration TimeSecondary goalsReduce perceived administration timeIncrease the engagement of the respondent with the experience of completing instrument  lock in interest and excitement from the startReduce the extent of missing and erroneous data due to carelessness, rushing, test forms that are hard to use, etc.Increase the respondents’ ease of experience (maybe even enjoyment!) so that they will persist to the end AND that they will respond again next year (or whenever the next survey comes out)Conclusions?Make the survey SEEM as short and compact as possibleStreamline the WHOLE EXPERIENCE from the first call for participation all the way to the end of the final page of the instrumentFocus test-reduction efforts on the easy stuff before diving into the nitty-gritty statistical stuff41
Instruction ReductionFewer than 4% of respondents make use of printed instructionsNovick and Ward (2006, ACM-SIGDOC)
Comprehension of instructions only influences novice performance on surveysCatrambone (1990; HCI)
Instructions on average are written five grade levels above average grade level of respondent; 23% of respondents failed to understand at least one element of instructionsSpandorfer et al. (1993; Annals of EM)42
Instruction ReductionConclusionsUnless you are working with a special/unusual population, you can assume that respondents know how to complete Likert scales and other common response formats without instructions
Most people don’t read instructions anyway.  When they do, the instructions often don’t help them respond any better!
If your response format is so novel that people require instructions, then you have a substantial burden to pilot test, in order to ensure that people comprehend the instructions and respond appropriately. Otherwise, do not take the risk!43
Archival DemographicsMost survey projects seek to subdivide the population into meaningful groupsgender, race/ethnicity, agemanagers and non-managersexempt and non-exemptpart time and full timeunit and departmental affiliationsDemographic data are criticalDemographic data often comprise one page, 5-15 questions, and 1-3 minutes of administration time per respondent
Self-completed demographic data frequently containing missing fields or intentional mistakes44
Archival DemographicsFor the sake of anonymity, these data can be de-identified up front and attached to randomly generated code (alphanumeric)  - in other words, have the demographic form contain a code, and that code is matched to the survey.
Respondents should feel like demographics are not serving to identify them in their survey responses.
You could offer respondents two choices: match (or automatically fill in) some/all demographic data using the code number provided in your invitation email (or on a paper letter); they fill in the demographic data (on web-based surveys, a reveal can branch respondents to the demographics page)45
From Don Dillman’sTailored Design Method: Key Form/Interface Design Goals – Non-subordinating language, No embarrassment, No drudgery, Readability, SimplicityDrudgery – Questions that require data lookup, calculation, interpolation, recall of specific events from distant past; response process should give a sense of forward momentum and achievementReadability – Grade level should match respondent population reading capabilitySimplicity – Layout should draw the eye directly to the items and response fields; response method should fit respondents’ experience and expectations Discuss: Any particularly frustrating surveys?  Particularly easy/streamlined ones? 46Forms/Interface Design
47
EligibilityIf a survey has eligibility requirements, the screening questions should be placed at the earliest possible point in the survey.(eligibility requirements can appear in instructions, but this should not be the sole method of screening out ineligible respondents)Skip LogicSkip logic actually shortens the survey by setting aside questions for which the respondent is ineligible.BranchingBranching may not shorten, but can improve the user experience by offering questions specifically focused to the respondent’s demographic or reported experience.48Illustration credit: Vovici.comEligibility, Skip Logic, and Branching
Discuss: Ever answer a survey where you knew that your answer would predict how many questions you would have to answer after that?e.g., “How many hotel chains have you been to in the last year?”If users can predict that their eligibility, the survey skip logic, or survey branching will lead to longer responses, more complex responses, or more difficult or tedious responses, they may:Abandon the surveyBackup and change their answer to the conditional with less work (if the interface permits it).49Implications: Eligibility, Skip Logic, and BranchingIllustration credit: Vovici.com
Branch design should  try not to imply what the user would have experienced in another branch.Paths through the survey should avoid causing considerably more work for some respondents than for others– if at all possible.50Implications: Eligibility, Skip Logic, and BranchingIllustration credit: Vovici.com
Panel Designs and Multiple AdministrationPanel designs measure the same respondents on multiple occasions.Typically either predictors are gathered at an early point in time, and outcomes gathered at a later point in time, or both predictors and outcomes are measured at every time point.  (There are variations on these two themes).Panel designs are based on maturation and/or intervention processes that require the passage of time. Examples: career aspirations over time, person-organization fit over time, training before/after – discuss others?Minimally, panel designs can help mitigate (though not solve) the problem of common method bias; e.g., responding to a criterion at time 2, respondents tend to forget how they responded at time 1.51
Panel Designs and Multiple AdministrationSurvey designers can apply the logic of panel designs to their own surveys:Sometimes, you have to collect a large number of variables (no measure shortening), and it is impractical to do so in a single administration.Generally speaking: Better to have a many short, pleasant survey administrations with a cumulative “work time lost” of an hour vs. long and grinding one hour-long survey.The former can get you happier and less fatigued respondents and better data, hopefully.In the limit, consider the implications of a “Today’s Poll” approach to measuring climate, stress, satisfaction, or other attitudinal variables: One question per day, every day….52
Unobtrusive Behavioral ObservationSurveys appear convenient and relatively inexpensive in and of themselves…however, the cumulative work time lost across all respondents may be quite large. Methods that assess social variables through observations of overt behavior rather than self report can provide indications of stress, satisfaction, organizational citizenship, intent to quit, and other psychologically and organizationally relevant variables.Examples Cigarette breaks over time (frequency, # of incumbents per day); Garbage (weight of trash before/after a recycling program); Social media usage (tweets, blog posts, Facebook); Wear of floor tilesAbsenteeism or tardiness records; Incumbent, team and department production quality and quantity measures53
Unobtrusive Behavioral ObservationMost unobtrusive observations must be conducted over time:Establish a baseline for the behavior.Examine subsequent time periods to examine changes/trends over time. Generally, much more labor intensive data collection than surveys.Results should be cross-validated with other types of evidence.54
Scale Reduction and One-item MeasuresStandard scale construction calls for “sampling the construct domain” with items that tap into different aspects of the construct with items that refer to various content areas. Scales with more items can include a larger sample of the behaviors or topics relevant to the construct. 55RELEVANTmeasuring what you want measureConstruct DomainItem ContentCONTAMINATEDmeasuring what you don’t want to measureDEFICIENTnot measuring what you want to measure
Scale Reduction and One-item MeasuresWhen fewer items are used, by necessity they must be eithermore general in wording to obtain full coverage (hopefully)more narrow to focus on a subset of behaviors/topicsInternal consistency reliability reinforces this trade-off: As the number of items gets smaller, inter-item correlation must rise to maintain a given level of internal consistency. However, scales with fewer than 3-5 items rarely achieve acceptable internal consistency without simply becoming alternative wordings of the same questions.Discussion: How many of you have taken a measure where you were being asked the same question again and again?  Your reactions?  Why was this done?The one-item solution: A one-item measure usually “covers” a construct only if is highly non-specific. A one item measure has a measurable reliability (see Wanous & Hudy; ORM, 2001), but the concept of internal consistency is meaningless.Discuss: A one-item knowledge measure vs. a one-item job satisfaction measure.56
One-item Measure LiteratureResearch using single item measures of each of the five JDI job satisfaction facets and found correlations between .60 and .72 to the full length versions of the JDI scalesNagy (2002)Review of single-item graphical representation scales; so called “faces” scales Patrician (2004)Single item graphic scale for organizational identificationShamir & Kark (2004)Research finding that single item job satisfaction scales systematically overestimate workers’ job satisfactionOshagbemi(1999)Single item measures work best on “homogeneous” constructsLoo (2002)57
Scale Reduction: Technical ConsiderationsItems can be struck from a scale based on three different sets of qualities: 1. Internal item qualities refer to properties of items that can be assessed in reference to other items on the scale or the scale's summated scores. 2. External item qualities refer to connections between the scale (or its individual items) and other constructs or indicators. 3. Judgmental item qualities refer to those issues that require subjective judgment and/or are difficult to assess in isolation of the context in which the scale is administered The most widely used method for item selection in scale reduction is some form of internal consistency maximization. Corrected item-total correlations provide diagnostic information about internal consistency. In scale reduction efforts, item-total correlations have been employed as a basis for retaining items for a shortened scale version. Factor analysis is another technique that, when used for scale reduction, can lead to increased internal consistency, assuming one chooses items that load strongly on a dominant factor58
Scale Reduction IIDespite their prevalence, there are important limitations of scale reduction techniques that maximize internal consistency. Choosing items to maximize internal consistency leads to item sets highly redundant in appearance, narrow in content, and potentially low in validity High internal consistency often signifies a failure to adequately sample content from all parts of the construct domain To obtain high values of coefficient alpha, a scale developer need only write a set of items that paraphrase each other or are antonyms of one other. One can expect an equivalent result (i.e., high redundancy) from using the analogous approach in scale reduction, that is, excluding all items but those highly similar in content.59
Scale Reduction IIIIRT provides an alternative strategy for scale reduction that does not focus on maximizing internal consistency. One should retain items that are highly discriminating (i.e., moderate to large values of a) and one should attempt to include items with a range of item thresholds (i.e., b) that adequately cover the expected range of the trait in measured individuals IRT analysis for scale reduction can be complex and does not provide a definitive answer to the question of which items to retain; rather, it provides evidence for which items might work well together to cover the trait rangeRelating items to external criteria provides a viable alternative to internal consistency and other internal qualities Because correlations vary across different samples, instruments, and administration contexts, an item that predicts an external criterion best in one sample may not do so in another.  Choosing items to maximize a relation with an external criterion runs the risk of a decrease in discriminant validity between the measures of the two constructs.60
Scale Reduction IVThe overarching goal of any scale reduction project should be to closely replicate the pattern of relations established within the construct's nomological network.  In evaluating any given item's relations with external criteria, one should seek moderate correlations with a variety of related scales (i.e., convergent validity) and low correlations with a variety of unrelated measuresResearchers may also need to examine other criteria beyond statistical relations to determine which items should remain in an abbreviated scale.Clarity of expression, its relevance to a particular respondent population, the semantic redundancy of an item's content with other items, the perceived invasiveness of an item, and an item's "face" validity. Items lacking apparent relevance, or that are highly redundant with other items on the scale, may be viewed negatively by respondents.To the extent that judgmental qualities can be used to select items with face validity, both the reactions of constituencies and the motivation of respondents maybe enhancedSimple strategy for retention that does not require IRT analysis: Stepwise regression Rank ordered item inclusion in an "optimal" reduced-length scale that accounts for a nearly maximal proportion of variance in its own full-length summated scale score.  Order of entry into the stepwise regression is a rank order proxy indicating item goodness Empirical results show that this method performs as well as a brute force combinatorial scan of item combinations; method can also be combined with human judgment to pick items from among the top ranked items (but not in strict ranking order)61
Segment 4:Pitfalls, trade-offs, and justificationsQuick Brainstorm: What complaints have you heard frommanagers when you ask them if you can survey theiremployees?62
Evaluating Surveying Costs and BenefitsWang and Strong’s Data Quality Framework (1996; JoMIS)Top Five Data Quality Concerns of N = 355 ManagersBottom Five Data Quality ConcernsInference: A survey effort is seen as more valuable to the extent that it completed quickly and cost effectively, such that results make sense to managers and seem unbiased.SIOP  XXVI  -  Workshop #7 - Put Your Survey on a Diet63
Trade-offs with Reduced SurveysThe shorter the survey…the higher the response rate
the less work time that is lost

PACIS Survey Workshop

  • 1.
    Surveys, Response Rates,and NonresponseJeffrey StantonSyracuse University1
  • 2.
    AbstractThe survey methodis one of the most popular methods in Information Systems research. One problem that plagues most survey researchers is nonresponse. As theories get more complex and more constructs must be measured, surveys tend to get longer, and this can reduce response rates. Some traditions have developed around acceptable response rates, such as these: “In my area a 45% response rate is considered quite good...” or “that research can’t possibly be valid given a response rate under 20%.” Research shows that these and other myths about response rate are incorrect. In this tutorial, participants will learn four things: 1) The important difference between response rate and nonresponse bias, and why it is more important to minimize the latter rather than maximizing the former; 2) the full range of response enhancing techniques that have had their efficacy documented in the methods literature; 3) a focus on a particular method of response enhancement – survey and scale shortening; and 4) survey design methods that allow for the detection of the presence and magnitude of nonresponse bias. At the end of the tutorial, participants will have the skills and knowledge to build assessment and control of nonresponse into their survey methods.2
  • 3.
    Why do wecare if we our study has a low response rate?
  • 4.
    Low Response Rates…causesmaller data samples which decrease statistical power, increase confidence intervals, and may limit the types of statistical techniques that can effectively be applied to the collected data. …undermine the perceived credibility of the collected data…undermine the actualgeneralizability of the collected data because of nonresponse bias. Where nonresponse bias exists, survey results can produce misleading conclusions that do not generalize to the entire population
  • 5.
    Research History1939F. Stanton(1939) wrote one of the first empirical pieces on the topic in the Journal of Applied Psychology entitled, “Notes on the validity of mail questionnaire returns.”Suchmanand McCandless’s (1940) Journal of Applied Psychology article titled, “Who answers questionnaires?” A significant early event to draw interest in this topic occurred in 1948.19482010
  • 6.
    In the 1948U.S. presidential election, pre-election polls by major newspapers and polling organizations predicted a victory by New York State Governor, Thomas E. Dewey, ranging between 5 to 15 percentage points.Instead, the victory by incumbent president Harry S. Truman was an embarrassment for the emerging public opinion polling community. What caused the failure?
  • 7.
    Research TodayExtensive literatureon techniques to increase response rates: response enhancing strategiesStatistical methods of compensating for nonresponse through imputation of missing data Rubin (1987) developed a book length treatment of methods for imputing data in sample surveys. Characteristics of Nonrespondents & Nonresponse BiasRogelberg, S.G., Spitzmüller, C., Little, I.S., & Reeve, C.L. (2006). Understanding Response Behavior to an Online Special Topics Organizational Satisfaction Survey. Personnel Psychology, 59, 903-923
  • 8.
    Organizational Survey ResponseRatesYoussefinia (2003) examined 58 organizational surveys conducted over five years by two consulting firms. Anseel, Lievens, Schollaert(2008) Analyzed 2037 surveys, covering 1,251,651 individual respondents, published in 12 journals in I/O Psychology, Management, and Marketing during the period 1995-2008.You predict:What is a typical response rate in an organizational survey?What is the trend over recent years for response rates in organizations?
  • 9.
  • 10.
    What is anacceptable response rate for your study?
  • 11.
    Trick Question?Industry andacademic standards only put a response rate into context The fact that everyone else also achieves 30%, 50%, or 70% response does not help to demonstrate that the reported research is free from nonresponse bias. In the absence of good information about presence, magnitude, and direction of nonresponse bias, ignoring the results of a study with a 10% response rate -- particularly if the research question explores a new and previously unaddressed issue -- is just as foolish as assuming that one with a response rate of 80% is unassailable.
  • 12.
    The Nature ofNonresponse Bias where ‘PNR’ refers to the proportion of non-respondents, ‘Xres’ is the respondent mean on a survey relevant variable and ‘Xpop’ is the population mean on the corresponding survey relevant variable, if it were actually known. Overall, the impact of nonresponse on survey statistics depends on the percentage not responding and the extent to which those not responding systematically different from the whole population on survey relevant variables.
  • 13.
    Error/Bias due toNon-Response(http://www.idready.org/courses/2005/spring/survey_SamplingFrames.pdf)Non-Respondentsµ0, α0, r0, F0Respondentsµ1, α1, r1, F1HowMany?HowDifferent?May 15-17, 2008
  • 14.
    Sample:N=100If non-respondents resemblerespondents, then low response rate is not a problem.n=5 say YES10% Responsen=10n=5 say NO
  • 15.
    Sample:N=100Even when responserates are “high” substantial potential for error still exists.n=35 say YES70% Responsen=70n=35 say NO
  • 16.
    Worst Case ScenarioExerciseWhat’s are the worst two things that could happen?Choose one scenario below and run the numbers…1. Sample N=100; Response rate 30%; Percent YES for respondents = 40% (12 YES votes)2. Sample N=100; Response rate 30%; Percent YES for respondents = 90% (27 YES votes)3. Sample N=100; Response rate: 90%; Percent YES for respondents = 50% (45 YES votes)4. Sample N=100; Response rate: 90%; Percent YES for respondents = 80% (72 YES votes)If you counted all non-respondents as YES votes or NO votes, what would be the range of results in each of these scenarios?
  • 17.
    Previous Examples: ProportionsOfcourse, rating scale means can be similarly impacted by non-response errorWhat about correlations?Word on the street is that correlations are fairly robust against non-response errorWe ran a simulation – 300 runs of random samples of n=500 from a larger population where rho=0.284 between a rating scale and a criterion scaleHalf of the samples had biased nonresponse where those favorable on a rating scale were twice as likely to respondResults showed that the unbiased samples had slightly suppressed correlations: a decline of r=0.038Biased samples had more suppression: a decline of r=0.066Difference in suppression was statistically significant, p<.001No sign reversals of correlations in any of the 300 samples
  • 18.
    Case Study ExerciseInstructions:Readbrief case, make some notesDiscuss with others at your table Generate as many ideas as you can, write ideas on sheetsPrepare to report back to complete group Case Overview:You are an preparing a climate surveyUnlikely to get response rate much above 45%How can you prepare for possible criticism of results?
  • 19.
    N-BIASResponse rate aloneis an inaccurate and unreliable proxy for study quality. While improving response rates is a worthy goal, researchers’ major efforts and resources should go into understanding the magnitude and direction of bias caused by non-response, if it exists. Rogelberg and Stanton (2006) advocate that researchers should conduct a nonresponse bias impact assessment (N-BIAS), regardless of how high a response rate is achieved.
  • 20.
    N-BIAS MethodsN-BIAS ispresently composed of eight techniquesArchival AnalysisFollow-up ApproachWave AnalysisPassive Nonresponse AnalysisInterest Level AnalysisActive Nonresponse AnalysisWorst Case ResistanceBenchmarking/NormsDemonstrate Generalizability
  • 21.
    N-BIAS: How itWorksSimilar to a test validation strategy. In amassing evidence for validity, each of several different validation methods (e.g., concurrent validity) provides a variety of insights into validity. Each assessment approach has strengths and limitations. There is no one conclusive approach and no particular piece of evidence that is sufficient to ward off all threats. Assessing the impact of nonresponse bias requires development and inclusion of different types of evidence, and the case for nugatory impact of nonresponse bias is built on multiple pieces of evidence that converge with one another.
  • 22.
    Technique 1: ArchivalAnalysisMost common techniqueThe researcher identifies an archival database that contains the members of the whole survey sample (e.g. personnel records).That data set, usually containing demographic data, can be described:50% Female; 40% Supervisors, etcAfter data collection, code numbers on the returned surveys (or access passwords) can be used to identify respondents, and by extension nonrespondents. Using this information, the archival database can be partitioned into two segments: 1) data concerning respondents; and 2) data concerning nonrespondents.
  • 23.
    So, if youfound the above do you have nonresponse bias?
  • 24.
    Technique 2: Follow-upApproachUsing identifiers attached to returned surveys (or access passwords), respondents can be identified and by extension nonrespondents.The follow-up approach involves randomly selecting and resurveying a small segment of nonrespondents, using an alternative modality and often by phone. The full or abridged survey is then administered.In the absence of identifiers, telephone a small random sample and ask whether they responded or not to the initial survey. Follow-up with survey relevant questions
  • 25.
    Technique 3: WaveAnalysisBy noting in the data set whether each survey was returned before the deadline, after an initial reminder note, after the deadline, and so on, responses from pre-deadline surveys can be compared with the late responders on actual survey variables (e.g. compare job satisfaction levels).
  • 26.
    Technique 4: PassiveNonresponse AnalysisRogelberg et al. (2003) found that the vast majority of nonresponse can be classified as being passive in nature (approx. 85%).Passive nonresponse does not appear to be planned.When asked (upon receipt of the survey), these individuals indicate a general willingness to complete the survey – if they have the time. Given this, it is not surprising that they generally do not differ from respondents with regard to job satisfaction or related variables.
  • 27.
    Technique 5: InterestLevel AnalysisResearchers have repeatedly identified that interest level in the survey topic is one of the best predictors of a respondent’s likelihood of completing the survey.As a result, if interest level is related to attitudinal standing on the topics making up the survey, the survey results are susceptible to bias.E.g., if low interest individuals tend to be more dissatisfied on the survey constructs in question, results will be biased “high”
  • 28.
    Technique 6: ActiveNonresponse AnalysisActive nonrespondents, in contrast to passive nonrespondents, are those that overtly choose not to respond to a survey effort. The nonresponse is volitional and a priori (i.e. it occurs when initially confronted with a survey solicitation).Active nonrespondents tend to differ from respondents on a number of dimensions typically relevant to the organizational survey researcher (e.g. job satisfaction)
  • 29.
    Technique 7: WorstCase ResistanceGiven the data collected from study respondents in an actual study, one can empirically answer the question of what proportion of nonrespondents would have to exhibit the opposite pattern of responding to adversely influence sample results.Similar philosophy as what occurs in meta-analyses when considering the “file-drawer problem”By adding simulated data to an existing data set, one can explore how resistant the dataset is to worst case responses from non-respondents.
  • 30.
    Technique 8: BenchmarkingUsingmeasures with norms for the population under examination, compare means and standard deviations of the collected sample to the norms
  • 31.
    Technique 9: DemonstrateGeneralizabilityBy definition, nonresponse bias is a phenomenon that is peculiar to a given sample under particular study conditions.Triangulating with a sample collected using a different method, or varying the conditions under which the study is conducted should also have effects on the composition of the nonrespondents group.
  • 32.
    N-BIAS: ConclusionNonresponse canbe problematic on a number of frontsDo what you can to facilitate responseIn the inevitable case of nonresponse, engage in the N-BIAS approach in an attempt to accumulate information to provide insight into the presence and absence of problematic nonresponse biasEngage in as many techniques as feasible. 1, is better than 0, 2 is better than 1. Most published literature has none!Each approach has a different purpose, each has positives and negatives.Use N-BIAS information collected to decide on next steps and educate your audience
  • 33.
    Armstrong, J. S.,& Overton, T. S. (1977). Estimating nonresponse bias in mail surveys Journal of Marketing Research, 14 (Special Issue: Recent Developments in Survey Research), 396-402.Baruch, Y. (1999). Response rate in academic studies – A comparative Analysis. Human Relations, 52 (4), 421-438. Bosnjak M., Tuten, T.L., & Wittman , W. W. (2005). Unit (non) response in web-based access panel surveys: An extended planned-behavior approach. Psychology and Marketing, 22, 489-505.Dillman, Don A. (2000). Mail and Internet Surveys: The Tailored Design Method. New York, NY, US: John Wiley & Sons, Inc.Groves, R., Presser, S., & Dipko, S. (2004). The role of topic interest in survey participation decisions. Public Opinion Quarterly, 68, 2-31. Rogelberg, S.G., & Luong, A. (1998). Nonresponse to mailed surveys: A review and guide. Current Directions in Psychological Science, 7, 60-65.Rogelberg, S.G., Luong, A., Sederburg, M.E., & Cristol, D.S. (2000). Employee Attitude Surveys: Examining the Attitudes of Noncompliant Employees. Journal of Applied Psychology, 85(2), 284-293.Rogelberg, S. G., Fisher, G. G., Maynard, D, Hakel, M.D., & Horvath, M. (2001). Attitudes Toward Surveys: Development of a Measure and its Relationship to Respondent Behavior. Organizational Research Methods, 4, 3-25.Rogelberg, S. G., Conway, J. M.., Sederburg, M. E., Spitzmuller, C., Aziz, S., Knight, W. E. (2003). Profiling Active and Passive-non-respondents to an Organizational Survey. Journal of Applied Psychology, 88 (6), 1104-1114.Rubin, D. (1987). Multiple imputation for nonresponse in surveys. New York, NY, US: John Wiley & Sons, Inc.Tomaskovic-Devey, D., Leiter, J., & Thompson, S. (1994). Organizational survey response. Administrative Science Quarterly, 39, 439-457Weiner, S.P. & Dalessio, A.T. (2006). Oversurveying: Causes, consequences, and cures. In A.I. Kraut (Ed.), Getting Action From Organizational Surveys: New Concepts, Methods, and Applications. (pp 294-311) San Francisco, California: Jossey-BassYammarino, F.J., Skinner, S.J., & Childers, T.L. (1991). Understanding mail survey response behavior: A meta-analysis. Public Opinion Quarterly, 55, 613–639.Key References
  • 34.
    Segment 2:Methods toFacilitate ResponseQuick brainstorm: I have thought of 12 response facilitation techniques. How many can you as a group come up with in three minutes?34
  • 35.
    Methods To FacilitateResponseActively publicize the survey. Personally notify your potential participants that they will be receiving a survey in the near future. Provide incentives, if appropriate. Inexpensive items such as pens, key chains, or certificates for free food/drink can increase responses.
  • 36.
    Keep the surveyto a reasonable length. A theory-driven approach to survey design helps determine what is absolutely necessary to include in the survey instrument. Do not use the “kitchen sink” approach.What is a reasonable length?Be sensitive to the actual physical design of your survey. For example, how questions are ordered may impact respondent participation. A study by Roberson and Sundstrom (1990) suggests placing the more interesting and easy questions first and demographic questions last.
  • 37.
    Send reminder notes.Response rates may bump up 3-7% with each reminder note, but keep in mind that there's a point of diminishing returns when you irritate people who have chosen not to participate. Give everyone the opportunity to participate (e.g., paper surveys where required, scheduling time off the phone in the call centers, etc.). At one company for example, most surveys run for 10 business days and span across three work weeks. Track response rates so that the survey coordinators can identify units with low response rates and contact the responsible manager to increase responses. 
  • 38.
    Foster commitment tothe survey effort. For example, you can involve a wide range of employees (across many levels) in the survey development process. Link the content of the survey to important business outcomes. Provide respondents with survey feedback after the project is completed. Be careful not to abandon your participants once getting the data you wanted from them. You are paving the way for future survey efforts.Personalization of the survey invitation. Personal signature as part of cover letter.Topic salience
  • 39.
    Even when controllingfor the presence of other techniques, advance notice, personalization, identification numbers, and salience, are associated with higher response rates.Because of survey fatigue and declining response rates we need to do more just to get the same results as in the past.Target facilitation strategy to who you are surveying.For top executives, Anseel found that salience of the survey topic was most key. Incentives were counterproductiveIncentives worked for unemployed individuals
  • 40.
    Segment 3:Survey reductiontechniques in detailQuick Brainstorm: I have thought of seven ways of reducinga survey. How many reduction methods can you thinkof in three minutes?40
  • 41.
    Primary Goal: ReduceAdministration TimeSecondary goalsReduce perceived administration timeIncrease the engagement of the respondent with the experience of completing instrument  lock in interest and excitement from the startReduce the extent of missing and erroneous data due to carelessness, rushing, test forms that are hard to use, etc.Increase the respondents’ ease of experience (maybe even enjoyment!) so that they will persist to the end AND that they will respond again next year (or whenever the next survey comes out)Conclusions?Make the survey SEEM as short and compact as possibleStreamline the WHOLE EXPERIENCE from the first call for participation all the way to the end of the final page of the instrumentFocus test-reduction efforts on the easy stuff before diving into the nitty-gritty statistical stuff41
  • 42.
    Instruction ReductionFewer than4% of respondents make use of printed instructionsNovick and Ward (2006, ACM-SIGDOC)
  • 43.
    Comprehension of instructionsonly influences novice performance on surveysCatrambone (1990; HCI)
  • 44.
    Instructions on averageare written five grade levels above average grade level of respondent; 23% of respondents failed to understand at least one element of instructionsSpandorfer et al. (1993; Annals of EM)42
  • 45.
    Instruction ReductionConclusionsUnless youare working with a special/unusual population, you can assume that respondents know how to complete Likert scales and other common response formats without instructions
  • 46.
    Most people don’tread instructions anyway. When they do, the instructions often don’t help them respond any better!
  • 47.
    If your responseformat is so novel that people require instructions, then you have a substantial burden to pilot test, in order to ensure that people comprehend the instructions and respond appropriately. Otherwise, do not take the risk!43
  • 48.
    Archival DemographicsMost surveyprojects seek to subdivide the population into meaningful groupsgender, race/ethnicity, agemanagers and non-managersexempt and non-exemptpart time and full timeunit and departmental affiliationsDemographic data are criticalDemographic data often comprise one page, 5-15 questions, and 1-3 minutes of administration time per respondent
  • 49.
    Self-completed demographic datafrequently containing missing fields or intentional mistakes44
  • 50.
    Archival DemographicsFor thesake of anonymity, these data can be de-identified up front and attached to randomly generated code (alphanumeric) - in other words, have the demographic form contain a code, and that code is matched to the survey.
  • 51.
    Respondents should feellike demographics are not serving to identify them in their survey responses.
  • 52.
    You could offerrespondents two choices: match (or automatically fill in) some/all demographic data using the code number provided in your invitation email (or on a paper letter); they fill in the demographic data (on web-based surveys, a reveal can branch respondents to the demographics page)45
  • 53.
    From Don Dillman’sTailoredDesign Method: Key Form/Interface Design Goals – Non-subordinating language, No embarrassment, No drudgery, Readability, SimplicityDrudgery – Questions that require data lookup, calculation, interpolation, recall of specific events from distant past; response process should give a sense of forward momentum and achievementReadability – Grade level should match respondent population reading capabilitySimplicity – Layout should draw the eye directly to the items and response fields; response method should fit respondents’ experience and expectations Discuss: Any particularly frustrating surveys? Particularly easy/streamlined ones? 46Forms/Interface Design
  • 54.
  • 55.
    EligibilityIf a surveyhas eligibility requirements, the screening questions should be placed at the earliest possible point in the survey.(eligibility requirements can appear in instructions, but this should not be the sole method of screening out ineligible respondents)Skip LogicSkip logic actually shortens the survey by setting aside questions for which the respondent is ineligible.BranchingBranching may not shorten, but can improve the user experience by offering questions specifically focused to the respondent’s demographic or reported experience.48Illustration credit: Vovici.comEligibility, Skip Logic, and Branching
  • 56.
    Discuss: Ever answera survey where you knew that your answer would predict how many questions you would have to answer after that?e.g., “How many hotel chains have you been to in the last year?”If users can predict that their eligibility, the survey skip logic, or survey branching will lead to longer responses, more complex responses, or more difficult or tedious responses, they may:Abandon the surveyBackup and change their answer to the conditional with less work (if the interface permits it).49Implications: Eligibility, Skip Logic, and BranchingIllustration credit: Vovici.com
  • 57.
    Branch design should try not to imply what the user would have experienced in another branch.Paths through the survey should avoid causing considerably more work for some respondents than for others– if at all possible.50Implications: Eligibility, Skip Logic, and BranchingIllustration credit: Vovici.com
  • 58.
    Panel Designs andMultiple AdministrationPanel designs measure the same respondents on multiple occasions.Typically either predictors are gathered at an early point in time, and outcomes gathered at a later point in time, or both predictors and outcomes are measured at every time point. (There are variations on these two themes).Panel designs are based on maturation and/or intervention processes that require the passage of time. Examples: career aspirations over time, person-organization fit over time, training before/after – discuss others?Minimally, panel designs can help mitigate (though not solve) the problem of common method bias; e.g., responding to a criterion at time 2, respondents tend to forget how they responded at time 1.51
  • 59.
    Panel Designs andMultiple AdministrationSurvey designers can apply the logic of panel designs to their own surveys:Sometimes, you have to collect a large number of variables (no measure shortening), and it is impractical to do so in a single administration.Generally speaking: Better to have a many short, pleasant survey administrations with a cumulative “work time lost” of an hour vs. long and grinding one hour-long survey.The former can get you happier and less fatigued respondents and better data, hopefully.In the limit, consider the implications of a “Today’s Poll” approach to measuring climate, stress, satisfaction, or other attitudinal variables: One question per day, every day….52
  • 60.
    Unobtrusive Behavioral ObservationSurveysappear convenient and relatively inexpensive in and of themselves…however, the cumulative work time lost across all respondents may be quite large. Methods that assess social variables through observations of overt behavior rather than self report can provide indications of stress, satisfaction, organizational citizenship, intent to quit, and other psychologically and organizationally relevant variables.Examples Cigarette breaks over time (frequency, # of incumbents per day); Garbage (weight of trash before/after a recycling program); Social media usage (tweets, blog posts, Facebook); Wear of floor tilesAbsenteeism or tardiness records; Incumbent, team and department production quality and quantity measures53
  • 61.
    Unobtrusive Behavioral ObservationMostunobtrusive observations must be conducted over time:Establish a baseline for the behavior.Examine subsequent time periods to examine changes/trends over time. Generally, much more labor intensive data collection than surveys.Results should be cross-validated with other types of evidence.54
  • 62.
    Scale Reduction andOne-item MeasuresStandard scale construction calls for “sampling the construct domain” with items that tap into different aspects of the construct with items that refer to various content areas. Scales with more items can include a larger sample of the behaviors or topics relevant to the construct. 55RELEVANTmeasuring what you want measureConstruct DomainItem ContentCONTAMINATEDmeasuring what you don’t want to measureDEFICIENTnot measuring what you want to measure
  • 63.
    Scale Reduction andOne-item MeasuresWhen fewer items are used, by necessity they must be eithermore general in wording to obtain full coverage (hopefully)more narrow to focus on a subset of behaviors/topicsInternal consistency reliability reinforces this trade-off: As the number of items gets smaller, inter-item correlation must rise to maintain a given level of internal consistency. However, scales with fewer than 3-5 items rarely achieve acceptable internal consistency without simply becoming alternative wordings of the same questions.Discussion: How many of you have taken a measure where you were being asked the same question again and again? Your reactions? Why was this done?The one-item solution: A one-item measure usually “covers” a construct only if is highly non-specific. A one item measure has a measurable reliability (see Wanous & Hudy; ORM, 2001), but the concept of internal consistency is meaningless.Discuss: A one-item knowledge measure vs. a one-item job satisfaction measure.56
  • 64.
    One-item Measure LiteratureResearchusing single item measures of each of the five JDI job satisfaction facets and found correlations between .60 and .72 to the full length versions of the JDI scalesNagy (2002)Review of single-item graphical representation scales; so called “faces” scales Patrician (2004)Single item graphic scale for organizational identificationShamir & Kark (2004)Research finding that single item job satisfaction scales systematically overestimate workers’ job satisfactionOshagbemi(1999)Single item measures work best on “homogeneous” constructsLoo (2002)57
  • 65.
    Scale Reduction: TechnicalConsiderationsItems can be struck from a scale based on three different sets of qualities: 1. Internal item qualities refer to properties of items that can be assessed in reference to other items on the scale or the scale's summated scores. 2. External item qualities refer to connections between the scale (or its individual items) and other constructs or indicators. 3. Judgmental item qualities refer to those issues that require subjective judgment and/or are difficult to assess in isolation of the context in which the scale is administered The most widely used method for item selection in scale reduction is some form of internal consistency maximization. Corrected item-total correlations provide diagnostic information about internal consistency. In scale reduction efforts, item-total correlations have been employed as a basis for retaining items for a shortened scale version. Factor analysis is another technique that, when used for scale reduction, can lead to increased internal consistency, assuming one chooses items that load strongly on a dominant factor58
  • 66.
    Scale Reduction IIDespitetheir prevalence, there are important limitations of scale reduction techniques that maximize internal consistency. Choosing items to maximize internal consistency leads to item sets highly redundant in appearance, narrow in content, and potentially low in validity High internal consistency often signifies a failure to adequately sample content from all parts of the construct domain To obtain high values of coefficient alpha, a scale developer need only write a set of items that paraphrase each other or are antonyms of one other. One can expect an equivalent result (i.e., high redundancy) from using the analogous approach in scale reduction, that is, excluding all items but those highly similar in content.59
  • 67.
    Scale Reduction IIIIRTprovides an alternative strategy for scale reduction that does not focus on maximizing internal consistency. One should retain items that are highly discriminating (i.e., moderate to large values of a) and one should attempt to include items with a range of item thresholds (i.e., b) that adequately cover the expected range of the trait in measured individuals IRT analysis for scale reduction can be complex and does not provide a definitive answer to the question of which items to retain; rather, it provides evidence for which items might work well together to cover the trait rangeRelating items to external criteria provides a viable alternative to internal consistency and other internal qualities Because correlations vary across different samples, instruments, and administration contexts, an item that predicts an external criterion best in one sample may not do so in another.  Choosing items to maximize a relation with an external criterion runs the risk of a decrease in discriminant validity between the measures of the two constructs.60
  • 68.
    Scale Reduction IVTheoverarching goal of any scale reduction project should be to closely replicate the pattern of relations established within the construct's nomological network.  In evaluating any given item's relations with external criteria, one should seek moderate correlations with a variety of related scales (i.e., convergent validity) and low correlations with a variety of unrelated measuresResearchers may also need to examine other criteria beyond statistical relations to determine which items should remain in an abbreviated scale.Clarity of expression, its relevance to a particular respondent population, the semantic redundancy of an item's content with other items, the perceived invasiveness of an item, and an item's "face" validity. Items lacking apparent relevance, or that are highly redundant with other items on the scale, may be viewed negatively by respondents.To the extent that judgmental qualities can be used to select items with face validity, both the reactions of constituencies and the motivation of respondents maybe enhancedSimple strategy for retention that does not require IRT analysis: Stepwise regression Rank ordered item inclusion in an "optimal" reduced-length scale that accounts for a nearly maximal proportion of variance in its own full-length summated scale score.  Order of entry into the stepwise regression is a rank order proxy indicating item goodness Empirical results show that this method performs as well as a brute force combinatorial scan of item combinations; method can also be combined with human judgment to pick items from among the top ranked items (but not in strict ranking order)61
  • 69.
    Segment 4:Pitfalls, trade-offs,and justificationsQuick Brainstorm: What complaints have you heard frommanagers when you ask them if you can survey theiremployees?62
  • 70.
    Evaluating Surveying Costsand BenefitsWang and Strong’s Data Quality Framework (1996; JoMIS)Top Five Data Quality Concerns of N = 355 ManagersBottom Five Data Quality ConcernsInference: A survey effort is seen as more valuable to the extent that it completed quickly and cost effectively, such that results make sense to managers and seem unbiased.SIOP XXVI - Workshop #7 - Put Your Survey on a Diet63
  • 71.
    Trade-offs with ReducedSurveysThe shorter the survey…the higher the response rate
  • 72.
    the less worktime that is lost
  • 73.
    the higher chancethat one or more constructs will perform poorly if the measures are not well established/developed
  • 74.
    less information mightbe obtained about each respondent and their score on a given construct
  • 75.
    have to sellits meaningfulness to decision makers who will act on the resultsSIOP XXVI - Workshop #7 - Put Your Survey on a Diet64
  • 76.
    Potential Pitfalls ofSurveys Containing Abbreviated ScalesUnacceptably low (or high!) internal consistency reliability
  • 77.
    Loss of validityrelationships
  • 78.
    Difficulty or inabilityto compare to or equate with prior time periods of data collection (e.g., if the items or measures cannot be matched)
  • 79.
    Loss of perceivedcredibility (“spending so little time on a test…it must be very good”)Justifications for Reduced SurveysReduced administration time saves money
  • 80.
    Reduced administration timesaves employee frustration, increases response rates, fosters good will
  • 81.
    When reduction processis careful and systematic, validity and usability of results are preserved for many applicationsSIOP XXVI - Workshop #7 - Put Your Survey on a Diet65
  • 82.
    66For brief surveys,lost productivity is nugatory, even for highly paid employees.As administration time goes up, lost time cost becomes excessive for highly paid employees. SIOP XXVI - Workshop #7 - Put Your Survey on a Diet
  • 83.
    IRT Information Function,18 item scale vs. 6 item scaleSIOP XXVI - Workshop #7 - Put Your Survey on a Diet67
  • 84.
    Caveats on InterpretationValidityrelations do not tell the whole story. A validity coefficient will decline to the extent that there is extensive reordering of score levels between the predictor and the criterion. When comparing individual scores to a cut score or other standard, a short form can create localized mixing that is not reflected in diminished validity.SIOP XXVI - Workshop #7 - Put Your Survey on a Diet68
  • 85.
    Justification to Stakeholders:Do’s and Don’tsDodocument the savings in administration time and corresponding reduction in lost work time
  • 86.
    Do assess appropriatereliability in short form scales (e.g., alpha)
  • 87.
    Do check responserate, abandoned forms, missing data levels, and intentional mal-response; compare with previous administration cycles
  • 88.
    Do compare correlationsbetween short form and long form scales
  • 89.
    Do show offthe elegance of the design of your reduced survey
  • 90.
    Don’tfacilitate year-over-year comparisonson scale scores or percentilesunless an equating study has been conducted
  • 91.
    Don’t allow decisionsabout individual respondents to occur using short-form scales without studies that re-assess cut score levels and related concerns
  • 92.
    Don’t assume thatan inversion in the relative position of two specific respondents(or two departments) over time reflects a reliable changeSIOP XXVI - Workshop #7 - Put Your Survey on a Diet69
  • 93.
    BibliographyBinning, J. F.,& Barrett, G. V. (1989). Validity of personnel decisions: A conceptual analysis of the inferential and evidential bases. Journal of Applied Psychology, 74, 478-494.Catrambone, R. (1990). Specific versus general procedures in instructions. Human-Computer Interaction, 5, 49-93.Dillman, D. A., Smyth, J. D., & Christian, L. M. (2008). Internet, mail, and mixed-mode surveys: The tailored design method. Hoboken, NJ: Wiley.Donnellan, M. B., Oswald, F. L., Baird, B. M., & Lucas, R. E. (2006). The Mini-IPIP scales: Tiny-yet-effective measures of the Big Five factors of personality. Psychological Assessment, 18, 192-203.Emons, W. H. M., Sijtsma, K., & Meijer, R. R. (2007). On the consistency of classification using short scales. Psychological Methods, 12, 105-12.Girard, T. A., & Christiansen, B. K. (2008). Clarifying problems and offering solutions for correlated error when assessing the validity of selected-subtest short forms. Psychological Assessment, 20, 76-8.Hinkin, T. R. (1995). A review of scale development practices in the study of organizations. Journal of Management, 21, 967-988.Levy, P. (1968). Short-form tests: A methodological review. Psychological Bulletin, 6, 410-416.Loo, R. (2002). A caveat on using single-item versus multiple-item scales. Journal of Managerial Psychology, 17, 68-75.Lord, F. M. (1965). A strong true-score theory, with applications. Psychometrika, 3, 239-27.Nagy, M. S. (2002). Using a single item approach to measure facet job satisfaction. Journal of Occupational and Organizational Psychology, 75, 77-86.Novick, D. G., & Ward, K. (2006). Why don't people read the manual? Paper presented at the SIGDOC '06 Proceedings of the 24th Annual ACM International Conference on Design of Communication.Oshagbemi, T. (1999). Overall job satisfaction: how good are single versus multiple-item measures? Journal of Managerial Psychology, 14, 388-403.Patrician, P. A. (2004). Single-item graphic representational scales. Nursing Research, 53, 347-352.Shamir, B., & Kark, R. (2004). A single item graphic scale for the measurement of organizational identification. Journal of Occupational and Organizational Psychology, 77, 115-123.SIOP XXVI - Workshop #7 - Put Your Survey on a Diet70
  • 94.
    BibliographySmith, G. T.,McCarthy, D. M., & Anderson, K. G. (2000). On the sins of short form development. Psychological Assessment, 12, 102-111.S pandorfer, J. M., Karras, D. J., Hughes, L. A., & Caputo, C. (1995). Comprehension of discharge instructions by patients in an urban emergency department. Annals of Emergency Medicine, 25, 71-74.Stanton, J. M., Sinar, E., Balzer, W. K., Smith, P. C., (2002). Issues and strategies for reducing the length of self-report scale. Personnel Psychology, 55, 167-194.Wanous, J. P., & Hudy, M. J. (2001). Single-item reliability: A replication and extension. Organizational Research Methods, 4, 361-375.Widaman, K. F., Little, T. D., Preacher, K. J., Sawalani, G. M. (2011). On creating and using short forms of scales in secondary research. In K. H. Trzesniewski, M. B. Donnellan, & R. E. Lucas (Eds.). Secondary data analysis: An introduction for psychologists (pp. 39-61). Washington, DC: American Psychological Association.SIOP XXVI - Workshop #7 - Put Your Survey on a Diet71
  • 95.
    About the presenterJeffStanton, PhDJeffrey Stanton is Associate Vice President for Research at Syracuse University. Dr. Stanton’s research focuses on organizational behavior and technology. He is the author of more than 40 peer reviewed journal articles as well as two books, The Visible Employee: Using Workplace Monitoring and Surveillance to Protect Information Assets – Without Compromising Employee Privacy or Trust and Information Nation: Educating the Next Generation of Information Professionals. Stanton’s methodological expertise is in psychometrics including the measurement of job satisfaction and job stress, as well as research on creating abridged versions of scales; he is on the editorial board of Organizational Research Methods and is an associate editor at Human Resource Management. Dr. Stanton's research has been supported through 15 grants and supplements including the National Science Foundation’s CAREER award. Dr. Stanton received his Ph.D. in Industrial and Organizational psychology from the University of Connecticut in 1997.Contact Information:Jeffrey M. Stanton, PhDSyracuse University, School of Information Studies316 Hinds Hall, Syracuse NY 13244Voice: (315)443-2979 Email: [email protected]://ischool.syr.edu/facstaff/member.aspx?id=223SIOP XXVI - Workshop #7 - Put Your Survey on a Diet72