The U.S. Consumer Expenditure Interview Survey asks many filter questions to identify the items that households purchase. Each reported purchase triggers follow-up questions about the amount spent and other details. We test the hypothesis that respondents learn how the questionnaire is structured and underreport purchases in later waves to reduce the length of the interview. We analyze data from 10,416 four-wave respondents over two years of data collection. We find no evidence of decreasing data quality over time; instead, panel respondents tend to give higher quality responses in later waves. The results also hold for a larger set of two-wave respondents.
Although counts of the novel Coronavirus (SARS-CoV-2) infections and deaths are reported by several sources online, precise estimation of the exposed proportion of the population is not possible in most areas of the world. Estimates of other disease prevalence in the United States are often obtained through in-person seroprevalence surveys. The availability of testing only for individuals with symptoms, combined with stay-at-home and social distancing mandates to stem the spread of the disease, limit in-person data collection options. A probability-based mail survey with at-home, self-administered testing is a feasible method to safely estimate SARS-CoV-2 antibody prevalence within the United States while also easing burden on the U.S. public and health care system. This mail survey could be a one-time, cross-sectional design, or a repeated cross-sectional or longitudinal survey. We discuss several options for designing and conducting this survey.
Several studies have shown that high response rates are not associated with low bias in survey data. This paper shows that, for face-to-face surveys, the relationship between response rates and bias is moderated by the type of sampling method used. Using data from Rounds 1 through 7 of the European Social Survey, we develop two measures of selection bias, then build models to explore how sampling method, response rate, and their interaction affect selection bias. When interviewers are involved in selecting the sample of households or respondents for the survey, high reported response rates can in fact be a sign of poor data quality. We speculate that the positive association detected between response rates and selection bias is because of interviewers’ incentives to select households and respondents who are likely to complete the survey.
Panel survey participation can bring about unintended changes in respondents’ behaviour and/or their reporting of behaviour. Using administrative data linked to a large panel survey, we analyse whether the survey brings about changes in respondents’ labour market behaviour. We estimate the causal effect of panel participation on the take‐up of federal labour market programmes by using instrumental variables. Results show that panel survey participation leads to an increase in respondents’ take‐up of these measures. These results suggest that panel survey participation not only affects the reporting of behaviour, as previous studies have demonstrated, but can also alter respondents’ actual behaviour.
Administrative data are increasingly important in statistics, but, like other types of data, may contain measurement errors. To prevent such errors from invalidating analyses of scientific interest, it is therefore essential to estimate the extent of measurement errors in administrative data. Currently, however, most approaches to evaluate such errors involve either prohibitively expensive audits or comparison with a survey that is assumed perfect. We introduce the “generalized multitrait-multimethod” (GMTMM) model, which can be seen as a general framework for evaluating the quality of administrative and survey data simultaneously. This framework allows both survey and administrative data to contain random and systematic measurement errors. Moreover, it accommodates common features of administrative data such as discreteness, nonlinearity, and nonnormality, improving similar existing models. The use of the GMTMM model is demonstrated by application to linked survey-administrative data from the German Federal Employment Agency on income from of employment, and a simulation study evaluates the estimates obtained and their robustness to model misspecification. Supplementary materials for this article are available online.
The LISS online panel has made extra efforts to recruit and retain households that were not regular users of the internet into the study. Households were provided with computers and/or internet when necessary. Including these cases made the panel more representative of the Dutch population, by bringing in respondents who were more likely to be older, to live in single-person homes and to have migration backgrounds. This paper replicates five published papers which used LISS data and explores how the conclusions in these papers would have been different had the LISS panel not included the non-internet households. There are strong demographic differences between the internet and non-internet households, and estimates of means would in many cases be biased if these households had not been included. However, across the five replicated studies, few of the published model estimates are substantively affected by the inclusion of these households in the LISS sample.