An Evaluation of the 2016 Election Polls in the United States,Public Opinion Quarterly

当前位置： X-MOL 学术 › Public Opinion Quarterly › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

An Evaluation of the 2016 Election Polls in the United States
Public Opinion Quarterly ( IF 2.9 ) Pub Date : 2018-01-01 , DOI: 10.1093/poq/nfx047
Courtney Kennedy , Mark Blumenthal , Scott Clement , Joshua D Clinton , Claire Durand , Charles Franklin , Kyley McGeeney , Lee Miringoff , Kristen Olson , Douglas Rivers , Lydia Saad , G Evans Witt , Christopher Wlezien

The 2016 presidential election was a jarring event for polling in the United States. Preelection polls fueled high-profile predictions that Hillary Clinton’s likelihood of winning the presidency was about 90 percent, with estimates ranging from 71 to over 99 percent. When Donald Trump was declared the winner of the presidency, there was a widespread perception that the polls failed. But did the polls fail? And if so, why? Those are among the central questions addressed by an American Association for Public Opinion Research (AAPOR) ad hoc committee. This paper presents the committee’s analysis of the performance of preelection polls in 2016, how that performance compares to polling in prior elections, and the extent to which performance varied by poll design. In addition, the committee examined several theories as to why many polls, particularly in the Upper Midwest, underestimated support for Trump. The explanations for which the most evidence exists are a late swing in vote preference toward Trump and a pervasive failure to adjust for overrepresentation of college graduates (who favored Clinton). In addition, there is clear evidence that voter turnout changed from 2012 to 2016 in ways that favored Trump, though there is only mixed evidence that misspecified likely voter models were a major cause of the systematic polling error. Finally, there is little evidence that socially desirable (Shy Trump) responding was an important contributor to poll error. Donald Trump’s victory in the 2016 presidential election came as a shock to pollsters, political analysts, reporters, and pundits, including those inside Trump’s own campaign (Jacobs and House 2016). Leading up to the election, three types of information widely discussed in the news media indicated that Democratic nominee Hillary Clinton was likely to win. First, polling data showed Clinton consistently leading the national popular vote, which is usually predictive of the winner (Erikson and Wlezien 2012), and leading, if narrowly, in Pennsylvania, Michigan, and Wisconsin—states that had voted Democratic for president six elections running. Second, early voting patterns in key states, particularly in Florida and North Carolina, were described in high-profile news stories as favorable for Clinton (Silver 2017a). Third, election forecasts from highly trained academics and data journalists declared that Clinton’s probability of winning was about 90 percent, with estimates ranging from 71 to over 99 percent (Katz 2016). The day after the election, there was a palpable mix of surprise and outrage directed toward the polling community, as many felt that the industry had seriously misled the country about who would win (e.g., Byers 2016; Cillizza 2016; Easley 2016; Shepard 2016). The unexpected US outcome added to concerns about polling raised by errors in the Kennedy et al. in Public Opinion Quarterly 82 (2018) 3 2014 referendum on Scottish independence, the 2015 UK general election, and the 2016 British referendum on European Union membership (Barnes 2016). In the weeks after the 2016 US election, states certified their vote totals and researchers began assessing what happened with the polls. It became clear that a confluence of factors made the collective polling miss seem worse than it actually was, at least in some respects. The winner of the popular vote (Clinton) was different than the winner of the Electoral College (Trump). While such a divided result is not without precedent, the full arc of US history suggests it is highly unlikely. With respect to polling, preelection estimates pointed to an Electoral College contest that was less certain than interpretations in the news media suggested (Trende 2016; Silver 2017b). Eight states with more than a third of the electoral votes needed to win the presidency had polls showing a lead of three points or less (Trende 2016). Trende noted that his organization’s battleground-state poll averages had Clinton leading by a very slim margin in the Electoral College (272 to 266), putting Trump one state away from winning the election. Relatedly, the elections in the three Upper Midwest states that broke unexpectedly for Trump (Pennsylvania, Michigan, and Wisconsin) were extremely close. More than 13.8 million people voted for president in those states, and Trump’s combined margin of victory was 77,744 votes (0.56 percent). Even the most rigorously designed polls cannot reliably indicate the winner in contests with such razor-thin margins. Even with these caveats about the election, a number of important questions surrounding polling remained. There was a systematic underestimation of support for Trump in state-level and, to a lesser extent, national polls. The causes of that pattern were not clear but potentially important for avoiding bias in future polls. Also, different types of polls (e.g., online versus live telephone) seemed to be producing somewhat different estimates. This raised questions about whether some types of polls were more accurate and why. More broadly, how did the performance of 2016 preelection polls compare to those of prior elections? These questions became the central foci for an ad hoc committee commissioned by the American Association for Public Opinion Research (AAPOR) in the spring of 2016. The committee was tasked with summarizing the accuracy of 2016 preelection polling, reviewing variation by different poll methodologies, and assessing performance Kennedy et al. in Public Opinion Quarterly 82 (2018) 4 through a historical lens. After the election, the committee decided to also investigate why polls, particularly in the Upper Midwest, underestimated support for Trump. The next section presents several of the main theories for why many polls underestimated Trump’s support. This is followed by a discussion of the data and key metrics the committee used to perform its analyses. Subsequent sections of the paper present analyses motivated by the research questions posed here. The paper concludes with a discussion of the main findings and implications for the field. Theories about Why Polls Underestimated Support for Trump A number of theories were put forward as to why many polls missed in 2016.1 Nonresponse Bias and Deficient Weighting Most preelection polls have single-digit response rates or feature an opt-in sample for which a response rate cannot be computed (Callegaro and DiSogra 2008; AAPOR 2016). While the link between low response rates and bias is not particularly strong (e.g., Merkle and Edelman 2002; Groves and Peytcheva 2008; Pew Research Center 2012, 2017a), such low rates do carry an increased risk of bias (e.g., Burden 2000). Of particular note, adults with weaker partisan strength (e.g., Keeter et al. 2006), lower educational levels (Battaglia, Frankel, and Link 2008; Chang and Krosnick 2009; Link et al. 2008; Pew Research Center 2012, 2017a), and anti-government views (U.S. Census Bureau 2015) are less likely to take part in surveys. Given the anti-elite themes of the Trump campaign, Trump voters may have been less likely than other voters to accept survey requests. If survey response was correlated with presidential vote and some factor not accounted for in the weighting, then a deficient weighting protocol could be one explanation for the polling errors. 1. The original committee report (AAPOR 2017) also discussed ballot-order effects. That discussion has been dropped in this paper because there was not strong evidence that such effects were a major contributor to polling errors in 2016. There remains an important debate about the possibility that ballot order affected the outcome of the presidential race in several states, including Michigan, Wisconsin, and Florida. Kennedy et al. in Public Opinion Quarterly 82 (2018) 5

中文翻译：

对美国 2016 年选举投票的评估

2016 年的总统大选对美国的民意调查来说是一个不和谐的事件。选举前民意调查引发了高调预测，即希拉里·克林顿 (Hillary Clinton) 赢得总统职位的可能性约为 90%，估计范围从 71% 到 99% 以上。当唐纳德特朗普被宣布为总统候选人时，人们普遍认为民意调查失败了。但是民意调查失败了吗？如果是这样，为什么？这些是美国舆论研究协会 (AAPOR) 特设委员会解决的核心问题之一。本文介绍了委员会对 2016 年选举前民意调查表现的分析，该表现与之前选举中的民意调查相比如何，以及民意调查设计对表现的影响程度。此外，委员会还研究了几种关于为什么许多民意调查、特别是在中西部地区，低估了对特朗普的支持。存在最多证据的解释是对特朗普的投票偏好迟到了，以及普遍未能调整大学毕业生（支持克林顿）的过多代表。此外，有明确的证据表明，从 2012 年到 2016 年，选民投票率发生了有利于特朗普的变化，尽管只有混合证据表明错误指定的可能选民模型是导致系统性投票错误的主要原因。最后，几乎没有证据表明社会期望（害羞的特朗普）的回应是导致民意调查错误的重要因素。唐纳德·特朗普 (Donald Trump) 在 2016 年总统大选中获胜，令民意调查机构、政治分析家、记者和专家，包括特朗普自己的竞选团队内部的人士感到震惊（Jacobs and House 2016）。在选举前，新闻媒体广泛讨论的三种信息表明，民主党候选人希拉里·克林顿很有可能获胜。首先，民意调查数据显示，克林顿始终领先全国民众投票，这通常可以预测获胜者（Erikson 和 Wlezien 2012），并且在宾夕法尼亚州、密歇根州和威斯康星州（曾六次选举民主党总统选举的州）中领先（如果微弱的话）跑步。其次，主要州的早期投票模式，特别是佛罗里达州和北卡罗来纳州，在备受瞩目的新闻报道中被描述为有利于克林顿（Silver 2017a）。第三，来自训练有素的学者和数据记者的选举预测表明，克林顿获胜的概率约为 90%，估计范围从 71% 到 99% 以上（Katz 2016）。大选后的第二天，投票界明显感到惊讶和愤怒，因为许多人认为该行业在谁会获胜的问题上严重误导了国家（例如，Byers 2016；Cillizza 2016；Easley 2016；Shepard 2016）。美国出人意料的结果增加了对肯尼迪等人错误引发的民意调查的担忧。在《民意季刊》第 82 季 (2018) 3 2014 年苏格兰独立公投、2015 年英国大选和 2016 年英国关于加入欧盟的公投 (Barnes 2016) 中。在 2016 年美国大选后的几周内，各州核证了他们的投票总数，研究人员开始评估投票结果。很明显，多种因素的汇合使集体投票失败看起来比实际情况更糟，至少在某些方面是这样。普选的获胜者（克林顿）与选举团的获胜者（特朗普）不同。虽然这种分裂的结果并非没有先例，但美国历史的完整弧线表明这是极不可能的。关于投票，选举前的估计表明选举团的比赛不如新闻媒体所暗示的解释那么确定（Trende 2016；Silver 2017b）。赢得总统职位所需的选举人票超过三分之一的八个州的民意调查显示领先优势为三分或更少（Trende 2016）。潮流指出，他的组织的战场民意调查平均值在选举大学（272至266岁）中受到了一个非常苗条的边缘，使特朗普一国队远离选举。相关地，特朗普意外破裂的三个中西部上州（宾夕法尼亚州、密歇根州和威斯康星州）的选举非常接近。这些州有超过 1,380 万人投票支持总统，特朗普的总胜差为 77,744 票（0.56%）。即使是设计最严格的民意调查，也无法可靠地指出在如此微薄的利润率的竞争中获胜者。即使有关于选举的这些警告，围绕投票的一些重要问题仍然存在。有系统地低估了州级和全国民意调查中对特朗普的支持率。这种模式的原因尚不清楚，但对于避免未来民意调查中的偏见可能很重要。此外，不同类型的民意调查（例如，在线与现场电话）似乎产生了一些不同的估计。这引发了关于某些类型的民意调查是否更准确以及为什么更准确的问题。更广泛地说，与之前的选举相比，2016 年的选举前民意调查的表现如何？这些问题成为美国民意研究协会 (AAPOR) 于 2016 年春季委托设立的一个特设委员会的核心焦点。该委员会的任务是总结 2016 年大选前投票的准确性，审查不同民意调查方法的差异，以及评估性能肯尼迪等人。在公众意见季刊 82 (2018) 4 中通过历史镜头。选举结束后，委员会决定还调查为什么民意调查，特别是在中西部地区，低估了对特朗普的支持。下一节介绍了为什么许多民意调查低估了特朗普的支持率的几个主要理论。接下来是对委员会用于执行分析的数据和关键指标的讨论。本文的后续部分将根据此处提出的研究问题进行分析。本文最后讨论了该领域的主要发现和影响。关于为什么民意调查低估了对特朗普的支持的理论提出了许多关于为什么许多民意调查在 2016 年错过的理论。1 无响应偏差和不足的权重大多数选举前民意调查的响应率都是个位数，或者具有响应率不能的选择样本计算（Callegaro 和 DiSogra 2008；AAPOR 2016）。虽然低响应率和偏见之间的联系不是特别强（例如，Merkle 和 Edelman 2002；Groves 和 Peytcheva 2008；Pew Research Center 2012, 2017a），如此低的比率确实会增加偏倚风险（例如，Burden 2000）。特别值得注意的是，党派力量较弱的成年人（例如，Keeter 等人 2006）、教育水平较低（Battaglia、Frankel 和 Link 2008；Chang 和 Krosnick 2009；Link 等人 2008；皮尤研究中心 2012、2017a），和反政府观点（美国人口普查局 2015 年）不太可能参与调查。鉴于特朗普竞选活动的反精英主题，特朗普选民可能比其他选民更不可能接受调查请求。如果调查响应与总统投票相关并且权重中未考虑某些因素，那么权重协议不足可能是投票错误的一种解释。1. 最初的委员会报告（AAPOR 2017）也讨论了选票的影响。该讨论已在本文中删除，因为没有强有力的证据表明此类影响是 2016 年投票错误的主要原因。关于选票顺序影响几个州的总统竞选结果的可能性仍然存在重要辩论，包括密歇根州、威斯康星州和佛罗里达州。肯尼迪等人。在民意季刊 82 (2018) 5

更新日期：2018-01-01

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文