Introduction

As of May 11, 2020, the United States accounted for about 30% of all global COVID-19 deaths (World Health Organization, 2020) and the pandemic was deepening the political division of the nation. Medical experts plead for social distancing while protesters armed with assault rifles and brandishing Confederate flags and swastikas stormed a state capitol to demand the lifting of restrictions (Burnett, 2020). Some allegedly went as far as to plan a take over of the state capitol and kidnap the governor over COVID restrictions (Jones, 2020). While governors removed restrictions over the summer, these social divides have been exacerbated as people argue over the need for non-pharmaceutical interventions such as masks and social distancing. The United States is in turmoil, medically, economically, and socially. Understanding what drives the spread of COVID-19 and whether these drivers are related to this social turmoil requires identifying the statistical drivers in the general population, and not just the risk factors of its individual victims. This research presents statistical models that include health, demographic, and economic variables to identify these drivers, and demonstrates that there have been two starkly contrasting experiences with the pandemic, one centered on the New York metropolitan area and the other for the rest of the country.

Insights from social learning theory (Flinn, 1997; McElreath et al., 2010) describe how evolved cognitive mechanisms likely work to guide somatic efforts to adapt to the local effects of the pandemic and its associated social disruption. Models of the evolution of cooperation inform our understanding of the mechanisms that enhance cooperation such as costly punishment (Alvard, 2003; Boyd and Mathew, 2007), contingent cooperation (Gurven, 2006), external forcing mechanisms (Huang et al., 2018), and memory and imitation (Liu et al., 2019), providing a deeper understanding of how variable experiences with the pandemic lead people to cooperate or resist with non-pharmaceutical interventions. Given the deep evolutionary roots of these cognitive tendencies, changing behavior to combat the pandemic will need to go far beyond simply making information available to the public and address the realities of how the pandemic is experienced differently, and how people process information about their environments.

A model of the COVID-19 pandemic

The unit of analysis of this study is the county and USA FactsFootnote 1 initially began its data collection at this level, therefore data on COVID-19 deaths were used from this source. Data on possible covariates were obtained from public domain U.S. government sources and aggregated for 3143 counties across the United States.

Population density, urban status (U.S. government Rural Urban Continuum Codes),Footnote 2 and a history of foreign travelFootnote 3 were considered because of the simple dynamics of disease spread; the virus’ transmission requires introduction, vectors, and is enhanced by the frequency of encounters. Median incomeFootnote 4, per capita ICU bedsFootnote 5, and percent uninsuredFootnote 6 were introduced to capture the potential effects of socio-economic factors such as poverty and poor health care systems. Median male age and COVID-19 comorbidity (percent smokers, obesity, diabetes, extreme alcohol use)Footnote 7 were introduced to capture physiological risk factors (CDC COVID-19 Response Team, 2020). Another potential risk factor is pollution, measured by the amount particulate matter for each county and drawn from the same source as the comorbidity data. The percent non-Hispanic Black populationFootnote 8 was introduced as it became increasingly clear that the disease was having a disproportionate impact on African Americans in the U.S. and similar minorities in the United Kingdom (Millett et al., 2020; White and Nafilyan, 2020). Finally, the percent of the labor force in occupations affected by government shutdowns (Dey and Loewenstein, 2020) was derived from Bureau of Labor Statistics dataFootnote 9 to account for economic effects on the population.

Some of the model’s key variables were highly correlated, potentially confounding their effects on death rates. For instance, median income had a strong negative correlation with urbanism (r = −0.521) and comorbidity (r = −0.595), the percent uninsured was negatively correlated with income (r = −0.330) and urbanism (r = −0.223), and the percent non-Hispanic Black population was correlated with pollution (r = 0.296,), comorbidity (r = 0.391), and median income (r = −0.248). Factor analysis confirmed that the percent non-Hispanic Black population was largely a proxy for these other risk factors and stepwise regressions routinely excluded this variable; consequently it was dropped from the model. Furthermore, Initial modeling revealed that 80% of the explained variance in the model was driven by the region around New York City, indicating that separate models should be run on that area verses the rest of the country. Therefore, the dataset was partitioned between the 31 counties that comprise the U.S. Census New York Combined Statistical Area (NYCSA), and the remaining 3112 counties of the U.S.

The relationship between the per capita COVID death rate and the independent variables described above was captured in a linear model in which the effect of each independent variable, i, is measured by coefficients, βI, and residual variance, e.

$$\begin{array}{l}{\mathrm{Percapita}}\,{\mathrm{COVID}}\,{\mathrm{Deaths}}\\ = \beta _{{\mathrm{pd}}}{\mathrm{Population}}\,{\mathrm{Density}} + \,\beta _{{\mathrm{us}}}{\mathrm{Urban}}\,{\mathrm{Status}} + \beta _p{\mathrm{Pollution}}\\ + \,\beta _{{\mathrm{icu}}}{\mathrm{percapitaI}}\,{\mathrm{CU}}\,{\mathrm{Beds}} + \,\beta _{{\mathrm{mma}}}{\mathrm{Median}}\,{\mathrm{Male}}\,{\mathrm{Age}}\\ + \beta _{{\mathrm{cm}}}{\mathrm{COVID}}\,{\mathrm{Comorbidity}} + \,\beta _{{\mathrm{inc}}}{\mathrm{Median}}\,{\mathrm{Income}}\\ + \,\beta _{{\mathrm{ins}}}{\mathrm{Percent}}\,{\mathrm{Uninsured}} + \,\beta _{{\mathrm{for}}}{\mathrm{Foreign}}\,{\mathrm{Travel}} \\ + \,\beta _{{\mathrm{ao}}}{\mathrm{Percent}}\,{\mathrm{Occupations}}\,{\mathrm{Affected}} + e\end{array}$$

Stepwise regression on the raw data and factor scores from principle components analysis are used to counteract the multicollinearity between independent variables, and all coefficients are standardized to allow for comparison of their relative strength of influence. Stepwise regression eliminates independent variables that do not contribute substantially to explaining the variation in the model, therefore the results presented in Table 1 contain only the coefficients retained by the procedure. Adjusted R-square values that correct for the number of variables retained in the model are presented to represent how well the models fit the data in comparison to one another.

Table 1 Results of per capita COVID-19 deaths raw data models.

The NYCSA models generally explained about 80% of the variance in per capita deaths. The only consistent relationships are a positive association with a history of foreign travel and negative associations with population density and median male age. The negative association with population density was unexpected, although it is restricted to this sample. The negative association with median male age is a function of counties with younger populations experiencing more per capita deaths, not that the young were at greater risk of death. This finding may be particularly important as it indicates that young people appear to be key vectors in the transmission of the virus to older people. The most consistent and strongest association is with a history of foreign travel, which reinforces the simple logistics of transmission. As of October 8, 2020,Footnote 10 these results remain virtually unchanged.

For the rest of the United States, only two variables exhibited a consistent relationship with COVID-19 deaths, urbanism and a history of foreign travel; urban counties that historically received a lot of foreign travel experience more deaths. Despite the fact that these relationships are highly statistically significant, the explained variance of the models is much lower than the NYCSA counties, indicating that the rest of the nation’s experience has been much more variable and the covariate signals are much weaker. As of October 8, 2020, more variables show a statistically significant association with per capita deaths after the pandemic had time to spread throughout the nation. These include a negative association with median income and median male age, and a positive association with comorbidity and pollution. In other words, poor counties with disproportionate young populations and high levels of comorbidity and pollution experience more deaths per capita. However, the variance explained remains low (R2 = 0.166), indicating that the experience with the pandemic outside of the NYCSA remains highly variable and inconsistent.

An alternative means of correcting for multicollinearity is to use scores from orthogonal components of a factor analysis. Principal components analysis using a varimax rotation yielded the following factors (Table 2). These factors, however, require interpretation.

Table 2 Principal components analysis of COVID-19 death model independent variables.

Component 1 (Comorbidity) loads high on co-morbidity and low income, with a weak association with rural status. Component 2 (Density Travel) loads very highly on population density and a history of foreign travel, to the point that the two variables are practically one and the same. The five counties that comprise New York City are a prime example. Component 3 (Young Urban) loads highly on young male age, with weaker associations with urban status and pollution. Finally, Component 4 (Insured) reflects people who have health insurance. Table 3 presents the primary results of linear regressions on these factors.

Table 3 Results of COVID-19 death factor models.

As of May 8, 2020, the only factor that was statistically associated with per capita COVID-19 deaths in the NYCSA was density/travel, which once again reinforces the fundamental role the logistics of viral transmission has played in the pandemic. While lower than the raw data models, the variance explained is still quite strong. Since May 14, 2020, one other factor, insured, became statistically significant and has continued to be so. High rates of health insurance decrease per capita death rates from COVID.

The variance explained in the non-NYCSA factor score models is typically half to a third as strong at the NYCSA models, indicating a weaker association with the factors. As of May 8, 2020, density/travel is the strongest covariate with per capita COVID-19 deaths, followed by the urban status of a county. Once again, these factors reinforce the logistics of transmission. Comorbidity and insurance are statistically significantly associated with deaths, but with the opposite sign expected before May 8, but these relationships flip in time such that by mid-summer comorbidity is positively and being insured is negatively associated with per capita deaths as expected; as of October 8, 2020 their respective coefficients are 0.264 and −0.144, respectively.

The results of all of these models indicate that the COVID-19 pandemic has been driven by logistical factors across the U.S. (history of foreign travel, urbanism, population density), but experienced to differing degrees and with differing levels of uncertainty depending on location. Table 4 lists the average per capita death rate by county category.Footnote 11

Table 4 Averaged COVID-19 death rates by county category.

As of May 8, 2020, the NYCSA COVID-19 death rate was over 15 times higher than the rest of the country. Over forty-two percent of the counties, a strong plurality, are considered extremely rural and their death rates were over 20 times lower than those experienced in the NYCSA epicenter of the pandemic (Table 4). By October 8, 2020 these figures narrowed, but people were generally still 4 times less likely to die from COVID-19 than in the NYCSA. Depending on where one lives, the experience of the pandemic is very different and the local signals that there is a pandemic at all range from obvious to extremely weak. These differences, moderated by evolved mechanisms for social learning, have predictably led to differing perspectives on and behavior towards the pandemic.

An evolutionary explanation

Simply blaming the backlash against COVID restrictions on fake news (Shiloh Vidon, 2020), social media (de la Garza, 2020), and structural racism (Barber and Barber, 2020), does not address the cognitive mechanisms by which information is influencing behavior. Social and economic factors certainly play a role in individual cases of death (e.g., lack of insurance), but people are also responding to signals in their environment as external forcing mechanisms, and looking to social sources of information to make decisions and weigh trade-offs (see Huang et al., 2018; Kuznar, 2007; Kuznar and Frederick, 2003; Kuznar and Lutz, 2007) between cooperatively following guidelines for non-pharmaceutical interventions and sheltering at home versus working and taking care of their families. Evolutionary psychology can provide a framework for considering how people are grappling with those decisions and understanding why people choose to cooperate by following COVID restrictions in some instances and selfishly ignore or even flout them in others.

Social learning theory emphasizes how people rapidly learn by imitating group members (Flinn, 1997; Hoel et al., 2019; van Leeuwen et al. 2018; Whiten, 2017). Game theoretic models tested in agent-based simulations reinforce the underlying evolutionary logic of how imitation and punishment can enforce normative behavior (Alvard, 2003; Boyd and Mathew, 2007; Huang et al., 2018; Liu et al., 2019). Furthermore, the power of being able to remember past moves and who reciprocated enhances cooperative behavior (Gurven, 2006; Liu et al., 2019). While shared behaviors may not be optimal, if widely practiced they often are adaptive (i.e., good enough) and can be an extremely efficient and effective mode of learning (McElreath et al., 2010). Social learning is biased toward in-group members and is enhanced by frequent interaction and shared norms (Glowacki and Molleman, 2017; van Leeuwen et al., 2018). Furthermore, norm-enforcing tendencies are intensified under threat conditions (Roos et al., 2015), and have been observed during historic cholera epidemics (Dutta and Rao, 2015). These patterns are present across cultures and in non-human primates, indicating that these cognitive mechanisms have great evolutionary depth (Glowacki and Molleman, 2017; Hoel et al., 2019; McElreath et al., 2010; Santos and Rosati, 2015; van Leeuwen et al., 2018; Whiten, 2017).

Another cognitive ability at which humans excel is learning from little experience (Biederman, 1987), known as one-shot learning. Recent research has identified the neural mechanisms involved in human one-shot learning and have found that in situations of high uncertainty, one-shot learning mechanisms are strengthened (Lee, O’Doherty, and Shimojo, 2015).

I argue that social learning mechanisms and one-shot learning are operant in the current pandemic and are providing people with evolutionarily adaptive perspectives and associated behaviors. The COVID-19 pandemic has created two evolutionary challenges: a threat to health and life, and a threat to individual livelihoods because of the historic shut-down of the economy. Health and subsistence are two key areas of somatic effort in adaptation that have been linked to worldviews and coping strategies in humans under stress (Łukasik et al., 2018). However, health and economic threats are not born equally across demographics such as age, ethnic identity, and the urban/rural continuum as the models show.

These patterns add up to two different national experiences with the pandemic, one of urban and in particular Metro New York devastation from disease, and one of largely rural suffering because of job-loss. Due to social media distortions and selective consumption of information (confirmation bias), the average U.S. resident is not receiving complete and accurate information on the virus, its transmission, and its effects on society. In other words, individuals as decision makers have relied on (1) very limited, idiosyncratic, and rare direct experiences with the virus, and (2) social learning to navigate how to adapt to this rapidly changing and uncertain environmental threat.

Given the uncertainty of the threat and the rarity of signals for most Americans, people are expected to fall back on evolved cognitive mechanisms more than ever, as opposed to cool, measured, lengthy, and thoughtful reflection on scientific facts. The fact that people are operating under threat conditions reinforces their tendencies to rely on social learning mechanisms. Furthermore, social learning mechanisms are strong within-group, and therefore one should expect that social media bubbles will have even more influence than prior to the crisis. Finally, current (as of October 2020) confirmed cases in the United States constitute only about 2.5% of the population (World Health Organization, 2020), meaning that at this point the average American has had little direct experience with the virus. People in urban environments at the heart of the pandemic are more likely to have direct experience, which should reinforce the threat the virus poses through one-shot learning. Also, because their neighbors have also been sheltering and taking precautions, imitation will reinforce these behaviors (Liu et. al., 2019). In contrast, the likelihood that a person in rural America even knows someone with symptoms is small, minimizing the chance that one-shot learning can occur and reinforcing social learning from peers and information sources they trust. Imitation of their peers will also reinforce their non-compliance with non-pharmaceutical interventions. Costly punishment such as violent protest and intimidation tactics will also reinforce non-compliance of their peers. The lack and quality of information that is consumed, combined with the human tendency for social learning and imitation, means that the mask-clad urbanites sheltering in place and the largely rural working-class protesters are both reacting in predictable, and probably adaptable, ways to the pandemic. Urban dwellers are definitely at higher risk of infection and so reasonably are avoiding contact, even at personal economic loss. People in rural working-class counties in reality have a remote chance of death yet are suffering a major blow to their somatic efforts (Łukasik et al., 2018) to put food on the table.

Many of the variables in this model covary with the political cleavages in the U.S.; support for candidate Donald Trump in the 2016 election is a proxy for these variables and per capita deaths are negatively correlated with the percent of the vote President Trump received in 2016 (May 2, 2020, r = −0.230). Furthermore, the President’s practice of blaming foreign entities for perceived injuries to America resonates well with his constituency (Suedfeld et al., 2020). The pandemic is likely to increase President Trump’s support among his base. While per capita death rates have increased across the United States, the data presented here indicate that they still remain substantially lower than in the key urban centers. Recent polls indicate that the pandemic may be eroding the President’s support as more people in rural areas experience the virus (Skelley and Thomson-DeVeaux, 2020), although whether or not the increased experience with the pandemic lowered the President’s support was enough to cost him the November 2020 election remains to be seen. The President’s base in rural America still has received a much less consistent signal of somatic threat from the pandemic, and many in their in-group networks reinforce their original views that the pandemic is of little concern.

Finally, the cavalier attitudes expressed by many young Americans, whether during Spring Break or at “COVID parties,” are also explained by this same evolutionary logic. In reality, they are at much lower risk of a serious infection let alone death. A Chinese study on February 17, 2020 reported a 0.3% death rate for people in their 20 s (Novel Coronavirus Pneumonia Emergency Response Epidemiology Team, 2020), and a report by New York City Health (2020) showed similar figures for youth; it is not surprising that many young people, regardless of race or urban/rural lifestyle, have exhibited highly risky behaviors. Their risky behavior may explain the consistently negative association between median male age and per capita deaths in this study; the young appear to be primary vectors for transmission of the virus. These findings reinforce the use of social learning heuristics that are, in the evolutionary sense, individually adaptive, if risky for society as a whole.

Implications

People are not classically economically rational decision makers, they are evolutionarily rational decision makers. West et al. (2020) identified the key information that rational decision makers should have to combat the spread of the pandemic, along with an APEASE (acceptability, practicability, effectiveness, affordability, spill-over effects, equity) framework for tailoring messages. The research presented here reinforces the need for messages to be tailored to specific audiences, delivered by credible sources, through channels that in-groups use, in a language that will resonate with the target audience, and in a manner that reinforces equity and respects the dignity of the target audience.Footnote 12 The lack of such consistent and tailored communication has probably contributed to the second spread, and the possibility of further social division and unrest.