Introduction

Reaction time (RT), an index of processing speed or efficiency in the central nervous system (CNS)1, is an essential factor in higher cognitive function2,3 and is profoundly affected by age4. In fact, of the studied demographics, age is the main factor known to influence RT4. Processing speed is an important limiting factor for most aspects of cognition during aging, most notably memory5,6. In studies where processing speed was used as a covariate, the age-related variance in various episodic memory measures was reduced or even eliminated7,8. Moreover, studies comparing varied factors and tests of age-related episodic memory deficit implicate age-related decline in processing speed as the main mediator9,10,11. These findings collectively suggest that RT is a useful index of age-related cognitive decline, healthy brain aging, and neurodevelopment.

RT can be operationally defined as “simple,” which typically involves a non-choice reaction to a visual stimulus (svRT). RT can also be operationally defined as “complex,” which involves a reaction to one or more visual stimuli after recognition (cvrRT) of correct stimuli and inhibiting incorrect stimuli12. svRT demonstrates variability between individuals, which is akin to paired-associate learning (PAL) and is influenced by genetic and environmental factors13. In addition, svRT effects are well noted across the field of neurology; for example, AD and stroke patients show lengthened svRT and higher inter-individual variability14,15. However, due to the limitations of traditional research methods, the body of work concerning RT examined only limited ranges of demographic, health, medical, and lifestyle factors in small cohorts. For example, prior work’s demographics consisted of college-aged students, well-educated older adults16,17,18, or athletes19,20,21.

Further, with notable exceptions22,23,24, many studies had few participants (e.g., n < 1000) and were therefore powered to detect only variables with large effect size and to lead to spurious non-replicable findings25. Consequently, many RT studies had minimal ability to reveal low-frequency factors or those with subtle effect sizes and conduct more sophisticated analyses (e.g., ANOVA vs. Growth Modeling) to find interactions and moderators. Collectively, this suggests that if RT performance can inform models of disease or normative and atypical aging, we need a deeper understanding of the normal variation of RT and the genetic and environmental factors associated with RT performance.

This study aimed to characterize RT across a broad range of demographic, health, medical, and lifestyle variables commonly associated with cognitive performance and AD risk. To do this, we utilized both the MindCrowd and UK Biobank cohorts26,27, comprising over 233 thousand combined participants, to model RT as a function of 11 or more demographic, health, medical, and lifestyle factors. These factors have been previously associated with aging and cognition28,29,30,31,32,33. Based on our prior work and earlier RT research34, we hypothesized that RT, via its structure of factor association and modifications, would reveal meaningful connections to healthy brain aging.

Results

MindCrowd

As of March 13th, 2020, after filtering (see Data Quality Control in “Methods” section), MindCrowd, had recruited 75,666 qualified participants (see Table 1 for Sociodemographic Characteristics and Supplementary Fig. 1A for a histogram of age). We modeled svRT as a function of Age3 and PAL Performance3 (i.e., curvilinear associations), as well as 20 other factors (see Supplementary Fig. 2 for diagnostic regression plots, and Table 2 for each analysis’ n). The omnibus model was significant (Fomnibus[58, 73406] = 858.20, pomnibus < 2.2e−16, Adjusted R2 = 0.40).

Table 1 MindCrowd, UKBb MindCrowd, and UK Biobank’s sociodemographic characteristics.
Table 2 Summary of MindCrowd’s sample sizes (n).

MindCrowd Curvilinear associations: age and paired-associate learning (PAL)

Our model revealed that all three Age polynomials were significantly associated with svRT. Age1 (i.e., linear association, first-degree polynomial, aka slope), Age2 (i.e., quadratic association, second-degree polynomial), and Age3 (i.e., cubic association, third-degree polynomial). On average from younger to older Age, a one-year difference (X = 1). (1) Age1 (shift in Y; pAge1 = 3.06E − 18) was associated with 7 ms longer svRT. (2) Age2 (shift in Age1; pAge2 = 3.23E − 17) was associated with 0.15 ms of added svRT length (i.e., 7 + .15 ms/year, Fig. 1a). (3) Age3 (shift in Age2; pAge3 = 1.46E − 34) was associated with a negligible 1.47E − 03 ms shift in added svRT length (i.e., 7 + (0.15 + 1.47E − 03) ms/year, Fig. 1a). In contrast to Age’s association with longer svRT, each word pair correct for PAL Performance. (1) PAL1 (pPAL1 = 3.77E − 34) was associated 9 ms shorter svRT. (2) PAL2 (pPAL2 = 2.35E − 14) was associated with 0.32 ms of additional svRT shortening (i.e., 9 + 0.32 ms/year, Fig. 1b). (3) PAL3 (pPAL3 = 7.28E−09) was associated with a small 4.13E − 04 ms shift in added svRT shortening (i.e., 9 + (0.32 + 4.13E − 04) ms/year Fig. 1b).

Fig. 1: MindCrowd: age, paired-associate learning (PAL), and biological sex.
figure 1

MindCrowd analysis (ages 18–85) of simple visual reaction time (svRT). a Linear model fits (line fill ±95% CI, error bars ± SEM) of the median svRT by Age3 (curvilinear model). There was a curvilinear relationship between svRT and Age1 (βAge1 = 7.07, pAge1 = 3.06E − 18), Age2 (βAge2 = −0.15, pAge2 = 3.23E − 17), and Age3 (βAge3 = 1.47E − 03, pAge3 < 1.46E − 34, n = 75,666). b Linear model fits (line fill ±95% CI, error bars ± SEM) of the median svRT by Age3 (curvilinear model). There was a curvilinear relationship between svRT and paired-associate learning (PAL) performance PAL Performance1 (βPAL1 = −8.89, pPAL1 = 3.77E − 34), PAL Performance2 (βPAL2 = 0.32, pPALl2 = 2.35E − 14), and PAL Performance3 (βPAL3 = −4.13E − 03, pPAL3 = 7.28E − 09, n = 75,666). c Simple slope analysis of the linear model fit (line fill ±95% CI, error bars ± SEM) for the Median svRT × PAL Performance interaction (βAge*PAL = −0.07, pAge*PAL = 1.26E − 59). At 20 (βAge20*PAL = −5.48, pAge20*PAL < 0.00, n = 1985), 40 (βAge40*PAL = −6.41, pAge40*PAL < 0.00, n = 739), 60 (βAge60*PAL = −7.75, pAge60*PAL < 0.00, n = 1789), and 80 (βAge80*PAL = −9.08, pAge80*PAL < 0.00, n = 344) years of age. d Linear model fits (line fill ±95% CI, error bars ± SEM) of the median svRT by Age3 (curvilinear model) with lines split by Biological Sex. Being a woman was associated with longer svRT compared to being a man from younger to older ages (βSex = −34.26, pSex = 1.26E − 163, nWomen = 47,700, nMen = 27,966).

MindCrowd: sex, education, and handedness

Biological Sex was a significant predictor of svRT (pSex = 1.26E − 163). Being a man was associated with an average of 34 ms (9.63%) shorter svRT response than being a woman (Fig. 1d). Educational Attainment was also a significant factor associated with svRT. Compared to “No High School Diploma,” participants who had “Some College” (pCollege = 8.74E − 05), or a “College Degree” (pCDegree = 2.95E − 17, Fig. 2a) were shorter. Attending college and attaining a college degree was associated with a respective near 15 (4.14%) and 32 (8.92%) ms shorter svRT compared to not graduating from high school. Handedness was also associated with svRT. Left-handed participants had a near 4 ms (1.09%) shorter svRT (pLeft = 0.03, Fig. 2b). This association was present in individuals 20 to 40 years old (p40Left < 0.01, Fig. 2c) but not in individuals 40 to 60 years old (p60Left = 0.07, Fig. 2d).

Fig. 2: MindCrowd: educational attainment and handedness. MindCrowd analysis (ages 18–85) of simple visual reaction time (svRT).
figure 2

a Linear model fits (line fill ±95% CI) of the median svRT by Age3 (curvilinear model) with lines split by Educational Attainment. Participants who had “Some College” (βCollege = −14.73, pCollege = 8.74E − 05, n = 22,950), or a “College Degree” (βCDegree = −31.78, pCDegree = 2.95E − 17, n = 44,140) were shorter than those with “No High School Diploma” (n = 1881). bd Linear model fits (line fill ±95% CI) of the median svRT by Age3 (curvilinear model) with lines split by Handedness. b From 18–85 years old, left-handed participants showed slightly shorter svRTs (βLeft = −3.87, pLeft < 0.03), (c) an association found in 20–40 years old (β40Left = −3.16, p40Left < 0.01, Left-Handed n = 8,449), (d) but not 40–60 years old (β60Left = −2.69, p60Left = 0.07, Right-Handed n = 66,903).

MindCrowd: health, medical, and lifestyle factors

For health and medical factors associated with svRT, we found that Smoking Status (pSmoking = 1.26E − 03, Fig. 3a) and Reported Dizziness (pDizzy = 0.04, Fig. 3b) were both significant predictors of svRT. Smoking Status was associated with 7 ms (1.99%) lengthened svRT, and Reported Dizziness was associated with nearly a 5 ms (1.37%) lengthened svRT. When compared to participants reporting “no daily medications,” taking “Two” (pMeds2 = 2.00E − 03), “Three” (pMeds3 < 0.01), and “Four” (pMeds4 = 3.51E − 16) Daily Medications were associated with an approximate 5 (1.64%), 6 (1.76%), and 18 (5.01%) ms longer svRTs, respectively (Fig. 3c). Further, Diabetes Mellitus (pDiabetes < 3.36E − 05, Fig. 3d), and Stroke (pStroke = 3.59E − 04, Fig. 3e) were related to 11 (3.16%) and 20 (5.73%) ms longer svRTs, respectively, when compared to participants not reporting either condition. Of note, in this model, both a first-degree family history of Alzheimer’s disease (FHAD; pFHAD = 0.78) and Hypertension (pHyper = 0.52) were not significant predictors of svRT performance (Supplementary Fig. 3A–B).

Fig. 3: MindCrowd: health, medical, and lifestyle factors.
figure 3

MindCrowd analysis (ages 18–85) of simple visual reaction time (svRT). ab Smoking Status and Reported Dizziness were associated with lengthened svRT. a Linear model fits (line fill ±95% CI) of the median svRT by Age3 (curvilinear model) with lines split by Smoking Status. Participants identifying as a smoker showed lengthened svRT (βSmoking = 7.07, pSmoking = 1.26E − 03, Smoker n = 5793, Non-Smoker n = 69,873). b Linear model fits (line fill ±95% CI) of the median svRT by Age3 (curvilinear model) with lines split by Reported Dizziness. Participants reporting Reported Dizziness showed lengthened svRT performance (βDizzy = 4.87, pDizzy = 0.04, Reported Dizziness Reported n = 4749, No Reported Dizziness Reported n = 70,917). ce Daily Medications, Diabetes, and Reported Stroke were all associated with lengthened svRT. MindCrowd analysis (ages 18–85). c Linear model fits (line fill ±95% CI) of the median svRT by Age3 (curvilinear model) with lines split by Daily Medications taken. Compared to participants reporting “no daily medications” (n = 33,672), taking “Two” (βMeds2 = 5.84, pMeds2 = 2.00E − 03, n = 9651), “Three” (βMeds3 = 6.24, pMeds3 < 0.01, n = 6656), and “Four” (βMeds4 = 17.82, pMeds4 = 3.51E − 16, n = 10,769) lengthened svRT. d Linear model fits (line fill ±95% CI) of the median svRT by Age3 (curvilinear model) with lines split by Diabetes Mellitus. Participants reporting having diabetes were associated with lengthened svRT (βDiabetes = 11.23, pDiabetes < 3.36E − 05, Diabetes Reported n = 3887, No Diabetes Reported n = 71,779). e Linear model fits (line fill ±95% CI) of the median svRT by Age3 (curvilinear model) with lines split by Reported Stroke. A reported stroke was associated with longer svRTs (βStroke = 20.38, pStroke = 3.59E − 04, Reported Stroke n = 765, No Reported Stroke n = 74,901).

MindCrowd: two-way interactions

For interactions, we found Age significantly interacted with PAL Performance. Age × PAL Performance (pAge*PAL = 9.93E − 62). Analysis of simple slopes suggests that each word pair correct was associated with shorter svRT from younger to older ages. That is, at 20 (pAge20*PAL < 0.00), 40 (pAge40*PAL < 0.00), 60 (pAge60*PAL < 0.00), and 80 (pAge80*PAL < 0.00, Fig. 1c) years old. There was a significant Biological Sex × Age interaction (pAge*Sex = 4.61E − 08), indicating that the associated slowing of svRT at younger and older ages in men, compared to women, was 0.36 ms (0.08%) less per one year difference in age (Fig. 1c). These data suggest that men’s age-associated svRT lengthening was slower when compared to women. Of interest, in both women and men, we found significant Age × Educational Attainment interactions. Compared to Age × “No High School Diploma”, participants reporting having “Some College” (pAge*College = 4.20E − 04) or a College Degree” (pAge*CDegre = 2.07E − 12) was associated with longer RTs from young to an older age. These results suggest that attending college or getting a college degree was associated with a 0.65 (0.15%) and 1.31 (0.30%) ms shortened svRT performance per one year difference in age, respectively (Fig. 2a). The MindCrowd model revealed a significant Age × Reported Stroke interaction (pAge*Stroke = 2.72E − 06). Participants who Reported Stroke were associated with an approximate 2 ms (0.37%) longer svRT per a one-year difference in age (Fig. 3e). Lastly, we found a significant Age × Smoking Status interaction (pAge*Smoke = 5.67E − 07). This interaction suggests that Smoking Status lengthens svRT by adding 0.57 ms (0.11%) per year difference in age (Fig. 3a). See Table 3 for a summary of MindCrowd’s results.

Table 3 Summary MindCrowd’s main results.

MindCrowd: mobile device

Participants who were identified as using a mobile device to take MindCrowd (i.e., using a touchscreen, n = 7603, age M = 54.06 SD = 14.54 years) were associated with longer svRTs and were older (βAge–Mobile = 14.13, pAge–Mobile < 2e − 16)compared to those who did not use a mobile device (n = 76,775, age M = 45.54 SD = 18.43 years, see Supplementary Fig. 4).

UKBb MindCrowd and UK Biobank

Of the total 75,666 MindCrowd participants, 39,759 between the ages of 40 and 70 were selected to mirror the UK Biobank. This subset is called UKBb MindCrowd from here on to differentiate it from MindCrowd. After filtering (see “Methods” section: Data Quality Control), the UK Biobank cohort had 158,249 participants, derived from a data request we received on 9–19–2019 (See Table 1 for Sociodemographic Characteristics and Supplementary Fig. 1B–C for age histograms). We model both the UKBb MindCrowd’s svRT (see Supplementary Fig. 5 for regression diagnostic plots) as well UK Biobank’s cvrRT (see Supplementary Fig. 6 for regression diagnostic plots) as a function of 11 shared survey questions (see Table 4 for MindCrowd and UK Biobank’s sample sizes [ns]). The omnibus UKBb MindCrowd (Fmcomni[20, 38871] = 1039, pmcomni < 2.2e − 16, Adjusted R2 = 0.08) and UK Biobank (Fukbbbomni[20, 157903] = 1038, pukbbomni < 2.2e−16, Adjusted R2 = 0.13) LMs were both significant. Table 5 summarizes the results from UKBb MindCrowd and the UK Biobank side by side.

Table 4 UKBb MindCrowd and UK Biobank’s sample size (ns) summary.
Table 5 Summary of the key results from UKBb MindCrowd and the UK Biobank.

UKBb MindCrowd and UK Biobank: age and sex

The UKBb MindCrowd cohort revealed Age as a significant predictor of svRT (pAge = 2.00E − 16). The parallel analysis (see “Statistical Methods” section) of Age in the UK Biobank cohort was also significant (pAge = 2.00E − 16) for complex visual recognition reaction time (cvrRT). For the association of Age and RT, UKBb MindCrowd and the UK Biobank showed longer RTs or worse RT performance from younger to older ages, with nearly 6 and 3 ms lengthened RT per year difference of age, respectively (Fig. 4a–b). For UKBb MindCrowd, Biological Sex was a significant predictor of RT (βSex = −40.00, pSex = 8.03E − 71), which was also the case in the UK Biobank (βSex = −18.28, pSex = 2.00E − 16). Being a man in both cohorts was associated with shorter RTs compared to being a woman (Fig. 4c–d). Here the effect of Biological Sex on RT between UKBb MindCrowd was 40 ms (20.46%), and the UK Biobank was 18 ms (5.16%).

Fig. 4: UK Biobank: age and biological sex.
figure 4

UKBb MindCrowd and UK Biobank analysis (ages 40–70) of visual reaction time (RT). ab Age was linearly associated with RT. a UKBb MindCrowd linear model fits (line fill ±95% CI, error bars ± SEM) of median simple visual RT (svRT) from young to old Age. b UK Biobank linear model fits (line fill ±95% CI, error bars ± SEM) of median complex visual recognition RT (cvrRT) from young to old Age. UKBb MindCrowd svRT (βAge = 5.75, pAge = 2.00E − 16, n = 39,795) and UK Biobank cvrRT (βAge = 3.40, pAge = 2.00E − 16, n = 158,245) were associated with similar lengthening from younger to older ages. The average 50 ms difference between UKBb MindCrowd svRT (M = 478.66 ms) and UK Biobank (M = 528.74 ms) is due to the choice component (i.e., do cards match or not > press button) of the UK Biobank’s cvrRT task compared to UKBb MindCrowd’s stimulus-response (i.e., the pink sphere appears > press button) svRT. cd Being a man, as compared to being a woman, was associated with shorter visual RT. c UKBb MindCrowd linear model fits (line fill ±95% CI, error bars ± SEM) of median svRT from young to old Age with lines split by Biological Sex. d UK Biobank linear model fits (line fill ±95% CI, error bars ± SEM) of median cvrRT from young to old Age with lines split by Biological Sex. Both UKBb MindCrowd svRT (βSex = −40.00, pSex = 8.03E − 71, 20.46%, Women [M = 489.75 ms, n = 29,640], Men [M = 446.28 ms, n = 10,155]) and UK Biobank cvrRT (βSex = −18.28, pSex = 2.00E − 16, 5.16%, Women [M = 534.98 ms, n = 89,331], Men [520.66 ms, n = 68,914) found that being a man was consistently associated with shorter RT when compared to being a woman from 40–70 years of age.

UKBb MindCrowd and UK Biobank: education and handedness

Akin to the association found in the MindCrowd analysis, the UKBb MindCrowd and UK Biobank comparison found that Educational Attainment was a significant RT predictor. Indeed, for UKBb MindCrowd a “High School Diploma” (βHSDiploma = −9.68, pHSDiploma = 2.70E − 01, 1.87%, “Some College” (βCollege = −26.08, pCollege = 1.61E − 03, 5.06%) and a “College Degree” (βCDegree = −49.07, pCDegree = 1.91E − 09, 9.51%), and in the UK Biobank a “High School Diploma” (βHSDiploma = −9.68, pHSDiploma = 1.23E − 33, 1.74%), “Some College” (βCollege = −10.26, pCollege = 1.30E − 40, 1.85%), or a “College Degree” (βCDegree = −11.71, pCDegree = 1.46E − 41, 2.11%) were all significantly different from “No High School Diploma” (Fig. 5a–b). Here, both UKBb MindCrowd and UK Biobank large cohorts reported shorter RT was associated with more education. Lastly, unlike the MindCrowd analyses, both the UKBb MindCrowd (pHandedness = .40) and the UK Biobank (pHandedness = 0.36) cohorts between the ages of 40–70 did not find Handedness to be a significant predictor of RT performance (Supplementary Fig. 7).

Fig. 5: UK Biobank: educational attainment and diabetes mellitus.
figure 5

UKBb MindCrowd and UK Biobank analysis (ages 40–70) of visual reaction time (RT). ab More education was related to shorter visual RT. a UKBb MindCrowd linear model fits (line fill ±95% CI) of median simple visual RT (svRT) from young to old Age with lines split by Educational Attainment. Participants who had a “High School Diploma” (βHSDiploma = −9.68, pHSDiploma = 2.70E − 01, 1.87%, n = 3176), “Some College” (βCollege = −26.08, pCollege = 1.61E − 03, 5.06%, n = 11,139), or a “College Degree” (βCDegree = −49.07, pCDegree = 1.91E − 09, 9.51%, n = 24,875) were shortened than those with “No High School Diploma” (n = 605). b UK Biobank linear model fits (line fill ±95% CI) of median complex visual recognition RT (cvrRT) from young to old Age with lines split by Educational Attainment. Like the UKBb MindCrowd cohort, participants who had a “High School Diploma” (βHSDiploma = −9.68, pHSDiploma = 1.23E − 33, 1.74%, n = 46,247), “Some College” (βCollege = −10.26, pCollege = 1.30E − 40, 1.85%, n = 77,270), or a “College Degree” (βCDegree = −11.71, pCDegree = 1.46E − 41, 2.11%, n = 23,750) were all associated with shorter cvrRTs when compared to “No High School Diploma” (n = 10,978). cd Diabetes Mellitus was associated with lengthened visual RT. c UKBb MindCrowd linear model fits (line fill ±95% CI) of median svRT from young to old Age with lines split by diabetes mellitus. d UK Biobank linear model fits (line fill ±95% CI) of median cvrRT from young to old Age with lines split by diabetes mellitus. For the UKBb MindCrowd, individuals who reported (βDiabetes = 11.48, pDiabetes = 3.31E − 03, Diabetes Reported n = 2807, No Diabetes Reported n = 36,988) and UK Biobank cvrRT (βDiabetes = 5.48, pDiabetes = 4.80E − 07, Diabetes Reported n = 4969, No Diabetes Reported n = 153,276), were associated with lengthened svRT.

UKBb MindCrowd and UK Biobank: health, medical, and lifestyle factors

In terms of health factors associated with RT, in the UKBb MindCrowd cohort, Diabetes (βDiabetes = 11.48, pDiabetes = 3.31E − 03, 5.87%), Stroke (βStroke = 18.47, pStroke = 4.00E − 02, 9.45%), (βHypertension = 7.99, pHypertension = 3.16E − 03, 3.58%), and Dizziness (βDizzy = 12.13, pDizzy = 2.52E − 03, 6.19%) were all significantly associated with longer svRTs. These associations were recapitulated by the UK Biobank. To that end, Diabetes Mellitus (βDiabetes = 5.48, pDiabetes = 4.80E − 07, 1.55%, Fig. 5c–d), Reported Stroke (βStroke = 10.61, pStroke = 6.15E − 07, 2.99%, Fig. 6a–b), Reported Hypertension (βHypertension = 1.14, pHypertension = 0.02, 0.31%, Fig. 6c–d), and Reported Dizziness (βDizzy = 3.21, pDizzy = 3.71E − 14, 0.91%, Fig. 7a–b) were all significantly related to longer cvrRTs; however, the association between Reported Hypertension and cvrRT was small (Fig. 6d). Further, in agreement with the MindCrowd analysis, Smoking Status was significantly related to svRT (βSmoke = 10.18, pSmoke = 0.01, 5.21%) in the UKBb MindCrowd cohort; however, FHAD (βFHAD = −0.07, pFHAD = 0.97) was not. This pattern of associations was reversed in the UK Biobank; that is, Smoking Status was not a significant predictor of cvrRT (βSmoke = 0.47, pSmoke = 0.22, Fig. 7c–d), but FHAD was (βFHAD = 2.36, pFHAD = 3.35E − 05, 0.69%, Fig. 8a–b).

Fig. 6: UK Biobank: reported stroke and hypertension.
figure 6

UKBb MindCrowd and UK Biobank analysis (ages 40–70) of visual reaction time (RT). ab Reported Stroke was associated with lengthened visual reaction time (RT). a UKBb MindCrowd linear model fits (line fill ±95% CI) of median simple visual RT (svRT) from young to old Age with lines split by Reported Stroke. b UK Biobank linear model fits (line fill ±95% CI) of median complex visual recognition RT (cvrRT) from young to old Age with lines split by Reported Stroke. In both the UKBb MindCrowd svRT (βStroke = 18.47, pStroke = 4.00E − 02, Reported Stroke n = 2807, No Reported Stroke n = 39,318) and UK Biobank cvrRT (βStroke = 10.61, pStroke = 6.15E − 07, Reported Stroke n = 1237, No Reported Stroke n = 157,008) analysis, experiencing a Reported Stroke was associated with lengthened visual RT. cd Reported Hypertension was associated with lengthened visual RT. c UKBb MindCrowd linear model fits (line fill ±95% CI) of median svRT from young to old Age with lines split by Reported Hypertension. d UK Biobank linear model fits (line fill ±95% CI) of median cvrRT from young to old Age with lines split by Reported Hypertension. Unlike the MindCrowd analysis, hypertension was related to longer svRTs in UKBb MindCrowd (βHypertension = 7.99, pHypertension = 3.16E − 03, Hypertension Reported n = 9676, No Hypertension Reported n = 30,119) and cvrRT in the UK Biobank (βHypertension = 1.14, pHypertension = 0.02, Hypertension Reported n = 32,593, No Hypertension Reported n = 125,652).

Fig. 7: UK Biobank: reported dizziness and smoking status.
figure 7

UKBb MindCrowd and UK Biobank analysis (ages 40–70) of visual reaction time (RT). ab Reported Dizziness was associated with lengthened visual RT. a UKBb MindCrowd linear model fits (line fill ±95% CI) of median simple visual RT (svRT) from young to old Age with lines split by Reported Dizziness. b UK Biobank linear model fits (line fill ±95% CI) of median complex visual recognition RT (cvrRT) from young to old Age with lines split by Reported Dizziness. UKBb MindCrowd svRT (βDizzy = 12.13, pDizzy = 2.52E − 03, Reported Dizziness Reported n = 2543, No Reported Dizziness Reported n = 37,252) and UK Biobank cvrRT (βDizzy = 3.21, pDizzy = 3.71E − 14, Reported Dizziness Reported n = 42,210, No Reported Dizziness Reported n = 116,035) were lengthened if participants Reported Dizziness. cd Smoking Status was associated with lengthened svRT in UKBb MindCrowd, but not cvrRT in the UK Biobank. c UKBb MindCrowd linear model fits (line fill ±95% CI) of median svRT from young to old Age with lines split by Smoking Status. Compared to non-smokers, smokers were associated with longer svRTs (βSmoke = 10.18, pSmoke = 0.01, Smoker n = 2783, Non-Smoker n = 37,012). d UK Biobank linear model fits (line fill ±95% CI) of median cvrRT from young to old Age with lines split by Smoking Status. An association between Smoking Status and cvrRT was not found in the UK Biobank (βSmoke = 0.47, pSmoke = 0.22, Smoker n = 91,312, Non-Smoker n = 66,923).

Fig. 8: UK Biobank: FHAD biological and sex × educational attainment.
figure 8

UKBb MindCrowd and UK Biobank analysis (ages 40–70) of visual reaction time (RT). ab A first-degree family history of Alzheimer’s disease (FHAD) was related to longer complex visual recognition reaction time (cvrRT)s in the UK Biobank, but not simple visual reaction time (svRT) in UKBb MindCrowd. a UKBb MindCrowd linear model fits (line fill ±95% CI) of median svRT from young to old Age with lines split by reported FHAD. An association between FHAD and svRT was not found in the UKBb MindCrowd cohort (βFHAD = −0.07, pFHAD = 0.97, FHAD Reported n = 13,748, No FHAD Reported n = 26,047). b UK Biobank linear model fits (line fill ±95% CI) of median cvrRT from young to old Age with lines split by reported FHAD. Compared to those reporting No FHAD, FHAD was related to worse cvrRT performance in the UK Biobank (βFHAD = 2.36, pFHAD = 3.35E − 05, FHAD Reported n = 19,741, No FHAD Reported n = 138,504). cd In the UK Biobank, Biological Sex modified the association of Educational Attainment on cvrRT (Biological Sex × Educational Attainment interaction). Linear model fits (line fill ±95% CI) of the median cvrRT by Age with lines split by Educational Attainment in c women and (d) men. Compared to c women having “No High School Diploma” (n = 6056), (d) men with a “High School Diploma” (βSex*HSDiploma = −11.24, pSex*HSDiploma = 3.30E − 12, n = 18,505), “Some College” (βsex*College = −10.54, psex*College = 8.71E − 12, n = 34,230) or a “College Degree” (βsex*CDegree = −12.8, psex*CDegre = 1.68E − 13, n = 11,257) were associated with shortened cvrRT. See Supplementary Fig. 8, displaying simple effects parsed using estimated marginal means (EMM).

UKBb MindCrowd and UK Biobank: two-way interactions

The “glmulti”35 R package defined two interactions in the UKBb MindCrowd and UK Biobank analysis. In UKBb MindCrowd we found a significant Age × Biological Sex interaction (pAge*Sex = 2.00E − 02, 0.33%). We found a comparable significant Age × Biological Sex interaction for cvrRT (pAge*Sex = 4.18E − 28, 0.16%) in the UK Biobank. Across both MindCrowd and the UK Biobank, these interactions indicated that RT was lengthened in men from younger to older ages compared to women, was over 0.5 ms shorter per one year difference in age (Fig. 4c–d). In addition, the UK Biobank analysis revealed a significant Biological Sex × Educational Attainment interaction not found in UKBb MindCrowd. Here, men with a “High School Diploma” (pSex*HSDiploma = 3.30E − 12, 3.35%), “Some College” (psex*College = 8.71E − 12, 3.14%) or a “College Degree” (psex*CDegre = 1.68E − 13, 3.18%) were significantly associated with shorter cvrRT performance when compared to women having “No High School Diploma” (Fig. 8c). Follow-up analyses of the simple effects via estimated marginal means (EMM, see “Statistical Methods” section) revealed that men who did not graduate high school (EMM = 558.91 ms), compared to men with more education (EMMs = 542.99, 542.98, and 540.41 ms), had markedly shortened cvrRTs, more in line with the women’s cvrRT performance (EMMs = 566.80, 562.11, 561.40, and 561.09 ms). The associated difference in cvrRT for men with “No High School Diploma” compared to men with a “High School Diploma” (βMen = 15.92, pMen = 2.00E − 16, 1.75%) was more substantial than between women with “No High School Diploma” compared to women with a “High School Diploma” (βWomen = 4.68, pWomen = 3.12E − 04, 1.69%, see Supplementary Fig. 8 for a graph of the EMM).

Discussion

Our study’s results illuminate a portion of the intricate relationship between age and RT performance by identifying demographic, health, medical, and lifestyle factors associated with either attenuation or exacerbation of RT lengthening from younger to older ages (see Fig. 9 for an illustrative summary of the results). A large body of work on RT, leading back to Sir Francis Galton in 189036, has consistently demonstrated an age-associated shift in RT37. It is not surprising that MindCrowd, UKBb MindCrowd, and UK Biobank models revealed slowing of simple visual (svRT, MindCrowd) and complex visual recognition RT (cvrRT, UK Biobank) from younger to older ages. Likewise, the MindCrowd model found that the relationship between svRT and age was modestly curvilinear (Fig. 1a). While this curvilinear relation between RT and age has been noted previously38, both cohorts’ large sample size combined with our application of an algorithm-based model definition35 revealed a notable addition to this picture. Specifically, in the MindCrowd analysis, we observed an interaction between age and education (Fig. 2a) and smoking (Fig. 3a). Here, less education and smoking were related to the additional slowing of simple visual RT (svRT) on top of the svRT slowing associated with transitioning from younger to older ages. Lastly, the age by reported stroke interaction modeled in MindCrowd was associated with longer RTs (Fig. 3e). This study’s large sample size and broad age and surveyed data range places it as one of the most substantial cross-sectional RT evaluations across the aging spectrum. Our findings suggest that smoking and stroke (i.e., cardiovascular health) and amount of education (i.e., cognitive demand or reserve) are factors, modifiable across aging, that influence age-associated RT slowing.

Fig. 9: An illustrative summary of the overall results.
figure 9

Data are shown across the MindCrowd (MC), UKBb MindCrowd, and the UK Biobank (UKBb). The color (i.e., red = low negative and blue = high positive) indicates the size of the β (beta coefficient) estimate.” N.S.,” indicates if the estimated β value was not statistically significant (α = 0.05).

In the UK Biobank cohort27, we found an association between having an FHAD and lengthened (2.43 ms) cvrRT (Fig. 8b). This effect of FHAD on RT or more so underlying process speed is in line with our episodic verbal memory task (i.e., paired-associate learning [PAL]) finding34, where we found FHAD was linked to lower PAL performance. Furthermore, a prior functional magnetic resonance imaging (fMRI) study examining medial-temporal lobe activation using a cvrRT task found a ~100 ms RT lengthening in 68 (mean age of 54) FHAD participants39. These data suggest that genetic and environmental factors relating to AD risk are present in individuals with an FHAD. Indeed, the first-degree relatives of identified FHAD participants consisted of familial early-onset AD or late-onset AD, which also has a high heritability of 79%, suggesting that they are a higher risk category for developing AD. Thus, such shared AD and FHAD factors may relate to sensorimotor function and processing speed (i.e., RT) analogous to alterations in cognition and memory (i.e., PAL).

We found a correlation between svRT and PAL performance (Fig. 1b). This finding was in line with many prior studies, the dependence of episodic memory on processing speed, a dependence that grows with age and incident of age-related disease (e.g., AD)5,6,7,8,40. The association of svRT with PAL may highlight distributed systems and networks that underlie RT performance. For example, svRT performance could depend on the functioning of many cognitive areas that PAL requires and vice versa. Another possibility is that properly functioning memory and cognitive networks correspond to better RT performance. Evidence for this is suggested by the fact that higher intelligence is related to shorter RT41,42,43. However, diverse factors (e.g., exercise, time of day, meal proximity, and individual assessing RT) affect RT performance40,43,44. Thus, making an accurate estimation of RT’s effect on intelligence and vice versa changeling. Together with our prior results, these findings suggest that RT performance could be used as a metric to assess potential AD risk. However, further research, including longitudinal studies, replication, and corroboration of RT’s link to age-related cognitive decline and disease, are necessary to support this notion.

The difference in the effects of FHAD between UK Biobank and MindCrowd (for both MindCrowd and UKBb MindCrowd analyses, Supplementary Fig. 3A and Fig. 8a) could be due to the vast difference in the fraction of FHAD participants in the UK Biobank (FHAD = 13% of total) compared to UKBb MindCrowd (FHAD = 35% of total). While noting that we target those with FHAD for recruitment into MindCrowd, this substantial disparity could also be due to the accuracy of the UK Biobank’s FHAD. The UK Biobank was calculated from three separate questions (i.e., mother, father, and siblings AD diagnosis). Adding to this, the United Kingdom uses a massive, detailed, nationwide electronic health record system facilitating respondent health and medical survey accuracy. Compare this to our single question in MindCrowd, asking participants of all ages to remember if a relative was diagnosed with AD. Another possibility is the RT paradigm used; that is, MindCrowd’s test of svRT compared to the UK Biobank’s use of cvrRT. The fact that complex reaction time, requiring recognition and the choice to “respond or not respond,” rather than just stimulus-response, may underly this difference. UKBb MindCrowd svRT performance showed a consistently shorter association from young to old age compared to the UK Biobank cvrRT performance; an observation noted in a prior study also measured both simple and complex RT45. These findings are consistent with the idea that complex RT requires more processing time12,46, and prior work found the age-associated slowing of RT was higher for choice RT than simple RT47. Lastly, the differences in FHAD associations across MindCrowd and the UK Biobank are perhaps due to the UK Biobank had over 2× the number of participants compared to MindCrowd (i.e., 158 K vs. 76 K). The larger sample is expected to produce better model definitions and increased statistical power. Increased statistical power may also have enhanced accuracy, validity, and reproducibility.

Numerous RT studies have found sex differences in RT performance, which does not appear to be reduced by practice16,20,45. Consistent with others45,47, men exhibited shorter RTs in each model across cohorts (Figs. 1d and 4c–d). In addition, the analysis of the UKBb MindCrowd and UK Biobank implicated biological sex affecting RT slowing from younger to older ages. The age interaction with biological sex suggests that being a woman from younger to older ages is associated with longer RT compared to being a man. These results essentially replicate a previous sizable study (i.e., 7000 participants) evaluating RT45. Similar to our own, this study found that (1) men consistently outperformed women on all RT measures from younger to older ages, (2) differences in RT performance from younger to older ages were nonlinear, (3) including a third-degree polynomial for age provided the best model fit, and (4) compared to men, women displayed longer RTs consistently from younger to older ages45. Collected with our prior study of PAL performance34, these associations replicate prior work and suggest that biological sex affects RT and age-associated shift in RT.

Educational attainment was associated with svRT in both MindCrowd cohorts and cvrRT in the UK Biobank. Overall, having more education (i.e., reporting higher milestones) was related to shorter svRT and cvrRT (Figs. 2a and 5a–b). However, it is unclear if individuals with higher processing speed naturally seek more education and what other factors confound this relationship. Further work utilizing both cohorts is necessary to shed light on the effects and modifiers of FHAD and cross-cohort discrepancies. The model of the UK Biobank revealed an interaction between biological sex and education on RT performance. The breakdown revealed that men had similar RT performance if they attained a high school diploma and above. However, men who did not graduate high school showed markedly longer RTs, which brought them in line with women’s RT performance. However, the associated lengthening of RT for less-educated men was vastly more than that found in less-educated women (reported “No High School Diploma” made up 3.83% of women and 3.11% of men; see Supplementary Fig. 8).

In MindCrowd, handedness, specifically being left-handed, was associated with shorter svRTs. Prior studies have reported similar associations, where left-handedness was correlated with shorter svRT19,46,48. Hemispheric asymmetries in spatial processing are thought to underly shortened svRT for the left hand47,49. Handedness was not associated with svRT in UKBb MindCrowd or cvrRT in the UK Biobank. One explanation for the divergent findings is that MindCrowd includes younger participants (i.e., 18–40-year-olds). Indeed, in MindCrowd, the association appears to diminish from younger to older ages. Specifically, in Fig. 2b, the separation of the regression lines between left-handed and right-handed participants shrinks and eventually crosses around the 4th decade of life. Figure 2c shows that the left-handed and right-handed regression lines separate in 20 to 40-years-olds, while Fig. 2d shows that these regression lines are not separate in 40 to 60-year-olds. While purely speculative, differences in social conventions may have played a role. For example, some older participants were forced to be right-handed, whereas younger participants were not. In doing so, upping the amount of unexplained variance in older, but not younger, participants across MindCrowd and the UK Biobank.

The MindCrowd analysis incorporated all 13 available health, medical, and lifestyle-related factors, of which six were present and incorporated into the shared UKBb MindCrowd/UK Biobank model (Tables 2 and 4). Before the launch of MindCrowd, these factors were carefully selected based upon their known relation to (1) age-associated alterations, (2) RT performance, and (3) PAL Performance. Of the 13 health, medical, and lifestyle factors evaluated in the MindCrowd analysis, we found associations between svRT and the number of daily medications, reported dizziness, smoking status, reported stroke, and diabetes mellitus. Each health and medical factor were associated with longer svRTs (Fig. 3). We should note that the number of daily medications is a serving as a proxy for overall health. That is, the worse one’s health, the worse one’s performance, the increased number of medications treating the underlying health conditions. The UKBb MindCrowd (svRT) and the UK Biobank (cvrRT) analyses found similar associations between reported dizziness, reported stroke, diabetes mellitus, indicating hypertension. Although each association differed in magnitude between the two older cohorts, each was related to lengthened RT. The UKBb MindCrowd to the UK Biobank found a different association for FHAD (UKBb MindCrowd = no association; UK Biobank = 2.43 ms longer), smoking status (UKBb MindCrowd = 10 ms longer; UK Biobank = no association). Interestingly, despite some differences, only a few coefficient signs differed between the UKBb MindCrowd and UK Biobank; indeed, most estimations were well within an order of magnitude between the two cohorts (e.g., age, educational attainment, and age by biological sex interaction, Table 5).

Many factors are likely to account for the different associations between smoking and FHAD between UKBb MindCrowd and the UK Biobank (Figs. 7c–d and 8a–b). Some of these include differences in demographics, genetic heterogeneity, and age26. However, candidates include the fractions of participants reporting each factor (e.g., for diabetes mellitus: MindCrowd = 1%, UKBb MindCrowd = 7%, and UK Biobank = 3%). Another factor is that the UK Biobank’s participant number is twice the size of MindCrowd and four times the size of UKBb MindCrowd. Despite our study’s size, the observational and cross-sectional method means that we cannot rule out effects due to confounding variables.

Consequently, while numbers may be close, we do not assume that the UKBb MindCrowd is similar and can be compared to the UK Biobank. Furthermore, we observed that UKBb MindCrowd consistently reported larger estimates and standard errors than the UK Biobank. For example, the MindCrowd cohort’s estimation of the sex difference association was consistently more extended (~40 ms), even in the UKBb MindCrowd cohort when looking at the UK Biobank (~19 ms). This difference demonstrates why the study of neuropsychological traits and disease requires large sizes to provide accurate estimations driving better predictive validity.

We strongly advocate for large-scale efforts like ours, the UK Biobank50, and others22. Indeed, studies of this kind have characteristics that provide the unique impact necessary to move the fields of aging and age-related diseases forward. These include: (1) statistical control, as our MindCrowd analysis incorporated all 24 available factors, 11 of which were used in the UKBb MindCrowd and UK Biobank model. (2) The inclusion of each predictor controlled for its association on RT, which potentially removed variability (noise), thus enhancing statistical power. (3) The two models used for each of the analyses were selected with little human input by automated application of specific statistical criteria (see Inclusion of polynomials and automatic model selection in “Statistical Methods” section). This likely decreases bias, the probability of overfitting, and multicollinearity. (4) For this study, MindCrowd had over 76 K and the UK Biobank over 158 K participants. Large sample sizes in each cohort were expected to help reduce variance, enhance estimation, select better models, and in turn, enhance statistical power. Expanded statistical power may then enhance accuracy, validity, and reproducibility.

Lastly, a recent genome-wide association study examining associations between RT and single nucleotide polymorphisms (SNP) in the UK Biobank and CHARGE and COGENT consortia noted weak correlations between the reported cognitive-associated SNPs among US and UK cohorts51. Here, MindCrowd presents a future opportunity to resolve these weak associations and get a better picture of potential cohort effects. Taken together, these characteristics increase the likelihood of making accurate inferences regarding associations while boosting predictive validity. These are both necessary and vital attributes when searching for genetic associations and the structure underlying healthy brain aging.

There are potential concerns that arise from web-based studies52. Indeed, limitations of this study include the cross-sectional design and the partial discrepancy in MindCrowd’s svRT test compared to the UK Biobank’s cvrRT test and info collected between the UK Biobank and MindCrowd (e.g., the omission of “prefer not to answer” choices for race and education questions). Acknowledging these drawbacks, we believe that the advantage of meaningfully higher participant numbers and enriched cohort diversity facilitated via online research remediates some disadvantages. For example, the range of error reported in recent internet-based studies of self-reported quantitative traits like height and weight was between 0.3 and 20%53,54,55,56. Previously, we ran simulations on the association between FHAD and PAL by randomly shuffling the FHAD responses (e.g., Yes to No, and No to Yes), introducing increasing sequential amounts of “error.” We found that even with a subtle effect such as FHAD on PAL performance, due to our cohort size, 24% error would still have only made us commit a Type1 error 50% of the time34. In line with this notion, another publication demonstrated that online RT studies produce reproducible results57.

Further, we developed an extensive and automated data filtering pipeline (see Data Quality Control and Supplementary Figs. 9–10) to address these concerns and enhance validity and accuracy. These data (i.e., raw or filtered) were excluded before analysis (i.e., listwise deletion). Exclusion resulted in dropping 0.3% and 6.1% of MindCrowd and UK Biobank participants, respectively. One of the 25 critical factors had over 5% missing data (see Supplementary Fig. 11 and Supplementary Tables 1 and 2). Reported Dizziness in the UK Biobank had 64.36% missing data. Hence, interpretation of this factor’s association with cvrRT should only be considered for “hypothesis-generation58.” Evaluation of selection bias between retained and excluded participants revealed an overall lower probability of exclusion in MindCrowd and higher likelihood in the UK Biobank (see Supplementary Tables 3 and 4). Notable groups with a higher probability of exclusion included those in the highest age ranges and those reporting hypertension and dizziness. These higher probability groups were found in both study’s cohorts.

Lastly, it is essential to note that our internet-based svRT task was not designed to directly mirror conventional face-to-face RT testing paradigms. Indeed, we find higher RTs and steeper slopes from younger to older ages than studies assessing svRT via the gold standard, laboratory-based assessments (e.g., refs. 47,59). However, these paradigm differences are not likely to alter our svRT test’s validity or reliability. One reason being our test is only interpreted within MindCrowd to identify associated factors and reveal individual differences. Despite test paradigm differences, we believe that large cross-sectional studies like MindCrowd, utilizing internet-based testing and remote biosample collection, are vital to moving the field of aging and age-related disease forward (see Opportunities: Unique impact above and50).

Understanding the modifiable and non-modifiable variables associated with RT and related cognitive function will begin to deconstruct the underlying architecture of elements accounting for the vast heterogeneity seen in individual trajectories of age-associated cognitive decline. Only then will it be possible to develop a healthy brain aging model that is both valid and reliable60. Such a model holds immense potential to attenuate age-related and disease-related cognitive deficits, thus enhancing cognitive healthspan. Any extension of cognitive healthspan, better aligning it to the human lifespan, would be invaluable and increasingly vital when aggregated across the aging population. Mitigating age-related or disease-related cognitive decline, allowing maintenance of independence by even only a few years, would have many benefits. For example, the U.S. could save billions of dollars in health care costs and lost caregivers’ productivity while improving the quality of life for the aging population50. In this study, we revealed several potential factors related to aging and processing speed. Of those, smoking and education, as potentially modifiable factors throughout life, were associated with longer and shorter RTs, respectively, from younger to older ages. With MindCrowd recruitment ever-increasing, our goal is to continue supplying and refining the knowledge necessary to optimize cognitive performance throughout life.

Methods

Study participants MindCrowd: overview

In January 2013, we launched our internet-based study at www.mindcrowd.org. Website visitors 18 years or older were asked to consent to our study before any data collection via an electronic consent form. As of 3–17–2020, we have had 356,674 non-duplicate or distinct visitors to the website. Of these distinct visitors, over 194,542 (54%) consented to take part. The final data set had 75,666 (39% of consented individuals) participants who completed a simple visual reaction time (svRT) and paired-associate learning (PAL) tasks and answered 22 demographic, lifestyle, and health questions. The authors confirm they obtained informed consent from each participant and complied with all relevant ethical regulations. Approval for this study was obtained from the Western Institutional Review Board (WIRB study number 1129241).

Study participants MindCrowd: simple visual reaction time (svRT)

After consenting to the study and answering five demographic questions (i.e., age, biological sex, years of education, primary language, and country where they reside), participants were asked to complete a web-based svRT task. We chose svRT because it is a simple central and peripheral nervous system-dependent task influenced by intelligence and brain injury61. Participants were presented with a pink sphere that appeared at random intervals (between 1 s and 10 s) on the screen, and they were instructed to respond as quickly as possible after the sphere appeared by pressing the enter/return key on their keyboard. Once the participant responded, the sphere disappeared until the subsequent trial. Each participant received a total of five trials. The sphere stayed on the screen until the participant responded. The dependent variable, response time in milliseconds (ms), was recorded from the sphere’s appearance on the screen to the participant’s key press or screen touch.

Study participants MindCrowd: paired-associate learning (PAL)

Next, participants were presented with the PAL task. For this cognitive task, during the learning phase, participants were shown 12-word pairs, one-word pair at a time (2 s/word pair). During the recall phase, participants were given the first word of each pair and were asked to use their keyboard to type in (i.e., recall) the missing word. This learning-recall procedure was repeated for two more trials. Before beginning the task, each participant received one practice trial consisting of three-word pairs not contained in the 12 used during the test. Word pairs were presented in different random orders during each learning and each recall phase. The same word pairs and order of presentation were used for all participants. The dependent variable/criterion was the total number of correct word pairs entered across the three trials (i.e., 12 × 3 = 36, a perfect score).

Study participants MindCrowd: demographic, medical, health, and lifestyle questions

Upon completing the PAL task, participants were asked to fill out an additional 17 demographic and health/disease risk factor questions. These questions included: marital status, handedness, race, ethnicity, number of daily prescription medications, a first-degree family history of dementia, and yes/no responses to the following: seizures, dizzy spells, loss of consciousness (more than 10 min), high blood pressure, smoking status, diabetes mellitus, heart disease, cancer, reported stroke, alcohol/drug abuse, brain disease, and memory problems). Next, participants were shown their results and provided different comparisons to other test takers based on the average scores across all participants’ sex, age, and education demographics. On this same page of the site, participants were given the option to be recontacted for future research (see Supplementary Table 5 for the list of MindCrowd questions asked).

Study participants UK Biobank: study design and aims

The UK Biobank is a long-term study and research resource in the United Kingdom (UK), which investigates links between genetic and environmental exposure to disease development. The UK Biobank’s stated goal is to “build a major resource that can support a diverse range of research intended to improve the prevention, diagnosis, and treatment of illness and the promotion of health throughout society.” The UK Biobank began in 2006. The study is currently following about 500,000 participants in the UK, enrolled at ages 40 to 69. Initial enrollment took place from 2006 to 2010. All participants are monitored for at least 30 years after recruitment and initial assessment (i.e., termed “instance 0” by the Biobank). Potential participants were invited to visit an assessment center, where they completed a questionnaire. Participants were next interviewed about lifestyle, medical history, and nutritional habits. Lastly, vital measurements, such as weight, height, and blood pressure, were measured. The UK Biobank aims to electronically record all health-related changes and events across the entire 30-year study. Notably, this task is aided by the UK’s integrated health system and corresponding electronic health record-keeping, an approach that is not yet possible in the USA.

Study participants UK Biobank: data procurement

All UK Biobank data were derived from Application #43036, entitled “Exploring and Accommodating Heterogeneity in Large-Scale Genetic Analyses” as a “Collaborator Project.” The authors confirm that the UK Biobank obtained informed consent from each participant and complied with all relevant ethical regulations. Approval for this study was obtained from the Research Ethics Committee [11/NW/0382].

Study participants UK Biobank: complex visual recognition reaction time (cvrRT) and educational attainment

Each participant’s cvrRT was based on 12 rounds of the card-game Snap. Participants were shown two cards at a time with a picture on them. Participants pressed a button on a table in front of them as quickly as possible if the images cards/matched. For each of the 12 rounds, the following data were collected: the pictures shown on the cards (Index of card A, Index of card B), the number of times the participant clicked the ‘snap’ button, and the latency to first click of the ‘snap’ button. This last record of “latency to click the button” was used as the UK Biobank’s criterion for regression analyses.

For Educational Attainment, the following conversions from UK Biobank (UKBb) answer codes (see http://biobank.ndph.ox.ac.uk/showcase/coding.cgi?id=100305) to MindCrowd (MC) values were made: (a) “UKBb -7 None of the above” to “MC No high school diploma,” (b) “UKBb 2A levels/AS levels or equivalent” to “MC High school diploma,” (c) “UKBb 3 O levels/GCSEs or equivalent” to “MC High school diploma,” (d) “UKBb 4 CSEs or equivalent” to “MC High school diploma,” (e) “UKBb 5 NVQ or HND or HNC or equivalent” to “MC Some college,” (f) “UKBb 6 Other professional qualifications (e.g., nursing and teaching)” to” MC Some college,” (g) “UKBb 1 College or University degree” to “MC College degree.” All UKBb participants selecting “-3 Prefer not to answer” were removed from the final dataset before model selection and analysis. While we did our best to ensure a similar education measure across UKBb MindCrowd and the UK Biobank, we realize that there are fundamental differences between US and UK schools that we cannot control or eliminate. Table 6 lists the specific UK Biobank data fields from which we derived our factors.

Table 6 UK Biobank data fields used to derive factors.

Data quality control

For the MindCrowd analysis, a final data set, including all qualifying participants up to 3–17–2020, was generated. See Supplementary Fig. 9 for a flowchart detailing the following filtering steps. This dataset removed participants: (a) with duplicate email addresses (only first entry kept), (b) who did not complete all three rounds of the PAL test, (c) whose primary language was not English, (d) who was not between 18–85 years old, (e) whose RT trials were above or below 1.5 × the interquartile range (IQR) and (f) whose median svRT was above or below 1.5 × the IQR range of all participants of the same age (Supplementary Fig. 10 details RT and IQR exclusion). Participants from either study were removed if they were missing any data (listwise deletion). Lastly, for the UKBb MindCrowd and UK Biobank analysis, participants were removed if their responses to a demographic, medical, health, and lifestyle question did not match the other study. For example, participants in the UK Biobank who responded to the “Race” question with “Prefer Not to Answer” were removed. “Prefer Not to Answer” was not a choice MindCrowd participants were given on the “Race” question. Removing these participants was done to align UKBb MindCrowd and UK Biobank cohorts as much as possible.

Statistical methods

Statistical analyses were conducted using R62,63 (v4.0.3). For all analyses, multivariate linear regression was performed using the general linear model (LM) to model Median svRT or Median cvrRT (i.e., criterion or dependent variable) as a function of either 24 (MindCrowd) or 11 predictors (UK Biobank analysis). For the MindCrowd analysis, svRT was modeled as a function of PAL Performance and Age raised to the power of three (i.e., to fit and estimate nonlinear associations). Most figures were created using “ggplot2” bundled together as a part of the R package, “tidyverse64”. Continuous by continuous interactions (i.e., simple slopes) were estimated using the R packages “interactions65,” “sandwich66,” “jtools67”. Categorical by categorical interactions were estimated using the R package, “emmeans”68. Adjustments for multiple comparisons were evaluated using Tukey’s method via the “emmeans” package. Missing data were assessed via the “finalfit69”, “visdat70”, and “naniar71” R packages (see Supplementary Table 6 for a complete list of resources).

All measurements were taken from distinct samples. Model fit and violations of parametric assumptions were evaluated separately in each model. Here, we evaluated different residual plots, assessing normality, homoscedasticity, outliers, residual autocorrelation, and multicollinearity. The MindCrowd LM included all 22 demographic questions, health, medical, and lifestyle questions. These questions were: Age, Biological Sex, Race, Ethnicity, Educational Attainment, Marital Status, Handedness, Daily Medications, Seizures, Reported Dizziness, Loss of Consciousness, Reported Hypertension, Smoking Status, Heart Disease, Reported Stroke, Alcohol/Drug Abuse, Diabetes Mellitus, Cancer, a First-Degree Family History of Alzheimer’s disease, history of brain disease, whether the test was taken on a mobile device, and the version of the MindCrowd site used. Not surprisingly, the device used to take the RT test in MindCrowd was associated with RT performance. For the UK Biobank analyses, these 11 variables included: Age, Biological Sex, Diabetes mellitus, Handedness, Reported Stroke, Reported Hypertension, Smoking Status, Reported Dizziness, Educational Attainment, and a First-Degree Family History of Alzheimer’s Disease. Examination of each model’s variance inflation factors (VIF) revealed no unexpected factors with a VIF > 5 (i.e., considered “highly” colinear by convention, see Supplementary Table 7).

The MindCrowd analyses included Age and PAL Performance as first through third-degree non-orthogonal polynomials (i.e., cubic regression). This choice was based on empirical evaluations, using Bayesian information criterion (BIC) weights (i.e., Schwarz weights)72. We generated, ran, and recorded BICs across seven models (i.e., base and first-degree through sixth-degree-[nonorthogonal] polynomial). BIC weights were calculated from raw BIC values using the “qPCR73” (v1.4–1) R for each model. The third-degree polynomial model reported the largest BIC weight, and it was 1.46E + 270 times more likely to occur than the base (no polynomial) model (BICHighW 9.99E − 01/BICLowW6.82E − 271 = 1.46E + 270)72. It is worth noting that a prior study examining both complex and simple RT also included age as a third-degree polynomial. Other similarities included: a relatively large n = 7000, both sexes, an 18–94-year-old age range, and several RT findings72.

For the MindCrowd, UKBb MindCrowd, and UK Biobank cohorts, we used the R package “glmulti” (v1.0.8)35 to define our GLM models. glmulti uses full information criterion model selection vs. shrinkage regression methods (e.g., LASSO or LAR)35. We used glmulti to avoid the pitfalls of stepwise selection methods or unintentional biased introduced via manual or p-value-based model selection. We had glmulti define the “best” (i.e., lowest BIC) MindCrowd and UK Biobank models separately using its genetic algorithm method with marginality set to True. We chose BIC as opposed to other information criterion methods because BIC punishes for model complexity. Two rounds of model selection were run to find pairwise interactions due to package limitations (i.e., millions of potential models). For round 1, the optimal model contained only the main effects when all 22 factors were included. In round 2, the only factors selected in the optimal main effects model were then included to select an optimal model, including two-way interactions.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.