The Adolescent Brain Cognitive Developmentsm Study (ABCD) is an ongoing multisite, longitudinal neuroimaging study following a cohort of 11,875 youths over 10 years. In this analysis, we use data from the baseline visits at which participants were 9–10 years old. All data used in the current study (i.e., fMRI, questionnaire, neuropsychological tasks) were collected at a single visit or across two visits that occurred within 30 days of each other. The ABCD study® was approved by the institutional review board of the University of California, San Diego (IRB# 160091). Additionally, the institutional review boards of each of the 21 data collection sites approved the study. Informed consent was obtained from all parents and informed assent was obtained from participants. Data can be accessed through registration with the ABCD study at https://nda.nih.gov/abcd.
Demographics for the full sample are shown in Table 1 and the distribution of ADHD symptomatology is shown in Supplemental Fig. 1. While the ABCD Data Analysis, Informatics & Resource Center (DAIRC) creates several indices of data quality, all exclusions were done by the research team of the current study starting from the total 11,875 participants enrolled in the ABCD study. Participants for all MRI/fMRI modalities were excluded if they had incomplete data for the CBCL “Attention Problems” scale, were missing sociodemographic covariates, or failed the FreeSurfer quality control assessments performed by the DAIRC. In the DAIRC Freesurfer quality control, trained technicians reviewed each subject’s cortical reconstruction, judging the severity of five problem types: motion, intensity inhomogeneity, white matter underestimation, pial overestimation, and magnetic susceptibility artifact. Reconstructions were rated for each problem as “none”, “mild”, “moderate”, or “severe”. A subject was recommended for exclusion if any of the five categories are rated as “severe” and these recommendations were summarized as an overall binary QC score (i.e., pass or fail), which was used in the current manuscript27. For each fMRI task, participants were excluded for having any missing fMRI data on that task, having fewer than two fMRI scans pass the image quality control performed by the DAIRC (which was similar to the DAIRC Freesurfer QC reported above27), or failing to meet additional quality control criterion specific to this report. These additional quality control steps included having: (1) hemispheric mean beta-weights more than two standard deviations from the sample mean, (2) fewer than 200 degrees of freedom over the two runs, (3) mean framewise displacement > 0.9 mm for both runs, (4) failed to meet task-specific performance criteria (described in Casey et al.28). Because of a data processing error (https://github.com/ABCD-STUDY/fMRI-cleanup), participants were excluded who were collected on Philips scanners for all fMRI tasks. Additionally, for the SST only, a small group of participants were excluded because of a glitch in the SST task (when the stop signal delay is 50 msec, a response that is faster than 50 msec is erroneously recorded as the response for all subsequent Stop trials, see Garavan et al.29). This resulted in participant totals of 8596 for Structural MRI (sMRI), 5417 for the EN-Back task, 5959 for the monetary incentive delay (MID) task, and 5020 for Stop Signal Task (SST). Participant exclusion steps are reported in Supplemental Table 1.
Child behavior checklist
The “Attention Problems” empirically derived syndrome scale of the CBCL parent-report questionnaire was used to evaluate symptoms of ADHD symptomatology as a continuous variable30. Despite its name, the CBCL “Attention Problems” scale evaluates a broad constellation of ADHD symptomatology (attention, hyperactivity, and impulsivity) and has been shown to be an excellent predictor of ADHD diagnosis derived from clinical interview31,32. Additionally, the CBCL internalizing composite was used as a covariate to account for depression and anxiety, which are frequently comorbid in children with ADHD. This composite is made up of the anxious/depressed scale, the withdrawn/depressed scale, and the somatic symptoms scale.
A demographics questionnaire was administered to the participant’s parent/guardian to determine demographic information including sex, age, race, household income, and parental education.
Pubertal Development Scale
Pubertal status was assessed using the pubertal development scale33, which was completed by a parent/guardian and by the participant, with results of the two being averaged. This measure has been shown to have good reliability and to correspond with accepted self-report and biological measures of pubertal development33.
Edinburgh handedness inventory – short form
The Edinburgh Handedness Inventory – Short Form34 is a 4-item self-report scale that produces a handedness score of “right”, “mixed”, or “left” by asking about preferred hand used for writing, throwing, using a spoon, and using a toothbrush. The measure has been shown to have good reliability and to correlate highly with longer measures of handedness34.
Medications survey inventory
Parents completed a survey in which they listed the names and dosages of all medications taken by the child. From this we created a binary variable if participants were taking one or more stimulant medications prescribed to treat ADHD (e.g., Adderall, Ritalin, amphetamine). For a full list of medications included in this variable, see Supplemental Table 2.
The tasks used in the current study have been described previously28,35 and are detailed in Supplemental Materials and a schematic of the tasks is shown in Supplemental Fig. 2. In short, the EN-Back task was a modified version of a traditional N-Back task in which participants viewed a series of stimuli and for each responded if that stimulus matched the one they saw N items ago (i.e., “N back”). The current task version incorporated added elements of facial and emotional processing, though these were not a focus of the current analysis. The task had two conditions: a 2-back as its active condition and a 0-back as the baseline condition. d’ (z(hit rate) − z(false alarm rate)) was used as the performance measure for the EN-back tasks. The MID task included both anticipation and receipt of reward and loss. In this task, participants viewed an incentive cue for 2 sec (anticipation) and then quickly respond to a target to win or avoid losing money ($5.00 or $0.20). Participants were then given feedback about their performance (receipt). The baseline used was “neutral” trials in which participants completed the same action but with no money available to be won or lost. The current study focused only on reward trials (i.e., trials in which participants win money), as most prior research on ADHD symptomatology has focused on reward trials10. The SST consisted of serial presentations of leftward and rightward facing arrows. Participants were instructed to indicate the direction of the arrows using a two-button response box (the “go” signal), except when the left or right arrow was followed by an arrow pointing upward (the “stop” signal). Participants were also instructed to respond as “quickly and accurately as possible”. Trials were then categorized based on the participant’s accuracy (“correct” and “incorrect”). The performance variable used for the SST was stop signal reaction time (SSRT), which represents the duration required to inhibit a “go” response after a “stop signal” has been presented and functions as an index for inhibitory speed.
Magnetic resonance imaging acquisition and data processing
Structural and functional MRI scans were acquired at sites across the United States using 26 different scanners from two vendors (Siemens and General Electric); there were also three sites using Philips scanners that were excluded from analyses due to an error in processing prior to their release. MRI sequences are reported in Supplemental Methods and in prior work31. All sMRI and fMRI data were preprocessed by the DAIRC using pipelines that have been detailed in prior work27. Briefly, sMRI data were preprocessed using FreeSurfer version 5.327 to produce CT and CSA measures for each of the 74 Destrieux atlas36 regions of interest in each hemisphere (148 regions total) and GMV for nine subcortical regions in FreeSurfer’s ASEG parcellation in each hemisphere (18 regions total), plus the brainstem which was not split by hemisphere. All structural MRI data were visually examined by a trained ABCD technician, who rated them from zero to three on five dimensions: motion, intensity homogeneity, white matter underestimation, pial overestimation, and magnetic susceptibility artifact. From this an overall score was generated recommending inclusion or exclusion27. All subjects recommended for exclusion based on their Freesurfer data were excluded from all analyses in the current study, including fMRI analyses as the fMRI processing pipeline relies on the Freesurfer cortical reconstruction.
fMRI data were preprocessed using a multi-program pipeline that yielded neural activation in these same cortical and subcortical regions for each fMRI contrast. The contrasts used for the SST were incorrect stop – correct go and correct stop – correct go; for the EN-Back the only contrast was 2-Back vs. 0-Back; for the MID, contrasts were reward anticipation – neutral anticipation and positive reward outcome – negative reward outcome (i.e., win – loss).
To investigate the relationship between covariates and ADHD symptomatology, we examined the association of each covariate with ADHD symptomatology in a separate mixed effect model for each covariate using R version 3.6.1. Code for all analyses is available at https://github.com/owensmax/ADHD. Covariates were participants’ age, sex, race, pubertal status, handedness, internalizing symptom score from the CBCL, parent’s highest education level, and family income. An additional analysis was also added that also used stimulant medication status (yes/no) as a covariate and so we examined stimulant medication status’ association with ADHD symptomatology. Additionally, framewise displacement from the relevant fMRI scan was used as a covariate and for structural MRI the average of framewise displacement from the fMRI scans was used. In structural MRI analyses, total intracranial volume was also used as covariate. To account for the large numbers of siblings and multiple data collection sites, family ID (used to denote sibling status) was modeled as a random effect nested inside a random effect of scanner in all mixed effects modelsFootnote 1, as has been recommended37 and is standard on the ABCD Data Exploration and Analysis Portal (DEAP). This nested approach was used since all siblings in each family were collected at the same scanner site. In these analyses, the covariates were the independent variables and ADHD symptomatology was the dependent variable. Additionally, to investigate if working memory and inhibitory control were associated with ADHD symptomatology, we examined the association of behavioral performance on the EN-Back and Stop Signal tasks with attention symptoms. d′ (z(hit rate) − z(false alarm rate)) was used as the performance measure for the EN-Back tasks and stop signal reaction time was used as the performance metric for SST. In these analyses, the performance metrics were used as the independent variables, along with all fixed effect covariates, and ADHD symptomatology was used as the dependent variable.
Primary analyses: elastic net regression
Elastic net regression was used to build predictive models for each of the imaging modalities (structural MRI, EN-Back, SST, and MID) using the glmnet package in MATLAB R2018b. Separate models were built for each MRI paradigm (i.e., each of three fMRI task + sMRI), with all brain variables used as features (i.e., independent variables) and ADHD symptomatology used as the target (i.e., dependent variable). Three versions of the analysis were run, one in which covariates were not accounted for, one in which the ADHD symptomatology score was residualized so that its shared variance with the covariates was removed, and one in which covariates including medication status were residualized from ADHD symptomatology.
Initially, data were divided in an 80%/20% split, with the 80% used as a training/validation set for model building and the 20% used as a final external test set. Then, elastic net regression model training was conducted in a 5-fold internal cross-validation framework with 80% of the data used for model training and 20% of the data used as an internal validation set to assess the model’s performance out-of-sample. Regularization hyperparameter tuning was conducted through a further nesting of a 20-fold cross-validation within the 5-fold cross validation (see Supplemental Methods for details on the cross-validation framework). Prediction accuracy was measured in R2. For the internal cross-validation, each R2 represents the accuracy of predicting the internal validation set (the 5th fold) using the model built on the training set (folds 1–4); to confirm the model’s generalizability, another R2 was derived by predicting the external test set (i.e., 20% of total data) using the most successful of the 5 models built in the 5-fold cross-validation. See Supplemental Methods for more details on the elastic net cross-validation scheme.
Primary analysis: confirmatory univariate analyses
To aid with the interpretation of elastic net regression analyses, follow-up analyses were conducted testing the associations of ADHD symptomatology with each of the brain features from the best elastic net regression model from each modality. This was done with a separate mixed effects model for each brain feature, which was used as a fixed effect. Covariates were fixed effects, family and scanner were random effects, and ADHD symptomatology was the dependent variable. Since these analyses were confirming relationships of coefficients found in elastic net models, a threshold of p < 0.05 was used to indicate significance. Regions included in elastic net regression models were only considered as neural correlates of ADHD symptomatology if they were also associated in the same direction in confirmatory univariate analyses.
Secondary analysis: categorical analyses
Because of the non-normal distribution of ADHD symptomatology (see Supplemental Fig. 1), we also completed a supplementary analysis to ensure that results were not being driven by this distribution. For this analysis we created a categorical variable of ADHD symptomatology based on a tertile split of the variable (group 1: no symptoms, 0 on CBCL; group 2: low symptoms, 1–3 on CBCL; and group 3: high symptoms, ≥3 on the CBCL). The three ADHD symptomatology groups were roughly equal in size (see Supplemental Fig. 3). Then we repeated the primary analyses using this categorical ADHD symptomatology variable as the dependent variable. Additionally, to determine if continuous findings of ADHD symptomatology would generalize to a clinical diagnosis of ADHD, we also repeated the primary analysis using ADHD diagnosis from the K-SADS as the dependent variable (K-SADS measure described in Supplemental Methods). There were 727 participants with an ADHD diagnosis vs. 7745 controls with valid sMRI data (N.B., 124 participants from the primary analyses did not have K-SADS ADHD diagnosis data and were excluded from this analysis). See Supplemental Fig. 4 for visualization of distribution of K-SADS diagnosis.