Pan-tissue methylation aging clock: Recalibrated and a method to analyze and interpret the selected features
Introduction
Aging is an unavoidable process, which takes place in all the living organisms. It is a decline of physiological processes in a time dependent manner. In common the number of years from the time of birth is being used as a measurement of age (Kalache et al., 2002). The rate of aging differs from one organism to another and also in humans, people with the same age shows varied physical and health changes (Baker and Sprott, 1988). It is reasonable to estimate that DNA methylation could potentially be one of the factors to influence the rate of aging (Ferrucci et al., 2020). Therefore, the past decade has seen various researches trying to establish aging markers based on epigenetic changes. One such research is the establishment of clock based on the methylation pattern. This was achieved by the usage of machine learning algorithms. In the year 2013 Hannum developed a methylation age clock (Hannum et al., 2013) using a blood derived dataset with a higher prediction accuracy. However, the model has suffered when it was applied onto other tissues. In the same year Steve Horvath developed a methylation clock using multiple tissues with a near perfect accuracy and was known as DNAmage (Horvath, 2013). In the later years, several different epigenetics clocks were developed with as low as just 3 CpG sites (Lin et al., 2016). In order to predict the lifespan along with the DNA methylation data Levine et al. included blood derived clinical markers and chronological age to develop the DNA PhenoAge (Levine et al., 2018). GrimAge (Lu et al., 2019), another clock model which utilizes smoking habit and plasma protein level was developed to predict the lifespan and the all-cause mortality. Both the clocks were better at predicting the lifespan and mortality factors when compared with the Horvath’s clock, which only utilizes the methylation data.
The recent development of the array technology introduced more target CpG sites and the latest addition to the illumina array is the EPIC methylation array beadchip which covers over 850 K CpG sites. In our study, we wanted to include a broad range of DNA methylation sites. We assumed, this inclusiveness could cover more DNA regions, which may involve in the aging process and results in identifying more important methylation sites during the feature selection. Recently deep learning models have been showing promising performance in various fields, implying that technology in age prediction will yield a better predictive accuracy. Deep learning derived clock model has also been recently published (Galkin et al., 2021b).
In the present study, we explored the potential of the advancement in the array technology to develop a DNA methylation clock that could predict the age of different tissues. In the process, we have developed a model with a higher accuracy and lesser error in the test data when compared with the already existing multi tissue clocks. In addition, this model have performed better in predicting the age of the tissues, which have not been used in the training session. Further, we have also analyzed the probes selected by the machine learning model to find out the biological meaning behind these DNA methylation age probes as they undeniably capture the essential characteristics of aging epigenome (Horvath and Raj, 2018).
Section snippets
Dataset collection
All the datasets used in this study have been collected from the public database Gene Expression Omnibus. In Fig. 1 the study process have been graphically explained. If the original study of the datasets includes any disease samples, they were neglected and only the control samples from the datasets has been used. Therefore, all the data used in this study were of normal humans. We used 20 different tissues datasets, which comprises 4671 samples for the training and test. The list of datasets
Performance of the model in training and test data
To verify the model’s performance, we used various metrics like Root Mean Squared Error (RMSE), Mean Absolute Error (MAE) and the correlation coefficient R2 values, which were measured between the actual age and the predicted age (Fig. 2A and C). The R2 value explains the proportion of the variance. RMSE and MAE were used to measure the average error in the prediction. Table 3 shows the performance values of the clock model in both test and training set data. The R2 value for the training data
Discussion
Numerous distinct epigenetic changes occur with aging (Kane and Sinclair, 2019). In this study, we have developed a machine-learning model, which could accurately predict the age of a person based on the DNA methylation levels. Though several studies developed epigenetic aging clock models, this study is only the second to use multiple tissues to generate a model next to the Horvath clock and the first to utilize a higher number of DNA methylation probes. However it have been proven that the
Conclusions
In this study, we have designed an epigenetic clock using the DNA methylation levels of multiple tissues and analyzed the selected probes for their biological function on aging. As a, result we trained a clock model with higher accuracy in predicting the age of a person. During the early adult phase, the methylation levels were increasing at a constant pace (hyper methylate) and slows down after the age of 80. This shows the evident influence of DNA methylation on aging. We analyzed these
CRediT authorship contribution statement
KAV: Conceptualization, Formal analysis, Data curation, Visualization, Writing – original draft, Writing – review & editing. GWC: Conceptualization, Funding acquisition, Supervision, Data curation, Formal analysis, Visualization, Writing – review & editing.
Competing interest statement
The authors declare that they have no competing interests.
Acknowledgments
We thank all the researchers who made all the microarray data that have been used in this study available for the scientific community. We also thank Dr. Shivarama holla kayyar and Ruban Kumar for their opinions on computing. This study was supported by research fund from Chosun University, 2021.
References (33)
- et al.
Biomarkers of aging
Exp. Gerontol.
(1988) - et al.
ELOVL2: Not just a biomarker of aging
Transl. Med. Aging
(2020) - et al.
GenAge: a genomic and proteomic network map of human ageing
FEBS Lett.
(2004) - et al.
Genome-wide methylation profiles reveal quantitative views of human aging rates
Mol. Cell
(2013) - et al.
Regulation of survival networks in senescent cells: from mechanisms to interventions
J. Mol. Biol.
(2019) - et al.
Minfi: a flexible and comprehensive Bioconductor package for the analysis of Infinium DNA methylation microarrays
Bioinformatics
(2014) - et al.
A multidimensional systems biology analysis of cellular senescence in aging and disease
Genome Biol.
(2020) Adult mesenchymal stem cells for tissue engineering versus regenerative medicine
J. Cell. Physiol.
(2007)- et al.
Racial disparities in epigenetic aging of the right vs left colon
J. Natl. Cancer Inst.
(2020) - et al.
Measuring biological aging in humans: A quest
Aging Cell
(2020)
Preprocessing, normalization and integration of the Illumina HumanMethylationEPIC array with minfi
Bioinformatics
Aging Dis.
DeepMAge: a methylation aging clock developed with deep learning
Aging Dis.
Methylation of ELOVL2 gene as a new epigenetic marker of age
Aging Cell
DNA methylation analysis on purified neurons and glia dissects age and Alzheimer's disease-specific changes in the human cortex
Epigenetics Chromatin
DNA methylation age of human tissues and cell types
Genome Biol.
Cited by (5)
How calorie restriction slows aging: an epigenetic perspective
2024, Journal of Molecular MedicineDistinguishable DNA methylation defines a cardiac-specific epigenetic clock
2023, Clinical EpigeneticsEndometrial receptivity in women of advanced age: an underrated factor in infertility
2023, Human Reproduction UpdateDistinguishable DNA Methylation Defines a Cardiac-Specific Epigenetic Clock
2022, Research SquareAging is a Side Effect of the Ontogenesis Program of Multicellular Organisms
2022, Biochemistry (Moscow)