Reclassification of the conventional risk assessment for aging-related diseases by electrocardiogram-enabled biological age

Introduction
Aging is a time-dependent progressive decline in physiological functions in cells and tissues. It significantly increases the risk for various aging-related diseases, including neurodegenerative diseases, cardiovascular (CV) diseases, metabolic diseases, musculoskeletal diseases, and immune system diseases1. Therefore, chronological age (CA) is used as an essential risk factor for these diseases in the prediction scores2,3. For example, the risk of developing coronary artery disease (CAD) increases with age. Age >45 years in men and >55 years in women is considered the risk factor in making the first impression of CAD diagnosis during the interview.
However, aging is a complex process that is difficult to quantify or simplify into the simple passage of time4. In addition to CA, the approaches to quantify the decline of physiological functions in different organs, known as biological age (BA), are utilized to evaluate and assess general health status and diseased condition during the aging process4. Under general health status, BA should correspond to CA as the elapsed time is compatible with the tissue aging process. BA is measured mainly by biomarkers or integration of medical records, vital signs, laboratory data5, and epigenetic modification to assess aging changes within tissues6,7. Different physiological developmental factors, such as genetics, lifestyle, nutrition, and comorbidities, might accelerate the functional decline in human organs. In this scenario, BA could be much higher than CA. In contrast, one’s level of health and fitness can significantly affect one’s biological age, potentially making it lower than one’s chronological age. A study suggests that a diet rich in nutrients can help decrease one’s biological age8. Similarly, higher activity levels have been associated with a lower biological age9. Interestingly, research has shown that an integrated approach involving dietary modifications and increased exercise is more effective in reducing biological age than employing either strategy independently10. The acceleration of BA contributes to a higher risk for all-cause mortality11 and CV diseases5,12. The utilization of BA and CA in risk prediction could be compensatory. It is hypothesized that risk assessment by combining CA and BA might be better than those by CA per se for risk stratification of aging-related diseases.
Artificial intelligence (AI)-enabled electrocardiogram (ECG) estimates BA (ECG-BA) through cardiac electrical signals and is used to measure cardiac functional decline13. This is correlated to the overall cardiac health of the patients and is used as a mortality predictor13,14. The increased difference between ECG-estimated BA and CA is linked to increased CV mortality and new onset of heart failure, stroke, coronary artery disease, and atrial fibrillation15. These findings suggest that ECG-BA might provide additional information to CA for predicting CV outcomes. Further, it is now feasible to monitor ECG through wearable devices for the detection and prediction of diseases such as arrhythmias, coronary artery disease, sleep apnea, mental health, and epilepsy, as highlighted by Neri et al.16. The integration of wearable ECG technology with AI algorithms can drastically improve the management of cardiovascular health and other conditions by enabling real-time, non-invasive, cost-effective monitoring and early intervention.
Aging is a systemic process that causes different diseases, including cardiac, neurological, and musculoskeletal diseases. The aging-related functional decline in the heart is involved as part of the systemic aging process. It is possible that cardiac BA might be used as a surrogate biomarker to measure systemic aging and predict aging decline and disease onset in the other organs beyond the heart. It remains unknown whether ECG-BA would provide additional significant information to CA for non-CV diseases and help the doctors differentiate the risk for aging-related diseases by considering the cardiac biological aging process.
In the present study, we developed the AI-ECG model to measure cardiac biological age and studied whether ECG-BA could predict the onset of aging-related diseases beyond the heart and improve the risk stratification in addition to CA.
Results
ECG-BA model
We first derived the predicted ECG-BA by using the deep learning model for the healthy population (20–80 years old) without comorbidities and diseases. In this population, ECG-BA should be well correlated to CA and reflect BA during cardiac physiological aging. In the test set, the predicted ECG-BA had a good correlation with CA (r2 = 0.70, p < 0.01, Fig. 1). The difference in mean absolute errors (MAE) between CA and ECG-BA was 6.25 ± 0.08 years. The mean absolute percentage errors (MAPE) for our age-predicting model was 15.35 ± 0.27%, respectively. The present model had a lower MAE and MAPE, as compared to the other published models (Table 1)13,17.

The correlation of ECG-biological age and chronological age.
Diagnostic performance of CA with/without ECG-BA for aging-related diseases
We further established the diagnostic relevance of ECG-BA for aging-related diseases and analyzed whether the add-on of ECG-BA to CA would improve diagnostic performance. By ROC analyses, the area under the receiver operating characteristic curve (AUC) of diagnostic performance for aging-related diseases was all higher than 0.73 (Table 2). The AUC of diagnostic performance by CA for CV diseases, including stroke, CAD, peripheral arterial occlusive disease (PAOD), myocardial infarction (MI), aortic stenosis (AS), atrial fibrillation (AF), sick sinus syndrome (SSS), and atrioventricular block (AVB) was 0.82 ± 0.06. The AUC of diagnostic performance by CA for non-CV diseases, including Alzheimer’s disease (AD), Parkinson’s disease (PD), osteoarthritis (OA), macular degenerative disease (MDD), and cancers, was 0.86 ± 0.07. The AUCs of non-CV diseases were higher than those of CV diseases. Compared with the AUC of CA, the combination of ECG-BA and CA significantly increased AUC in CV diseases, including stroke, CAD, PAOD, and MI. However, for arrhythmia-related CV diseases, the add-on of ECG-BA did not improve diagnostic performance. The combination of ECG-BA and CA significantly increases AUC in non-CV diseases, including AD, OA, and cancers, compared to CA. The combination of ECG-BA did not improve the diagnostic performance of MDD and PD compared to CA.
Net reclassification improvement after adding ECG-BA on CA
Net reclassification improvement (NRI) was calculated to quantify how well a new model (CA with ECG-BA) reclassified subjects appropriately as compared to a baseline model (CA alone). The add-on of ECG-BA to CA explained the net reclassification improvement of the disease diagnosis (Fig. 2, from 0.09 to 0.29). After adding ECG-BA, the percentage of successful reclassification to correct misdiagnosis for target diseases by the model of CA was 21.0 ± 7.6% (from 9% to 29%). The NRI enhanced across various age categories with the inclusion of ECG-BA alongside CA in predicting both CV diseases (PAOD, MI, and CAD) and non-CV diseases (AD, OA, and cancer). However, exceptions were noted in individuals under 40 years for AD and OA (detailed in Table S1). The percentage of reclassified diagnoses between the model of CA with/without ECG-BA was the highest for cancers (29%). The NRI for the cancer subgroup analysis is detailed in Fig. S1. This indicated that simply adding ECG-BA to CA could correct the misclassification of the model by CA per se as high as more than 9% in both CV and non-CV diseases.

NRI net reclassification improvement.
Discussion
CA remains a clinically relevant standard risk factor for aging-related diseases. Adding ECG-BA to CA might detect biological aging changes and increase the accuracy of risk stratification for aging-related diseases, according to AUC and NRI. Our model demonstrated a 21% increase in net reclassification improvement over using chronological age alone, highlighting its potential to identify high-risk individuals more accurately (graphic abstract). As the improvement is not only confined to CV but also non-CV diseases, the results suggested ECG-BA could be considered as a surrogate marker of systemic aging.
In the present study, we found ECG-BA, combined with CA, had incremental predictive power for aging-related diseases. The proposed ECG-BA model is not one that replaces the conventional risk factor, e.g., CA. Instead, ECG-BA is a complementary tool to CA. NRI suggested that ECG-BA, in addition to CA, could increase the accuracy of the classification of diseases by more than 9%. The finding indicated that a simple test by surface ECG could correct the risk assessment for those at high risk, but CA mistakenly considered it low risk. Chang et al. employed the discrepancy between biological and chronological age to predict cardiovascular mortality and the onset of heart failure, stroke, coronary artery disease, and atrial fibrillation15. This approach may overlook the predictive value of traditional chronological age for these conditions. Our research demonstrated that combining ECG-BA with chronological age offers incremental predictive power for aging-related conditions. Notably, ECG-BA is not intended to replace traditional risk factors such as chronological age but to complement them. The improvement of the predictive value is not confined to CV (PAOD, MI, and CAD) but also non-CV diseases (AD, OA, and cancers) in the present study. Rather than being considered a CV risk factor, this supports that the present model of ECG-BA is a surrogate biomarker to measure systemic aging and relevant age-related diseases beyond the heart.
In the Women’s Health Study18, the scoring system was developed to predict CV outcomes, including MI, ischemic stroke, coronary revascularization procedures, and deaths from cardiovascular causes. The risk factors included age, systolic blood pressure, current smoking, total and high-density lipoprotein cholesterol, hemoglobin A1c, high-sensitivity C-reactive protein, and parental history of MI before age 60. Each risk factor increased the reclassification percentage from 2.8% of parental history of MI to 13.4% of age. Even though these risk factors are already considered clinically important, the average reclassification percentage increase is only 6.6 ± 3.4%18. The present work demonstrated that a simple measurement of ECG-BA would increase the reclassification percentage by more than 9% compared to CA per se. These findings suggest that the reclassification of ECG-BA is clinically relevant and might be incorporated into the clinical risk scores for identifying high-risk patients for aging-related diseases.
ECG-BA reflects cardiac aging as ECG detects the physiological changes in the myocardium and cardiac electrical conduction13,15. As myocardial aging might share common mechanisms with vascular aging, the close association between myocardial and vascular aging makes the prediction of ECG-BA for the onset of vascular diseases reasonable. However, it seems ECG-BA is not a good predictor for AF or other arrhythmias. This indicates the pathogenesis of arrhythmias might be more complex than the aging process. For instance, AF has been detected in individuals younger than 65 years of age. A multitude of factors, such as hyperthyroidism, smoking, alcohol consumption, and obesity, have been identified as potential inducers of AF. It is intriguing that ECG-BA is also a good predictor for non-CV aging-related diseases, including cancers, OA, and AD. How could these aging-related diseases be linked to an ECG measurement of cardiac physiology? Chronic inflammation might be one of the plausible mechanisms. Inflammaging is a chronic, sterile, low-grade inflammation during the aging process. This chronic systemic inflammation is common ground for the initiation of cardiac aging, CV, and non-CV diseases, and it synchronizes the onset of cardiac aging and non-CV aging diseases. Therefore, the measurement of cardiac aging by ECG-BA could detect the occurrence of the pathogenesis of non-CV diseases. This may explain why ECG-BA might be applied as the predictor for non-CV aging-related diseases.
There is no gold standard for the estimation of cardiac BA. In addition to ECG-BA, more biomarkers related to cardiac aging might be a useful complement. We assess whether NRI remains significant for CV death when BA is added to conventional CV risk factors, including CA, sex, diabetes mellitus, and hypertension. Our analysis shows that including BA alongside CA and sex enhances the prediction of CV death risk, yielding an NRI of 33%. Similarly, the addition of BA to CA, sex, diabetes mellitus, and hypertension results in an NRI of 32% (Table S2). However, our study database did not include data on systolic/diastolic blood pressure, total cholesterol, or smoking status. As a result, we could not calculate NRI based on conventional risk scores, such as the Framingham risk scores19,20,21. ECG data were collected from a single tertiary center, indicating that multi-center validation across a more diverse population would be essential for broader applicability. We screen relevant diseases using ICD diagnostic codes and define the patients without pre-defined diseases as health controls for AI analysis. Some patients with mild symptoms or minor ECG abnormalities might not be recorded in ICD codes. The present AI model is generated by the ECG data from the Philips ECG machine. A different system, e.g., GE Healthcare, might be needed for the validation as the ECG format or setting might be different for a variety of ECG machines.
In clinical practice, incorporating ECG-BA into predicting aging diseases and traditional risk prediction models significantly enhances diagnostic accuracy and risk stratification in our study. This suggests that ECG-BA could be employed as an effective screening tool in routine health assessments, helping clinicians triage and prioritize patients for further diagnostic evaluations. For instance, ECG-BA of individuals alongside CA could be considered additional testing to uncover or monitor potential subclinical or emerging pathologies. This approach not only helps in early detection but potentially aids in the initiation of preventative therapies tailored to the biological rather than only the CA of patients. By advancing ECG-BA as a surrogate biomarker for systemic aging alongside CA, our study significantly enhanced the predictive power for aging-related diseases. This advancement opens new avenues for its application beyond cardiovascular health, highlighting its potential to predict a broad spectrum of age-associated diseases.
Methods
Data source and study population
The ECG dataset was collected from Taipei Veterans General Hospital from 2006 to 2017 (Fig. 3). The cases’ label information of ECGs was linked to the hospital database of medical records to obtain a list of healthy individuals over 20 years old who did not have any related diseases, including hypertension, diabetes mellitus, MI, heart failure, pulmonary embolism, stroke, malignancy, chronic kidney disease, chronic lung disease, rheumatic diseases, sick sinus syndrome, atrioventricular block, permanent pacemaker implantation, cardioverter-defibrillator implantation, atrial fibrillation, atrial flutter, ventricular arrhythmia, or ventricular fibrillation and flutter. A total of 51,061 valid ECGs were recorded from a hospital-based cohort of healthy individuals. Ultimately, 48,783 healthy participants, each associated with a valid ECG under sinus rhythm and aged between 20 and 80 years, were enrolled in the study. Figure 3 illustrates the process involved in generating the training, validation, and test sets for the development of the AI ECG-BA model and clinical implementation in the hospital-based cohort.

CA chronological age, ECG-BA artificial intelligence-enabled ECG estimated biological age, NRI net reclassification improvement.
The Ninth and Tenth Revisions of the International Classification of Disease (ICD-9 and ICD-10) codes were also utilized to determine the presence of medical diseases (details in the Supplementary Methods).
Overview of artificial intelligence model development and testing of the network
The ECG recordings were collected using a Philips ECG machine. The sampling frequency was 500 Hz, with 10 seconds recorded in each lead. The input data is 12-lead ECG signals, each of which comprises 12 leads, 5000-time points, and CA.
A novel method combining residual network (ResNet), squeeze-and-excitation network (SENet), and multitask learning to train a new deep learning model for linking ECG and CA. Initially, batch normalization was implemented in the ResNet architecture to mitigate internal covariate shifts and prevent gradient vanishing, thereby facilitating the construction of a deeper network without encountering degradation issues. Second, the SENet method was adopted to pay more attention to crucial feature maps. In brief, the convolution layer makes use of the filters to go across the width and height of the input data, taking dot products between the entries of the filter and the input at any position to obtain feature maps. Finally, we used multitask learning to enhance our proposed method by training several related tasks simultaneously, and each task can benefit from the knowledge acquired from the other tasks.
To enhance the generalizability of the proposed method and enable a more objective assessment, we employed 5-fold cross-validation to optimize the selection of hyperparameters. During the training procedure, the training set would be divided into five subgroups with the same sample size, and one subgroup would be retained as the validation data while the remaining four subgroups were used to train the model. The process would repeat five times, and the results of the mean value and the standard deviation of the 5-fold cross-validation were analyzed as the performance measurement. In addition, the 5-fold cross-validation could also determine the best combination of hyper-parameters. These hyper-parameters could be applied to re-train the model with the whole training data and evaluate the well-trained model on the hold-out testing.
PyTorch was used to build the deep learning model, and the initial settings for batch size and learning rate of hyper-parameters for each experiment were 800 and 0.01, respectively. The network weights were optimized using the Adam optimizer. Furthermore, when building the model, the mechanism of early stopping to avoid overfitting was adopted. Nevertheless, to avoid the training process’s untimely stops because of early stopping before the model converges, at least 25 iterations were forced to run in each experiment, and the counter of early stopping would not start to work until the 26th iteration.
The diagnostic performance of ECG-BA with CA for aging-related disease
In this part, we tested the performance of ECG-BA with CA in diagnosing multiple aging-related diseases. The aging-related diseases included cardiovascular diseases (stroke, CAD, PAOD, MI, AS, AF, SSS, AVB) and non-cardiovascular diseases (AD, PD, OA, MDD, and cancers [colorectal cancer, lung cancer, breast cancer, hepatocellular cancer, and prostate cancer]). The study sample, collected from Taipei Veterans General Hospital between 2006 and 2017, comprised cases of prevalent aging-related diseases and healthy controls. Cases of prevalent disease were defined as individuals who were diagnosed with the disease either within 1 month before or after the date their ECG was performed. For each case of prevalent disease, we randomly selected two healthy controls that matched in sex and were free from aging-related diseases. We then collected the corresponding ECGs for both the disease cases and the healthy controls from the ECG dataset, subsequently deriving the ECG biomarkers (ECG-BAs) through our AI model. The study recruited a total of 27,124 individuals with stroke, 40,407 individuals with CAD, 3536 individuals with PAOD, 2521 individuals with MI, 2737 individuals with AS, 10,237 individuals with AF, 1114 individuals with SSS, 867 individuals with AVB, 1521 individuals with AD, 2957 individuals with PD, 30,262 individuals with OA, 1539 individuals with MDD, and 9689 individuals with cancer across independent test sets to evaluate the diagnostic efficacy of the AI model. The performance of CA with/without ECG-BA in diagnosing each of the aging diseases was then assessed using the following procedure (Fig. 4). First, we explored the performance of CA in diagnosing the disease. This is compatible with the usual clinical scenario in which doctors tend to make the first diagnostic impression of aging-related diseases based on CA. For each disease, we built a conditional logistic regression model that included CA. The discriminatory performance of the model was assessed based on the AUC. Second, we hypothesized that ECG-BA may improve the performance of CA in diagnosing specific aging-related diseases. Therefore, for each disease, we built a conditional logistic regression model that includes CA and ECG-BA. Then, by testing the AUC differences, we compared the discriminatory performance of the model with CA per se to that of the model with additionally included ECG-BA.

A flowchart of study design is used to predict aging-related diseases and analyze diagnostic performance.
Net reclassification improvement
NRI is an index that attempts to quantify how well a new model reclassifies subjects as compared to an old model. This was used in our present work to overcome the limitation of the analysis by c-statistics of AUC, as the interpretation of small magnitude changes by c-statistics is difficult22,23. Therefore, we assessed the potential impact of ECG-BA by calculating the net reclassification improvement.
Statistical analysis and model performance assessment
Categorical variables were described with counts and percentages. Continuous variables were reported as means with standard deviations for normally distributed data and the median and interquartile range for non-normally distributed data. The performance of the deep learning model was evaluated by the MAE and MAPE. CA and ECG-BA were assessed and correlated with the Pearson correlation coefficient. Conditional logistic regression analysis was applied to model the relationship between the ECG-BA and aging diseases, as mentioned above. We performed the receiver operating characteristic curve analysis to assess the performance of the model in assigning a higher probability to the case (discriminatory performance) by estimating the area under the receiver operating characteristic curve. We compared the discriminatory performance of the models with and without ECG-BA by testing the differences in the area under the receiver operating characteristic curve. NRI was used to evaluate the improvement offered by ECG-BA in correctly reclassifying individuals (the percentage of individuals with the disease assigned a higher predicted risk plus the percentage of individuals without the disease assigned a lower predicted risk). The significance of NRI statistics was based on approximate normal distributions. Two-sided p values < 0.05 were considered statistically significant. All analyses were completed using SAS version 9.4 and R.
Responses