Assessment of polygenic risk score performance in East Asian populations for ten common diseases

Introduction

Genome-wide association studies (GWASs) have revolutionized our understanding of complex traits by identifying a significant number of genetic variants associated with their expression1,2,3. However, individual genetic variants often contribute modestly to phenotypic variation, even in highly heritable traits4. This emphasizes the polygenic nature of the most complex traits, in which numerous genetic variances with small effects collectively influence the trait variance5. Consequently, polygenic risk score (PRS) has emerged as a valuable predictive tool. The PRS aggregates risk information from numerous genetic variants and offers a cumulative measure of an individual’s genetic susceptibility to a disease6. This field is rapidly progressing with advances in methods7 and cataloging8.

The PRS demonstrates the potential to stratify individuals based on disease susceptibility in Europeans9,10. The PRS estimation revealed a significant increase in risk in the high-risk group. Specifically, individuals in the top 8.0% for coronary artery disease (CAD), 6.1% for atrial fibrillation, 3.5% for type 2 diabetes, 3.2% for inflammatory bowel disease, and 1.5% for breast cancer experienced a three-fold increased risk compared to the remaining group9. Additionally, significant differences in obesity prevalence (body mass index, [BMI] ≥30 kg/m²) were observed across the deciles of PRS for BMI10. However, studies on this PRS type have not been well explored beyond the European ethnicity.

Recently, large-scale GWAS have been expanded to include other ethnic groups11. East Asian GWAS summary statistics were reported as the second highest in number, following the European GWAS Catalog database12. GWASs were conducted on 220 traits using data from BioBank Japan (BBJ), comprising 170,000 Japanese, the largest sample of East Asians ever studied for GWAS13. However, a significant limitation in PRS prediction stems from the smaller sample size of East Asians than that of the Europeans14. A meta-analysis of standing height was performed using a sample size of 4,080,687 Europeans and 472,730 East Asians14. This variation in sample size affected the statistical power, resulting in a disparity in SNP heritability (50% for European height heritability vs. 35% for East Asian height heritability). SNP heritability is closely associated with PRS performance metrics, such as the correlation (R2) between PRS and trait6,15. Consequently, it may be challenging to achieve predictive accuracy like that observed in European studies.

Therefore, leveraging well-analyzed European GWAS data is crucial to achieve higher predictive performance of the PRS for East Asians16. Practical challenges arise because of variations in linkage disequilibrium (LD) patterns between East Asian and European populations, rendering the direct utilization of European GWAS data for PRS estimation in East Asians impractical17. One of the existing cross-population PRS methods involves the meta-analysis of GWAS summary statistics across multiple populations using the inverse-variance method and subsequently constructing a PRS using the independent single nucleotide polymorphisms (SNPs) that exhibit statistical significance through the P + T method13,16,18,19. However, this approach lacks the incorporation of population-specific alleles, frequencies, and LD patterns. To address these limitations, a PRS-CSx and CT-SLEB methods were assessed to enhance the cross-population PRS to non-European populations16,20. This approach integrates GWAS data from various ethnic groups with large-scale GWAS data from Europeans to assess the PRS for non-Europeans. The cross-population method of PRS-CSx employs a Bayesian technique to enhance the accuracy of PRS prediction by considering genetic effects and LD diversity across distinct ethnic groups16. By leveraging the relationships between genetic associations and LD patterns in distinct ethnic groups, the PRS-CSx effectively increases the effective sample size while accommodating specific genetic variations within each ethnic group16. CT-SLEB, on the other hand, is a cross-population empirical Bayes method that combines effect size estimates from multiple GWAS summary statistics across ancestries20. It adjusts for differences in population-specific LD and allele frequencies by borrowing statistical power from larger European GWAS datasets while incorporating population-specific details from the target cohort. CT-SLEB therefore provides improved cross-population PRS predictions, maintaining fidelity to the target population while using well-analyzed European data for better accuracy20.

Recently, the PRSs in East Asians were assessed using the cross-population methods of PRS21,22. The predictive performance of diverse PRSs for type 2 diabetes was assessed using diverse cross-population methods, such as PRS-CSx, PRCS-meta, and Ldpred2-meta. Among these methods, PRS-CSx significantly increased the predictive performance of type 2 diabetes in East Asians22. PRSs in East Asians were assessed using the PRS-CS and PRS-CSx in inflammatory bowel disease (IBD), Crohn’s disease (CD), and ulcerative colitis (UC)21. It was observed that the PRS-CSxs associated with risk for CD (8.0%), IBD (6.5%), and UC (5.5%) on a liability R2 in the Chinese population. In contrast, PRS-CS, trained only with East Asian GWAS data, exhibited a lower liability R2 values of CD (6.4%), IBD (4.7%), and UC (3.2%). Compared with PRS-CS, PRS-CSx exhibited an average enhancement of 1.5% in risk prediction for these diseases21. While various PRSs have already been attempted in East Asians, efforts to improve predictive performance through cross-population PRSs have been ongoing. It is, therefore, necessary to evaluate these approaches across multiple diseases to confirm their potential in enhancing predictive performance in East Asian populations.

In this study, we assessed the predictive performance of East Asian PRSs, including single-population PRSs (PRS-CS, ldpred2, lassosum) and cross-population PRSs (PRS-CSx, and CT-SLEB) for ten common diseases in a Health Examinees (HEXA) East Asian cohort in Korea23, comprising a sample size of 58,700 Koreans. We selected these ten common diseases (asthma, cataract, cholelithiasis, colon polyp, CAD, hypertension, obesity, osteoporosis, stroke, and type 2 diabetes) based on prevalence being greater than 1% in the HEXA cohort. The diseases evaluated in this study include asthma, cataracts, cholelithiasis, colon polyp, CAD, hypertension, obesity, osteoporosis, stroke, and type 2 diabetes. Additionally, we assessed East Asian PRSs using follow-up data from the Korea Association Resource (KARE)24, comprising a sample size of 8840 through Cox regression analysis.

Results

Basic characteristics

The baseline characteristics of participants in the HEXA cohort23 are listed in Supplementary Table 1. This study included 58,700 Korean individuals (65.43% female) with an average age of 53.80 years, a mean height of 160.72 cm, and an average BMI of 23.89 kg/m2. We selected ten common diseases based on having a prevalence greater than 1% in the HEXA cohort (Table 1). The prevalence rates ranged from 1.18% (stroke) to 45.91% (hypertension). Additionally, data on the risk factors for each disease were included based on the Mayo Clinic guidelines (https://www.mayoclinic.org/). These basic characteristics of participants are summarized in Table 1. The BMI and age of all diseases were higher in the cases than in the controls. For CAD, osteoporosis, and stroke, family history frequencies were higher in the cases than in the controls. We also observed that the risk factors exhibited significant frequencies or values in an unfavorable direction. For example, compared to controls, cases exhibited significant risk factors for stroke, such as systolic blood pressure (122.40 ± 14.77 vs. 127.21 ± 15.05), diastolic blood pressure (75.75 ± 9.73 vs. 77.03 ± 9.66), high density lipoprotein (53.80 ± 13.15 vs. 49,75 ± 12.09), coronary artery disease (2.80% vs. 7.22%), and type 2 diabetes (8.57% vs. 21.65%).

Table 1 Basic characteristics of ten common diseases in HEXA cohort
Full size table

Predictive performance of PRS calculated using single-population PRS methods in the HEXA cohort

We calculated the PRSs for ten common diseases using the East Asian GWASs and the single-population PRS methods such as PRS-CS, LDpred2, and Lassosum. The GWAS summary statistics for East Asians were obtained from the GWAS Catalog (https://www.ebi.ac.uk/gwas/), and the sample size of GWAS summary statistics is listed in Supplementary Table 213,25,26. The sample sizes of GWAS summary statistics ranged from 51,442 (CAD) to 341,204 (asthma). The majority of GWAS summary statistics, excluding those for asthma, were obtained from Japanese datasets13,26. For asthma, we used the Meta GWAS summary statistics provided by the Global BioBank Meta-analysis Initiative (GBMI) (n = 341,204)25.

We assessed the performance of single-population PRSs (PRS-CS, LDpred2, and lassosum) in the HEXA cohort (n = 55,870). We used four different statistical methods for predictive performance: 1) LRT to assess the fit of the logistic regression model for PRSs; 2) perSD OR to quantify the effect size of the PRS; 3) NRI to assess the enhancement of individual classification; and 4) area under the curve (AUC) to evaluate the discriminatory ability of the PRS in distinguishing between cases and controls. The results are presented in Table 2.

Table 2 Predictive performance metrics for single-population PRSs
Full size table

In Table 2, the term “deviance” for LRT indicates the goodness of fit by comparing the models with and without PRS. It represents how well a model fits a given dataset and is calculated as the difference in the log probability between the two models. All PRS-CSs were statistically significant (P < 5.00E-03; 0.05/10). The PRS-CS for obesity exhibited the highest deviance in LRT (2644.60), whereas the PRS-CS for cataracts exhibited the lowest deviance in LRT (11.85). All the PRS-CSs were statistically significant for per SD OR (P < 5.00E-03; 0.05/10) (Table 2). The perSD OR was the highest for type 2 diabetes (2.03), whereas cataracts exhibited the lowest (1.08). For the NRI, PRS-CSs for seven diseases, such as asthma, cataracts, CAD, hypertension, obesity, osteoporosis, and type 2 diabetes were statistically significant (P < 5.00E-03; 0.05/10), while the other three diseases, cholelithiasis, colon polyps, and stroke, exhibited statistical insignificance based on the multiple correction (P < 5.00E-03) (Table 2).

LDpred2 PRSs showed the insignificance of NRI in one more disease, coronary artery disease in addition to cholelithiasis, colon polyp, and stroke. Lassosum PRS additionally showed the insignificance in asthma, cataract, and osteoporosis. In Lassosum PRS, cataract, colon polyp, coronary artery, osteoporosis, and stroke showed the insignificance of LRT and perSD OR.

We compared the AUC values of three single-population PRSs in each disease (Fig. 1). PRS-CS demonstrated the best values in all ten diseases. In addition, we investigated the AUC difference between the full model (PRS + age + sex) and the baseline model (age + sex) (Table 2). PRS-CS provided the largest improvement in AUC for Type 2 diabetes (76.68% vs. 70.07%).

Fig. 1
figure 1

Comparison of AUCs for single-population PRS methods across ten common diseases.

Full size image

Predictive performance of cross-population PRS in the HEXA cohort

We assessed the performance of cross-population PRSs for ten common diseases using PRS-CSx, which re-estimates the SNP effect size from both East Asian and European GWAS using the Bayesian technique, and CT-SLEB, which employs an empirical Bayes approach to integrate population-specific LD patterns, enhancing the prediction accuracy across ancestries. The GWAS summary statistics for both Europeans and East Asians were obtained from the GWAS Catalog (https://www.ebi.ac.uk/gwas/) for the cross-population PRS. The East Asian summary statistics used for PRS were the same as those used for the PRS were the same as those used for single-population PRS (Supplementary Table 2)13,25,26,27,28,29,30,31. The sample size of GWAS summary statistics for Europeans ranged from 184,481 (CAD) to 1,339,889 (type 2 diabetes). As anticipated, the sample sizes of the European GWAS summary statistics were higher than those of the East Asia for all diseases (Supplementary Table 2).

We assessed the predictive performance of PRS-CSx in the HEXA cohort (Table 3). All ten PRS-CSxs met the statistical significance of the LRT (P < 5.00E-03; 0.05/10). In the LRT, the PRS-CSx for obesity exhibited the highest deviance (2956.60), while the PRS-CSx for cataracts exhibited the lowest deviance (12.36). Additionally, all PRS-CSxs met the statistical significance of the perSD OR, with type 2 diabetes exhibiting the highest (2.10) and cataracts having the lowest values (1.08). All PRS-CSxs satisfied the statistical significance of the NRI.

Table 3 Predictive performance for the cross-population PRS
Full size table

However, CT-SLEB demonstrated a lack of statistical significance across three metrics (LRT, perSD OR, and NRI) for three diseases: cataract, cholelithiasis, and osteoporosis. Additionally, it showed a lack of statistical significance for NRI in asthma (Table 3). Among the seven diseases where LRT was statistically significant, the CT-SLEB for obesity exhibited the highest deviance (2728.90), while the CT-SLEB for stroke showed the lowest deviance (10.62). Similarly, among the seven diseases where perSD OR was statistically significant, the CT-SLEB for type 2 diabetes had the highest perSD OR (2.33), whereas the CT-SLEB for colon polyp had the lowest perSD OR (1.15).

The performance comparison between two cross-population PRSs revealed that PRS-CSx showed better LRT performance in six of the seven diseases. In the comparison for perSD OR, PRS-CSx performed better in three diseases, while CT-SLEB performed better in four diseases. This trend was more pronounced in NRI, with CT-SLEB showing superior performance in five of the six diseases where NRI was statistically significant.

We compared the AUC values of two cross-population PRSs across ten diseases (Fig. 2). PRS-CSx showed higher AUC values in six diseases, including asthma, cataract, cholelithiasis, CAD, osteoporosis, and stroke. Additionally, we analyzed the difference in AUC between the full model (PRS + age + sex) and the baseline model (age + sex) (Table 3). CT-SLEB demonstrated the largest improvement in AUC for type 2 diabetes (78.04% vs. 68.80%).

Fig. 2
figure 2

Comparison of AUCs for cross-population PRS methods across ten common diseases.

Full size image

Comparison between the single-population and cross-population PRSs in the HEXA cohort

We evaluated whether the cross-population PRS improved the AUC value compared to the single-population PRS. For cholelithiasis, colon polyp, and stroke, the single-population PRS did not achieve statistical significance for NRI and was therefore excluded from this analysis. Consequently, a total of eight diseases were included in the comparison. To measure the improvement, we calculated the ratio of the AUC between the cross-population PRS and the single-population PRS that achieved the highest AUC value for each disease within each PRS group (Fig. 3 and Supplementary Table 3). For all eight diseases, PRS-CS was identified as having the best AUC in the single-population PRS group. However, in the cross-population PRS group, the method yielding the best AUC varied depending on the disease. In the cross-population PRS group, PRS-CSx was used for asthma, cataracts, cholelithiasis, coronary artery disease, and osteoporosis, while CT-SLEB was used for hypertension, obesity, and type 2 diabetes (Supplementary Table 3).

Fig. 3
figure 3

The ratio of AUCs between single-population PRSs and cross-population PRSs across eight common diseases.

Full size image

Figure 3 illustrates these AUC comparisons. For asthma, CAD, hypertension, obesity, and type 2 diabetes, the AUC values of the cross-population PRS were higher than those of the single-population PRS, PRS-CS. For cataract, the AUC value was identical between PRS-CS and PRS-CSx. In contrast, for cholelithiasis and osteoporosis, PRS-CS demonstrated higher AUC values than PRS-CSx. Overall, the cross-population PRS outperformed the single-population PRS in five out of eight diseases.

To assess the statistical significance of the increased predictive performance of cross-population PRS, we performed an r2redux analysis between single-population and cross-population PRS in the HEXA cohort32. This analysis calculated the variance and covariance of R2 for each PRS, thereby facilitating the estimation of the 95% confidence interval (CI) and P value for the difference between single-population and cross-population PRS. Initially, we assessed the R2 and variance of R2 using r2redux for each PRS. R2 ranged from 0.0023 (asthma) to 0.2149 (hypertension) for single-population PRS and from 0.0025 (asthma) to 0.2450 (hypertension) for cross-population PRS (Supplementary Table 4). Subsequently, we calculated the difference in R2 between the single-population and cross-population PRS using r2redux method (Supplementary Table 5). Among the ten diseases, five exhibited statistically significant differences in R2 (P < 5.00E-03; 0.05/10), demonstrating a higher R2 for cross-population PRSs than that for single-population PRSs. The highest difference in the R2 value was observed for hypertension (0.03010), whereas the lowest difference was observed for stroke (0.00033). The average increase in R² for cross-population PRSs, among those with statistically significant differences, was 1.13%, which was greater compared to those observed for single-population PRSs.

Comparison between the cross-population PRS and European PRS

To compare the predictive performance of East Asian PRS (single-population and cross-population PRS) with that of the European PRS, we used the polygenic score (PGS) Catalog database (https://www.pgscatalog.org/) and previous studies (Table 4 and Supplementary Table 6)8. Among the ten diseases, the performance metrics and per SD OR of the European PRS results were available for only eight diseases in both the PGS Catalog and previous studies. The perSD OR results are presented in Table 4.

Table 4 Results of comparison between the PRSs for perSD OR
Full size table

Among the eight diseases, four-Asthma, Cataract, Coronary artery disease, and Stroke-demonstrated that the perSD ORs of the East Asian PRSs were within the European PRS value range. Obesity and osteoporosis did not reach the European PRS value range, whereas stroke and type 2 diabetes demonstrated significant performance compared to the European PRSs (Table 4). We assessed the relative performance of cross-population PRSs compared with European PRSs by calculating the percentage ratio between the highest perSD OR observed among European PRSs and the per SD OR computed from cross-population PRSs. This indicated that the average performance of East Asian PRSs, as measured by the perSD OR across all eight diseases, was equivalent to 87.80% of that of European PRSs.

Predictive performance of East Asian PRSs (single-population and cross-population PRS) in the follow-up data

We assessed the performance of PRSs over time using the follow-up data from the KARE cohort24. Supplementary Table 1 presents the baseline characteristics of the participants in the KARE cohort, which comprised 8,840 Koreans, with 52.69% females. The participants were aged between 40–69 years (average; 52.22 years). Data collection in the KARE cohort commenced in 2001, and follow-up examinations were conducted every two years, totaling seven examinations over a span of 14 years23. The analysis of follow-up data every two years revealed a novel incidence of diseases (Methods and Supplementary Table 7). Owing to the variations in diseases collected through KARE from HEXA, we were able to assess the predictive performance of PRSs for seven diseases in the KARE follow-up data (Supplementary Table 7).

We assessed the predictive performance of the PRSs using a Cox regression model adjusted for age and sex in the follow-up data (Table 5 and Fig. 4). Both the single-population and cross-population PRSs exhibited statistical significance (P < 7.14E-03, 0.05/7) for asthma, hypertension, obesity, and type 2 diabetes. However, no distinct variation was observed in the performance between the single-population and cross-population PRS groups in the follow-up data. PRS-CS performed better for asthma, CT-SLEB outperformed in hypertension and obesity, and both methods showed similar performance for type 2 diabetes based on HR values.

Table 5 Comparative evaluation of PRSs using Cox regression analysis in KARE
Full size table
Fig. 4: Comparison of hazard ratios between single-population PRSs and cross-population PRSs using follow-up data from the KARE Cohort.
figure 4

HR represents the hazard ratio, and * denotes the statistical significance of the HRs.

Full size image

Furthermore, we compared the performance of the East Asian and European PRSs using follow-up data. Among the four diseases, such as asthma, hypertension, obesity, and type 2 diabetes, which displayed statistical significance in hazard ratio, European PRS performance metrics were available for only three of them: asthma, obesity, and type 2 diabetes. These were documented in the PGS Catalog and in previous studies. The comparison results are presented in Table 6. Specifically, the East Asian PRS for asthma exhibited superior performance, while the PRS for type 2 diabetes fell within the range of values observed for European PRSs. Additionally, the European PRS for obesity showed superior performance.

Table 6 Results of comparison between the PRSs for hazard ratios
Full size table

Discussion

We assessed and compared the predictive performance of East Asian PRSs, including the cross-population PRS. Using the HEXA Korean cohort (n = 55,870), we demonstrated that cross-population PRS enhanced the predictive performance compared with single-population PRS for most diseases. This demonstrated significant improvement for LRT (1.08-fold on average), perSD OR (1.07-fold on average), NRI (1.15-fold on average), and AUC (1.01-fold on average) for seven diseases with statistical significance. Among all analyzed diseases, hypertension, obesity, and type 2 diabetes showed the most significant improvements in predictive performance. Additionally, our results showed that the performance of East Asian PRSs was similar to that of the European PRSs, achieving an average equivalence of 87.80%.

The most significant contributor to the predictive performance of PRS was the SNP heritability for traits6,15. To reveal significant heritability, a substantial number of cases are essential for a GWAS33. Despite East Asian GWASs having the second-highest sample number following European12, limitations persist owing to the small sample size of East Asian PRSs13,14,27. Leveraging large-scale GWAS data from Europeans, there is potential for the PRS transferability of East Asians to exhibit significant predictive performance compared to East Asian GWAS data-based PRS16. The enhanced predictive performance of PRS-CSx over PRS-CS, as demonstrated by Liu et al., was using only East Asian GWAS data. A modest enhancement of 1.5% on average in disease risk prediction based on the liability scale R2 was observed in the Chinese population21. Our findings also exhibited an increase, but to a lesser extent, under conditions similar to those of previous studies. On average, there was a 0.41% increase in Nagelkerke’s R2 for PRS-CSx compared with PRS-CS for ten common diseases (Supplementary Table 8). The relatively small enhancement in transferability observed in this study may be due to the differences in sample sizes of the East Asian GWASs used in both studies. In our case, we used the GWAS generated from the BBJ cohort (>170,000), while Liu et al. utilized GWAS performed with a larger East Asian sample size (>350,000)21. Additionally, our results indicated that cross-population PRS modestly enhanced the predictive performance of LRT (1.08-fold on average), perSD OR (1.07-fold on average), NRI (1.15-fold on average), and AUC (1.02-fold on average) compared with single-population PRS. Because Liu et al. did not furnish these metrics, we were unable to compare our degree of enhancement with that of their study. In the other study, Ge et al. calculated the PRS for Type 2 diabetes using the PRS-CSx from the Taiwan BioBank dataset22. The perSD OR in diverse PRSs for Type 2 diabetes ranging from 2.01 to 2.19 was assessed. Our study yielded comparable results, with a higher perSD OR of 2.33 for type 2 diabetes in the HEXA cohort.

The enhanced predictive performance of the East Asian cross-population PRS highlights its effectiveness in predicting the genetic risk of diseases in East Asian populations. We attempted to understand the relative performance of the East Asian cross-population PRS by comparing them to the European PRS using the largest East Asian and European GWAS currently available. To calculate this, we compared the perSD ORs of East Asian cross-population PRS with the maximum perSD ORs obtained from European PRS for each disease. The East Asian cross-population PRS exhibited an average performance of 87.80% across eight diseases for those of the European PRS (Tables 4 and 6). However, Hypertension, Stroke, and Type 2 diabetes exhibited significant performance in the cross-population PRSs compared to the European PRSs. These findings indicate a limitation for the cross-population PRS of the increased performance of non-European ethnic PRS by leveraging GWAS from Europeans with a larger sample size. Moreover, they emphasized the requirement for larger-scale East Asian GWAS to bridge the performance gap between European and East Asian PRSs.

Recently, various approaches have been explored to leverage the PRS for clinical utility9,10,34. Among these, the classification of high-risk groups using the PRS has been widely applied. Previous studies have assessed the OR between high- and normal-risk groups of PRS for CAD, type 2 diabetes, obesity, and hypertension in Europeans9,10,34. The OR of PRS was assessed by comparing the disease prevalence between the high- (top 10% of PRS) and normal-risk groups (40–60% of PRS)34 and provided the OR for diseases, such as CAD (3.52), hypertension (3.28), and type 2 diabetes (4.27). Similarly, we assessed the OR between high- (top 10% cross-population PRS) and the normal-risk groups (41–60% of cross-population PRS), as summarized in Supplementary Table 9. Our findings demonstrated that the OR for CAD (1.63), hypertension (3.03), and type 2 diabetes (4.28) exhibited less discrimination compared to European PRSs. Additionally, the PRS for BMI was calculated, and the OR between the high-risk group (top 10% of PRS) and the remaining group (1–90% of PRS) for extreme obesity (BMI ≥ 40) was estimated to be 4.2210. Our findings demonstrated that ORs between the high-and normal-risk groups were 2.68 for obesity (≥25 kg/m2), 4.00 for severe obesity (≥30 kg/m2) indicating that BMI cross-population PRS exhibited significant discrimination of the high-risk group.

Our study had several limitations. First, despite an enhancement in the predictive performance of the cross-population PRS, we did not explore the underlying reasons. Although the sample size of the GWAS was anticipated to be a primary factor for enhancement, we failed to confirm any correlation between the sample size of the European GWAS summary statistics integrated into the cross-population PRS and the increased performance metrics (Tables S10 and S11). Future research is required to identify the factors that enhance the performance of cross-population PRS to develop a highly accurate transferable PRS. Additionally, KARE cohort’s follow-up data had a limitation due to its small sample size. The largest group we analyzed was type 2 diabetes, with 693 patients and 5090 controls, making a total of 5783 people. The small sample size of this group suggests that the modest increase in performance metrics evaluated through cross-population PRS could be due to the limited data scale. Therefore, there is a need to evaluate cross-population PRS with a larger follow-up dataset. Another limitation is the small number of diseases assessed owing to the limited data on diseases in the Korean cohorts, such as HEXA and KARE. Additionally, it is essential to demonstrate the predictive performance of cross-population PRS in other East Asian countries, including Japan. Finally, we did not assess the applicability of the diverse methods for the cross-population PRS. Specifically, the widely recognized PolyPred method requires a minimum of 50,000 individuals for PRS training using the LD reference panel for its application in addition to the assessment of the PRS35.

In conclusion, the cross-population PRSs showed significant transferability in East Asians for ten common diseases, enhancing most predictive metrics of LRT, perSD, and NRI compared to the single-population PRSs. In addition, the difference in R2 values between single-population and cross-population PRS was statistically significant across five diseases, demonstrating an average increase of 1.13%. The relative performance of these East Asian PRSs with their respective European PRSs for eight diseases resulted in an average performance of 87.80%. Our findings indicate that while cross-population PRS enhances the performance of East Asian PRSs, large-scale East Asian GWAS data are essential to bridge the performance gap with European PRSs for effective disease prediction in East Asian populations.

Methods

HEXA (Health Examines)

The HEXA was initiated in 2004 and 173,357 participants, aged over 40 years, were recruited from 38 health examination centers and training hospitals located in eight regions of South Korea23. Of these, 58,700 individuals with genotype data and passing sample quality control criteria were extracted. The sample quality control criteria for exclusion are as follows: a history of cancer, gender inconsistencies, cryptic relatedness, low genotype call rate (<95%), and sample contamination, as previously described23. All participants were genotyped with the Korean Chip (K-CHIP), which was designed by the Center for Genome Science, Korea National Institute of Health (KNIH), based on the UK Biobank Axiom® Array, and manufactured by Affymetrix. The SNP imputation was carried out using IMPUTE v236 with 1000 Genomes Phase 3 data as a reference panel. Additionally, diseases within the HEXA cohort were recorded based on self-reported.

KARE (Korea Association Resource)

Participants of KARE cohort (n = 8840) were recruited from two regions in South Korea (Ansan and Ansung) from 2009 to 2012 for the Korean Genome and Epidemiology Study24. All study participants aged ≥40 years provided written informed consent, and approval was obtained from the institutional review board. The exclusion criteria were as follows: history of cancer, gender inconsistencies, cryptic relatedness, low genotype call rate (<95%), and sample contamination23,24. The KARE study utilized the Affymetrix Genome-Wide Human SNP Array GeneChip 5.0. SNP imputation was performed using IMPUTE v2 with the 1000 Genomes Project (haplotype phase 1)36. Data collection in the KARE cohort commenced in 2001, and follow-up examinations were conducted every two years, totaling seven examinations over a span of 14 years. The analysis of follow-up data every two years revealed a novel incidence of diseases. Diseases in the KARE cohort were also recorded using self-reported.

Ethics approval and consent to participate

This study was conducted with bioresources from the National Biobank of Korea, the Korea Disease Control and Prevention Agency, Republic of Korea (KBN‐2021‐051).

Disease selections

For hypertension, we selected cases meeting any of the following criteria: systolic blood pressure ≥140 mmHg, diastolic blood pressure ≥90 mmHg, use of antihypertensive medicines, diagnosis of hypertension, or undergoing treatment for hypertension. Controls were those with systolic blood pressure <120 mmHg and diastolic blood pressure <80 mmHg37.

For type 2 diabetes, cases were selected if they satisfied any of the following criteria: fasting glucose level ≥126 mg/dl, 2-h oral glucose tolerance test (2-h OGTT) ≥ 200 mg/dl, receiving treatment for type 2 diabetes, or taking medication for condition. Controls were identified as those with fasting glucose level <100 mg/dl, 2-h OGTT < 140 mg/dl, and no history of type 2 diabetes treatment and diagnosis38.

For asthma, cataract, cholelithiasis, colon polyp, and stroke, cases were chosen if they met any of these criteria: a diagnosis of each respective disease, taking medication for the same, or undergoing treatment for it. Conversely, controls were selected from those without a diagnosis of any of these diseases.

For coronary artery disease, cases were selected based on the following criteria: a diagnosis of myocardial infarction or angina pectoris, medication for either condition or undergoing treatment for them. Controls were those not having a diagnosis of both myocardial infarction and angina pectoris.

For obesity, cases meeting the criterion of a body mass index ≥25 were selected. Controls were identified as those with a body mass index <2539,40.

For osteoporosis in HEXA, cases were selected based on these criteria: diagnosis of osteoporosis, taking medication for osteoporosis, or receiving treatment for osteoporosis. Controls were selected based on the criterion of not having a diagnosis of osteoporosis. For osteoporosis in KARE, we selected cases that met the following criteria: for females, a diagnosis of osteoporosis, taking medication for osteoporosis, undergoing treatment for osteoporosis, or having a distal radius T score < −2.6 or midshaft tibia T score < −3.041; for males, a diagnosis of osteoporosis, taking medication for osteoporosis, undergoing treatment for osteoporosis, or having a distal radius T score < −2.5 or midshaft tibia T score < −2.542. In contrast, controls for females were defined as having a distal radius T score greater than −1.4 and a midshaft tibia T score of −1.641, and controls for males were defined as having a distal radius and midshaft tibia T score greater than −1.042.

GWAS summary statistics

In this study, we used GWAS summary statistics from both East Asian and European populations to evaluate the predictive performance of single-population and cross-population PRSs for ten common diseases. Detailed information on the GWAS summary statistics used is presented in Supplementary Table 2, and the sources of each dataset are summarized as follows.

  1. 1.

    East Asian GWAS Summary Statistics: GWAS data from East Asian populations were utilized to explore major genetic associations. These datasets were obtained from the following studies:

  1. CAD: Data were obtained from Matsunaga et al., with a total sample size of 51,442 (15,302 cases and 36,140 controls)26. The study used data from the Biobank Japan and Osaka Acute Coronary Insufficiency Study (OACIS) cohorts.

  1. Asthma: The dataset was provided by Zhou et al., comprising 341,204 samples (18,549 cases and 322,655 controls)25. This study used data from the Biobank Japan, China Kadoorie Biobank, Taiwan Biobank, UK Biobank, and UCLA Biobank.

  1. Cataract, cholelithiasis, colon polyp, osteoporosis, stroke, type 2 diabetes, systolic blood pressure, and BMI: Summary statistics for these traits were provided by Sakaue et al.13. Detailed sample sizes, including case/control counts, are listed in Supplementary Table 2. All data were derived from the Biobank Japan cohort.

  1. 2.

    European GWAS Summary Statistics: GWAS data from European populations were used to evaluate cross-population PRS methods such as PRS-CSx and CT-SLEB. These datasets were sourced from the following studies:

  1. CAD: Data were obtained from Nikpay et al., with a total sample size of 184,841 (61,289 cases and 123,552 controls)28. This study utilized the Coronary Artery Disease Genome-wide Replication and Meta-analysis (CARDIoGRAM) consortium, which includes data from multiple cohorts: the Framingham Heart Study (FHS), Atherosclerosis Risk in Communities (ARIC) study, Rotterdam Study, Wellcome Trust Case Control Consortium (WTCCC), EPIC-Norfolk study, and Myocardial Infarction Genetics Consortium (MIGen).

  1. Asthma: The dataset was obtained from Zhou et al., with 1,376,071 samples (121,940 cases and 1,254,131 controls)25. The study incorporated data from various biobanks, including UK Biobank.

  1. Cataract, cholelithiasis, colon polyp, osteoporosis: GWAS summary statistics for these traits were provided by Jiang et al., with a total sample size of 456,34827. The number of cases and controls for each disease is detailed in Supplementary Table 2. Data were sourced from the UK Biobank cohort.

  1. Stroke: Data were obtained from Malik et al., with a total sample size of 446,696 (40,585 cases and 406,111 controls)43. The study incorporated data from the CARDIoGRAMplusCAD consortium, CHARGE Consortium, deCODE, EPIC-InterAct Study, and UK Biobank.

  1. Type 2 diabetes: The dataset was obtained from Mahajan et al., with a total sample size of 1,339,889 (180,834 cases and 1,159,055 controls)29. This study utilized data from multiple sources, including UK Biobank, EPIC-InterAct, DIAGRAM consortium, FinnGen, GERA, and deCODE Genetics.

  1. Systolic blood pressure: Data were obtained from Evangelou et al., comprising 757,601 samples30. The study included data from the UK Biobank and the International Consortium for Blood Pressure (ICBP).

  1. BMI: Data were obtained from Yengo et al. with a sample size of 681,27531. The study utilized data from the UK Biobank and the Genetic Investigation of Anthropometric Traits (GIANT) consortium.

All GWAS summary statistics used in this study were collected from previous studies, and the details of sample sizes, as well as case and control counts, are clearly provided in Supplementary Table 2. Additionally, none of the GWAS summary statistics used in this study overlap with the HEXA or KARE cohorts.

PRS-CS

PRS-CS is a Bayesian regression framework that enables “Shared continuous shrinkage priors” on SNP effects to infer their posterior mean effects, which is robust to varying genetic architectures, provides substantial computational advantages, and enables multivariate modeling of local LD patterns44. PRS-CS will learn the phi parameter from the discovery GWAS without requiring post-hoc tuning as an auto model. We used the default settings for other parameters. Also, we used the 1000 Genomes reference panel provided by PRS-CS (https://github.com/getian107/PRScs).

LDpred2

LDpred2 is a computational algorithm based on a Bayesian approach that uses an LD matrix and GWAS summary statistics45. It is implemented in the R package bigsnpr. LDpred2 provides several models, including the infinitesimal model, which assumes that all genetic variants are causal. Another model, the grid model, tunes hyperparameters such as SNP heritability (h²), the proportion of causal variants (p), and optional sparsity to reweight the effect of variants on the phenotype. In this study, SNP heritability (h²) was estimated using LD Score regression, based on the LD scores derived from European-ancestry samples in the 1000 Genomes Project.

Lassosum

Lassosum is a method designed for PRS calculation that applies penalized regression (lasso) to select and estimate the effects of genetic variants on a trait46. Lassosum directly utilizes a penalized regression approach to simultaneously estimate SNP effect sizes while accounting for LD structures, which can be beneficial in reducing overfitting and managing phenotypes.

In Lassosum, two important parameters are used:

  • S value (s): This represents the proportion of SNPs assumed to be causal, controlling the sparsity of the model. A lower s value means that fewer SNPs are assumed to have a non-zero effect, effectively leading to a more parsimonious model.

  • Lambda value (λ): This is the regularization parameter in lasso regression, determining the degree of shrinkage applied to the SNP effect sizes. A higher λ value results in greater penalization, which helps in avoiding overfitting by shrinking smaller effect sizes towards zero.

We optimized the s and λ values to obtain the best predictive performance for each disease, and these values are summarized in Supplementary Table 12.

Lassosum is implemented in the R package “lassosum” (https://github.com/tshmak/lassosum). The default parameter settings were used unless otherwise specified.

PRS-CSx

We used PRS-CSx, a recently developed Bayesian polygenic modeling method, to construct the transferability PRS22. PRS-CSx jointly models the two GWAS summary statistics and couples genetic effects across populations using a shared continuous shrinkage prior, which enables more accurate effect size estimation by sharing information between summary statistics and leveraging LD diversity across discovery samples. The shared prior allows for correlated but varying effect size estimates across populations, retaining the flexibility of the modeling framework. In addition, PRS-CSx accounts for population-specific allele frequencies and LD patterns and inherits efficient and robust posterior inference algorithms from PRS-CS. We used pre-computed 1000 Genomes Project reference panels that matched the ancestry of each discovery GWAS, and a fully Bayesian algorithm for model fitting, which automatically learned all model parameters from the summary statistics without the need for hyper-parameter tuning. Also, the PRS-CSx used the 1,259,754 HapMap3 variants information to estimate the PRS. So, we used only HapMap 3 variants in the HEXA (~1,150,090 SNPs) and KARE cohort (~919,166 SNPs).

CT-SLEB

CT-SLEB is a cross-ancestry empirical Bayes method developed to improve the transferability of PRS across diverse20. CT-SLEB combines effect size estimates from multiple GWAS summary statistics across different ancestries, using an empirical Bayes approach to adjust for population-specific LD and allele frequency differences. By borrowing strength from larger European GWAS while maintaining fidelity to the target population-specific data, CT-SLEB is able to produce more accurate cross-population PRS estimates.

This method involves two main steps:

  1. 1.

    Cross-population Effect Size Integration: It estimates the posterior mean effect sizes using an empirical Bayes method, combining GWAS summary statistics from both the target and a secondary population, such as European samples.

  2. 2.

    Population-specific LD Modeling: CT-SLEB takes into account LD patterns specific to the target population, which helps enhance prediction accuracy when the GWAS discovery dataset is predominantly from a different ancestry.

In this study, we applied CT-SLEB to construct PRSs for ten common diseases, using GWAS summary statistics from both European and East Asian populations. The 1000 Genomes Project data was used as the reference LD panel for this analysis. We used the default parameter settings as recommended by the CT-SLEB developers (https://github.com/andrewhaoyu/CTSLEB). We summarized the optimized values for each disease, and these values are provided in Supplementary Table 13.

Statistical analysis

To investigate the LRT and per SD OR, we used a logistic regression model using R statistical package version 4.1.0, as follows:

$${Disease}, ({coded; as}, 1, {or}, 0) sim {beta }_{1}{{{rm{PRS}}}}+{beta }_{2{{{rm{age}}}}}+{beta }_{3{{{rm{sex}}}}}$$

where logit (Disease) is the log odds of binary outcome variable disease (coded as 0 for control or 1 for case), range of age is from 40 to 69 and sex is coded as 0 or 1 for female or male. The perSD OR was derived using this logistic regression.

To assess the significance of adding the PRS to our model, we used the LRT to compare two models, the baseline model and the PRS model. The baseline model included only the covariates age and sex, while the PRS model extended this baseline by incorporating the PRS as an additional predictor. The LRT was conducted to determine if the inclusion of PRS significantly improved model fit compared to the baseline model.

For evaluating the incremental predictive value of PRS, we used the NRI metric, which helps quantify the improvement in reclassification of individuals when adding the PRS to the baseline model.

The comparison between the baseline and PRS models allowed us to quantify how much the inclusion of PRS improves the prediction of disease status beyond what can be explained by demographic factors alone. To evaluate the NRI, we used the “PredictABEL” package in R. The formula for calculating the censored NRI when comparing the baseline model against new model 1 and 2 is as follows:

NRIi = P (upnew model i > baseline model | Case) – P (down new model i< baseline model | Case) + P (down new model i < baseline model | Control) – P (up new model i > baseline model | Control), where i = 1or 2.

We generated NRI indices for both “baseline model vs. new model 1” and “baseline model vs. new model 2” and compared these indices to assess the relative predictive performances. The baseline model includes only age and sex as covariates, while the new models include additional PRS information. For this analysis, we randomly divided the samples into two equal halves. In one half, we generated the model, while in the other half, we estimated the NRI values, allowing us to validate the performance improvements more robustly.

To statistically investigate incidence data, which involves events occurring over time, we conducted Cox regression analysis using the “survival” package in R.

To investigate mean differences of quantitative variables between cases and controls, we used the student’s t-test using R statistical package version 4.1.0.

We depicted the bar plot using “ggplot2” version 3.3.6 in R.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Related Articles

Depression symptom-specific genetic associations in clinically diagnosed and proxy case Alzheimer’s disease

Depression is a risk factor for the later development of Alzheimer’s disease (AD), but evidence for the genetic relationship is mixed. Assessing depression symptom-specific genetic associations may better clarify this relationship. To address this, we conducted genome-wide meta-analysis (a genome-wide association study, GWAS) of the nine depression symptom items, plus their sum score, on the Patient Health Questionnaire (PHQ-9) (GWAS-equivalent N: 224,535–308,421) using data from UK Biobank, the GLAD study and PROTECT, identifying 37 genomic risk loci. Using six AD GWASs with varying proportions of clinical and proxy (family history) case ascertainment, we identified 20 significant genetic correlations with depression/depression symptoms. However, only one of these was identified with a clinical AD GWAS. Local genetic correlations were detected in 14 regions. No statistical colocalization was identified in these regions. However, the region of the transmembrane protein 106B gene (TMEM106B) showed colocalization between multiple depression phenotypes and both clinical-only and clinical + proxy AD. Mendelian randomization and polygenic risk score analyses did not yield significant results after multiple testing correction in either direction. Our findings do not demonstrate a causal role of depression/depression symptoms on AD and suggest that previous evidence of genetic overlap between depression and AD may be driven by the inclusion of family history-based proxy cases/controls. However, colocalization at TMEM106B warrants further investigation.

Using twin-pairs to assess potential bias in polygenic prediction of externalising behaviours across development

Prediction from polygenic scores may be confounded by sources of passive gene-environment correlation (rGE; e.g. population stratification, assortative mating, and environmentally mediated effects of parental genotype on child phenotype). Using genomic data from 10 000 twin pairs, we asked whether polygenic scores from the most recent externalising genome-wide association study predict conduct problems, ADHD symptomology and callous-unemotional traits, and whether these predictions are biased by rGE. We ran regression models including within-family and between-family polygenic scores, to separate the direct genetic influence on a trait from environmental influences that correlate with genes (indirect genetic effects). Findings suggested that this externalising polygenic score is a good index of direct genetic influence on conduct and ADHD-related symptoms across development, with minimal bias from rGE, although the polygenic score predicted less variance in CU traits. Post-hoc analyses showed some indirect genetic effects acting on a common factor indexing stability of conduct problems across time and contexts.

Phenotypic divergence between individuals with self-reported autistic traits and clinically ascertained autism

While allowing for rapid recruitment of large samples, online research relies heavily on participants’ self-reports of neuropsychiatric traits, foregoing the clinical characterizations available in laboratory settings. Autism spectrum disorder (ASD) research is one example for which the clinical validity of such an approach remains elusive. Here we compared 56 adults with ASD recruited in person and evaluated by clinicians to matched samples of adults recruited through an online platform (Prolific; 56 with high autistic traits and 56 with low autistic traits) and evaluated via self-reported surveys. Despite having comparable self-reported autistic traits, the online high-trait group reported significantly more social anxiety and avoidant symptoms than in-person ASD participants. Within the in-person sample, there was no relationship between self-rated and clinician-rated autistic traits, suggesting they may capture different aspects of ASD. The groups also differed in their social tendencies during two decision-making tasks; the in-person ASD group was less perceptive of opportunities for social influence and acted less affiliative toward virtual characters. These findings highlight the need for a differentiation between clinically ascertained and trait-defined samples in autism research.

Tomato growth stage modulates bacterial communities across different soil aggregate sizes and disease levels

Soil aggregates contain distinct physio-chemical properties across different size classes. These differences in micro-habitats support varied microbial communities and modulate the effect of plant on microbiome, which affect soil functions such as disease suppression. However, little is known about how the residents of different soil aggregate size classes are impacted by plants throughout their growth stages. Here, we examined how tomato plants impact soil aggregation and bacterial communities within different soil aggregate size classes. Moreover, we investigated whether aggregate size impacts the distribution of soil pathogen and their potential inhibitors. We collected samples from different tomato growth stages: before-planting, seedling, flowering, and fruiting stage. We measured bacterial density, community composition, and pathogen abundance using qPCR and 16 S rRNA gene sequencing. We found the development of tomato growth stages negatively impacted root-adhering soil aggregation, with a gradual decrease of large macro-aggregates (1–2 mm) and an increase of micro-aggregates (<0.25 mm). Additionally, changes in bacterial density and community composition varied across soil aggregate size classes. Furthermore, the pathogen exhibited a preference to micro-aggregates, while macro-aggregates hold a higher abundance of potential pathogen-inhibiting taxa and predicted antibiotic-associated genes. Our results indicate that the impacts of tomatoes on soil differ for different soil aggregate size classes throughout different plant growth stages, and plant pathogens and their potential inhibitors have different habitats within soil aggregate size classes. These findings highlight the importance of fine-scale heterogeneity of soil aggregate size classes in research on microbial ecology and agricultural sustainability, further research focuses on soil aggregates level could help identify candidate tax involved in suppressing pathogens in the virtual micro-habitats.

Responses

Your email address will not be published. Required fields are marked *