Insights from the largest diverse ancestry sex-specific disease map for genetically predicted height

Insights from the largest diverse ancestry sex-specific disease map for genetically predicted height

Introduction

Adult height is an easily measured anthropometric trait that is complex and highly heritable1,2. Several factors contribute to adult height including both genetic and environmental, such as nutrition, socio-economic status, and physical activity3,4,5,6,7.

Several observational studies have been performed to better understand the association between height and disease. In individuals of European ancestry, increased height has been associated with a reduced risk of several circulatory diseases, including coronary artery disease (CAD), aortic valve stenosis (AS), heart failure (HF), hypertension and stroke6. In addition to these observational studies, increased genetically predicted height has been associated with decreased risk of hypertension, diaphragmatic hernia, and gastro-esophageal reflux disease (GERD)6. A recent study in the Million Veteran Program (MVP) used a polygenic score based on 3290 height-associated Single Nucleotide Polymorphisms (SNPs)2 to show that increased genetically predicted height is associated with an increased risk of atrial fibrillation (AF) and decreased risk of CAD, hypertension, hyperlipidemia. They also reported potential novel associations with peripheral neuropathy and infections of the skin and bones, both in European and African ancestries individuals8. Furthermore, increased genetically predicted height has been associated with longer PR interval and QRS duration9, venous thromboembolism6,10, AF, intervertebral disc disorder, hip fracture, vasculitis, breast cancer6,11 and colorectal cancer6,12 in European ancestry.

The Genetics of Anthropometric Traits – (GIANT) – consortium has performed increasingly larger meta-analyses of genome-wide association studies (GWAS) of height over the years2,13,14. In the present study we are using a multi-ancestry polygenic score (PGS) for height in six study populations of diverse ancestries to explore the association with a comprehensive set of health-related outcomes. The PGS for height was constructed using genetic variants taken from the most recent GIANT GWAS for adult height, excluding data from 23andMe14. We employed a Phenome-Wide Association Analyses (PheWAS) approach; a hypothesis-free analysis, with no prior assumptions, to detect phenotypes associated with the height PGS15,16, followed by meta-analysis (meta-PheWAS) of the individual PheWAS in each study population both within and across-ancestry groupings, to potentially identify new diseases associated with genetically predicted height. Sex-stratified cross-ancestry analyses were also considered.

Results

We performed a PheWAS in each cohort using the PGS of height as exposure and tested its association with disease outcomes available in each of them (Methods). The sex-combined cross-ancestry meta-PheWAS in up to 839,872 participants, interrogating 1768 traits (available in at least 2 cohorts), yielded 254 significant associations below Bonferroni threshold (p-value = 2.83E-05) (Table 1, Fig. 1). All phecode categories harbored multiple significant associations with the tested height PGS; circulatory system (62), congenital anomalies (4), dermatologic (17), digestive (18), endocrine/metabolic (35), genitourinary (8), hematopoietic (11), infectious diseases (6), injuries & poisonings (10), mental disorders (7), musculoskeletal (29), neoplasms (16), neurological (11), respiratory (7), sense organs (6) and symptoms (7) (Supplementary Data 6).

Table 1 Sample size and number of traits in the cross-ancestry meta-PheWAS
Full size table
Fig. 1: Phenome Wide Association Study (PheWAS).
Insights from the largest diverse ancestry sex-specific disease map for genetically predicted height

Manhattan plot showing the significant phecodes per category for the sex-combined cross-ancestry PheWAS meta-analysis of European (EUR), African (AFR), East Asian (EAS) ancestries and Hispanic (HIS) population groups.

Full size image

The traits that displayed the strongest associations with height PGS are shown in Table 2. The results from the PheWAS performed in each cohort, along with the full results from the meta-PheWAS, are presented in Supplementary Data 7– Supplementary Data 14 and Supplementary Data 6, respectively.

Table 2 Top 20 significant hits from the sex-combined cross-ancestry PheWAS meta-analysis of EUR, AFR, EAS, HIS
Full size table

From the cross-ancestry meta-PheWAS analysis, six traits exhibited evidence of heterogeneity (defined as when the p-value of the Cochran’s heterogeneity test is below Bonferroni threshold) as shown in Table 3, Supplementary Figs. 1–7. For example, in cardiac dysrhythmias (427) the signal indicated strong evidence for association in European ancestry (p-value = 1.29 × 10–91) but not in the other ancestral groups (Supplementary Data 4). We also observed evidence of heterogeneity of effects across cohorts for Cardiac dysrhythmias (Supplementary Fig. 3). Another notable example of heterogeneity was for Chronic ulcer of skin (707) (Supplementary Fig. 5).

Table 3 Heterogeneous traits in the cross-ancestry meta-PheWAS analysis
Full size table

Cross-ancestry analyses revealed 30 additional signals that were not present in the European ancestry meta-analyses (Supplementary Data 5, Supplementary Fig. 8).

We further performed sex-specific meta-PheWAS analyses in the UKB, MVP, BioVU and BioMe cohorts (males: Supplementary Data 21, 23, 25, 27and females: Supplementary Data 32, 34, 36, 38). The meta-PheWAS analysis for males in up to 471,395 participants, interrogated 1582 traits (available in at least two cohorts) and yielded 173 statistically significant trait associations below the Bonferroni threshold (p-value = 3.16 × 10–5) (Table 1) (Supplementary Data 19). The identified categories included the circulatory system (50), congenital anomalies (4), dermatologic (14), digestive (10), endocrine/metabolic (22), genitourinary (6), hematopoietic (6), infectious diseases (5), injuries & poisonings (6), mental disorders (8), musculoskeletal (21), neoplasms (5), neurological (7), respiratory (3), sense organs (3) and symptoms (3). In total, 10 traits, spanning across different categories, were significant only in the male cross-ancestry meta-PheWAS and not in the sex-combined cross-ancestry meta-PheWAS (Supplementary Data 42). For example, increased genetically predicted height was associated with decreased risk of Hyperpotassemia (276.13) (OR = 0.95, 95% CI [0.93, 0.97], p-value = 1.23 × 10–6, het p-value = 9.46 × 10–1) in males but showed a null association in females (Supplementary Data 42).

Looking at the ancestry level, 10 traits were significant (p-value < 3.16 × 10–5) only in European ancestry males but not in the male cross-ancestry analysis; 3 of them from the mental disorders category, with increased height PGS having a decreased risk of: Anxiety disorders (300), Posttraumatic stress disorder (300.9) and Substance addiction and disorders (316) (Supplementary Data 16) (Fig. 2). In the other populations, Drusen (degenerative) of retina (362.27) and Fracture of lower limb (800) were significantly associated with increased height PGS in the male African (decreased risk) and Hispanic (increased risk) populations, respectively, but not in the cross-ancestry analyses (Supplementary Data 17 and 18).

Fig. 2: Estimates per ancestry in the male meta-analysis of Phenome Wide Association Studies (meta-PheWAS), for signals from the mental disorders category that were identified as significant only in the European males meta PheWAS.
figure 2

SD standard deviation of the mean, PGS polygenic score.

Full size image

The meta-PheWAS analysis in up to 267,576 female individuals, interrogated 1499 traits (available in at least two cohorts) and yielded 56 significant associations below Bonferroni threshold (p-value = 3.34 × 10–5) (Table 1) (Supplementary Data 30). The identified categories included the circulatory system (23), dermatologic (2), digestive (7), endocrine/metabolic (6), genitourinary (1), hematopoietic (1), musculoskeletal (5), neoplasms (8) and neurological (3). Only 1 association identified as significant in the females meta-PheWAS and was not observed in the sex-combined meta-PheWAS; Benign neoplasm of other parts of digestive system (211) (OR = 0.95, 95% CI [0.92, 0.97], p-value = 1.53 × 10–5, het p-value = 5.48 × 10–1) (Supplementary Data 41). Seven associations were significant in European ancestry but not in the cross-ancestry analysis, mainly from musculoskeletal and infectious diseases categories, such as Osteoporosis (743.1) and Dermatophytosis / Dermatomycosis (110), respectively (Supplementary Data 29).

Comparing males to females, the meta-PheWAS yielded 126 significant associations only in males, primarily from the circulatory system, endocrine/metabolic and musculoskeletal categories (Supplementary Data 44). Ninety-three percent of the traits had concordant effect sizes and were larger for males. On the other hand, comparing females with males, the meta-PheWAS yielded 13 significant associations in females only, with the digestive and neoplasms categories including the most traits (Supplementary Data 43). Ninety-two percent of the traits were concordant in direction, and the effect sizes in females were larger. Examining the heterogeneity between males and females in the cross-ancestry meta-PheWAS, 7 associations were identified; 4 of them from the musculoskeletal category, such as Acquired foot deformities (735) (Supplementary Data 40).

We performed a meta-PheWAS analysis excluding UKB in the cross-ancestry sex-combined meta-PheWAS (Supplementary Data 46) and the sex-specific ones for males and females (Supplementary Data 50 and 54). For the sex-combined (Supplementary Fig. 9, Supplementary Data 46) and the males (Supplementary Fig. 10, Supplementary Data 50) meta-PheWAS the estimates are concordant as presented in the plots. In the cross-ancestry female meta-PheWAS (Supplementary Fig. 11, Supplementary Data 54) three traits were identified as discordant: Benign neoplasm of other parts of digestive system (211), Other disorders of circulatory system (459), Gastritis and duodenitis (535).

Replication analyses were performed in an independent sample of the Colorado Biobank. Comparing the European ancestry meta-PheWAS with the European PheWAS in Colorado biobank we observe that the ORs are concordant in their majority; Colorado biobank has larger error bars due to the smaller sample size than the meta-PheWAS analysis (Supplementary Figs. 12–14). Colorado biobank also provided PheWAS results using both weighted and unweighted PGS (Supplementary Figs. 15–17).

Discussion

We performed a large ancestrally diverse meta-PheWAS for height in six cohorts including up to 840,000 individuals. Of the 1768 disease traits that were in common across cohorts and were meta-analysed, we identified 254 significant PGS-trait associations (p-value = 2.83 × 10–5). The largest number and most precise phenotypic associations were observed for the circulatory system, endocrine/metabolic and musculoskeletal categories.

From the circulatory system category, increased genetically predicted height was associated with an increased risk of Chronic venous insufficiency (CVI) (456) (OR = 1.16 95% CI [1.14, 1.18], p-value = 2.04 × 10–64) (Supplementary Data 6), with no evidence of heterogeneity across cohorts (het p-value = 6.65 × 10–2). These findings were concordant with a recent study in MVP which reported an association between increased genetically predicted height and increased risk of CVI in European American (EA) (OR = 1.366, p-value = 1.6 × 10–35) and in African American (AA) individuals (OR = 1.469, p-value = 3.1 × 10–4)8. The effect was similar in both males and females in our analyses. Failure of the femoral vein valves may lead to CVI, with severe consequences. However, for the valves to be replaced, the femoral vein diameter (FVD) must be known. A recent study by Keiler et al.17 reported that height was positively correlated with FVD; this correlation was attenuated when the sample was stratified by sex. In addition, failure of the venous valve can lead to varicose veins17. In our study, increased genetically predicted height was associated with increased risk of Varicose veins (VV) (454) (OR = 1.15, 95% CI [1.14, 1.17], p-value = 1.23 × 10–108) (Supplementary Data 6), with no evidence of heterogeneity across cohorts (het p-value = 5.85 × 10–2), again a finding in agreement with the MVP PheWAS8. Moreover, Mendelian Randomisation (MR) studies in European ancestry have supported a causal association between genetically predicted height and VV18,19.

Within the circulatory system category, the strongest association was for Atrial fibrillation and flutter (AF) (427.2) (OR = 1.16, 95% CI [1.15, 1.17], p-value = 1.08 × 10–226) (Supplementary Data 6), with no evidence of heterogeneity across cohorts (het p-value = 4.50 × 10–4), and similar effect sizes in the sex-stratified meta-PheWAS. The aforementioned MVP study similarly reported an increased risk of AF in EA (OR = 1.381, p-value = 5.70 × 10–84) and in AA (OR = 1.352, p-value = 3.3 × 10–4)8. Significant causal associations from MR analysis have been reported in two previous studies6,20.

Our study confirmed that increased genetically predicted height is inversely associated with cardiovascular diseases21,22,23. Increased genetically predicted height was associated with decreased risk of hypertension (401) (OR = 0.950, 95% CI [0.944, 0.955], p-value = 2.20 × 10–77) (Supplementary Data 6), with no evidence of heterogeneity across cohorts (het p-value = 1.31 × 10–3), and with similar effect sizes in males and females. This finding is in accordance with previous studies, although our effect sizes were slightly attenuated, possible due to a lack of coding “hypertension” using ICD codes8,24. According to World Health Organisation (WHO) “hypertension is diagnosed if, when it is measured on two different days, the systolic blood pressure readings on both days is ≥140 mmHg and/or the diastolic blood pressure readings on both days is ≥90 mmHg”25. A study in the Finnish population examining blood pressure found that shorter participants had higher SBP than taller ones, and this could be partially the reason for observing inverse association between height and cardiovascular disease21. A study in the USA reported that height was inversely associated with DBP in older males and females, in contrast to SBP that was positively associated22. A recent systematic review concluded that there was a potentially inverse association of stature and BP26. An MR analysis conducted in European ancestry individuals showed that an increase in adult height was causally associated with a lower risk of coronary heart disease, with one potential mechanism including BP27.

Epidemiological and genetic studies suggest that increased height is associated with decreased risk of CAD6,23,28. In a meta-analysis of European ancestry participants, genetically predicted increased height was associated with decreased risk of CAD (OR = 0.88, 95% CI [0.82, 0.95], p-value < 1.00 × 10–3)28. Similar findings were reported in several MR studies6,23. CAD is a broad category including diseases such as ischemic heart disease, myocardial infarction and coronary atherosclerosis. For instance, Ischemic heart disease (411) (OR = 0.948, 95%CI [0.942, 0.954], p-value = 1.03 × 10–56, het p-value = 6.53 × 10–3), and Myocardial infarction (MI) (411.2) (OR = 0.93, 95%CI [0.92, 0.94], p-value = 3.54 × 10–41, het p-value = 1.32 × 10–1) (Supplementary Data 6) were identified as significant among the cardiovascular diseases and with similar effect at the sex-stratified meta-PheWAS; all these have been confirmed in previous studies8,29.

In the endocrine/metabolic category, several health-related outcomes were identified. Our study identified decreased risk of Hyperlipidemia (272.1) (OR = 0.942, 95% CI [0.936, 0.947], p-value = 4.04 × 10–86, het p-value = 1.34 × 10–4) and Hypercholesterolemia (272.11) (OR = 0.946, 95% CI [0.939, 0.953], p-value = 5.67 × 10–55, het p-value = 4.17 × 10–2) (Supplementary Data 6), with similar effect at the sex-stratified meta-PheWAS. These findings have also been reported by MVP8, and in a Korean population30,31. Our meta-analysis confirmed the well-established association between 1 SD increase in genetically predicted height and decreased risk of Type 2 diabetes (T2D) (250.2) (OR = 0.98, 95% CI [0.97, 0.99], p-value = 2.27 × 10–11, het p-value = 9.11 × 10–3) (Supplementary Data 6)32,33. In addition, we observed an association between increased genetically predicted height and the increased risk of Hypothyroidism (244) (OR = 1.022, 95% CI [1.014, 1.031], p-value = 8.58 × 10–8, het p-value = 3.30 × 10–1) (Supplementary Data 6). This is an interesting insight towards the known epidemiological links between hypothalamic-pituitary-thyroid (HPT) axis dysregulation and stature34.

Several health outcomes from the musculoskeletal category were associated with genetically predicted height. Acquired foot deformities (735) (OR = 1.06, 95% CI [1.05, 1.07], p-value = 1.11 × 10–37) were associated with higher genetically predicted height, with strong evidence of heterogeneity across cohorts (het p-value = 2.62 × 10–10) (Supplementary Data 6). In the present study, EA descent individuals presented the strongest signal in MVP, followed by eMERGE and in AA only in MVP (Supplementary Fig. 6). We found this association in males only, which is supported by a previous study reporting foot deformities to be significantly more prevalent in male veterans versus male non-veterans in USA35. In contrast, Osteoarthritis; localized (740.1) (OR = 1.033, 95% CI [1.026, 1.039], p-value = 3.13 × 10–22, het p-value = 4.88 × 10–2) (Supplementary Data 6) was found to have a similar effect in both males and females. This finding is supported by the MVP PheWAS8 and is widely supported in the epidemiological literature, that taller individuals have an increased risk of knee osteoarthritis, that remained significant for both sexes, after adjusting for confounders36. A recent meta-analysis of GWAS studies for osteoarthritis, in Icelanders and European ancestry from UKB, found that a large proportion of osteoarthritis risk variants are associated with height37.

We identified several notable associations in the neoplasms category. There has been a significant body of literature studying the association between height and risk of breast cancer (BC) and the results are controversial. Several PheWAS and MR studies reported null associations between height PGS and BC38,39. In contrast, several studies, including ours, confirm the association of height and risk of BC. An observational study, using data from EPIC and the Women’s Health Initiative (WHI) in the USA, observed that for every 10 cm increase in height there was an 18% increased risk of ER + BC; null association was found for ER- BC40. Another observational study, analysing post-menopausal women from the Netherlands Cohort Study (1986-2006), observed that for every 5 cm increase in height there was a 7% increased risk of BC (95% CI: 1.01–1.13); an association that remained significant for the ER + BC but not for ER- BC41.

We observed an attenuated, non-significant association, between increased genetically predicted height and Colorectal cancer (153) (OR = 1.02, 95% CI [1.00, 1.04], p-value = 2.19 × 10–2, het p-value = 6.89 × 10–1) (Supplementary Data 6). This finding contrasts with the majority of PheWAS and MR studies that describe an association between increased adult height and increased risk of colorectal cancer11,12,42.

We identified a significant association between increased genetically predicted height and decreased risk of Hyperpotassemia (276.13) (OR = 0.95, 95% CI [0.93, 0.97], p-value = 1.23 × 10–6, het p-value = 9.46 × 10–1) (Supplementary Data 19) in males. Additionally, increased genetically predicted height was associated with 3 traits from the mental disorders category in the males meta-PheWAS: Pervasive developmental disorders (313) (OR = 1.06, 95% CI [1.03, 1.09], p-value = 6.11 × 10–6, het p-value = 5.25 × 10–3), Attention deficit hyperactivity disorder (ADHD) (313.1) (OR = 1.06, 95% CI [1.03, 1.09], p-value = 2.35 × 10–5, het p-value = 3.93 × 10–1) and Autism (313.3) (OR = 1.215, 95% CI [1.222, 1.316], p-value = 1.64 × 10–6, het p-value = 2.58 × 10–1) (Supplementary Data 19). Similarly, the traits were concordant in the sex-combined meta-PheWAS but showed null association in the female meta-PheWAS. Previous PheWAS provided suggestive support of these findings, with the exception of autism8. A study by Yackobovitch-Gavan et al.43 employing data from Israel Clalit Health Services, reported that drug treatment for ADHD was associated with greater decline of height z-score in boys than girls, with 66% of the participants being boys. Additionally two studies in the US, one for children44, and one for both children and adolescents45, confirmed a decline of height z-scores for patients using stimulants and it is confirmed by a study in Netherlands46. However, these studies have examined the case in which the participants are medicated. Nevertheless, there is evidence suggesting that there are more males diagnosed than females, which is in accordance with our results. Our results relating to autism are in accordance with the literature; a study in Spanish pre-school children showed that autism spectrum disorder (ASD) had increased height in contrast to children with typical development47. Additionally, in Australia, male babies with ASD were born smaller, but grew taller in comparison to children with typical development48. Therefore, for these disease traits, it seems that males drive the association.

In males, 10 phenotypes displayed significant associations with height PGS in European descent individuals only; 3 of them belong to the mental disorders category: Anxiety disorders (300) (OR = 0.98, 95% CI [0.97, 0.99], p-value = 3.21 × 10–5, het p-value = 1.40 × 10–1), Posttraumatic stress disorder disorders (300.9) (OR = 0.97, 95% CI [0.96, 0.98], p-value = 2.08 × 10–7, het p-value = 2.86 × 10–1) and Substance addiction and disorders (316) (OR = 0.96, 95% CI [0.95, 0.98], p-value = 5.85 × 10–6, het p-value = 9.10 × 10–1) (Supplementary Data 16).

Seven traits were identified as significant in the female meta-PheWAS for European descent individuals and not in the cross-ancestry analysis. Amongst them, increased height PGS was associated with decreased risk of Osteoporosis (743.1) (OR = 0.93, 95% CI [0.90, 0.96], p-value = 6.58 × 10–6, het p-value = 9.65 × 10–1) (Supplementary Data 29). Post-menopausal European ancestry females had an increased risk of osteoporotic fractures, in contrast to African and Asian ancestries49,50.

Our study had several important limitations. Although we used the recently published cross-ancestry GWAS from GIANT, the study populations were predominantly of European ancestry. Thus, we observed a poorer prediction performance of the height PGS in our study populations that were ancestrally diverse, diminishing the power in populations with substantial non-European admixture. It is also possible that some of the signals observed may be driven by differences in phenotype prevalence across cohorts. The differences in sample size by sex and ancestry complicate interpretation of differences across these strata. This limitation is not new for genetic studies but likely limits our inference on true sex and ancestry differences in the phenotype associations with genetically predicted height at phenome-wide significance. We included all available cohort data as a discovery meta-analysis to increase power. Trait associations with genetically predicted height may be particularly influenced by indirect genetic effects and assortative mating. A recent study showed that population estimates are larger than within-sibship meta-analysis GWAS estimates for height51. The authors presented strong evidence of polygenic adaptation on taller height in European ancestry individuals, suggesting that demographic effects, such as assortative mating, could vary between populations51,52. Additionally, previous work in the UK Biobank has reported an association between stature and socio-economic status in both sexes, therefore this could serve as a mediator of the reported associations rather than the actual direct effect of height6. Lastly, we did not consider obvious reasons for differences across studies, sex, and ancestry. Social factors have a powerful influence on many of the phenotype-genetically predicted height associations described herein. By including data from diverse populations in future investigation of the role of genetically predicted height across the phenome, future research might be able to address the limitations of this study. It might be possible to better understand the genetic and environmental factors that affect height by more broadly interpreting the results. The study’s power would be enhanced, and more precise results would be produced by expanding the sample size and providing more in-depth information on lifestyle factors. Finding associations between height and disease using data from different ancestries would improve the generalizability of our findings and offer a more thorough understanding of the genetic and environmental factors affecting height and disease risk. Additional approaches could include carrying out population-specific studies, which would enable the investigation of height-disease relationships in particular ethnic groups. This could be accomplished by enlisting volunteers from particular ethnic groups and gathering thorough data on disease outcomes, height, and other pertinent covariates like lifestyle variables. In the process of creating new treatments and preventative measures for a variety of diseases, this could assist in the identification of novel genetic variants and pathways.

Methods

PheWAS is used to identify the effects of genetic variation already associated with a trait of interest across a larger array of phenotypes, using a hypothesis-free approach, with no prior assumptions53. We employed Bonferroni correction to determine statistical significance. Despite this, our large sample size facilitated the replication of known associations and even the discovery of new ones54.

To assess the associations of the PGS with hospital-record data, we used the PheWAS library53 implemented in R55. The package converts International Classification of Diseases (ICD) codes to ‘PheWAS codes’ or phecodes, which represent 1866 phenotypes in total formed from grouped ICD codes using the “Phecode Map 1.2 ICD-10-CM” (https://phewascatalog.org/phecodes_icd10cm). Each phenotype case is accompanied by accurate controls, meaning participants who have similar disease with the phenotype case are excluded. For instance, if the phenotype case under investigation is T2D, then participants who have T1D are excluded from the control group. This built-in exclusion feature, that prevents contamination of the controls, is essential to preserve statistical power to identify associations53,56. The phecodes are divided in 17 distinct categories: circulatory system, endocrine/metabolic, mental disorders, neurological, respiratory, infectious diseases, neoplasms, hematopoietic, sense organs, digestive, genitourinary, pregnancy complications, dermatologic, musculoskeletal, congenital anomalies, symptoms and injuries & poisonings56. Next, binary logistic regression models are employed to examine the association of the exposure, the PGS of height (independent variable), with the trait of interest with each phecode. As covariate adjustments in each study population, we used age, sex, genotype batch, to reduce model variability. Each study population (described in Supplementary Information) also adjusted for principal components for ancestry to control for confounding via population stratification (details per study on ancestry determination and exclusion in Supplementary Data 1A).

Details regarding compliance with all relevant ethical regulations including the Declaration of Helsinki can be found in the information and references for each participating cohort below. The PheWAS and meta-PheWAS summary statistics results that are discussed in the manuscript are included in the Supplementary Data 3–57.

Polygenic score

We performed a conditional and joint analysis (GCTA-COJO) to select quasi-independent height-associated SNPs for the construction of the PGS57,58. A stepwise procedure was used for SNP selection and the joint effects of all selected SNPs were estimated after the model was optimized. The genetic variants are still genome-wide significant, independent and the variance explained by them is larger than considering only the leading SNP at each locus. This conditional analysis was performed in the recent cross-ancestry GWAS for adult height, excluding data from 23andMe14, using 50,000 unrelated and randomly sampled European participants of UKB as the LD reference panel. We performed analyses using p-value threshold p = 5 × 10–9 to declare a genome-wide significant hit. Also, SNPs with allele frequency differences larger than 0.2 as compared to a UKB reference panel, were excluded from the analysis along with SNPs having MAF ≤ 0.001. The GCTA-COJO analysis resulted in a list of 6797 SNPs. As covariate adjustments we used age, sex, genotype batch, to reduce model variability. We also adjusted for principal components for ancestry to control for confounding via population stratification (Supplementary Data 1B). The PGS of height was constructed as the unweighted sum of the height-increasing alleles within each study (Supplementary Data 1) and afterwards is was scaled (using scale function in R).

All herein reported ORs are per one standard deviation increase in PGS.

Meta-analysis

Meta-analysis is a popular statistical technique used to increase the power to detect new effects by combining the information from independent studies. In addition, heterogeneity among the studies can be assessed, employing the beta estimates and standard errors from each study. For a small number of similar studies, the most common technique is the fixed-effect inverse variance weighted meta-analysis, which uses as a hypothesis that a common underlying effect exists for all studies59,60. We performed a meta-PheWAS, combined in a fixed-effect meta-analysis for UKB, MVP, BioVU, BioMe, MyCode and eMERGE cohorts, using the phecodes derived from the PheWAS in each cohort (Supplementary Data 3- Supplementary Data 14). For the sex-specific analysis, we employed data from the UKB, MVP, BioVU and BioMe (Supplementary Data 1538). The examined ancestries were European, African, East Asian ancestries and Hispanic population groups, and the sample size per ancestry and per study are included in Table 4. For more details the reader is referred to Supplementary Data 1 and 2.

Table 4 Sample size per ancestry and per study
Full size table

The sample size and examined number of traits for the sex-combined and sex-specific cross-ancestry meta-PheWAS are detailed in Table 1, and for the specific ancestries in Supplementary Data 2. For the meta-analysis we employed the statistical software R 3.6.1 and the library metafor61.

Replication

Replication analyses were performed in an independent sample of the Colorado Biobank. We also performed a replication PheWAS in the same biobank using a score weighted for the effects of the height-associated SNPs in the GWAS meta-analysis. Details are provided in the Supplementary Material.

UK Biobank (UKB)

The UKB is a prospective cohort of 502,504 participants, aged 40–69 years old, who were recruited between 2006 and 2010. The cohort includes information regarding a variety of phenotypes like blood measurements, clinical assessments, anthropometry, cognitive function, hearing, arterial stiffness, hand grip strength, spirometry, ECG, data on cancer and death registries, health and lifestyle medical conditions, operations, mental health, sociodemographic factors, lifestyle, family history, psychosocial factors and dietary intake, described in more detail elsewhere62. Hospital episode statistics (HES) is a database containing details of all admissions at NHS hospitals in UK, which has been linked to the UKB63.

Million Veteran Program (MVP)

The Department of Veterans Affairs (VA) created in 2011 a national cohort across USA: the MVP. This cohort was created as a representative, national and longitudinal study of Veterans for genomic and non-genomic research, employing responses to questionnaires, blood specimens and electronic health records (EHR). The blood specimens were collected for genotyping, and these were linked to the EHR, which coded the diagnosis in ICD9 and ICD10, up until September 2019. As expected, most of the participants are males, aged between 50 and 69 years old at recruitment. Regarding ethnicity, European Americans and African Americans are well represented; Hispanics and Asian descent participants are also included64.

The MVP study from Raghavan et al. 8 uses different sample than the one we are using in the current study.

BioVU

The Vanderbilt Institutional Review Board (IRB) approved the creation of Vanderbilt DNA databank, that collected DNA samples from 2007 until 2010. During the past years, the Vanderbilt University Medical Center has developed a comprehensive electronic medical record (EMR) system that covers all inpatient and outpatient data, including labs, drug ordering, and diagnostic imaging, including over 1.4 million records65. Regarding ethnicity, there is large concordance between race assignment and genetic ancestry for Europeans and African Americans, in contrast to lower concordance for Hispanics, East Asians and South Asians66.

BioMe

The Icahn School of Medicine at Mount Sinai’s Institutional Review Board approved in 2007 the construction of BioMe biobank. This EMR-linked biorepository enrolls participants non-selectively from the Mount Sinai Health System, which serves a diverse group of communities across the greater New York City area. At enrolment, participants provided informed consent to link their DNA and plasma sample to their EMR. This is further complemented by a questionnaire on demographic and lifestyle factors. At present, the cohorts comprise over 60,000 participants. 58% of the participants are females; participants were aged between 18 and 89+ years old at recruitment. Regarding ethnicity, European Americans, African Americans and Hispanics are well represented67.

Geisinger’s MyCode Community Health Initiative Study (MyCode)

The Geisinger Health System (GHS) includes a large percentage of stable participants from Pennsylvania, from more than 70 care facilities. In 2007 GHS initiated the MyCode Community Health Initiative (MyCode) to create a biobank of blood, serum, DNA samples along with genotype and exome sequence data. These data were linked to the EMR data for research purposes. By 2015, MyCode reported more than 90,000 participants and an ongoing monthly enrolment of around 2000, across the age spectrum (0 to >89 years old). Regarding ethnicity, more than 95% of the population are self-identified white or European American68.

Electronic Medical Records and Genomics (eMERGE) network

In 2007 the electronic MEdical Records and GEnomics (eMERGE) Network is a National Human Genome Research Institute (NHGRI) created to employ EHR for genomic research purposes. Today, eMERGE Network includes nine research groups across US, that they have connected the DNA samples to EHR. The majority of the studied participants have European ancestry, but also African, Asian and Hispanic descent participants are included in a smaller percent69,70.

Colorado Center for Personalized Medicine (CCPM Biobank)

The biobank at the Colorado Center for Personalized Medicine (CCPM Biobank) was jointly developed by the University of Colorado Anschutz Medical Campus and UCHealth to serve as a unique, dual-purpose research and clinical resource accelerating personalized medicine. As a resource comprising electronic health records (EHRs), genotype data, and other integrated data sources (e.g., geocoded data and survey data), the CCPM Biobank had more than 200,000 enrolled participants and 33,674 genotyped participants as of March 2022. The latter formed the freeze 2 research dataset. More details about the CCPM Biobank are described in Wiley et al. 71.

Related Articles

Using twin-pairs to assess potential bias in polygenic prediction of externalising behaviours across development

Prediction from polygenic scores may be confounded by sources of passive gene-environment correlation (rGE; e.g. population stratification, assortative mating, and environmentally mediated effects of parental genotype on child phenotype). Using genomic data from 10 000 twin pairs, we asked whether polygenic scores from the most recent externalising genome-wide association study predict conduct problems, ADHD symptomology and callous-unemotional traits, and whether these predictions are biased by rGE. We ran regression models including within-family and between-family polygenic scores, to separate the direct genetic influence on a trait from environmental influences that correlate with genes (indirect genetic effects). Findings suggested that this externalising polygenic score is a good index of direct genetic influence on conduct and ADHD-related symptoms across development, with minimal bias from rGE, although the polygenic score predicted less variance in CU traits. Post-hoc analyses showed some indirect genetic effects acting on a common factor indexing stability of conduct problems across time and contexts.

Iron homeostasis and ferroptosis in muscle diseases and disorders: mechanisms and therapeutic prospects

The muscular system plays a critical role in the human body by governing skeletal movement, cardiovascular function, and the activities of digestive organs. Additionally, muscle tissues serve an endocrine function by secreting myogenic cytokines, thereby regulating metabolism throughout the entire body. Maintaining muscle function requires iron homeostasis. Recent studies suggest that disruptions in iron metabolism and ferroptosis, a form of iron-dependent cell death, are essential contributors to the progression of a wide range of muscle diseases and disorders, including sarcopenia, cardiomyopathy, and amyotrophic lateral sclerosis. Thus, a comprehensive overview of the mechanisms regulating iron metabolism and ferroptosis in these conditions is crucial for identifying potential therapeutic targets and developing new strategies for disease treatment and/or prevention. This review aims to summarize recent advances in understanding the molecular mechanisms underlying ferroptosis in the context of muscle injury, as well as associated muscle diseases and disorders. Moreover, we discuss potential targets within the ferroptosis pathway and possible strategies for managing muscle disorders. Finally, we shed new light on current limitations and future prospects for therapeutic interventions targeting ferroptosis.

Comparative analysis of the Mexico City Prospective Study and the UK Biobank identifies ancestry-specific effects on clonal hematopoiesis

The impact of genetic ancestry on the development of clonal hematopoiesis (CH) remains largely unexplored. Here, we compared CH in 136,401 participants from the Mexico City Prospective Study (MCPS) to 416,118 individuals from the UK Biobank (UKB) and observed CH to be significantly less common in MCPS compared to UKB (adjusted odds ratio = 0.59, 95% confidence interval (CI) = [0.57, 0.61], P = 7.31 × 10−185). Among MCPS participants, CH frequency was positively correlated with the percentage of European ancestry (adjusted beta = 0.84, 95% CI = [0.66, 1.03], P = 7.35 × 10−19). Genome-wide and exome-wide association analyses in MCPS identified ancestry-specific variants in the TCL1B locus with opposing effects on DNMT3A-CH versus non-DNMT3A-CH. Meta-analysis of MCPS and UKB identified five novel loci associated with CH, including polymorphisms at PARP11/CCND2, MEIS1 and MYCN. Our CH study, the largest in a non-European population to date, demonstrates the power of cross-ancestry comparisons to derive novel insights into CH pathogenesis.

Deep learning to predict cardiovascular mortality from aortic disease in heavy smokers

Aortic angiopathy is a common manifestation of cardiovascular disease (CVD) and may serve as a surrogate marker of CVD burden. While the maximum aortic diameter is the primary prognostic measure, the potential of other features to improve risk prediction remains uncertain. This study developed a deep learning framework to automatically quantify thoracic aortic disease features and assessed their prognostic value in predicting CVD mortality among heavy smokers. Using non-contrast chest CTs from the National Lung Screening Trial (NLST), aortic features quantified included maximum diameter, volume, and calcification burden. Among 24,770 participants, 440 CVD deaths occurred over a mean 6.3-year follow-up. Aortic calcifications and volume were independently associated with CVD mortality, even after adjusting for traditional risk factors and coronary artery calcifications. These findings suggest that deep learning-derived aortic features could improve CVD risk prediction in high-risk populations, enabling more personalized prevention strategies.

Genome-wide association study meta-analysis provides insights into the etiology of heart failure and its subtypes

Heart failure (HF) is a major contributor to global morbidity and mortality. While distinct clinical subtypes, defined by etiology and left ventricular ejection fraction, are well recognized, their genetic determinants remain inadequately understood. In this study, we report a genome-wide association study of HF and its subtypes in a sample of 1.9 million individuals. A total of 153,174 individuals had HF, of whom 44,012 had a nonischemic etiology (ni-HF). A subset of patients with ni-HF were stratified based on left ventricular systolic function, where data were available, identifying 5,406 individuals with reduced ejection fraction and 3,841 with preserved ejection fraction. We identify 66 genetic loci associated with HF and its subtypes, 37 of which have not previously been reported. Using functionally informed gene prioritization methods, we predict effector genes for each identified locus, and map these to etiologic disease clusters through phenome-wide association analysis, network analysis and colocalization. Through heritability enrichment analysis, we highlight the role of extracardiac tissues in disease etiology. We then examine the differential associations of upstream risk factors with HF subtypes using Mendelian randomization. These findings extend our understanding of the mechanisms underlying HF etiology and may inform future approaches to prevention and treatment.

Responses

Your email address will not be published. Required fields are marked *