Analysis of exonic deletions in a large population study provides novel insights into NRXN1 pathology

Analysis of exonic deletions in a large population study provides novel insights into NRXN1 pathology

Introduction

Larger genomic deletions in the NRXN1 locus have been associated with a highly increased risk of mental disorders and, in particular, schizophrenia. However, the locus is known to harbour highly heterogeneous CNVs (Copy Number Variations, deletions and duplications) and, moreover, no population-based estimates of risk are available. Here, we use the iPSYCH2015 case-cohort sample to investigate the population prevalence and phenotypic consequences of specific types of deletions within the locus.

Neurexins are a family of highly conserved transmembrane proteins strongly involved in the development and function of neuronal synapses1. Like all mammals, humans possess three genes encoding different neurexin proteins (NRXN1-3)2. All three genes encode two main protein isoforms, alpha and beta1, and are almost exclusively expressed in neuronal tissue3,4. Notably, hundreds of splicing isoforms are expressed in humans and mice, many of which are specific to certain neuronal cell types1,5,6. Neurexin proteins are expressed by neurons at the presynaptic nerve terminal and their expression peaks around birth1. Among other ligands, neurexins bind to the calcium/calmodulin-dependent serine protein kinase (CASK) scaffolding molecules, contributing to the coupling of Ca2+ channels to synaptic release machinery1,7.

NRXN1 is a 1.3 Mbp gene located on the short arm of chromosome 2 (GRCh38:49,918,503–51,225,575)8. Among the three neurexin genes, NRXN1 is the most studied with respect to association with disease9,10,11. Multiple case-control studies have associated exonic deletions with increased risk of neurodevelopmental disorders, including schizophrenia (odds ratio (OR): 4.5; 95% CI: 2.0–10.9)12, autism spectrum disorder (OR: 7.2; 95% CI: 0.9–326)13, attention-deficit/hyperactivity disorder (OR: 4.68; CI95%: 1.82–10.64)14, depression (OR: 2.01; CI95%: 1.18–3.19)15, intellectual disability and/or developmental delay (OR: 8.14; 95% CI: 2.91–22.7)16, epilepsy (OR: 9.91; 95% CI: 1.92–51.1)17, and Tourette Syndrome (OR: 20.3; 95% CI: 2.6–156)18. Deletions in the 5′ end of the gene are more commonly observed compared to the rest of the gene9. To our knowledge, duplications and intronic deletions have not been strongly associated with disease risk in previous studies and, in general, appear to be far less studied than exonic deletions of NRXN1.

CNVs in the NRXN1 locus are non-recurrent, meaning that CNVs result from unrelated de novo mutations which do not share fixed breakpoints, and their mutational mechanism is different from that observed in non-allelic homologous recombination (NaHR) mediated by low-copy repeat (LCR) sequence elements19. One possible explanation for this genomic instability is that the NRXN1 locus, similarly to other large genes, is a late replicating region and therefore more prone to mutations resulting from stress-induced replication errors20.

As was also the case for rare recurrent CNVs (such as 22q11.2 deletions and 16p11.2 duplication) NRXN1 deletions were originally associated with high risk of disease from single case studies or small collections of cases21,22,23,24, followed by larger case-control studies also based on highly selected samples (e.g., cases with severe or long-term illness and controls screened for any family history of mental illness)11,13,16,17,18,25. However, recent research on recurrent CNV loci in larger and more population-representative study samples suggests that associations obtained using selected case-control samples tend to be biased toward an overestimation of the disease risk, owing largely to an underestimation of the prevalence of recurrent CNVs in the general population26,27,28.

In this study, we use the unique design of the iPSYCH201529 case-cohort study to provide population-representative estimates of the prevalence of NRXN1 deletions, and the associated risk of attention-deficit/hyperactivity disorder (ADHD), major depressive disorder (MDD), schizophrenia spectrum disorder (SSD), autism spectrum disorder (ASD), and bipolar disorder (BPD). We assess the risk of any deletion in the NRXN1 locus as well as that of different subgroups (including non-exonic ones). Moreover, we show that a significant proportion of intronic deletions in the locus is segregating in the population and may be associated with an increased risk of some psychiatric disorders.

Methods

Study design, phenotypes, and genotyping

This study is based on the iPSYCH2015 case-cohort sample29, an expanded version of iPSYCH2012, which has been previously described in detail30. In brief, the base population is defined as all 1,657,449 singleton births that occurred in Denmark between May 1, 1981, and Dec 31, 2008, who were alive and residing in Denmark on their first birthday and had a mother registered in the Danish Civil Registration System31. From the base population all persons who received a diagnosis of a major psychiatric disorder (as specified below) no later than Dec 31, 2015, were included in the case sample, N = 92,531 individuals. Then, a randomly selected population-representative cohort of N = 50,615 individuals was drawn from the base population, including 3030 who overlapped with the case sample. Individual diagnosis sample counts are as follow: SSD (ICD10 F20–F29; n = 16,008), MDD (ICD10 F32–F33 and ICD 8 296.09, 296.29, 298.09, and 300.49; n = 37,555), ASD (ICD10 F84; n = 24,975), or ADHD (ICD10 F90; n = 29,668).

We also assessed three other brain disorders; intellectual disability (ID), epilepsy, and Tourette syndrome (TS), with prior evidence of association with NRXN1 deletions16,17,18, using information on hospital diagnoses that had been obtained through the Danish Psychiatric Central Research Register32 and the Danish National Patient Registry33 for other iPSYCH2015 studies. The diagnostic codes used to identify individuals with these disorders were as follows: ID (ICD10: F70-F79; ICD8: 311-315), epilepsy (ICD10: G40; ICD8: 345 (excluding 345.29)), TS (ICD10: F95.2).

Supplementary Table 2 provides carrier count for each diagnosis, as well as a subset by subcohort (iPSYCH2012 or iPSYCH2015i) and gender.

Genotyping was performed using Illumina microarrays and has been described elsewhere30. Notably, the genotyping was performed on dried blood spot samples taken at birth. iPSYCH2012 and the additional extension (iPSYCH2015i) were genotyped using two different arrays, PsychArray version 1.0 and Global Screening Array version 2 (GSA), respectively. B allele frequency (BAF) and logR ratio (LRR) values were extracted using GenomeStudio and samples with a genotyping call rate below 95% were excluded.

CNV calling and pre-processing

CNVs were called using PennCNV34 as described in our previously published CNV calling and processing protocol35. All steps of the calling pipeline were run using the Singularity container provided in the protocol. In brief, the intensity files were filtered to include only biallelic autosomal SNPs mapping uniquely to the Haplotype Reference Consortium (HRC) hg19 reference map36, with a minor allele frequency of at least 0.1%, which yielded 280,700 and 509,754 probes for the PsychArray and GSA, respectively. Next, PennCNV calls were obtained with the script “detect_cnv.pl” setting a minimum number of probes (–minsnp) at 5, and the minimum length (–minlength) at 1000 bp. We then merged adjacent calls, with the PennCNV script “clean_cnv.pl” using the settings “–fraction 0.2 –bp” whereby two calls are merged if the gap between them corresponds to less than 20% of the combined length (in base pairs) of the calls. After CNV calling, we excluded samples with high levels of noise from the analysis. Thus, samples were excluded if they had either a LRR standard deviation value ≥ 0.35, BAF drift ≥ 0.005 or |GCWF | ≥ 0.02.

The locus of interest was defined as the NRXN1 gene in Ensembl8 GRCh37 (https://grch37.ensembl.org/Homo_sapiens/Info/Index) plus 0.5 Mbp upstream and downstream of the gene boundaries (chr2:49 645 643-51 759 674). Any CNV call overlapping the region by at least 0.1% of its length was selected for visual validation using the function “select_stich_calls()” from the R package QCtreeCNV35; this step also removed CNV smaller than 10 SNPs. Visual inspection was performed independently by two analysts as already described35. The boundaries of true CNVs were manually adjusted if necessary and any discordant call between the analysts was re-evaluated in a final joint session.

CNV analysis

The genomic coordinates of NRXN1 exons and transcripts were extracted using Ensembl8 GRCh37 (https://grch37.ensembl.org/Homo_sapiens/Info/Index). We decided to focus on protein-coding transcripts only and thus selected all 9 transcripts with a protein match in UniProt37 (https://www.uniprot.org/), yielding a total of 41 unique exons.

Under the assumption that exons mapping close to each other are likely to be deleted by the same CNVs, we investigated if any larger pattern was present at the level of the whole gene. We computed a genomically ordered correlation matrix across all exons, defined as an N × N matrix where N is the number of exons and the cell xy is the number of times a CNV affecting exon x also affects exon y.

CNVs are not equally distributed across the locus. We explored this topic using an IOU matrix, defined as an NxN matrix where N is the number of CNVs (381) and the cell xy is the IOU (Intersection Over the Union) score for CNVs x and y. IOU is 1 for two identical segments and ranges between 0 and 1 for any two overlapping segments, while non-overlapping segment pairs have an IOU range from 0 to approaching an asymptote at −1 the farther apart the two segments are. We then subgrouped exons in “alpha” and “beta” regions, based on Fig. 1d and previous literature38, corresponding to exons ENSE00001682911 to ENSE00002460080 (beta), and exons ENSE00002453754 to ENSE00001547151 (alpha). For the purpose of the secondary analysis (Table 1), deletions affecting exons from both groups were assigned to “alpha”.

Fig. 1: NRXN1 deletions similarity matrices, and NRXN1 correlation matrix. Note that the NRXN1 gene is encoded on the reverse strand, meaning the alpha promoter region (5′ of the gene) is shown on the right in this figure (see panel c for a breakdown of the gene structure).
Analysis of exonic deletions in a large population study provides novel insights into NRXN1 pathology

a Similarity heatmap for all deletions in the neurexin locus. Similarity is measured as IOU (intersection over the union), as described in the methods. Each row represents a deletion. Deletions are ordered on the x-axis based on the genomic position of the respective centre. Note that the scale is not linear as CNVs are not distributed equally across the locus. b Positional similarity for intronic deletions. This makes more evident the large group of very homogeneous deletions (marked with the orange bar on the x-axis). This group is referenced as segregating in the main text. c Distribution of the centre position for all exonic deletions in the NRXN1 gene locus. A schematic of the main gene isoform is aligned below the x-axis. The green and red bars mark the two exonic groups described in (d). d Exon correlation matrix. Exons are ordered based on genomic location. Note that the scale is not linear as exons are not distributed equally across the locus (see c). The red bar marks the exons in the “alpha” group and the green in the “beta” group. e A different view on the NRXN1 gene, the top blue graph shows all exons used in the study, while the bottom shows the top isoform.

Full size image
Table 1 NRXN1 deletions and associated risk of psychiatric disorders
Full size table

Segregating deletion analysis

The coordinates of the segregating NRXN1 deletion found in Rujescu et al.25 were lifted over from hg18 to hg19 using the online tool LiftOver (https://genome.ucsc.edu/cgi-bin/hgLiftOver).

To identify SNPs in high linkage disequilibrium (LD) with the segregating deletion, we performed an association analysis (using the “–assoc” command in PLINK39,40 with default settings, Supplementary Fig. 2) where we compared the 100 identified carriers with 5000 randomly drawn non-carriers, across all SNPs with MAF > 0.01 and info >0.95 mapping on the entire chromosome 2, using an imputed genotype dataset of the iPSYCH201541. We then pruned the resulting SNPs with the following settings –clump-p1 0.00001 –clump-r2 0.8 –clump-kb 1000000.

The phased genotypes of the top 10 SNPs (shown in Supplementary Table 1) were imported in R. Here we constructed all possible haplotypes of length between two and five SNPs and tested their association with the deletion carriers using the R function fisher.test(). The haplotypes with an OR ≥ 2 and a p-value ≤ 0.0001 were further tested using the function roc() from the R package pROC42 to get the AUC (Area Under the Curve) value.

Statistical analysis

We derived population-based prevalence (with CI95%) for the different subgroups of NRXN1 deletions using the svydesign() and svyciprop() functions from the R package survey43, with finite population correction (FPC) to account for oversampling of cases in iPSYCH2015.

Briefly, we divided the post-QC number of cases (77,655) and individuals from the random population subcohort (43,311) with the total number of corresponding individuals in the source population (90,218 and 1,657,449) to derive the sampled population fractions; 0.85068 (100% of cases minus the ones failing genotype or excluded in QC) and 0.02613, respectively. Samples from overlapping individuals (cases-in-subcohort) were assigned the case population fraction (0.85068).

We calculated the corresponding prevalence of exonic NRXN1 deletions in the UKB directly from carrier counts provided by Crawford et al.44 and derived CI95% as follows (R pseudocode): CI95%=qbeta(c(0.05/2,1-0.05/2), nCarrier + 0.5, nTotal-nCarrier + 0.5), where nCarrier and nTotal indicate the number of carriers and the total number of assessed samples (421,268), respectively.

We compared the prevalence of exonic deletions in iPSYCH2015 and UKB with Welch’s test of the difference between two means assuming unequal variance. Briefly, we defined the difference; d = (|log(piPSYCH/pUKB)|), the standard error of the difference; SEd = √(SEiPSYCH2 + SEUKB2), and the p-value; P = 2*(1-pnorm(d/SEd)), where piPSYCH and SEiPSYCH, and pUKB and SEUKB, indicate the prevalence and standard error of prevalence in iPSYCH2015 and UKB, respectively.

To estimate the risk of index psychiatric disorders associated with NRXN1 deletions we ran a logistic regression analysis using gam() from the R package mgcv45. We used age, sex (at birth) and SNP array type as covariates, with a smoothed function to model the effect of age using the mgcv function s(). In each association, we included all cases for the phenotype of interest and all controls, defined as individuals not having any of the index diagnoses. For the later-onset disorders SSD, MDD and SCZ, we only included those controls who were at least as old as the youngest case. Multiple testing correction was applied to the table containing the results of all three analyses (Table 1) using the R function p.adjust(method = “fdr”). We then compared risk estimates with those reported in published case-control studies (in each case the study applying the largest case-control sample size for the respective disorder; only considering studies that controlled for genotyping array, when including samples genotyped on different arrays) using a Welch’s test in a similar way as described above for prevalence estimate comparison. We performed two additional sensitivity analyses, we ran the first model on the phenotype schizophrenia (ICD10, F20) instead of SSD, and we ran the last model on the European unrelated subset of iPSYCH201541.

To estimate the risk of the three other brain disorders associated with NRXN1 deletions we fitted a logistic regression model using case status for each of the four iPSYCH disorders (ADHD, ASD, MDD and SSD) as covariates in addition to age, sex (at birth) and SNP array type.

Software

All analyses were performed on HPC running CentOS Linux 7. PLINK39,40 version 190b6.21, R46 version 4.0.5 and VCFtools47 0.1.17 were installed via the conda package manager (https://anaconda.org/). PennCNV34 version 1.0.5, bcftools48 version 1.14, htslib49 1.14 are a part of the container we used for the CNV calling described in the previous section, available on Docker Hub (https://hub.docker.com/r/sinomem/docker_cnv_protocol). For the analysis and the figures, we used the following R packages: data.table50, pROC42, survey43, mgcv45 and ggplot251.

Ethics statement

This study is in full compliance with all relevant ethical regulations including the Declaration of Helsinki. Access to the data and its use for research purposes was granted by The Danish Scientific Ethics Committee, the Danish Health Data Authority, the Danish Data Protection Agency, and the DNSB Steering Committee. For this study, the Danish Scientific Ethics Committee has, in accordance with the Act on Research Ethics Review of Health Research Projects (in Danish: Komitéloven), waived the need for informed consent in biomedical research based on existing biobanks.

Results

Descriptive statistics and prevalences

After quality control, our sample consisted of 77,655 cases of the four disorders ascertained in iPSYCH2015 (22,167 ASD, 26,186 ADHD, 31,622 MDD, 13,126 SSD) and a population-representative random cohort of 43,311 samples, for a total of 118,427 unique samples. Given the structure of the sample, there is a small overlap between the two groups. Moreover, a given case can be diagnosed with more than one of the index disorders. We called CNVs in the larger NRXN1 locus (NRXN1 gene plus 0.5 Mbp upstream and downstream) and performed visual validation as described in the methods. In total 1387 calls were evaluated, of those 378 were deemed as true CNVs, 573 as false calls, and 436 as unknown (meaning no definitive judgement was possible, most often due to the small number of markers available). Given the small proportion of duplications (21 out of 378) and the low reliability of validating small duplications, we discarded duplications from all subsequent analyses and focused on deletions only. This resulted in a total of 357 carriers (255 cases, 102 controls) of which 135 (108 cases, 27 controls) were exonic, i.e., overlapping at least one exon.

The prevalence of NRXN1 deletions in the general Danish population is 2.55 (95% CI: 2.13–3.04) per 1000 individuals and 0.70 (95% CI: 0.50–0.98) when restricting to exonic deletions. This is almost two times higher than what was previously reported in UKB44, 0.70 vs 0.39 per 1000 individuals (p-value 0.0014, Welch’s test). Subgrouping by subcohort (iPSYCH2012 and the extension iPSYCH2015i respectively) the prevalence estimates are 2.20 (95% CI: 1.72–2.81) and 3.07 (95% CI: 2.37–3.98) for any deletion, and 0.78 (95% CI: 0.52–1.17) and 0.58 (95% CI: 0.32–1.05) for exonic deletions only. Supplementary Table 3 provides a prevalence breakdown per gender.

NRXN1 deletions subgrouping

Neither exonic nor non-exonic deletions are distributed uniformly across the locus (Supplementary Fig. 1). In order to disentangle the risk signal in NRXN1 CNVs further than exonic/non-exonic deletions, we created a set of subgroups. We used a similarity matrix of all CNV pairs (Fig. 1a, b) and a correlation matrix of the deleted exons (Fig. 1c) as described in the methods. Regarding non-exonic CNVs, we identified a clear subgroup of 100 very similar CNVs (IOU > 80%) corresponding to those between exons ENSE00003649136 and ENSE00002460080 (Fig. 1a, b, Supplementary Fig. 1d). The average boundaries of this group of deletions correspond to a deletion previously found segregating in several European populations (Chr2:50,882,153–50,945,699 in Rujescu et al. and Chr2:50,882,111–50,947,645 in this study)25. The prevalence of this segregating intronic deletion is 0.77 (95% CI: 0.55–1.06) per 1000 individuals.

Regarding exonic CNVs, the correlation plot (Fig. 1d) shows that exons are affected by deletions essentially in two blocks, exons ENSE00001682911 to ENSE00002460080 (roughly corresponding to the 3′ end of the gene to the group of exons where the promoter of the beta isoform is located, referred to as beta region from now on), and exons ENSE00002453754 to ENSE00001547151 (roughly corresponding to said group of exons to the 5′ end of the gene, referred to as alpha promoter region from now on). See also Supplementary Table 4 and Supplementary Fig. 3 for more details on exonic deletions. The number of carriers in each group was 81 and 54, for the alpha and beta promoter regions, respectively. While smaller clusters are observed within both large groups, further subgrouping of these two main clusters resulted in limited study power, thus we only used these two main clusters for further analysis.

NRXN1 deletions and associated risk of psychiatric disorders

To estimate the association between NRXN1 deletions and the risk of the four index psychiatric disorders (ADHD, ASD, MDD, SSD) we conducted three separate analyses based on the deletion subgroups described above. As described in the methods, we used a logistic model adjusting for age, SNP array type and sex. The resulting OR estimates and carrier counts are summarised in Fig. 2 and Table 1. Overall, we see an increased risk of ADHD and ASD associated with carriage of exonic deletions, but not of SSD (also when running the analysis on the stricter schizophrenia phenotype, OR: 1.87, 95% CI: 0.81–4.33) or MDD.

Fig. 2: Forest plots showing the ORs resulting from three logistic regression analyses on four neurodevelopmental disorders.
figure 2

a First model, ORs for exonic and non-exonic deletions in the NRXN1 locus. b, c Second model, exonic deletion is divided into three subgroups based on the exons they overlap (alpha promoter region, beta promoter region, at least one of both) and non-exonic are divided into two subgroups (those belonging to the segregating deletion and all the rest). Note that the scale of (b) differs from the rest. d Third model, ORs for being a carrier of the segregating deletion or of the haplotype associated with the deletion but without such deletion.

Full size image

We also attempted to replicate findings of previous studies linking exonic NRXN1 deletions to increased risk of ID16, epilepsy17 and TS18, although these disorders had not been specifically targeted by the iPSYCH case-cohort design and as a consequence our estimates are not as well powered (or population-representative) as for the four index psychiatric disorders (Supplementary Table 5). As shown in Table 2, we replicate the previous reports for ID and epilepsy, but not for TS. In all instances, (both for the four index psychiatric disorders and the three other brain disorders) our risk estimates are lower than reported in the case-control studies that we draw comparisons with, although not significantly so except for SSD and TS (Table 2). When we used the stricter SCZ diagnosis (ICD:F20) the difference with the comparison study12 was not significant (P = 0.15; Table 2).

Table 2 Comparison of effect sizes for exon-disrupting NRXN1 deletions between iPSYCH2015 and published case-control studies
Full size table

When subgrouping CNVs, deletions in the alpha promoter region of the gene appear to carry the majority of the signal. This is in accordance with previous literature both based on case-control studies as well as in vitro studies5,38.

While we observed no association between exonic deletions and risk of SSD (OR = 1.40; 95% CI: 0.68–2.89), this diagnosis group was the only one where we observed a significant increase in risk associated with intronic deletions. As shown in Fig. 2c, this association seems to be driven by the segregating intronic deletion described above (OR = 2.20; 95% CI: 1.15–4.18). Since intronic deletions are usually not considered pathogenic, we hypothesised that the risk associated with the segregating deletion could be explained by another variation co-segregating with it. As described in the methods, we ran a simple association test between all SNPs in chromosome 2 and the recurrent deletion. Using the 10 most associated SNPs we constructed all two-to-five SNPs haplotypes, and we identified the most characteristic haplotype with an AUC of 0.94 (rs10205006-T, rs7608415-G, rs62140665-C, rs17041353-G). We then ran a final analysis grouping samples based on whether they were carriers of this haplotype or not. The results, shown in Fig. 2d, confirm that this deletion is only associated with an increased risk of SSD and, notably, that the associated risk is confined to the deletion (n = 100) and not observed among carriers of the underlying haplotype without the deletion (n = 2341). However, we do not observe a significantly increased risk of SSD associated with this deletion when we restrict the sample to the European unrelated subset (OR: 1.8, 95% CI: 0.8–3.8). Finally, given the high number of analyses we performed multiple testing corrections (FDR, adjusted p-values are provided in Table 1). As expected, the strongest association reported in this study, namely ASD and ADHD with exonic deletions in the NRXN1 locus, remains significant after the correction. However, the SSD association with the segregating intronic deletion did not remain significant after correction.

Discussion

Deletions affecting the NRXN1 gene have been investigated for associations with psychiatric and developmental disorders for almost twenty years. CNVs in the NRXN1 locus can be very heterogeneous, affecting one or more exons, besides occurring between two exons. Exonic deletions in particular have been associated with SDD12,25, ADHD14, MDD15 and ASD13. However, most of the published studies have been limited to smaller case-control samples or meta-analyses of case-control samples. Moreover, intronic deletions are usually discarded from the analysis11,25,44. In this study, we attempt to disentangle the risk profile of exonic as well as intronic deletions defining subgroups of similar deletions. Using the population-representative case-cohort design of iPSYCH2015, we report unbiased estimates of the population prevalence and association of such subtypes of deletions with four core psychiatric disorders.

As in previous studies on the same cohort26,27,28, we find the prevalence in the general population to be higher and the risk associations to be lower than previously reported. We observe exonic deletions to be associated with ASD and ADHD. When subgrouping deletions based on location in the gene, the association is driven by deletions in the alpha promoter region of the gene, while deletions in the other half of the gene are rarer and possibly associated with less increased risk of psychiatric disorders. Notably, CNVs in the alpha promoter region are known to be more frequent and indeed are in our sample as well. The association appears robust, suggesting a biological reason for the excess risk in one proportion of the gene. However, it may also be exacerbated by the difference in number of carriers. We also confirm the presence of a small segregating deletion that does not affect any exon and find it to be potentially linked to SSD. While this signal did not survive multiple testing corrections, we believe it can be taken as an indication that intronic CNVs should not be discarded a priori in this kind of analysis.

Notably, we do not find exonic deletions in the NRXN1 locus to be associated with an increased risk of SSD, which at first glance seems in strong contrast with previous reports11,12,25,52,53,54,55. However, when we examine the methodology and timeline of these previous reports, a more conciliatory picture emerges. The first large-scale study of schizophrenia-associated risk with exon-disrupting NRXN1 deletions was that of Rujescu et al.25, who reported an OR of 9 in a meta-analysis of European samples including ~3000 cases and >30,000 controls. Most subsequent studies derived their risk estimates either fully11,52,55 or in part53 by merging all schizophrenia cases and controls from previously published studies and performing a simple Fisher’s exact test on the pooled sample. As a consequence, in all these studies a large fraction of the control individuals (40%–80%) are those from the original report by Rujescu et al.25, whereas most case individuals are from other studies, most often applying denser arrays than the HumanHap300 array used in Rujescu et al.25. As NRXN1 deletions vary widely in size and breakpoints, the approach taken in these studies is very vulnerable to batch effects owing to differing resolution to detect exon-disrupting deletions across different genotyping platforms.

Since the initial report of Rujescu et al. only two other large-scale studies (Rees et al.12, and Marshall et al.54) have been published that do not include the large control sample of Rujescu et al. Both these studies report slightly lower carrier rates in cases (0.15% and 0.11%) and higher carrier rates in controls (0.034% and 0.020%) than Rujescu et al. (0.24% in cases and 0.015% in controls), and when meta-analysing across genotyping platforms, both studies correspondingly report lower odds ratios (4.5 and 5.8, respectively). These estimates are still higher than we find in iPSYCH2015, as is also the case for the other three core iPSYCH2015 disorders. This could in part be due to case ascertainment; iPSYCH2015 relies on hospital-based diagnoses from national registers, without any further confirmation of case status. However, the carrier frequency among iPSYCH2015 cases is very similar to those reported by the largest previously published studies for each disorder. In contrast, the population-based prevalence of exon-disrupting NRXN1 deletions in iPSYCH2015 is twice as high as reported in UKB15,44 and the control samples used in Rees et al.12 and Girirajan et al.13, and more than three times higher than among the controls of Gudmundsson et al.14 This is in line with results of our previous CNV studies involving iPSYCH2015 and suggests that the overall tendency for lower CNV-associated risk estimates in iPSYCH2015 is in large part explained by the higher CNV prevalence in the general population compared to individuals used as controls in other studies.

The sample size is the major limitation of this study. Although NRXN1 is a hotspot for non-recurrent CNVs, such events are rare. For this reason, we lacked the power to include duplications in the study or subgroup deletions beyond the two major groups. Also, both the relatively young age of participants and the specific focus on a limited number of psychiatric disorders in the iPSYCH case-cohort design limits our study power for the later-onset iPSYCH disorders (MDD and SSD) as well as other brain disorders not targeted by the study design (such as ID, epilepsy and TS). Some of the individuals from the random subcohort will later go on to develop MDD or SSD, which in the case of MDD, with its high lifetime prevalence of 10–15%, could have had an attenuating effect on the estimated OR, while it is unlikely to have had affected the risk estimate for SSD, with its much lower lifetime prevalence (1.0–1.5%). As for the brain disorders not targeted by the case-cohort design, the case sample sizes are relatively small and enriched with individuals with comorbid ADHD, ASD, MDD and/or SSD. To account for this enrichment, while also retaining the maximum case sample size, we fitted a logistic model that included each of the four iPSYCH disorders as covariates. While maximising study power, this approach probably leads to an overestimate of case carrier frequency but at the same time an underestimate of the associated OR for these disorders.

Notwithstanding these limitations, our results add important insight into the association between NRXN1 deletions and the risk of psychiatric illness. Most importantly, we show that the risk is mainly driven by deletions disrupting exons specific to the alpha isoform of Neurexin 1. Also, we show that as with recurrent CNVs, previous case-control studies of NRXN1 deletions have likely underestimated their population prevalence and consequently overestimated their associated risk. Finally, we characterise the haplotype background of a previously reported intronic deletion segregating at ~0.1% carrier frequency in the Danish population, and while inconclusive, our results warrant further study into its possible association with psychiatric and/or other cognitive/behavioural traits.

Related Articles

The role of gene copy number variation in antimicrobial resistance in human fungal pathogens

Faced with the burden of increasing resistance to antifungals in many fungal pathogens and the constant emergence of new drug-resistant strains, it is essential to assess the importance of various resistance mechanisms. Fungi have relatively plastic genomes and can tolerate genomic copy number variation (CNV) caused by aneuploidy and gene amplification or deletion. In many cases, these genomic changes lead to adaptation to stressful conditions, including those caused by antifungal drugs. Here, we specifically examine the contribution of CNVs to antifungal resistance. We undertook a thorough literature search, collecting reports of antifungal resistance caused by a CNV, and classifying the examples of CNV-conferred resistance into four main mechanisms. We find that in human fungal pathogens, there is little evidence that gene copy number plays a major role in the emergence of antifungal resistance compared to other types of mutations. We discuss why we might be underestimating their importance and new approaches being used to study them.

Circular RNAs in neurological conditions – computational identification, functional validation, and potential clinical applications

Non-coding RNAs (ncRNAs) have gained significant attention in recent years due to advancements in biotechnology, particularly high-throughput total RNA sequencing. These developments have led to new understandings of non-coding biology, revealing that approximately 80% of non-coding regions in the genome possesses biochemical functionality. Among ncRNAs, circular RNAs (circRNAs), first identified in 1976, have emerged as a prominent research field. CircRNAs are abundant in most human cell types, evolutionary conserved, highly stable, and formed by back-splicing events which generate covalently closed ends. Notably, circRNAs exhibit high expression levels in neural tissue and perform diverse biochemical functions, including acting as molecular sponges for microRNAs, interacting with RNA-binding proteins to regulate their availability and activity, modulating transcription and splicing, and even translating into functional peptides in some cases. Recent advancements in computational and experimental methods have enhanced our ability to identify and validate circRNAs, providing valuable insights into their biological roles. This review focuses on recent developments in circRNA research as they related to neuropsychiatric and neurodegenerative conditions. We also explore their potential applications in clinical diagnostics, therapeutics, and future research directions. CircRNAs remain a relatively underexplored area of non-coding biology, particularly in the context of neurological disorders. However, emerging evidence supports their role as critical players in the etiology and molecular mechanisms of conditions such as schizophrenia, bipolar disorder, major depressive disorder, Alzheimer’s disease, and Parkinson’s disease. These findings suggest that circRNAs may provide a novel framework contributing to the molecular dysfunctions underpinning these complex neurological conditions.

Iron homeostasis and ferroptosis in muscle diseases and disorders: mechanisms and therapeutic prospects

The muscular system plays a critical role in the human body by governing skeletal movement, cardiovascular function, and the activities of digestive organs. Additionally, muscle tissues serve an endocrine function by secreting myogenic cytokines, thereby regulating metabolism throughout the entire body. Maintaining muscle function requires iron homeostasis. Recent studies suggest that disruptions in iron metabolism and ferroptosis, a form of iron-dependent cell death, are essential contributors to the progression of a wide range of muscle diseases and disorders, including sarcopenia, cardiomyopathy, and amyotrophic lateral sclerosis. Thus, a comprehensive overview of the mechanisms regulating iron metabolism and ferroptosis in these conditions is crucial for identifying potential therapeutic targets and developing new strategies for disease treatment and/or prevention. This review aims to summarize recent advances in understanding the molecular mechanisms underlying ferroptosis in the context of muscle injury, as well as associated muscle diseases and disorders. Moreover, we discuss potential targets within the ferroptosis pathway and possible strategies for managing muscle disorders. Finally, we shed new light on current limitations and future prospects for therapeutic interventions targeting ferroptosis.

Integrated proteogenomic characterization of ampullary adenocarcinoma

Ampullary adenocarcinoma (AMPAC) is a rare and heterogeneous malignancy. Here we performed a comprehensive proteogenomic analysis of 198 samples from Chinese AMPAC patients and duodenum patients. Genomic data illustrate that 4q loss causes fatty acid accumulation and cell proliferation. Proteomic analysis has revealed three distinct clusters (C-FAM, C-AD, C-CC), among which the most aggressive cluster, C-AD, is associated with the poorest prognosis and is characterized by focal adhesion. Immune clustering identifies three immune clusters and reveals that immune cluster M1 (macrophage infiltration cluster) and M3 (DC cell infiltration cluster), which exhibit a higher immune score compared to cluster M2 (CD4+ T-cell infiltration cluster), are associated with a poor prognosis due to the potential secretion of IL-6 by tumor cells and its consequential influence. This study provides a comprehensive proteogenomic analysis for seeking for better understanding and potential treatment of AMPAC.

The potential impact of RNA splicing abnormalities on immune regulation in endometrial cancer

RNA splicing controls the post-transcriptional level of gene expression, allowing for the synthesis of many transcripts with various configurations and roles. Variations in RNA splicing regulatory factors, including splicing factors, signaling pathways, epigenetic modifications, and environmental factors, are typically the origin of tumor-associated splicing anomalies. Furthermore, thorough literature assessments on the intricate connection between tumor-related splicing dysregulation and tumor immunity are currently lacking. Therefore, we also thoroughly discuss putative targets associated with RNA splicing in endometrial cancer (EC) and the possible impacts of aberrant RNA splicing on the immune control of tumor cells and tumor microenvironment (TME), which contributes to enhancing the utilization of immunotherapy in the management of EC and offers an alternative viewpoint for the exploration of cancer therapies and plausible prognostic indicators.

Responses

Your email address will not be published. Required fields are marked *