Germline structural variant as the cause of Lynch Syndrome in a family from Ecuador

Germline structural variant as the cause of Lynch Syndrome in a family from Ecuador

Introduction

Colorectal cancer (CRC) is the third most common cancer in both genders worldwide1. Most CRC cases occur sporadically, but inherited mutations are responsible for 13–30% of CRC cases2. In this regard, Lynch Syndrome (LS) is the most common form of hereditary CRC. LS is an autosomal dominant disease caused by germline defects in the DNA-mismatch repair (MMR) genes, including MLH1, MSH2, MSH6, PMS2 and EPCAM. These mutations lead to microsatellite instability (MSI), a hallmark of LS-related tumors. Moreover, the tumors do not present with BRAF mutations or MLH1 hypermethylation. LS is associated with increased risk for early-onset CRC, endometrial, stomach and other cancers3. Prevalence has been estimated to be around 1/300 individuals4.

It is of extreme importance for affected LS patients and relatives to identify the germline causative alteration to provide intensified surveillance to allow early diagnosis and prevention of cancer for those carrying the inherited defect. Current approaches for LS diagnosis typically involve the combination of clinical criteria, tumor testing, and genetic testing5. Diagnostic germline testing in suspected LS families generally include screening of the MMR genes by targeted gene-panel sequencing (for small coding or splicing-affecting genetic variants) and multiplex ligation-dependent probe amplification (MLPA) (for rearrangements). The International Society for Gastrointestinal Hereditary Tumours (InSiGHT) locus-specific database collects more than 3000 unique germline sequence variants of the four LS-associated genes, being the more commonly mutated MLH1 and MSH26. It has been reported that about 50% of patients with suspected LS remain without a clear germline cause and they are designated as Lynch-like syndrome (LLS) cases7. More than half of LLS tumors present two somatic mutations in the MMR genes8. However, a significant proportion of LLS patients remain genetically unresolved despite both germline and somatic testing.

In this study, we report the identification and characterization of a novel germline structural variant involving the 3’-ends of the MLH1 and LRRFIP2 genes as the cause of LS in a family from Ecuador. Whole-genome sequencing (WGS) and transcriptomics allowed the identification of the genomic rearrangement and highlighted the importance of the use of these additional approaches to achieve a comprehensive molecular diagnosis in some LS patients.

Results

Case report

In this study, we present a family of Ecuadorian origin with a history of CRC and gastric cancer fulfilling Amsterdam II criteria (Fig. 1A). This family remained unconfirmed by germline testing for more than 10 years despite being clinically diagnosed with LS. In this LLS family, a sister of the index case (III:4) was attended at the high-risk clinic for gastrointestinal cancer of our hospital in 2012 for advice concerning her familial cancer history. Other family members presented cancer at young age, either CRC (40 y.o.) or gastric cancer (27 and 48 y.o.). When reported, endometrial cancer was not present in any of the family members. The index case of this family was affected with CRC at age 47. His tumor showed loss of both MLH1 and PMS2 proteins and was BRAF wild type. Tumor sequencing to detect MMR double somatic events was not pursued.

Fig. 1: Family tree and genetic testing of the index case for the MMR genes.
Germline structural variant as the cause of Lynch Syndrome in a family from Ecuador

A Pedigree of the Ecuadorian family, where the index case is marked with an arrow. Black symbols represent individuals affected by either colorectal (upper right) or stomach cancer (lower left). Age of onset is also indicated. Female and male gender are circles and squares, respectively. Slashed individuals represent death. + symbol indicates carrier status for the rearrangement. – symbol indicates non-carrier status for the rearrangement. B MLPA results showing potential alterations in MLH1 exon 19 (probemix P003), and MLH1 exon 19 and exon LRRFIP2 exon 26 (probemix P248). C CGH array of index case indicative of a genomic alteration in the 3’ends of MLH1 and LRRFIP2 (red circle).

Full size image

Genetic testing results

Germline testing for the MMR genes was performed using targeted gene-panel sequencing and revealed no potentially pathogenic genetic variants. Rearrangement analysis for the MMR genes using MLPA and comparative genomic hybridization (CGH) detected an alteration affecting the 3’-ends of the MLH1 and LRRFIP2 genes (Fig. 1B, C). Subsequently, long-range PCR amplification was attempted on germline DNA, targeting the suspected area (primers designed before MLH1 exon 18 and before LRRFIP2 exon 26), but it was unsuccessful.

Genomic analysis/Whole-genome sequencing

To better characterize this putative rearrangement, we performed WGS. The structural variant (SV) callers Manta and Delly detected a complex rearrangement with three overlapping SVs, including a small inversion in MLH1 (0.91 kb, GRCh37 NC_000003.11:g.37088346_37089273inv), a large inversion involving MLH1 and LRRFIP2 (22.8 kb, GRCh37 NC_000003.11:g.37088713_37111528inv), and a tandem duplication in LRRFIP2 (37.88 kb, GRCh37 NC_000003.11:g.37099223_37137103dup), none of them previously reported in dbVar (https://www.ncbi.nlm.nih.gov/dbvar/). The genomic rearrangement encompassed a total of 48.757 kb (GRCh37 NC_000003.11:g.37088346_37137103).

By examining the WGS data with the Integrative Genomics Viewer (IGV), we observed discordant read pair orientation and mapping distance, confirming the three SVs (Fig. 2A). We also noted an altered depth of coverage, that included not only the tandem duplication but the full complex rearrangement, indicating the presence of additional duplicated regions beyond the predictions of the SV callers. By using a simplified IGV coverage plot view, we were able to propose a fine-tuned map of the actual rearrangement, fitting the altered WGS coverage and the SV calling (Fig. 2B). The breakpoints of the three SVs were validated by Sanger sequencing at a nucleotide resolution, revealing small insertions and deletions at the novel junctions. The full characterization of this complex rearrangement confirmed that the variant calling was not able to detect some duplicated areas (A and C) as well as the additional duplication of region D, which corresponds to the overlap between the large inversion and the tandem duplication.

Fig. 2: Characterization of the complex rearrangement involving MLH1 and LRRFIP2.
figure 2

A Visualization of WGS data with Integrative Genomics Viewer (IGV). Germline short reads from WGS allowed the detection of three SVs involving the 3’-ends of MLH1 and LRRFIP2. Representative paired reads with discordant pair orientation (RR, LL and RL) and aberrant mapping distance are depicted. A schematic map of the area of the rearrangement shows the three SVs and the six breakpoints, resulting in a five-segment map (A to E). The size of each segment in bp is indicated. The overlapped SVs are defined as a small inversion in MLH1 of 0.91 kb (fragments A and B, deep blue), a big inversion of 22.8 kb involving MLH1 and LRRFIP2 (fragments C and D, turquoise blue) and a tandem duplication of 37.88 kb in the LRRFIP2 gene (fragments D and E, green). B Characterization of the breakpoints of the SVs and proposed map of the rearrangement. IGV simplified coverage plot (IGV Count tool with an average read density window of 300 bp) allowed the characterization of the breakpoints and duplicated areas. The proposed map of the actual rearrangement fits the WGS coverage and the SV calling. Sanger sequencing profiles of the three PCRs (arrows) validated the breakpoints.

Full size image

Transcriptomics

To further characterize the SV, we performed RNA-seq on peripheral blood mononuclear cells (PBMCs) from the index case. Neither relevant gross alterations nor aberrant splicing patterns were detected in the RNA-seq data for MLH1 or LRRFIP2 (Fig. S1A). Relative expression levels of both genes were also measured by quantitative PCR (qPCR). No differences in MLH1 and LRRFIP2 expression were detected when using qPCR probes outside the rearrangement, whereas downregulation of MLH1 and upregulation of LRRFIP2 were evident when using qPCR probes located in the areas affected by the rearrangement (Fig. S1B). MLH1 downregulation would support the role of this complex rearrangement in causing LS. At the same time, LRRFIP2 expression levels could be compatible with the presence of some of the altered transcripts produced by the complex rearrangement not degraded by non-mediated decay. Although not detected by our analysis, it is also likely that this structural variant could lead to a truncated MLH1 protein or a fusion protein incorporating part of LRRFIP2, disrupting the final protein product. A similar effect has been observed in cases where EPCAM deletions impact the upstream MSH2 gene9. Additionally, taking into consideration the large area affected by this rearrangement (48.757 kb), it could be hypothesized that the topological architecture and predicted transcriptional associated domains in this region will also be affected, according to the data available at http://3dgenome.fsm.northwestern.edu/10 (Fig. S2).

PCR tests to detect the genomic rearrangement

PCR tests designed for breakpoint validation can be used to facilitate the screening of the large rearrangement in additional members of the family and other Ecuadorian CRC patients. Only carriers of the complex rearrangement will display a positive amplification when using primers of PCR TD, PCR INVL, PCR INVS and control PCR. The non-carriers will only amplify with primers of the control PCR (Fig. S3). By using these discriminatory primers, the same rearrangement was detected in the asymptomatic sister of the index case (III:4) who came to the clinic seeking advice regarding her family history of cancer. The SV was also confirmed by WGS. Recently, another asymptomatic sister of the index case (III:5) was found non-carrier of the rearrangement by PCR tests.

Discussion

In this study, we report the identification with WGS of a genomic alteration involving the 3’-ends of MLH1 and LRRFIP2 as the causative mutation in a LLS family from Ecuador. Molecular screening for this alteration has been offered to the rest of the family. The reference gastroenterologist and molecular laboratory specialists in the region of Ecuador where most of this family is located, have been contacted to facilitate the screening process. Additionally, this collaboration will enable the screening for this alteration in additional Ecuadorian CRC patients to determine if the genomic rearrangement represents a potential founder mutation, being more common than previously expected. It is worth noting that molecular studies for LS are scarce or nonexistent in some countries like Ecuador11. In this regard, our study enhances the molecular understanding of LS cases in this area with the subsequent benefit for both patients and the scientific community.

From our results, it can be highlighted that for some LS patients the current molecular diagnostic techniques (germline and somatic sequencing of the coding regions of MMR genes and MLPA) are not sufficient, and additional approaches should be used to increase diagnostic yield. Molecular rearrangements involving the MMR genes and their adjacent genes should be carefully examined, alongside the screening of non-coding MMR alterations. These features can be easily missed through standard targeted panel sequencing and MLPA12.

In order to detect structural variants, which can easily span repetitive or complex regions of the genome, long-read sequencing has become a powerful tool capable of reliably sequencing longer reads (10 kb), which enhances de novo assembly and mapping of the genome13. This powerful technique has been proven to capture most structural variants in the genome, compared to the capacity of short-read sequencing. For the case presented in this study, and due to the large area affected, its implementation might be challenging.

Exonic rearrangements in the MMR genes involving several exons are already an established mutational mechanism for LS14 and are currently screened as part of the routine tests involved in the molecular diagnosis of LS. The identified complex rearrangement was first suspected by MLPA and further characterized with WGS. Similar mutational events located in the 3’-end of MLH1 have been previously reported to be involved in LS15,16,17,18, suggesting that this genomic area could be a hotspot for these kinds of rearrangements. In the study by Zhu et al.15, they detected a duplication of MLH1 exon 19 in a patient fulfilling Amsterdam II criteria using MLPA. They described an extraordinarily high peak for this area corresponding to 12 calculated copies, which could imply a more complex rearrangement rather than just a simple duplication. In the study by Morak et al.16, they reported a paracentric inversion on chromosome 3p22.2 with one breakpoint in the genomic region of MLH1 and the other breakpoint downstream of MLH1, in the region of LRRFIP2, creating two new stable fusion transcripts between MLH1 and LRRFIP2. This alteration was detected in a CRC patient of a large family fulfilling the Amsterdam II criteria and segregating with CRC and/or endometrial cancer. In the study conducted by Pinheiro et al.17, they identified a deletion comprising exons 17–19 of the MLH1 gene and exons 26–29 of the LRRFIP2 gene, which turned out to be a founder mutation present in several LS patients of Portuguese ancestry. A recent study by Witt et al.18 detected a structural MLH1 variant in an Amsterdam criteria-positive family which corresponded to a copy-neutral inversion involving MLH1 and LRRFIP2.

In conclusion, we have been able to identify the complex mutational event spanning 48.757 kb affecting MLH1 (and the contiguous gene LRRFIP2) in a LLS family from Ecuador in which the tumor of the index case showed loss of MLH1 and PMS2 proteins, agreeing with the immunochemistry result that indicated a molecular defect in the gene. It is evident by our findings, and from previous studies, that the area of the 3’-ends of MLH1 and LRRFIP2 seem to be particularly prone to rearrange in some LS patients. Our case highlights the need to perform additional approaches, like long-read WGS and transcriptome analysis to the current established molecular diagnostic tests (targeted gene-panel, MLPA), for the subset of LLS patients without an identified germline or somatic alteration. These additional techniques can aid in the finding of these complex genomic structural variants that are present in some LS patients and remain undetected through current diagnostic techniques.

Methods

Ethics statement

Written informed consent was obtained in all cases. The present study was approved by the Institutional Ethics Committee of Hospital Clínic in Barcelona (register number HCB/2021/0189, date of approval 01/06/2021), in accordance with the Declaration of Helsinki (1975).

Germline testing: gene-panel, MLPA and CGH

Germline testing was conducted using commercial kits, including the TruSight Hereditary Cancer Panel (Illumina, San Diego, CA, USA) for targeted gene-panel sequencing, the SALSA MLPA probemixes P003 and P248 (MRC Holland, Amsterdam, Netherlands) for multiplex ligation-dependent probe amplification, and the Human SurePrint G3 CGH Microarray 180 K (Agilent, Santa Clara, CA, USA) for comparative genomic hybridization. All procedures were performed according to the manufacturer’s instructions19,20.

Whole-genome sequencing

Short-read WGS was conducted on the germline DNA of the index patient. Briefly, a short-insert paired-end library was prepared using a PCR free protocol with the TruSeq DNA Sample Preparation Kit and the KAPA Library Preparation kit (Kapa Biosystems, USA). Sheared genomic DNA was end-repaired, adenylated, and ligated to specific indexed paired-end adaptors. The library was sequenced using a HiSeq 4000 (Illumina), in paired-end mode (2 × 150 bp) with a yield of >99 Gb and median coverage of 30x. Primary data analysis, image analysis, base calling and quality scoring of the run were performed using the manufacturer’s software, followed by generation of FASTQ files by CASAVA. Sequencing mapping to the reference genome, alignment and variant annotation was performed using GEM19, Picard tools (http://broadinstitute.github.io/picard/), GATK, SnpEff and SnpSift21,22,23,24. Manta and Delly were used with default parameters to call for structural variants on WGS data25,26.

Breakpoint PCR and Sanger sequencing

The regions flanking the approximate breakpoints of the CNVs identified with WGS were confirmed by PCR amplification using the following primers:

PCR TD (tandem duplication): forward GGTTAGTCCAAATTGAGAGTTGC; reverse TTCTCGGACAGAGGAGATTTTC.

PCR INVL (large inversion): forward TTACTCTCCATCCTCACCCG; reverse TGGTTCTTAGGGCTTGGGAG.

PCR INVS (small inversion): forward AATGCAGAAACAAAGGGAAAACT; reverse TTGGATTACAGGTACCCGCC.

Control amplification (DNA quality control): forward TTCTGAGCTCAAGCAATCCA; reverse CTCGGACAGAGGAGATTTTCA.

PCR using PCR TD, PCR INVL, and PCR INVS was only successful in carriers of the rearrangement. The control amplification served as DNA quality control and amplified in all samples.

RNA-seq

Blood from the index case was collected in PAXgene Blood RNA tubes, and RNA extracted using the PAXgene Blood RNA kit (Qiagen, Hilden, Germany) as per manufacturer’s instructions. cDNA libraries were prepared using a TruSeq Stranded Total RNA Library Prep Kit. Paired-end sequencing (2 × 100 base pairs) was performed on a HiSeq 2500 Sequencing System. Raw reads were subjected to quality control, adapters sequences and low-quality reads were removed, transcripts were aligned and quantified, and gene expression levels were normalized. Gene expression results were analyzed with DROP (Detection of RNA Outliers Pipeline)27.

Real-time PCR

RNA reverse transcription was performed with the High-Capacity cDNA reverse Transcription kit (Applied Biosystems). Quantitative PCR (qPCR) was run on a QuantStudio1 System (Applied Biosystems) by using Taqman® Gene Expression probes against MLH1 (Hs00979919_m1; Hs00979922_m1) and LRRFIP2 (Hs00196889_m1; Hs00992892_m1), with GAPDH-FAM (Hs03929097_g1) as endogenous gene control for normalization purposes. Relative quantification was performed with the –∆∆Ct method.

Related Articles

Whole genome sequencing completes the molecular genetic testing workflow of patients with Lynch syndrome

Multigene panel tests (MGPTs) revolutionized the diagnosis of Lynch syndrome (LS), however noncoding pathogenic variants (PVs) can only be detected by complementary methods including whole genome sequencing (WGS). Here we present a DNA-, RNA- and tumor tissue-based WGS prioritization workflow for patients with a suspicion of LS where MGPT detected no LS-related PV. Among the 100 enrolled patients, MGPT detected 28 simple PVs and an additional 3 complex PVs. Among the 69 MGPT-negative patients, the lack of somatic MLH1 promoter methylation in a patient with a distinguished MLH1 allelic imbalance selected this sample for WGS. This returned a germline deep intronic MLH1 variant, with further functional studies confirming its’ pathogenicity. Interestingly, all three complex PVs and the MLH1 deep intronic PV were found to be recurrent at our center. Our straightforward and cost-effective prioritization workflow can optimally include WGS in the genetic diagnosis of LS.

Therapeutic vulnerabilities and pan-cancer landscape of BRAF class III mutations in epithelial solid tumors

Kinase-impaired class III BRAF mutations have recently received attention as a possible prognostic factor and therapeutic target. Class III BRAF variants differ from class I and class II mutations in terms of mechanism of pathway activation and therapeutic vulnerabilities. Genomic landscape analyses of tumors in large real-world cohorts represent a great opportunity to further characterize tumor-related molecular events and treatment vulnerabilities, however, such data is not yet available for tumors with BRAF class III mutations.

Targeting the splicing factor SNRPB inhibits endometrial cancer progression by retaining the POLD1 intron

Dysregulated alternative splicing has been closely linked to the initiation and progression of tumors. Nevertheless, the precise molecular mechanisms through which splicing factors regulate endometrial cancer progression are still not fully understood. This study demonstrated elevated expression of the splicing factor SNRPB in endometrial cancer samples. Furthermore, our findings indicate that high SNRPB expression is correlated with poor prognosis in patients with endometrial cancer. Functionally, SNRPB inhibition hindered the proliferative and metastatic capacities of endometrial cancer cells. Mechanistically, we revealed that SNRPB knockdown decreased POLD1 expression and that POLD1 intron 22 was retained after SNRPB silencing in endometrial cancer cells, as determined via RNA sequencing data analysis. The retained intron 22 of POLD1 created a premature termination codon, leading to the absence of amino acids 941–1,107 and the loss of the site of interaction with PCNA, which is essential for POLD1 enzyme activity. In addition, POLD1 depletion decreased the increase in the malignancy of endometrial cancer cells overexpressing SNRPB. Furthermore, miR-654-5p was found to bind directly to the 3′ untranslated region of SNRPB, resulting in SNRPB expression inhibition in endometrial cancer. Antisense oligonucleotide-mediated SNRPB inhibition led to a decrease in the growth capacity of a cell-derived xenograft model and a patient with endometrial cancer-derived xenograft model. Overall, SNRPB promotes the efficient splicing of POLD1 by regulating intron retention, ultimately contributing to high POLD1 expression in endometrial cancer. The oncogenic SNRPB–POLD1 axis is an interesting therapeutic target for endometrial cancer, and antisense oligonucleotide-mediated silencing of SNRPB may constitute a promising therapeutic approach for treating patients with endometrial cancer.

Integration of clinical outcomes and molecular features in extramedullary disease in multiple myeloma

Multiple myeloma (MM) remains incurable despite novel therapeutics. A major contributor to the development of relapsed/refractory and resistant MM is extraosseous extramedullary disease (EMD), whose molecular biology is still not fully understood. We analyzed 528 MM patients who presented to our institution between 2014 and 2021 and who had undergone molecular testing. We defined EMD as organ plasmacytoma distinct from bones and evaluated patients for the development of EMD with the goal of defining their molecular characteristics. Here, we show that RAS/BRAF mutations are likely essential for the development of EMD. Our results also indicate that the underlying reason for the negative outcomes in patients with poor prognostic factors such as duplication 1q and deletion 17p is largely due to the development of EMD. However, the presence of TP53 mutation remains a poor prognostic factor regardless of EMD development. Furthermore, mutation sites of TP53 were different between EMD versus non-EMD patients, with gain-of-function mutations enriched in patients with EMD. Our data highlights distinct molecular abnormalities in patients with EMD and provides potential mechanistic insights for novel therapeutic targets for the future.

3D genome landscape of primary and metastatic colorectal carcinoma reveals the regulatory mechanism of tumorigenic and metastatic gene expression

Colorectal carcinoma (CRC) is a deadly cancer with an aggressive nature, and how CRC tumor cells manage to translocate and proliferate in a new tissue environment remains not fully understood. Recently, higher-order chromatin structures and spatial genome organization are increasingly implicated in diseases including cancer, but in-depth studies of three-dimensional genome (3D genome) of metastatic cancer are currently lacking, preventing the understanding of the roles of genome organization during metastasis. Here we perform multi-omics profiling of matched normal colon, primary tumor, lymph node metastasis, liver metastasis and normal liver tissue from CRC patients using Hi-C, ATAC-seq and RNA-seq technologies. We find that widespread alteration of 3D chromatin structure is accompanied by dysregulation of genes including SPP1 during the tumorigenesis or metastasis of CRC. Remarkably, the hierarchy of topological associating domain (TAD) changes dynamically, which challenges the traditional view that the TAD structure between tumor and normal tissue is conservative. In addition, we define compartment stability score to measure large-scale alteration in metastatic tumors. To integrate multi-omics data and recognize candidate genes driving cancer metastasis, a pipeline is developed based on Hi-C, RNA-seq and ATAC-seq data. And three candidate genes ARL4C, FLNA, and RGCC are validated to be associated with CRC cell migration and invasion using in vitro knockout experiments. Overall, these data resources and results offer new insights into the involvement of 3D genome in cancer metastasis.

Responses

Your email address will not be published. Required fields are marked *