Spotiphy enables single-cell spatial whole transcriptomics across an entire section

Main

The location of a cell within its microenvironment is a critical determinant of its function and interactions with neighboring cells¹. Cells possess unique characteristics and functions, which are finely tuned to meet the specific needs of their environments through a process known as cellular regional specification². Analyzing local cell behavior involves identifying spatial domains, regions characterized by cellular composition and their transcriptomic profiles. A cell’s response to external signals is shaped by its spatial domain, affecting tissue development, homeostasis and disease progression³. Current spatial transcriptomics (ST) platforms^4,5,6 leave an unmet need for single-cell-resolved transcriptomic profiles with genome-wide coverage. ST technologies can generally be grouped into two categories: sequencing-based approaches (for example, Visium⁷, DBiT-Seq⁸, Slide-Seq^9,10, Slide-tags¹¹ and Stereo-seq¹²) and image-based approaches (for example, SeqFISH^13,14, smFISH¹⁵, RNAScope¹⁶, MERFISH^17,18, Xenium¹⁹ and CosMx²⁰). Predefining a capture area, or ‘spot’, allows sequencing-based approaches to generate unbiased genome-wide transcriptomic profiles while minimizing lateral leaking contamination. Yet, limiting the reads to predetermined spots comes with two major drawbacks, specifically information loss in the areas between the spots (noncapture areas) and low resolution (as each spot consists of multiple, often heterogeneous, cells). Image-based ST approaches using high-resolution fluorescent images achieve single-cell resolution but are limited to preselected gene panels consisting of 300–1,000 targets. Image-based ST approaches usually capture limited counts per gene, making them better suited for validation than data-driven discovery. Initial attempts have focused on enhancing the resolution of sequencing-based ST data, with various deconvolution methods for spot-level ST data already established. These methods integrate single-cell RNA-sequencing (scRNA-seq) and ST data to parse the cellular composition of individual transcriptomic spots^{21,22,23,24,25,26,27,28,29,30,31,32,33}. Yet, none of the current methods are able to achieve single-cell resolution while maintaining the spatially varying gene expression present in the original sequencing-based ST data.

We present Spotiphy, a toolkit that resolves the gene-coverage-resolution tradeoff. Spotiphy generates inferred scRNA expression profiles (iscRNA data) from all cells (located in both capture and noncapture areas), achieving spatially resolved whole-slide transcriptomic profiling. This provides substantial benefits for downstream analyses and opportunities for biological insight. Spotiphy-derived iscRNA data reveal regional specification of astrocytes and microglia in healthy and Alzheimer’s disease (AD) mouse brains, offering a level of detail not detectable by existing scRNA-seq and ST technologies. Spotiphy identifies changes in the patterns of tumor–tumor microenvironment (TME) interactions across multiple spatial domains and delivers single-cell-resolved whole-transcriptome images of the entire section, greatly expanding the information intensity of ST data and providing a more comprehensive and detailed understanding of spatially resolved cellular states and regulatory mechanisms.

Results

Spotiphy achieves single-cell spatial whole transcriptomics

Spotiphy delivers single-cell spatial whole-transcriptome profiling via generative modeling of sequencing-based ST, scRNA-seq and histological imaging data (Fig. 1a and Methods). Specifically, Spotiphy selects the most informative genes for each cell type to generate a signature reference from scRNA-seq data. It presets five customizable hyperparameters to ensure scRNA-seq reference accuracy and robustness (Supplementary Fig. 1). Spotiphy determines the locations of nuclei by segmenting the high-resolution histological images. It integrates the above information into a probabilistic model that considers the distribution of gene contributions from each cell type. This unique feature enables the simultaneous deconvolution and decomposition of ST data, thereby generating both cell-type proportion and iscRNA data. Additionally, Spotiphy imputes cell-type proportion and iscRNA data for cells located in noncapture areas through Gaussian processes³⁴. Consequently, Spotiphy generates single-cell-resolved images equivalent to the output of image-based approaches, with whole-transcriptomic profiling across entire sections.

**Fig. 1: Schematic overview of Spotiphy and the matched datasets used for validation.**

Matched mouse brain datasets for biological validity

To evaluate Spotiphy’s performance and the biological validity of its results, we enriched rare immune cells and obtained scRNA-seq profiles for 27,836 CD45⁺CD11b⁺ cells (microglia, macrophages and neutrophils), 6,085 CD45⁺CD11b^– cells (T cells and B cells) and 29,563 CD45⁻CD11b⁻ cells (neurons, glial cells and so on) from an AD mouse model³⁵ and wild-type (WT) control mice (Fig. 1b, Supplementary Table 1 and Methods). By supplementing our data with the Allen Brain Map Atlas³⁶, we assembled a comprehensive mouse brain single-cell reference with 27 cell types, including neurons, glia and immune cells (Fig. 1c, Supplementary Fig. 2a,b and Supplementary Table 2). We then generated datasets of various ST techniques, including immunohistochemistry (IHC), Visium, Xenium and CosMx. Serial sectioning of the same sample produced nearly identical slices, allowing direct comparison between different ST approaches once the histological images from these platforms were aligned (Supplementary Fig. 2c–e). The matched datasets are a valuable resource for the spatial omics community, particularly for the development and evaluation of ST algorithms.

Spotiphy excels in rare cell type deconvolution accuracy

To evaluate the accuracy of Spotiphy’s cellular deconvolution, Xenium data generated from an AD sample were used as the ground truth, and heat maps of cell-type proportions across the section were generated by Spotiphy and 13 other benchmarking methods^{21,22,23,24,25,26,27,28,29,30,31,32,33} (Fig. 2a, Extended Data Fig. 1a, Supplementary Figs. 3–5 and Methods). For well-organized excitatory glutamatergic neurons, Spotiphy’s delineation of the cortical layers was closer to that of Xenium and had clearer boundaries than those of the other methods. Spotiphy’s distribution patterns for unevenly distributed inhibitory GABAergic interneurons were likewise closer to the ground truth. Because the Xenium panel did not include markers for immune cells, we used in situ hybridization (ISH) results from the Allen Mouse Brain Atlas (mouse.brain-map.org)³⁷ as our ground truth for neutrophils and T and B cells. Whereas most other methods being tested failed to predict the distribution of neutrophils and instead produced random results, Spotiphy identified a specific enrichment of neutrophils around the ventricle that is in line with the ISH results for neutrophil markers, including Itgb2 (Fig. 2a and Supplementary Figs. 4–6). RCTD²² appeared comparable to Spotiphy, yet it incorrectly classified some neutrophils as macrophages, leading to an apparent increase of macrophages around the ventricle that did not align with the Xenium data (Supplementary Fig. 4e). Similar misidentifications among microglia, macrophages and neutrophils were observed in the CIBERSORTx results²⁶ (Supplementary Fig. 5e). No obvious ISH signal was observed for Cd3e (T cell marker) or Cd19 (B cell marker), consistent with the fact that T and B cells are rarely found in the mouse brain. Due to the low numbers of immune cells, and varying cell counts per spot, using proportions for rare cells can lead to inaccuracies. Thus, we translated these proportions into absolute cell counts (multiplying by the total cells per spot segment) for further comparison (Supplementary Fig. 7). For B cells, most methods performed well except CytoSPACE²³, StereoScope²⁸, iStar³² and Cell2location²¹. A low rate of false-positive detection of T cells in the striatum (STR) region has been observed in the predictions of Spotiphy, CARD²⁴ and Tangram²⁵. Spotiphy’s predictions for macrophages and microglia aligned well with the ground truth. Multiple metrics were used to benchmark Spotiphy’s cellular deconvolution against other methods (Fig. 2b–d, Extended Data Fig. 1b–e, Supplementary Fig. 3 and Supplementary Table 3). Spotiphy consistently produced the highest overall Pearson’s correlation coefficient with the ground truth. Spotiphy’s cell-type proportions for each transcriptomic spot aligned more closely with the ground truth, as evidenced by higher values of correlation, the fraction of cells correctly mapped and the cosine similarity, along with lower values for the absolute error, mean square error and Jensen–Shannon divergence (JSD).

**Fig. 2: Benchmarking Spotiphy’s cellular deconvolution using matched Xenium data.**

Although the matched Xenium data were useful for evaluating biological validity, it was not an ideal choice of ground truth for benchmarking the method’s performance. This is because (1) Xenium’s predesigned gene panel has limited plexity, and (2) the data were generated from serial sections instead of an identical section. To obtain a comprehensive benchmarking analysis, we therefore constructed a series of simulated Visium datasets from scRNA-seq data with three noise levels. Additionally, we included eight paired ‘scRNA-Visium’ datasets from various human tissues³⁸ to evaluate the deconvolution performance of Spotiphy against the other 13 methods (Extended Data Fig. 2a–e, Supplementary Figs. 8–11, Supplementary Table 4 and Methods). The quantitative metrics further demonstrated that Spotiphy consistently ranked among the top in performance, if not the best. Moreover, Spotiphy’s computational time was substantially lower than that of the other methods (Extended Data Fig. 2f). In summary, with a total of 13 methods included for deconvolution benchmarking, our evaluations demonstrate Spotiphy’s superiority across different tissue types.

Spotiphy captures astrocyte regional specification

Besides surpassing its peers in deconvolution, Spotiphy is unique in its ability to decompose the transcriptomic profiles of spots to the single-cell level (iscRNA data; Extended Data Fig. 3a and Methods). When we applied unsupervised clustering³⁹ to the iscRNA data from 33,819 cells generated from mouse brain Visium data, individual astrocytes did not cluster together; instead, they formed multiple subclusters, indicating intraheterogeneity not apparent in the single-cell reference (Figs. 1c and 3a, Extended Data Fig. 3b–e and Supplementary Fig. 12). Similar phenotypes were observed in oligodendrocytes and multiple types of neurons. We subsequently color coded the astrocytes based on their respective subcluster and overlaid the same colors onto the histological images at their spots of origin (Fig. 3b). Surprisingly, the distinct spatial distribution of these subclusters corresponded perfectly with the classical histological regions of the mouse brain, in which the sagittal section is divided into six major topographic regions: cerebral cortex (CTX), hippocampus (HPF), fiber tracts (FT), thalamus (TH), hypothalamus (HY) and stratum (STR) (Extended Data Fig. 3f). Notably, the spatial distribution of astrocyte subclusters was nearly identical in the WT and AD samples; the proportions of each astrocyte subtype likewise did not differ substantially between the two samples (Fig. 3b and Extended Data Fig. 3e).

Further downstream analysis of the iscRNA data generated by Spotiphy revealed regional specification of astrocytes. To explore potential functional differences in the astrocytes of the telencephalon and diencephalon, we selected a region from each (CTX and TH, respectively) for comparison. Using NetBID2 (ref. ⁴⁰), we identified differentially expressed genes (DEGs) between CTX and TH astrocytes (Fig. 3c,d and Supplementary Table 5) and then used the top DEGs to define signature scores for each region (CTX score and TH score; Methods). As expected, each astrocyte scored high for its region of origin and low for the comparison region in both WT and AD mouse samples (Fig. 3e). To validate the accuracy of the iscRNA data and subsequent analyses, we used the CosMx data in the CTX and TH regions in the WT and AD mouse samples as the ground truth. The CosMx data also showed that astrocytes in the CTX region had a higher CTX score, and astrocytes in the TH region had a higher TH score (Fig. 3f,g). Differences in the actual transcripts (as measured by CosMx) of the astrocytes from these regions further supported this astrocyte regional specification. Gene set enrichment analysis (GSEA) demonstrated enrichment of CTX astrocytes in pathways associated with neuronal differentiation and cell fate specification; TH astrocytes, meanwhile, were found to show greater expression of genes related to neuronal synaptic plasticity. These data suggest distinct biological roles for astrocytes in different regions of the mouse brain (Extended Data Fig. 3g and Supplementary Table 6).

A recent study identified two distinct astrocyte clusters in mouse brains using single-nucleus RNA-seq (snRNA-seq) profiles of astrocytes enriched in different regions²¹. These clusters aligned with the telencephalon (cluster 1) and diencephalon (cluster 2), indicating regional expression differences in astrocytes (Extended Data Fig. 4a). When we applied the signature scores to these data, astrocytes in cluster 1 had a high CTX score and low TH score, whereas those in cluster 2 had a low CTX score and a high TH score, consistent with the astrocyte regional specification that we observed in the iscRNA data (Extended Data Fig. 4b,c).

Furthermore, we generated iscRNA data from the simulated Visium datasets discussed earlier. Unsupervised clustering of the simulated iscRNA data showed that astrocytes, oligodendrocytes and neurons were clustered together and did not form subclusters as they did in the real iscRNA data (Extended Data Fig. 4d,e). Nearly all astrocytes grouped into one single cluster, except for two minor subclusters, which exhibited random distribution and no clear regional patterns (Extended Data Fig. 4f,g). As the simulated Visium datasets were generated using scRNA-seq data, the lack of spatial variable genes and identifiable regional specification was anticipated. This result indicates that Spotiphy will not introduce any ‘variations’ stemming from its modeling process, suggesting that the regional specification observed in the real iscRNA data reflects genuine biological phenomena rather than merely an artifact of Spotiphy.

Spotiphy reveals disease-associated microglia regional specifications

To confirm that the iscRNA data generated by Spotiphy for rare cell populations are likewise biologically meaningful, we analyzed the microglial populations, which formed five subclusters corresponding to the histological regions of the mouse brain, with a single cluster (C&H) encompassing both the CTX and HPF regions (Fig. 4a,b and Extended Data Fig. 5a). Both the WT and AD brain samples showed similar distribution patterns of microglia in the FT, HY and STR regions; however, microglia in the C&H and TH regions were found to originate mainly from the AD sample (Extended Data Fig. 5b). To characterize these AD-specific microglial populations, we used NetBID2 to identify the top DEGs and found considerable overlap between those genes and disease-associated microglia (DAM) markers^41,42,43, suggesting that the microglia in these regions of the AD brain may in fact be DAM (Extended Data Fig. 5c and Supplementary Table 7). We then used commonly used microglia and DAM markers to define signature scores and applied them to the iscRNA data for the microglia from both samples (Extended Data Fig. 5d). As expected, the microglia scores for each region did not vary significantly between the WT and AD samples. The DAM score, however, was significantly higher in the AD brain, especially in the C&H and TH subclusters (Fig. 4c). These observations were validated by the CosMx data, in which the signal intensities of the DAM markers were greater for microglia located in the C&H and TH regions of the AD brain (Fig. 4d,e). Furthermore, GSEA identified a strong enrichment of immune response pathways in AD-specific microglia, consistent with previous scRNA-seq studies of DAM populations^41,42,43 (Extended Data Fig. 5e and Supplementary Table 8). Given the strong colocalization of DAMs and β-amyloid noted in previous studies^42,43,44,45, β-amyloid IHC staining was performed on both samples using the adjacent section (Extended Data Fig. 5f and Methods). β-Amyloid signals were substantially higher in the C&H, FT and TH regions of the AD brain, with the remaining regions showing no significant difference from regions without microglia (Extended Data Fig. 5g and Supplementary Table 9). Together, these results support that the AD-specific microglia in the C&H and TH regions were indeed the DAM population.

**Fig. 4: Spotiphy reveals microglia regional specification in AD mouse brain tissue.**

Unsupervised clustering of the iscRNA data for the AD sample identified C&H and TH subclusters of DAMs, indicating DAM regional specification (Extended Data Fig. 6a). This regional specification was supported by a further validation that used CosMx data as the ground truth. NetBID2 analysis identified DEGs for the C&H DAM and TH DAM subclusters (Extended Data Fig. 6b,c and Supplementary Table 10); these were then used to define signature C&H and TH scores that were applied to both the CosMx and iscRNA data. As expected, the iscRNA and CosMx data for C&H DAM showed higher C&H scores, and both datasets for TH DAM showed higher TH scores (Fig. 4f–h). GSEA showed that C&H DAM exhibit upregulation of genes related to the immune response, indicating greater immune activation in the CTX and HPF regions of the AD sample (Extended Data Fig. 6d and Supplementary Table 11). This finding was consistent with the greater relative accumulation of β-amyloid that we observed in the CTX and HPF regions of the AD sample^46,47. Additionally, the microglia in the simulated iscRNA data were clustered together and had a seemingly random distribution, confirming that the results were authentic and not artifact related (Extended Data Fig. 6e,f).

Spotiphy charts tumor–TME spatial domains in breast cancer

To evaluate Spotiphy’s ability to characterize tumors and the TME, publicly available scRNA-seq and Visium datasets^48,49 for breast tissue samples (BCCID4535, BC1160920F and BCCID44971) were applied to Spotiphy and produced iscRNA data for a total of 23,200 cells (Supplementary Fig. 13, Supplementary Table 12 and Methods). Unsupervised clustering analysis of these iscRNA data revealed multiple subclusters for luminal hormone-responsive (LumHR) cells and luminal secretory (LumSec) cells (Fig. 5a,b). To test if the iscRNA data could distinguish tumors from normal cells, we applied inferCNV analysis^{50,51,52,53,54} to both Visium and iscRNA data (Fig. 5c and Supplementary Fig. 14). Although the spot-level Visium data can reveal notable copy number variations (CNVs) in both tumor samples, obvious intraheterogeneity in the CNV patterns was also observed in BC1160920F (Supplementary Fig. 14). Due to resolution limits, we cannot determine which cells contribute to this variation. By contrast, the inferCNV results of the iscRNA data showed that LumHR cells from BCCID4535 and LumSec cells from BC1160920F possessed more CNVs, indicating their greater likelihood of being tumor cells (Fig. 5c). Both LumHR and LumSec cells from BCCID44971 exhibited fewer CNVs, indicating that they are normal cells (Fig. 5d,e and Extended Data Fig. 7a,b). This observation was consistent with the original study’s conclusion that BCCID44971 was normal breast tissue⁴⁹. Similarly, iscRNA data successfully identified epithelial cells as the primary tumor cells in the Visium datasets of lung cancer and colorectal cancer (Supplementary Fig. 15).

**Fig. 5: Spotiphy reveals tumor–TME changes across spatial domains in human breast tissue.**

We subsequently validated the tumor/nontumor identities of luminal cells (as determined by the iscRNA data) using Resolve smFISH data from tumor (P35-S1) and normal (P69-S3) breast tissue samples⁴⁸. We selected the DEGs for tumor (BC1160920F) and normal (BCCID44971) LumSec cells and then used them to define tumor LumSec (T-Sec) and normal LumSec (N-Sec) scores, which we subsequently applied to the iscRNA and smFISH datasets (Extended Data Fig. 7c,d and Supplementary Table 13). Indeed, genes that showed elevated expression in tumor LumSec cells also produced greater signal intensity in the smFISH data for the tumor sample (P35-S1); the same was observed for the N-Sec genes (Fig. 5f–h). Similarly, the top DEGs for tumor (BCCID4535) LumHR cells were also found to show elevated expression and greater signal intensity in both the iscRNA data and the smFISH data for the tumor sample (P35-S1), respectively (Extended Data Fig. 7e–i and Supplementary Table 14). Furthermore, GSEA detected elevated immune activation features in cells originating from the tumor samples (BC1160920F and BCCID4535; Extended Data Fig. 7j,k and Supplementary Tables 15 and 16), supporting Spotiphy’s accuracy in differentiating between tumor cells and normal cells.

Three LumSec tumor cell subclusters in BC1160920F exhibited unique localizations, distributed from the center to the periphery of the section (Fig. 5i and Extended Data Fig. 8a). We categorized spots into three ‘spatial domains’ based on the subcluster of their LumSec cells, termed as outer layer (OL), middle layer (ML) and inner layer (IL; Fig. 5j). We labeled cells by spatial domain and divided the iscRNA data into three domain-specific subsets. Similarly, we defined OL and IL spatial domains in BCCID4535 based on LumHR tumor cell subclusters (Extended Data Fig. 8a–c). The cell-type proportions varied across the spatial domains, with tumor cells (LumSec or LumHR cells) increasing and immune cells (T, B and myeloid cells) decreasing toward the center (Fig. 5j and Extended Data Fig. 8c). To better understand these variations, we used CellChat^55,56 to compare the tumor–TME communication patterns across these spatial domains. Dramatic differences were observed in the tumor–TME interactions in both BC1160920F and BCCID4535 (Fig. 5k,l and Extended Data Fig. 8d,e). Interestingly, the interaction strength (per CellChat) between tumor and immune cells did not always correspond to the cell-type proportions of the spatial domain. For LumSec cells in BC1160920F, the strongest immune reactions were detected in the ML domain, not the OL domain with the most immune cells (Fig. 5l). Interactions between the collagen family and SDC1/SDC4, as well as between MDK and SDC1/SDC4, are associated with cell mobility and were notably increased in the ML domains^57,58. Both pairs were upregulated in the ML domain of BC1160920F and the OL domain of BCCID4535 (Extended Data Fig. 8f,g), indicating that cell mobility may play a role in the infiltration of immune cells into tumors.

To highlight the uniqueness of Spotiphy’s decomposition function, we performed a comprehensive decomposition benchmarking of Spotiphy against Tangram²⁵, SpatialScope²⁹ and iStar³² using simulated Visium datasets of the mouse brain (Supplementary Fig. 16). Four methods delivered cell-type-level expression matrices for each spot. Spotiphy consistently achieved the highest correlation and cosine similarity and the lowest absolute error, mean square error and JSD compared to the ground truth. Spotiphy also delivered the highest Matthews correlation coefficient for each spot. In addition, Spotiphy requires significantly less computation time than its competitors. Considering that a typical Visium sample usually contains 3,000 to 4,000 spots⁷, this is a major advantage, allowing Spotiphy to deliver reliable results in a shorter time frame.

Another strategy for decomposition benchmarking involves merging the iscRNA data into pseudo-Visium data according to the spot origin of cells and then comparing it with the original inputs (Supplementary Fig. 17). Because Spotiphy’s decomposition strategy involves partitioning gene counts into different cells, merging its iscRNA data results in output that is almost identical to the Visium input, achieving the highest overall Pearson’s correlation coefficient. iStar also delivers good performance with the second highest Pearson’s correlation coefficient. Tangram and SpatialScope, however, demonstrated less than ideal results. We then examined the spatial distribution of six spatial variable genes in the pseudo-Visium data. Spotiphy’s output accurately preserved the original distribution patterns. By contrast, SpatialScope, limited to decomposing marker genes from its early phase, only provided the distribution of two genes. Although Tangram’s ‘iscRNA data’ offer whole-genome coverage, it distorted the distribution of two genes and exhibited less overall accuracy. Collectively, these results strongly illustrate Spotiphy’s advancement, accuracy and effectiveness and bring single-cell resolution to whole-transcriptomic ST data, preserving the unique distribution patterns of spatial variable genes and creating opportunities for new insights.

Spotiphy creates pseudo-single-cell whole-slide images

Spotiphy addresses the other major drawback of sequencing-based ST approaches, specifically information loss in noncapture areas (out-spot space). Visium⁷ and DBiT-Seq⁸ typically sequence about 50% of the entire section. In both mouse brain Visium datasets, only 34% of cells were within the spots (in-spot space), leading to profound information loss (Extended Data Fig. 9a). After analyzing the seamless image-based Xenium data of the mouse brain, human lung cancer and human colorectal cancer, we observed that within small scales (<100 μm), cellular proportion changes are minimal. This allows us to impute the information for noncapture areas based on the nearby regions (Supplementary Fig. 18). In fact, cells of the same type typically exhibit a gradually uniform distribution within a tissue section to maintain tissue function^59,60,61, a pattern that has also been observed in MERFISH data⁶². Most genes mirror the steady and consistent distribution, particularly those involved in maintaining basic cellular functions and tissue architecture^63,64. Consequently, Spotiphy imputes cell-type proportions of out-spot space via Gaussian processes (Extended Data Fig. 9b,c).

We then validated the imputation’s accuracy using two fields of view (FOVs) located in the CTX and HPF in matched Visium and Xenium data from AD brain samples. CytoSPACE²³ and Tangram²⁵ were used to benchmark Spotiphy’s performance (Fig. 6a and Extended Data Fig. 9d). Coloring by cell types, Spotiphy yields pseudo-single-cell-resolution images that are highly consistent with the ground truth (Fig. 6b and Extended Data Fig. 9e). All methods delivered reasonable cell-type proportions for the in-spot space, with Spotiphy’s results being closest to the ground truth (Fig. 6c,d and Supplementary Table 17). Intriguingly, the cell-type proportions for the out-spot space inferred by Spotiphy were closer to the ground truth than the Tangram- and CytoSPACE-generated data for the in-spot space. Furthermore, Spotiphy’s merged cell-type proportions for both the in-spot and out-spot spaces were consistent with the accuracy of the in-spot results. These findings support imputation via Gaussian processes as an approach to reconstructing the whole section from sequencing-based ST spot grids. Spotiphy also provides kernel density smoothing as an alternative to estimate the transcriptomic profiles of cells in noncapture areas (Methods).

**Fig. 6: Spotiphy outputs pseudo-single-cell-resolved whole-transcriptome images.**

Spotiphy overcomes in situ single-nuclei sequencing limits

Slide-tags is another sequencing-based platform that was recently introduced¹¹. Slide-tags labels individual nuclei in a whole section with unique barcodes, generating snRNA-seq data with high gene coverage along with spatial information. Although Slide-tags represents an important innovation in the ST field, its practicality is limited by the method’s substantial loss of nuclei (approximately 75%)¹¹. Similar to how Spotiphy’s imputation function recovers data from noncapture areas in spot-based ST, treating each nucleus in Slide-tags data as a ‘spot’ enables us to retrieve the lost snRNA-seq data. To pinpoint the ‘missing’ nuclei that Slide-tags has not labeled, we aligned the hematoxylin and eosin (H&E) image of mouse HPF data (839 nuclei) from the original paper with the image of the same brain region in our AD brain, wherein 3,193 nuclei were detected (Extended Data Fig. 10a–c and Methods). Spotiphy generated iscRNA data for 3,193 nuclei using Slide-tags snRNA-seq data via the imputation function. Cells from both datasets mixed well by cell type after unsupervised clustering, indicating that Spotiphy’s results were accurate and free of batch effects (Extended Data Fig. 10d,e). iscRNA data displayed a similar expression pattern of marker genes identified by Slide-tags snRNA-seq data (Extended Data Fig. 10f). The cell-type proportions obtained from Slide-tags snRNA-seq data and iscRNA data for the HPF were also concordant, further supporting the accuracy of Spotiphy’s inference (Extended Data Fig. 10g). Although no ‘new’ data was generated, Spotiphy was able to increase the number of nuclei for each cell type, providing sufficient statistical power for downstream analysis. Whether these additional pseudocells could improve the accuracy of cellular proportion analysis is worth further exploration.

Together, the results reported herein demonstrate that Spotiphy effectively addresses the key challenges of sequencing-based ST approaches, delivering accurate and biologically meaningful single-cell-resolved whole-transcriptomic ST data of all detectable cells. These data are compatible with conventional scRNA-seq analysis algorithms and tools.

Discussion

Spotiphy is an integrated method that marries the genome-wide coverage of sequencing-based ST approaches with the single-cell resolution of image-based methods. Spotiphy offers a wider array of features (Supplementary Fig. 19a). Spotiphy’s superior inference of cell-type proportions^{21,22,23,24,25} largely owes to its improved marker gene selection and generative modeling approach. Rather than simply predicting the cell-type proportions of transcriptomic spots, Spotiphy models the gene counts for each cell in each spot; in doing so, it not only greatly increases the information density but also brings single-cell resolution to sequencing-based ST data. By contrast, Cell2location²¹ and RCTD²² use probabilistic models without explicitly incorporating decomposed expression into their models. CytoSPACE²³, CARD²⁴ and Tangram²⁵ approach deconvolution from an optimization perspective, attempting to identify the combination of cells in the scRNA-seq reference whose collective expression most closely resembles that of each spot in ST data. They face two drawbacks. (1) Output accuracy heavily relies on the quality of reference data. If the reference lacks or contains only a limited number of the correct cell types, the algorithm would produce results deviating from the ground truth. This explained their poor performance for immune cell populations. (2) No additional information is gained from this mapping process, rendering them less powerful than Spotiphy. SpatialScope²⁹ uses a strategy similar to RCTD²² for cell-type proportion deconvolution and enhances spot expression assembly with pseudocells that closely resemble the scRNA-seq reference, as generated by a deep learning model. However, it confines the gene expression for each cell type to a predetermined range that aligns with the scRNA-seq reference, making it difficult to parse cellular regional specification. Spotiphy uses a Bayesian approach to dissect the expression of individual genes and assign them to specific cell types, producing iscRNA data that more accurately reflect expression fluctuations within transcriptomic spots and reveal cellular regional specifications. Capturing region-specific cell subtypes through conventional single-cell transcriptomics technologies is challenging for several reasons. (1) The single-cell sampling may not cover enough regions of tissue, where ST data usually have full coverage of the whole-tissue block. (2) The population of cells, particularly rare types of cells, is often too small to detect these intravariations for one single-cell profiling. By contrast, Spotiphy can generate an average of about 10,000 cell expression profiles from a single tissue section (whereas one scRNA-seq sample typically has around 5,000 cells on average). (3) It is impossible to determine whether subclusters of certain cell types observed in scRNA-seq data are due to regional variations without the spatial information unique to ST data. Spotiphy can recover the information loss for noncapture areas via Gaussian processes (Supplementary Fig. 20). The insights into the cellular organization, heterogeneity and function within complex biological systems provided by Spotiphy will deepen our understanding of the molecular mechanisms underlying AD pathogenesis and the pathogenesis of solid tumors.

Spotiphy’s results are reliant on the quality of input. Because the decomposition applies to every gene, the iscRNA data will retain the gene coverage of the ST data. Therefore, sequencing-based ST datasets are well suited for Spotiphy, mainly because of their whole-transcriptomics coverage. Other approaches have enhanced their resolution but at the expense of gene coverage. For example, Slide-Seq v1/Slide-Seq v2 achieve a resolution of approximately 10 µm but provide gene coverage of 10¹~10³ (refs. ^9,10); Stereo-seq offers subcellular-resolution spot size (0.2 µm) with a gene coverage of 10²~10³ (ref. ¹²). Data generated through these approaches do not stand to greatly benefit from the application of Spotiphy. At present, pairing gene coverage-prioritizing approaches (for example, Visium and DBiT-Seq) with Spotiphy may yield the best outcomes; the data discussed in the Results had a relatively high gene coverage of 10³~10⁴ at single-cell resolution (Supplementary Fig. 19b).

Spotiphy’s performance is also contingent on the quality of the scRNA-seq reference for its generative model. An scRNA-seq dataset derived from the same sample as the ST data would be ideal to avoid batch effects. However, acquiring such matched datasets in the real world is both technically challenging and financially cumbersome. To accommodate unmatched datasets, Spotiphy assigns variables (batch prior) to each gene to model the differences between ST and scRNA-seq profiles, effectively mitigating batch effects on deconvolution and decomposition. Spotiphy also presets four additional hyperparameters, including the quantile parameter, fold change threshold, nonzero count cell percentage and number of marker genes, to ensure scRNA-seq reference accuracy and robustness (Supplementary Fig. 1). In fact, the scRNA-seq reference used for mouse brains was primarily sourced from Allen Brain Map Atlas³⁶, demonstrating that Spotiphy can produce reliable deconvolution results even when the reference originates from nonmatched tissues. This flexibility greatly broadens Spotiphy’s potential applications, allowing for more uses in various research contexts without the need for matched reference datasets.

Spotiphy produces single-cell spatial whole-transcriptomic profiles, but it is not the only solution. Emerging methods like iSpatial⁶⁵ and Liger⁶⁶ aim to generate the same results from image-based ST data, which could be worth exploring, but with some caveats. The major limitation of image-based ST approaches is the low-plexity of their predesigned gene panels⁶⁷. These methods use scRNA-seq data to predict the expression of nontargeted genes, a process complicated by the different measurement techniques used by scRNA-seq and image-based ST to quantify gene expression. For scRNA-seq, the raw count is a measure of the free mRNA molecules within an individual cell⁶⁸, whereas the raw counts produced by image-based ST are a measure of the positive fluorescent signals within regions around the nuclei⁶⁹. This fundamental discrepancy greatly diminishes the usefulness of scRNA-seq data as a reference, and minimizing these ‘technique-derived batch effects’ is challenging. Additional limitations make image-based ST data less suitable for inference. First, the fluorescence signal intensity is highly influenced by probe affinity, which varies across probe sets. Second, different excitation wavelengths will exhibit different signal intensities, complicating signal quantification. Third, image-based ST data tend to be sparser due to differences in the sensitivity of different probes. Together, these issues make it more difficult to accurately compare expression levels between different genes.

Potential future directions for Spotiphy include the implementation of five improvements. First, a deep learning-based vision transformer model that correlates cell-type information with image features would improve Spotiphy’s performance by taking advantage of extra information from histological images. Currently, Spotiphy assigns a cell type and corresponding iscRNA profile to a random nucleus within each spot. Although this random assignment is confined to a small region (50~100 µm), it poses limitations in improving the accuracy of cell-type assignment. This solution, which requires the use of massive training datasets, is contingent on the increased availability of sequencing-based ST data in the future⁷⁰. It is also worth noting that for users studying cell–cell interactions at a fine scale, vision transformer modeling has the potential to increase inference accuracy. By leveraging additional image features, cell-type information and iscRNA data can be effectively assigned to the corresponding cells. A second exciting extension for Spotiphy involves reconstructing the three-dimensional (3D) structure of tissue. Recent studies have shown that profiling sequential sections with ST can reveal the real 3D organization of tissues^71,72. Yet, it is costly and not feasible for broad use, which is where Spotiphy can step in. By leveraging the similarities in image features from adjacent sections, Spotiphy could infer the cellular proportions and expression profiles of neighboring sections, ultimately reconstructing the 3D structure with single-cell-resolved profiles. STalign⁷³ would be of great assistance for ensuring alignments between adjacent sections. Another opportunity for improvement resides in Spotiphy’s inability to identify cell types that are present in the ST data but not in the scRNA-seq reference. We believe that this shortcoming owes primarily to the algorithm’s inability to discern whether uncharacterized gene expression is indicative of an as of yet undefined cell type or simply a biological change in an existing cell type. In other words, the ease and accuracy of predictions could be improved if the gene coverage of sequencing allowed for a clearer distinction between cell subtypes and even substates. We therefore recommend including as many cell types as possible in the construction of the scRNA-seq reference to optimize the precision of Spotiphy’s results. Furthermore, with the continuous advancement of single-cell sequencing technologies, we aim to enhance Spotiphy by incorporating inputs from scMultiome (scRNA and scATAC)⁷⁴ and Paired-Tag (scRNA and scChIP)⁷⁵ profiles with spatial-ATAC-seq⁷⁶ and spatial-CUT&Tag⁷⁷, which have yielded spatially resolved chromatin modification and epigenome profiles. This expansion would enable Spotiphy to integrate and analyze multimodal data, providing a more comprehensive and detailed understanding of spatially resolved cellular states and regulatory mechanisms. Last, improvement in the resolution of ST platforms holds the potential to enhance existing methods that directly use ST data for principal component analysis and clustering analysis^78,79. By leveraging the iscRNA data generated by Spotiphy as input, these approaches will achieve more accurate and comprehensive output results at both gene and cellular levels. Their outputs, including DEGs and neighborhood signatures, will provide more realistic guidance for future experimental validations.

Methods

Sample information and processing

Animals

Four-month-old 5xFAD and C57BL/6J (all Jackson Laboratory) mice were used in this study. All animals were housed within the vivarium at St. Jude Children’s Research Hospital or Thomas Jefferson University and maintained on a 12-h light/12-h dark cycle with ad libitum access to food and water. All experimental procedures using animals were performed in accordance with the National Institutes of Health Guide for the Care and Use of Laboratory Animals, and all protocols were approved by the St. Jude Children’s Research Hospital (protocol 542) Institutional Animal Care and Use Committee. Experiments were performed in accordance with the Code of Ethics of the World Medical Association (Declaration of Helsinki) for animal experiments.

Tissue preparation

Mice were deeply anesthetized with Avertin and perfused with ice-cold 1× PBS (pH 7.4) to flush circulating blood cells from the brain. The exsanguinated brain was quickly removed, and the left hemisphere was dissected and placed in ice-cold HBSS (Gibco, Life Technologies). The right hemisphere was embedded in OCT and kept at −80 °C until use. Embedded brains were cut from the midline to the lateral side in laterally at the sagittal plane level in 10-μm-thick sections or as required by the cryostat.

Single-nucleus isolation and multiome profiling

The left hemisphere was used for profiling using a 10x Genomics Chromium Single Cell Multiome ATAC + Gene Expression kit. A small piece of the tissue was flash-frozen and used for direct nuclei isolation by following the 10x Genomics demonstrated protocol Nuclei Isolation from Complex Tissues for Single Cell Multiome ATAC + Gene Expression Sequencing (CG000375 Rev A). Nuclei were counted by staining with Trypan Blue on a hemocytometer, and the concentration was adjusted to 1,000 nuclei per μl. The remaining larger piece of the tissue sample was dissociated to yield single cells using papain-based dissociation solution on a GentleMACS Octodissociator at 37 °C for 20 min. After filtering the dissociated sample, neuronal and immune cells were separated from unwanted myelin using density gradient centrifugation with 22% Percoll. The pellet containing microglia, lymphocytes and nonimmune neuronal cells like astrocytes, oligodendrocytes and neuronal stem cells was washed in HBSS buffer and stained with CD45 (1:100) and CD11b (1:100) to enrich immune cell populations using a FACS Aria III Cell Sorter. After isolating CD45⁺CD11b⁺ microglia, CD45⁺ other immune cells and double-negative neuronal cells were counted using Acridine Orange/propidium iodide staining on a Luna FL Cell Counter. In total, 15,000 neuronal cells, 10,000 microglia and all cells of the immune population were mixed and used for nuclei isolation using 10x Genomics Demonstrated Protocol Nuclei Isolation for Single Cell Multiome ATAC + Gene Expression Sequencing (CG000365 Rev A). Nuclei were counted by staining with Trypan Blue on a hemocytometer and adjusted to a concentration of 2,000 nuclei per μl. Nuclei isolated from both pieces were mixed such that 5,000 nuclei were targeted from nuclei isolated from enriched cell populations and 1,000 nuclei were targeted from nuclei obtained directly from frozen sections. The mixed nuclei population was tagmented and immediately loaded on to Chip J to be partitioned into GEMS using 10x Genomics User Guide Chromium Next GEM Single Cell Multiome ATAC + Gene Expression (CG000338 Rev C). Partitioned nuclei underwent barcoding, reverse transcription, cDNA amplification and sample index ligation, and libraries were made for both ATAC-seq and gene expression sequencing. Libraries were quantified using a D5000 kit on an Agilent Tapestation and Illumina MiSeq platform. Quantified libraries were then sequenced on an Illumina NovaSeq 6000.

Sequencing-based ST profiling (10x Genomics Visium)

The right hemisphere was embedded in OCT (Tissue-Tek, Sakura) after removing excess PBS. The tissue was cryosectioned as per the tissue preparation guide from Visium Spatial Gene expression (10x Genomics, CG000160). Briefly, OCT-embedded tissue blocks were sectioned (10 µm) and placed within the capture area of the Visium spatial slide (10x Genomics, PN-1000184). The slides were then permeabilized for 18 min at 37 °C as determined with the Visium Tissue optimization procedure. For Visium Spatial Gene expression, brightfield images were acquired using an AxioScan Z.1 Whole-Slide scanner and exported into JPG files. cDNA libraries were generated according to the Visium Spatial Gene Expression User Guide (10x protocol, CG000239). The libraries were loaded and sequenced on an Illumina NovaSeq 6000 following the recommendation’s of the 10x Genomics sequencing parameters. Raw data were converted into FastQ and matrices of expression using Space Range software V1.0 provided by 10x Genomics.

Image-based ST profiling (10x Genomics Xenium and NanoString CosMx)

Serial sections of mouse brain tissue were sent to 10x Genomics and NanoString for processing. A 298-plex mouse neuron panel was used for 10x Genomics Xenium profiles, and a 1,000-plex mouse panel was used for NanoString CosMx profiles (Supplementary Table 18).

β-Amyloid staining

Ten-micron serial sections of mouse brain tissue were generated for IHC staining using X-34 targeting β-amyloid plaques. High-resolution images of H&E staining were acquired using a Zeiss system. By aligning the IHC images with Visium images based on histological features, we can further quantify the fluorescence intensity in different brain regions.

Mouse brain scRNA-seq reference

scRNA-seq data from 1,228,636 cells from Yao et al.³⁶ were used to identify 22 cell types in the mouse brain. The whole dataset was downsampled to 5,000 random picked cells for cell types with numbers over 10,000. In total, 103,281 cells were selected and further merged with 49,406 cells of seven cell types from the original generated dataset.

Human breast datasets

scRNA-seq data from 126 specimens from Kumar et al.⁴⁸ were used to construct the single-cell reference with ten major cell types (Supplementary Fig. 13a–c). Three 10x Genomics Visium profiles from Wu et al.⁴⁹ were used as ST inputs. BCCID4535 and BC1160920F are labeled as tumor samples, and BCCID44971 is labeled a normal control in the original paper⁴⁹. Spotiphy was performed on three Visium samples separately (Supplementary Fig. 13d). Resolve smFISH data from samples P35-S1 (tumor) and P69-S3 (normal) from Kumar et al.⁴⁸ were used as the ground truth for downstream analysis and validation. The original data links are available in the Data availability section.

Clustering based on Seurat and scMINER

All clustering analyses of iscRNA data in this study were performed using scMINER³⁹. Clustering of the mouse brain scRNA-seq reference was performed using Seurat⁸⁰ for batch correction with harmony⁸¹, and clustering of the human breast scRNA-seq reference was performed using Seurat and the same parameters as those used in the original paper⁴⁸.

Downstream analyses and signature score definition

Differential expression analysis was performed using the scMINER getDE() function (https://jyyulab.github.io/scMINER). getDE() uses the ‘limma (v-3.60.4) R package’ as the default method. The GSEA in this study was conducted using the NetBID2 (ref. ⁴⁰) funcEnrich.Fisher() function and visualized with the draw.funcEnrich.cluster() function (https://jyyulab.github.io/NetBID). The top DEGs were selected (Supplementary Table 19), and their average values were used as a signature score through the Seurat⁸⁰ AddModuleScore function.

Tumor identification by inferCNV

The tumor–normal distinction in this study was performed using the inferCNV package^{50,51,52,53,54} with default settings. It is worth mentioning that the inferCNV package is sensitive to the choice of normal cell reference. Therefore, it is important to select the appropriate normal cell reference in the iscRNA data to ensure proper result interpretation.

Cell–cell communication

Cell–cell communication analysis and comparisons were performed using the CellChat package^55,56 with default settings.

Spotiphy pipeline

Introduction

For a more detailed description of the Spotiphy pipeline, see the Supplementary Methods. In summary, Spotiphy integrates probabilistic modeling, statistical testing and computer vision techniques to (1) estimate the proportion of cell types at each capture area (for example, circular spots) of the tissue, (2) decompose the spatial expressions of the multicell mixtures into the single-cell level and (3) generate pseudo-single-cell-resolution images. The pipeline takes three inputs. The first input is an untransformed spatial expression count matrix ({boldsymbol{X}}{{mathbb{in }}{mathbb{R}}}^{Stimes G}) of locations (sin {mathrm{1,2},cdots ,S}) and genes (gin {mathrm{1,2},cdots ,G}), which can be obtained by using sequencing-based ST technologies, such as 10x Genomics Visium and Slide-Seq. The second input is the scRNA count matrix ({boldsymbol{Y}}{{mathbb{in }}{mathbb{R}}}^{Ctimes G}) of cells (cin {mathrm{1,2},cdots ,C}) and genes (gin {mathrm{1,2},cdots ,G}), which is normalized by the counts per million method. All cell types need to be annotated, and we let (tau (c)in {mathrm{1,2},cdots ,T,}) denote the type of cell (c), where (T) is the number of cell types. The last input is the high-resolution H&E-stained image, which is used to determine the location of each nucleus and the number of nuclei in each capture area. Note that the H&E-stained image is not needed when estimating cell-type proportions.

Marker gene selection

Let ({z}_{t,g}) and ({hat{z}}_{t,g}) denote the average expression of gene (g) in type (t) cells and the corresponding estimation based on scRNA matrix ({boldsymbol{Y}}). Let ({{boldsymbol{F}}}_{g,t}=left{{hat{z}}_{t,g}/{hat{z}}_{{t}^{{prime} },g},|,{t}^{{prime} }ne tright}) denote the set of fold change values for gene g when comparing gene expression in cell type (t) with each other cell type. Let ({f}_{g,t}(v)) denote the (v) th sample quantile for the values in set ({{boldsymbol{F}}}_{g,t}). Additionally, with the scRNA matrix ({boldsymbol{Y}}), we can conduct a z test for the null hypothesis ({H}_{0}:{z}_{t,g}le {z}_{{t}^{{prime} },g}) and alternative hypothesis ({H}_{1}:{z}_{t,g} > {z}_{{t}^{{prime} },g}), where the P value is denoted as ({lambda }_{g,t,{t}^{{prime} }}). Let ({w}_{t,g}) denote the coverage rate of gene (g) in type (t) cells, that is, ({w}_{t,g}) is the percentage of cells that have nonzero expression of gene (g). With the parameter (v) and thresholds ({l}_{{rm{fold}}}), ({l}_{lambda }) and ({l}_{{rm{cover}}}), gene (g) is selected as a candidate of type (t) marker when the following three conditions are satisfied:

({f}_{g,t}(v) > {l}_{{rm{fold}}}),
(max left{{lambda }_{g,t,{t}^{{prime} }}|{t}^{{prime} }ne tright} < {l}_{lambda }) and
({w}_{t,g} > {l}_{{rm{cover}}}).

These conditions ensure that the candidate genes are effective in distinguishing type (t) cells from all other cell types. Finally, if cell type (t) has more than ({n}_{{rm{select}}}) candidate genes, we rank them based on the fold change quantile ({f}_{g,t}(v)) and only select the top ({n}_{{rm{select}}}) genes. We repeated this process for all cell types and aggregated the selected marker genes. By only keeping the marker genes in matrices ({boldsymbol{X}}) and ({boldsymbol{Y}}), we obtain matrices ({{boldsymbol{X}}}^{,({rm{m}})}{{mathbb{in }}{mathbb{R}}}^{Stimes {G}_{{rm{m}}}}) and ({{boldsymbol{Y}}}^{,({rm{m}})}{{mathbb{in }}{mathbb{R}}}^{Ctimes {G}_{{rm{m}}}}), where ({G}_{{rm{m}}}) is the total number of selected marker genes, and the superscript ‘m’ means that only the marker genes are considered.

Construction of the scRNA-seq reference matrix

We let ({{boldsymbol{Phi }}}^{({rm{m}})}in {{mathbb{R}}}^{Ttimes {G}_{{rm{m}}}}) be the scRNA reference matrix, where ({varphi }_{t,g}^{({rm{m}})}) represents the average proportion of gene (g) in expression of type (t) cells. Thus, we have (mathop{sum }nolimits_{g=1}^{{G}_{{rm{m}}}}{varphi }_{t,g}^{({rm{m}})}=1) for (t=mathrm{1,2},cdots ,T). Reference matrix ({{boldsymbol{Phi }}}^{({rm{m}})}) is estimated by maximizing the likelihood function, an approach similar to BayesPrism⁸², where the results can be expressed as

$${hat{varphi }}_{t,g}^{({rm{m}})}=frac{{sum }_{left{{c|}tau left(cright)=tright}},{y}_{c,g}^{({rm{m}})}}{{sum }_{left{{c|}tau left(cright)=tright}}{sum }_{{g}^{{prime} }=1}^{{G}_{{rm{m}}}}{y}_{c,{g}^{{prime} }}^{({rm{m}})}}.$$

Probabilistic generative model and estimation of cell-type proportions

To model the spatial expression matrix ({{boldsymbol{X}}}^{,({rm{m}})}), we let ({boldsymbol{Q}}in {{mathbb{R}}}^{Stimes T}) be the score matrix of gene contribution, where the gene contribution score ({q}_{s,t}) denotes the proportion of genes at location (s) that are contributed by type (t) cells. Thus, (mathop{sum }nolimits_{t=1}^{T}{q}_{s,t}=1) for (s=mathrm{1,2},cdots ,S). The proportion of gene (g) at location (s) can be derived as

$${rho }_{s,g}=mathop{sum }limits_{t=1}^{T}{q}_{s,t}bullet {varphi }_{t,g}^{({rm{m}})}.$$

However, because the scRNA data and ST data are obtained using different technologies, the batch effect between these two data types cannot be ignored. To quantify the batch effect, we introduced batch effect parameters ({boldsymbol{r}}={{r}_{1},{r}_{2},cdots ,{r}_{G}}) to the probabilistic generative model and adjust ({rho }_{s,g}) as

$${widetilde{rho }}_{s,g}=frac{{rho }_{s,g}bullet {2}^{{r}_{g}}}{{sum }_{{g}^{{prime} }=1}^{{G}_{{rm{m}}}}{rho }_{s,{g}^{{prime} }}bullet {2}^{{r}_{{g}^{{prime} }}}}.$$

We then assumed that conditioned on score matrix ({boldsymbol{Q}}), batch effect parameters ({boldsymbol{r}}) and the scRNA reference matrix ({{boldsymbol{Phi }}}^{({rm{m}})}), the spatial expression at location (s) follows a multinomial distribution,

$${{boldsymbol{x}}}_{s}^{({rm{m}})}left|{boldsymbol{Q}},{boldsymbol{r}},{{boldsymbol{Phi }}}^{({rm{m}})}right.sim {rm{Multinomial}}left({m}_{s}^{{prime} },left[{widetilde{rho }}_{s,1},{widetilde{rho }}_{s,2},cdots ,{widetilde{rho }}_{s,{G}_{{rm{m}}}}right]right),$$

where ({m}_{s}^{{prime} }) is the total gene count at location (s). For full details of the generative model, see the Supplementary Methods.

In Spotiphy, we applied variational inference to approximate the posterior distribution of score matrix ({boldsymbol{Q}}) and batch effect parameters ({boldsymbol{r}}). This approach was implemented using the Python package Pyro⁸³, which supports GPU acceleration. The posterior mean values were then used as the estimation of the unknown parameters.

Nucleus segmentation and inference of cell boundaries

In the Spotiphy pipeline, we adopted the pretrained deep learning model from Stardist^84,85 to segment the nuclei in H&E-stained images. With the segmentation results, we assumed that a cell was at location (s) if the center of the cell’s nucleus was inside the capture area. We let ({N}_{s}) denote the number of cells at location (s).

Decomposition of spatial expression

We first update the estimated cell-type proportions. When H&E-stained images are available, we can leverage the nucleus segmentation results. Specifically, for (t=mathrm{1,2},cdots ,T), we let ({n}_{s,t}) denote the number of type (t) cells at location (s). These values can be determined by solving the following optimization problem:

$$minmathop{sum }limits_{t=1}^{T}left|frac{{n}_{s,t}}{{N}_{s}}-{p}_{s,t}right|$$

$${rm{s}}.{rm{t}}.mathop{sum }limits_{t=1}^{T}{n}_{s,t}={N}_{s}$$

Then, the updated cell-type proportions at location (s) are ({widetilde{p}}_{s,t}={n}_{s,t}/{N}_{s}) for (t=mathrm{1,2},cdots ,T). When an H&E-stained image is not available, we define the threshold ({l}_{{rm{p}}}) and assume that if ({p}_{s,t} < {l}_{{rm{p}}}), there is no strong evidence to suggest the existence of cell-type (t) at location (s). Consequently, the proportion is updated as follows:

$${tilde{p}}_{s,t}=left{begin{array}{ll},0, & {p}_{s,t} < {l}_{{rm{p}}}\ frac{{sum }_{{t}_{1}=1}^{T}{p}_{s,{t}_{2}}cdot I(,{p}_{s,{t}_{1}}ge {l}_{{rm{p}}})}{{sum }_{{t}_{2}=1}^{T}I(,{p}_{s,{t}_{2}}ge {l}_{{rm{p}}})} & {p}_{s,t}ge {l}_{{rm{p}}}end{array},right.$$

where (I(bullet )) is the indicator function. After updating the cell-type proportions, we let (widetilde{{boldsymbol{P}}}in {{mathbb{R}}}^{Stimes T}) be the corresponding matrix where the ({st}) th entry is ({widetilde{p}}_{s,t}).

To make sure that the decomposed spatial expression can facilitate more downstream analysis, we aimed to decompose the spatial expression of all (G) genes, rather than the ({G}_{m}) marker genes. Thus, we constructed a single-cell reference matrix ({boldsymbol{Phi }}in {{mathbb{R}}}^{Ttimes G}) using the full scRNA count matrix ({boldsymbol{Y}}). In addition, we let ({boldsymbol{U}}in {{mathbb{R}}}^{Stimes Gtimes T}) denote a 3D tensor, where ({u}_{s,t,g}) is the expression of gene (g) in type (t) cells at location (s). As a result, ({{boldsymbol{u}}}_{s,g}=[{u}_{s,g,1},{u}_{s,g,2},cdots ,{u}_{s,g,T}]) is the decomposition of spatial expression ({x}_{s,g}). By applying the probabilistic model again, the conditional distribution of ({{boldsymbol{u}}}_{s,t}) is

$${{boldsymbol{u}}}_{s,g}left|{boldsymbol{X}},{boldsymbol{Phi }},widetilde{{boldsymbol{P}}}right.sim {rm{Multinomial}}left({x}_{s,t},left[{omega }_{s,g,1},{omega }_{s,g,2},cdots ,{omega }_{s,g,T}right]right),$$

where ({omega }_{s,g,t}=frac{{widetilde{p}}_{s,t}bullet {varphi }_{t,g}}{{sum }_{{t}^{{prime} }=1}^{T}{widetilde{p}}_{s,{t}^{{prime} }}bullet {varphi }_{{t}^{{prime} },g}}). Then, ({u}_{s,t,g}) can be estimated as the posterior mean: ({hat{u}}_{s,t,g}={x}_{s,t}bullet {omega }_{s,g,t}).

Cell-type annotation in the pseudo-single-cell-resolution images

Recall that we can estimate the exact numbers of each cell type within each capture area. However, we are not able to identify which specific nuclei belong to each cell type. Thus, given that there are ({n}_{s,t}) cells of type (t) at location (s), we randomly assign ({n}_{s,t}) nuclei as belonging to cell-type (t) within the capture area. For nuclei outside the capture areas, we used Gaussian processes to estimate the proportion of each cell type in the neighborhood of each nucleus outside the capture areas. In this way, we assign a cell type to each nucleus outside the capture area by randomly sampling from the estimated proportions of different cell types. By annotating the nuclei and inferring the cell boundaries, we obtain a pseudo-single-cell-resolution image that closely resembles the output of image-based ST approaches (Supplementary Fig. 20).

Evaluation of cell-type proportion estimation using Xenium, CosMx and synthetic data

Alignment of 10x Genomics Xenium and NanoString CosMx data to 10x Genomics Visium data

Alignment is necessary to make the Xenium data, CosMx data and Visium data comparable. To achieve this goal, we first calibrated the 10x Genomics Xenium nuclei (stained with DAPI) image and NanoString CosMx nuclei (stained with DAPI and 18Sr/H3) image to match the scale of the Visium H&E-stained image, ensuring that pixels in both images represent the same physical size. It is worth mentioning that due to the sensitivity differences between DAPI and H&E staining, the absolute number of nuclei on Visium (48,219) is smaller than on Xenium (104,831). We then applied affine transformation to Xenium and CosMx images to align the histological features to Visium images. Taking Xenium as an example, we first scaled the Xenium image to make sure each pixel in the image represents the same physical length as the H&E-stained image from the Visium data (0.753 μm per pixel). We then aligned the Xenium image with the H&E-stained image by rotating and translating the Xenium image to ensure that the brain tissue regions (for example, CTX, HPF, TH, HY and STR) were properly aligned. We then projected the Visium spot array onto the Xenium image. In this way, we calculated the cell-type proportion of each spot using cell annotations identified by Xenium profiles as ground truth and further evaluated the accuracy of Spotiphy outputs (Supplementary Fig. 2c). For NanoString CosMx, we obtained data from a total 20 of restricted FOVs (Supplementary Fig. 2d).

Generation of simulated ST datasets

To create synthetic ST datasets that closely resembled the actual dataset, we used the estimated number of each cell type at every location, as determined by Spotiphy, as the ground truth. Specifically, for generating the spatial expression at a given location (s), we randomly sampled ({n}_{s,t}) cells of type (t) cells from the scRNA-seq data for (t=mathrm{1,2},cdots ,T). We then merged all scRNA-seq expression data. To enhance the resemblance of the synthetic expression to real data, we introduced batch effects, artificial zero reads and random noise (Extended Data Fig. 2a). Considering the magnitude of the disturbance, we produced three synthetic datasets characterized by small, medium and large levels of noise (Supplementary Methods).

Estimation of cell-type proportions in mouse brain and human breast cancer samples using Spotiphy

When select marker genes for the 27 cell types in mouse brain data and for the 10 cell types in human breast cancer data, the parameters are selected as ({l}_{{rm{fold}}}=1.5), ({l}_{lambda }=0.1), ({l}_{{rm{cover}}}=60 %), (v=0.15) and ({n}_{{rm{select}}}=50). This is also the default parameter setting in Spotiphy. In total, 1,059 marker genes are selected. To approximate the posterior distribution of the unknown parameters in the probabilistic model, the parameters are updated for 8,000 iterations in Pyro. Plots of the loss function suggest that the estimation of the parameters has already converged.

Evaluation metrics

Here, we applied six evaluation metrics to quantify the performance of cell-type proportion estimation in each spot: absolute error, square error, JSD, cosine similarity, Pearson correlation and fraction of cells correctly mapped. For the first three metrics, a lower value indicates better performance, whereas for the remaining three metrics, a higher value signifies improved performance. We let ({p}_{s,t}) denote the estimated proportion of type (t) cells at location (s) and let ({p}_{s,t}) denote the ground truth. The computation methods of each metric for proportion estimation at location (s) are briefly described as follows:

Absolute error: (mathop{sum }nolimits_{t=1}^{T}left|{p}_{s,t}-{p}_{s,t}^{* }right|).

Square error: (mathop{sum }nolimits_{t=1}^{T}{left({p}_{s,t}-{p}_{s,t}^{* }right)}^{2}).

JSD: (sqrt{frac{1}{2}left[Dleft({{boldsymbol{p}}}_{s}parallel {boldsymbol{m}}right)+Dleft({{boldsymbol{p}}}_{s}^{* }parallel {{boldsymbol{m}}}_{s}right)right]}), where (D) is the Kullback–Leibler divergence, ({{boldsymbol{p}}}_{s}=[{p}_{s,1},{p}_{s,2},cdots ,{p}_{s,T}]), ({{boldsymbol{p}}}_{s}^{* }=[{p}_{s,1}^{* },{p}_{s,2}^{* },cdots ,{p}_{s,T}^{* }]), and ({{boldsymbol{m}}}_{s}=) ((,{{boldsymbol{p}}}_{s}{boldsymbol{+}}{{boldsymbol{p}}}_{s}^{* })/2).

Pearson correlation: (frac{mathop{sum }nolimits_{t=1}^{T}(,{p}_{s,t}-a)(,{p}_{s,t}^{* }-a)}{sqrt{mathop{sum }nolimits_{t=1}^{T}{(,{p}_{s,t}-a)}^{2}mathop{sum }nolimits_{t=1}^{T}{(,{p}_{s,t}^{* }-a)}^{2}}}), where (a=frac{1}{T}=) (frac{mathop{sum }nolimits_{t=1}^{T}{p}_{s,t}}{T}=frac{mathop{sum }nolimits_{t=1}^{T}{p}_{s,t}^{* }}{T}).

Cosine similarity: (frac{mathop{sum }nolimits_{t=1}^{T}{p}_{s,t}{p}_{s,t}^{* }}{sqrt{mathop{sum }nolimits_{t=1}^{T}{(,{p}_{s,t})}^{2}mathop{sum }nolimits_{t=1}^{T}{(,{p}_{s,t}^{* })}^{2}}}).

Fraction of cells correctly mapped: (mathop{sum }nolimits_{t=1}^{T}min (,{p}_{s,t},{p}_{s,t}^{* })).

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.