Biochemistry, Genetics and Molecular Biology Molecular Biology

Single-cell and spatial transcriptomics

Description

This cluster of papers focuses on the comprehensive integration and analysis of single-cell transcriptomic data, covering topics such as cell types, spatial profiling, lineage tracking, data integration, gene expression, and cell heterogeneity. The research explores various technologies and computational methods to study the transcriptomic landscape at the single-cell level.

Keywords

Single-Cell; Transcriptomics; RNA-Seq; Cell Types; Spatial Profiling; Lineage Tracking; Data Integration; Gene Expression; Cell Heterogeneity; Droplet-based Sequencing

The mammalian cerebral cortex supports cognitive functions such as sensorimotor integration, memory, and social behaviors. Normal brain function relies on a diverse set of differentiated cell types, including neurons, glia, … The mammalian cerebral cortex supports cognitive functions such as sensorimotor integration, memory, and social behaviors. Normal brain function relies on a diverse set of differentiated cell types, including neurons, glia, and vasculature. Here, we have used large-scale single-cell RNA sequencing (RNA-seq) to classify cells in the mouse somatosensory cortex and hippocampal CA1 region. We found 47 molecularly distinct subclasses, comprising all known major cell types in the cortex. We identified numerous marker genes, which allowed alignment with known cell types, morphology, and location. We found a layer I interneuron expressing Pax6 and a distinct postmitotic oligodendrocyte subclass marked by Itpr2. Across the diversity of cortical cell types, transcription factors formed a complex, layered regulatory code, suggesting a mechanism for the maintenance of adult cell type identity.
Abstract The number of markers measured in both flow and mass cytometry keeps increasing steadily. Although this provides a wealth of information, it becomes infeasible to analyze these datasets manually. … Abstract The number of markers measured in both flow and mass cytometry keeps increasing steadily. Although this provides a wealth of information, it becomes infeasible to analyze these datasets manually. When using 2D scatter plots, the number of possible plots increases exponentially with the number of markers and therefore, relevant information that is present in the data might be missed. In this article, we introduce a new visualization technique, called FlowSOM, which analyzes Flow or mass cytometry data using a Self‐Organizing Map. Using a two‐level clustering and star charts, our algorithm helps to obtain a clear overview of how all markers are behaving on all cells, and to detect subsets that might be missed otherwise. R code is available at https://github.com/SofieVG/FlowSOM and will be made available at Bioconductor. © 2015 International Society for Advancement of Cytometry
Simultaneous measurement of more than 30 properties in individual human cells is used to characterize signaling in the immune system. Simultaneous measurement of more than 30 properties in individual human cells is used to characterize signaling in the immune system.
Multiplexed RNA imaging in single cells The basis of cellular function is where and when proteins are expressed and in what quantities. Single-molecule fluorescence in situ hybridization (smFISH) experiments quantify … Multiplexed RNA imaging in single cells The basis of cellular function is where and when proteins are expressed and in what quantities. Single-molecule fluorescence in situ hybridization (smFISH) experiments quantify the copy number and location of mRNA molecules; however, the numbers of RNA species that can be simultaneously measured by smFISH has been limited. Using combinatorial labeling with error-robust encoding schemes, Chen et al. simultaneously imaged 100 to 1000 RNA species in a single cell. Such large-scale detection allows regulatory interactions to be analyzed at the transcriptome scale. Science , this issue p. 10.1126/science.aaa6090
Digital PCR enables the absolute quantitation of nucleic acids in a sample. The lack of scalable and practical technologies for digital PCR implementation has hampered the widespread adoption of this … Digital PCR enables the absolute quantitation of nucleic acids in a sample. The lack of scalable and practical technologies for digital PCR implementation has hampered the widespread adoption of this inherently powerful technique. Here we describe a high-throughput droplet digital PCR (ddPCR) system that enables processing of ∼2 million PCR reactions using conventional TaqMan assays with a 96-well plate workflow. Three applications demonstrate that the massive partitioning afforded by our ddPCR system provides orders of magnitude more precision and sensitivity than real-time PCR. First, we show the accurate measurement of germline copy number variation. Second, for rare alleles, we show sensitive detection of mutant DNA in a 100 000-fold excess of wildtype background. Third, we demonstrate absolute quantitation of circulating fetal and maternal DNA from cell-free plasma. We anticipate this ddPCR system will allow researchers to explore complex genetic landscapes, discover and validate new disease associations, and define a new era of molecular diagnostics.
Sequencing of RNA from thousands of individual immune cells allows unbiased identification of cellular subtypes. Sequencing of RNA from thousands of individual immune cells allows unbiased identification of cellular subtypes.
Single-cell transcriptomics reveals gene expression heterogeneity but suffers from stochastic dropout and characteristic bimodal expression distributions in which expression is either strongly non-zero or non-detectable. We propose a two-part, generalized … Single-cell transcriptomics reveals gene expression heterogeneity but suffers from stochastic dropout and characteristic bimodal expression distributions in which expression is either strongly non-zero or non-detectable. We propose a two-part, generalized linear model for such bimodal data that parameterizes both of these features. We argue that the cellular detection rate, the fraction of genes expressed in a cell, should be adjusted for as a source of nuisance variation. Our model provides gene set enrichment analysis tailored to single-cell data. It provides insights into how networks of co-expressed genes evolve across an experimental treatment. MAST is available at https://github.com/RGLab/MAST .
Spatial structure of RNA expression RNA-seq and similar methods can record gene expression within and among cells. Current methods typically lose positional information and many require arduous single-cell isolation and … Spatial structure of RNA expression RNA-seq and similar methods can record gene expression within and among cells. Current methods typically lose positional information and many require arduous single-cell isolation and sequencing. Ståhl et al. have developed a way of measuring the spatial distribution of transcripts by annealing fixed brain or cancer tissue samples directly to bar-coded reverse transcriptase primers, performing reverse transcription followed by sequencing and computational reconstruction, and they can do so for multiple genes. Science , this issue p. 78
Scanpy is a scalable toolkit for analyzing single-cell gene expression data. It includes methods for preprocessing, visualization, clustering, pseudotime and trajectory inference, differential expression testing, and simulation of gene regulatory … Scanpy is a scalable toolkit for analyzing single-cell gene expression data. It includes methods for preprocessing, visualization, clustering, pseudotime and trajectory inference, differential expression testing, and simulation of gene regulatory networks. Its Python-based implementation efficiently deals with data sets of more than one million cells ( https://github.com/theislab/Scanpy ). Along with Scanpy, we present AnnData, a generic class for handling annotated data matrices ( https://github.com/theislab/anndata ).
Spatial positions of cells in tissues strongly influence function, yet a high-throughput, genome-wide readout of gene expression with cellular resolution is lacking. We developed Slide-seq, a method for transferring RNA … Spatial positions of cells in tissues strongly influence function, yet a high-throughput, genome-wide readout of gene expression with cellular resolution is lacking. We developed Slide-seq, a method for transferring RNA from tissue sections onto a surface covered in DNA-barcoded beads with known positions, allowing the locations of the RNA to be inferred by sequencing. Using Slide-seq, we localized cell types identified by single-cell RNA sequencing datasets within the cerebellum and hippocampus, characterized spatial gene expression patterns in the Purkinje layer of mouse cerebellum, and defined the temporal evolution of cell type-specific responses in a mouse model of traumatic brain injury. These studies highlight how Slide-seq provides a scalable method for obtaining spatially resolved gene expression data at resolutions comparable to the sizes of individual cells.
Single-cell RNA-seq has enabled gene expression to be studied at an unprecedented resolution. The promise of this technology is attracting a growing user base for single-cell analysis methods. As more … Single-cell RNA-seq has enabled gene expression to be studied at an unprecedented resolution. The promise of this technology is attracting a growing user base for single-cell analysis methods. As more analysis tools are becoming available, it is becoming increasingly difficult to navigate this landscape and produce an up-to-date workflow to analyse one's data. Here, we detail the steps of a typical single-cell RNA-seq analysis, including pre-processing (quality control, normalization, data correction, feature selection, and dimensionality reduction) and cell- and gene-level downstream analysis. We formulate current best-practice recommendations for these steps based on independent comparison studies. We have integrated these best-practice recommendations into a workflow, which we apply to a public dataset to further illustrate how these steps work in practice. Our documented case study can be found at https://www.github.com/theislab/single-cell-tutorial This review will serve as a workflow tutorial for new entrants into the field, and help established users update their analysis pipelines.
The mammalian nervous system executes complex behaviors controlled by specialized, precisely positioned, and interacting cell types. Here, we used RNA sequencing of half a million single cells to create a … The mammalian nervous system executes complex behaviors controlled by specialized, precisely positioned, and interacting cell types. Here, we used RNA sequencing of half a million single cells to create a detailed census of cell types in the mouse nervous system. We mapped cell types spatially and derived a hierarchical, data-driven taxonomy. Neurons were the most diverse and were grouped by developmental anatomical units and by the expression of neurotransmitters and neuropeptides. Neuronal diversity was driven by genes encoding cell identity, synaptic connectivity, neurotransmission, and membrane conductance. We discovered seven distinct, regionally restricted astrocyte types that obeyed developmental boundaries and correlated with the spatial distribution of key glutamate and glycine neurotransmitters. In contrast, oligodendrocytes showed a loss of regional identity followed by a secondary diversification. The resource presented here lays a solid foundation for understanding the molecular architecture of the mammalian nervous system and enables genetic manipulation of specific cell types.
Abstract Characterizing the transcriptome of individual cells is fundamental to understanding complex biological systems. We describe a droplet-based system that enables 3′ mRNA counting of tens of thousands of single … Abstract Characterizing the transcriptome of individual cells is fundamental to understanding complex biological systems. We describe a droplet-based system that enables 3′ mRNA counting of tens of thousands of single cells per sample. Cell encapsulation, of up to 8 samples at a time, takes place in ∼6 min, with ∼50% cell capture efficiency. To demonstrate the system’s technical performance, we collected transcriptome data from ∼250k single cells across 29 samples. We validated the sensitivity of the system and its ability to detect rare populations using cell lines and synthetic RNAs. We profiled 68k peripheral blood mononuclear cells to demonstrate the system’s ability to characterize large immune populations. Finally, we used sequence variation in the transcriptome data to determine host and donor chimerism at single-cell resolution from bone marrow mononuclear cells isolated from transplant patients.
Single-cell transcriptomics allows researchers to investigate complex communities of heterogeneous cells. It can be applied to stem cells and their descendants in order to chart the progression from multipotent progenitors … Single-cell transcriptomics allows researchers to investigate complex communities of heterogeneous cells. It can be applied to stem cells and their descendants in order to chart the progression from multipotent progenitors to fully differentiated cells. While a variety of statistical and computational methods have been proposed for inferring cell lineages, the problem of accurately characterizing multiple branching lineages remains difficult to solve.We introduce Slingshot, a novel method for inferring cell lineages and pseudotimes from single-cell gene expression data. In previously published datasets, Slingshot correctly identifies the biological signal for one to three branching trajectories. Additionally, our simulation study shows that Slingshot infers more accurate pseudotimes than other leading methods.Slingshot is a uniquely robust and flexible tool which combines the highly stable techniques necessary for noisy single-cell data with the ability to identify multiple trajectories. Accurate lineage inference is a critical step in the identification of dynamic temporal gene expression.
Tissues are complex milieus consisting of numerous cell types. Several recent methods have attempted to enumerate cell subsets from transcriptomes. However, the available methods have used limited sources for training … Tissues are complex milieus consisting of numerous cell types. Several recent methods have attempted to enumerate cell subsets from transcriptomes. However, the available methods have used limited sources for training and give only a partial portrayal of the full cellular landscape. Here we present xCell, a novel gene signature-based method, and use it to infer 64 immune and stromal cell types. We harmonized 1822 pure human cell type transcriptomes from various sources and employed a curve fitting approach for linear comparison of cell types and introduced a novel spillover compensation technique for separating them. Using extensive in silico analyses and comparison to cytometry immunophenotyping, we show that xCell outperforms other methods. xCell is available at http://xCell.ucsf.edu/ .
Unique Molecular Identifiers (UMIs) are random oligonucleotide barcodes that are increasingly used in high-throughput sequencing experiments. Through a UMI, identical copies arising from distinct molecules can be distinguished from those … Unique Molecular Identifiers (UMIs) are random oligonucleotide barcodes that are increasingly used in high-throughput sequencing experiments. Through a UMI, identical copies arising from distinct molecules can be distinguished from those arising through PCR amplification of the same molecule. However, bioinformatic methods to leverage the information from UMIs have yet to be formalized. In particular, sequencing errors in the UMI sequence are often ignored or else resolved in an ad hoc manner. We show that errors in the UMI sequence are common and introduce network-based methods to account for these errors when identifying PCR duplicates. Using these methods, we demonstrate improved quantification accuracy both under simulated conditions and real iCLIP and single-cell RNA-seq data sets. Reproducibility between iCLIP replicates and single-cell RNA-seq clustering are both improved using our proposed network-based method, demonstrating the value of properly accounting for errors in UMIs. These methods are implemented in the open source UMI-tools software package.
Single-cell RNA-seq (scRNA-seq) data exhibits significant cell-to-cell variation due to technical factors, including the number of molecules detected in each cell, which can confound biological heterogeneity with technical effects. To … Single-cell RNA-seq (scRNA-seq) data exhibits significant cell-to-cell variation due to technical factors, including the number of molecules detected in each cell, which can confound biological heterogeneity with technical effects. To address this, we present a modeling framework for the normalization and variance stabilization of molecular count data from scRNA-seq experiments. We propose that the Pearson residuals from "regularized negative binomial regression," where cellular sequencing depth is utilized as a covariate in a generalized linear model, successfully remove the influence of technical characteristics from downstream analyses while preserving biological heterogeneity. Importantly, we show that an unconstrained negative binomial model may overfit scRNA-seq data, and overcome this by pooling information across genes with similar abundances to obtain stable parameter estimates. Our procedure omits the need for heuristic steps including pseudocount addition or log-transformation and improves common downstream analytical tasks such as variable gene selection, dimensional reduction, and differential expression. Our approach can be applied to any UMI-based scRNA-seq dataset and is freely available as part of the R package sctransform, with a direct interface to our single-cell toolkit Seurat.
Abstract It has been known that, the novel coronavirus, 2019-nCoV, which is considered similar to SARS-CoV, invades human cells via the receptor angiotensin converting enzyme II (ACE2). Moreover, lung cells … Abstract It has been known that, the novel coronavirus, 2019-nCoV, which is considered similar to SARS-CoV, invades human cells via the receptor angiotensin converting enzyme II (ACE2). Moreover, lung cells that have ACE2 expression may be the main target cells during 2019-nCoV infection. However, some patients also exhibit non-respiratory symptoms, such as kidney failure, implying that 2019-nCoV could also invade other organs. To construct a risk map of different human organs, we analyzed the single-cell RNA sequencing (scRNA-seq) datasets derived from major human physiological systems, including the respiratory, cardiovascular, digestive, and urinary systems. Through scRNA-seq data analyses, we identified the organs at risk, such as lung, heart, esophagus, kidney, bladder, and ileum, and located specific cell types (i.e., type II alveolar cells (AT2), myocardial cells, proximal tubule cells of the kidney, ileum and esophagus epithelial cells, and bladder urothelial cells), which are vulnerable to 2019-nCoV infection. Based on the findings, we constructed a risk map indicating the vulnerability of different organs to 2019-nCoV infection. This study may provide potential clues for further investigation of the pathogenesis and route of 2019-nCoV infection.
The simultaneous measurement of multiple modalities represents an exciting frontier for single-cell genomics and necessitates computational methods that can define cellular states based on multimodal data. Here, we introduce "weighted-nearest … The simultaneous measurement of multiple modalities represents an exciting frontier for single-cell genomics and necessitates computational methods that can define cellular states based on multimodal data. Here, we introduce "weighted-nearest neighbor" analysis, an unsupervised framework to learn the relative utility of each data type in each cell, enabling an integrative analysis of multiple modalities. We apply our procedure to a CITE-seq dataset of 211,000 human peripheral blood mononuclear cells (PBMCs) with panels extending to 228 antibodies to construct a multimodal reference atlas of the circulating immune system. Multimodal analysis substantially improves our ability to resolve cell states, allowing us to identify and validate previously unreported lymphoid subpopulations. Moreover, we demonstrate how to leverage this reference to rapidly map new datasets and to interpret immune responses to vaccination and coronavirus disease 2019 (COVID-19). Our approach represents a broadly applicable strategy to analyze single-cell multimodal datasets and to look beyond the transcriptome toward a unified and multimodal definition of cellular identity.
Single-cell atlas efforts have reshaped the way we understand cells across the human body. Despite their power, they have not been effectively used to study ovarian biology in the context … Single-cell atlas efforts have reshaped the way we understand cells across the human body. Despite their power, they have not been effectively used to study ovarian biology in the context of other tissues, nor have they comprehensively incorporated healthy tissues from pre-menopausal donors. A focused pre-menopausal single-cell atlas could both advance our understanding of this life stage and support identification of ovarian gene targets for indications prevalent among younger demographics, such as fertility management. Here, we present an integrated resource of single-cell datasets from pre-menopausal women (PreMeno Atlas), comprising 511,365 cells from 14 tissues, including the ovary. This unified resource enables transcriptomic comparisons across cell types, tissues, and organs within the pre-menopausal context. Our analysis revealed distinct ovarian cell gene signatures, with theca and stromal cells exhibiting the most unique transcriptional profiles among ovarian cell types. We further leveraged the PreMeno Atlas to prioritize ovary-specific genes with potential druggability, particularly G-protein coupled receptors (GPCRs), identifying GPR78, ADRB3, GPR20, and GPR101 as candidate targets. Finally, we assessed species homology of ovarian cell marker genes using a harmonized mouse ovulation and spatial transcriptomics dataset. Collectively, this work establishes the PreMeno Atlas as a resource for ovarian biology research and contraceptive target discovery.
Chevreul is an open-source R Bioconductor package and interactive R Shiny app for processing and visualising single-cell RNA sequencing (scRNA-seq) data. Chevreul differs from other scRNA-seq analysis packages in its … Chevreul is an open-source R Bioconductor package and interactive R Shiny app for processing and visualising single-cell RNA sequencing (scRNA-seq) data. Chevreul differs from other scRNA-seq analysis packages in its ease of use, capacity to analyze full-length RNA sequencing data for exon coverage and transcript isoform inference, and support for batch correction. Chevreul enables exploratory analyses of scRNA-seq data using Bioconductor SingleCellExperiment objects (or converted Seurat objects), including batch integration, quality control filtering, read count normalization and transformation, dimensionality reduction, clustering at a range of resolutions, and cluster marker gene identification. Processed data can be visualized in the R Shiny app. Gene or transcript expression can be visualized using PCA, tSNE, UMAP, heatmaps, or violin plots; differential expression can be evaluated with several statistical tests. Chevreul also provides accessible tools for isoform-level analyses and alternative splicing detection. Chevreul empowers researchers without programming experience to analyze full-length scRNA-seq data. Availability & implementation Chevreul is implemented in R, and the R package and integrated Shiny application are freely available at https://github.com/cobriniklab/chevreul with constituent packages hosted on Bioconductor at https://bioconductor.org/packages/chevreulProcess, https://bioconductor.org/packages/chevreulPlot, and https://bioconductor.org/packages/chevreulShiny.
Cellular function depends on dynamic interactions and nanoscale spatial organisation of proteins. While transcriptomic and proteomic methods have enabled single-cell profiling, scalable technologies allowing high-resolution analysis of protein interactions at … Cellular function depends on dynamic interactions and nanoscale spatial organisation of proteins. While transcriptomic and proteomic methods have enabled single-cell profiling, scalable technologies allowing high-resolution analysis of protein interactions at omics-scale are lacking. Here we present the Proximity Network Assay (PNA), a DNA-based method for constructing three-dimensional nanoscale maps of 155 proteins in single cells without the use of optics. PNA employs barcoded antibodies and in situ rolling circle amplification to generate >40,000 spatial nodes per cell, which are linked through proximity-dependent gap-fill ligation and decoded by DNA sequencing, forming single cell Proximity Networks. At an estimated spatial resolution of ~50 nm, PNA captures single-cell protein abundance, self-clustering, and colocalization, validating established cell membrane protein interactions. We illustrate how PNA can be used to gain insights into the molecular mechanisms of cell function through protein interactions in hematological oncology, CAR-T cell therapies, and autoimmune disease.
Ovulation is a spatiotemporally coordinated process that involves several tightly controlled events, including oocyte meiotic maturation, cumulus expansion, follicle wall rupture and repair, and ovarian stroma remodeling. To date, no … Ovulation is a spatiotemporally coordinated process that involves several tightly controlled events, including oocyte meiotic maturation, cumulus expansion, follicle wall rupture and repair, and ovarian stroma remodeling. To date, no studies have detailed the precise window of ovulation at single-cell resolution. Here, we performed parallel single-cell RNA-seq and spatial transcriptomics on paired mouse ovaries across an ovulation time course to map the spatiotemporal profile of ovarian cell types. We show that major ovarian cell types exhibit time-dependent transcriptional states enriched for distinct functions and have specific localization profiles within the ovary. We also identified gene markers for ovulation-dependent cell states and validated these using orthogonal methods. Finally, we performed cell–cell interaction analyses to identify ligand-receptor pairs that may drive ovulation, revealing previously unappreciated interactions. Taken together, our data provides a rich and comprehensive resource of murine ovulation that can be mined for discovery by the scientific community.
Abstract Cancer immunotherapy is an innovative treatment approach that leverages the immune system to combat tumors, demonstrating significant therapeutic potential. In the past few years, single‐cell RNA sequencing (scRNA‐seq) has … Abstract Cancer immunotherapy is an innovative treatment approach that leverages the immune system to combat tumors, demonstrating significant therapeutic potential. In the past few years, single‐cell RNA sequencing (scRNA‐seq) has made significant progress in the field of cancer immunotherapy, enabling us to understand the complexity of anti‐tumor immune processes with unprecedented depth and precision, thereby facilitating the design of more effective immunotherapy strategies. This review aims to summarize the recent applications of advanced scRNA‐seq technologies in tumor immunology. First, we outline the most representative scRNA‐seq technologies with different technical principles, with a particular focus on single‐cell T cell receptor sequencing. Next, we describe how scRNA‐seq technology is applied to identify the cellular composition and phenotypes within the tumor microenvironment for the construction of immune cell atlas, and uncover key cell types and molecular mechanisms underlying treatment responses for developing novel immunotherapies. Finally, we address the current challenges and future prospects of scRNA‐seq technology in tumor immunology.
This paper describes an end-to-end workflow for highly multiplexed fluorescence imaging with the Cell DIVE platform, allowing simultaneous detection of 40+ markers at single-cell resolution. Combining whole-slide multiplexed imaging with … This paper describes an end-to-end workflow for highly multiplexed fluorescence imaging with the Cell DIVE platform, allowing simultaneous detection of 40+ markers at single-cell resolution. Combining whole-slide multiplexed imaging with a dedicated analysis pipeline provides a powerful approach to investigate immune cell interactions with stromal and vascular networks within human tissue microenvironments. With a focus on spatial investigation of human immune niches, here we provide a complete framework for tissue preparation, autofluorescence reduction, multiplex panel design and whole-slide image analysis. For complete details on the use and execution of this protocol, please refer to Korsunsky et al. (Med, 2022) [1].
Machine learning methods, especially Transformer architectures, have been widely employed in single-cell omics studies. However, interpretability and accurate representation of out-of-distribution (OOD) cells remains challenging. Inspired by the global workspace … Machine learning methods, especially Transformer architectures, have been widely employed in single-cell omics studies. However, interpretability and accurate representation of out-of-distribution (OOD) cells remains challenging. Inspired by the global workspace theory in cognitive neuroscience, we introduce CellMemory, a bottlenecked Transformer with improved generalizability designed for the hierarchical interpretation of OOD cells. Without pre-training, CellMemory outperforms existing single-cell foundation models and accurately deciphers spatial transcriptomics at high resolution. Leveraging its robust representations, we further elucidate malignant cells and their founder cells across patients, providing reliable characterizations of the cellular changes caused by the disease.
The tumor microenvironment is heterogeneous, structurally complex, and continually evolving, making it difficult to fully capture. Common dissociative techniques thoroughly characterize the heterogeneity of cellular populations but lack structural context. … The tumor microenvironment is heterogeneous, structurally complex, and continually evolving, making it difficult to fully capture. Common dissociative techniques thoroughly characterize the heterogeneity of cellular populations but lack structural context. The recent boom in spatial analyses has exponentially accelerated our understanding of the structural complexity of these cellular populations. However, to understand the dynamics of cancer pathogenesis, we must assess this heterogeneity across space and time. In this review, we provide an overview of current dissociative, spatial, and temporal analysis strategies in addition to existing and prospective spatiotemporal techniques to illustrate how understanding the tumor microenvironment, focusing on dynamic immune-cancer cell interactions, across four dimensions will advance cancer research and its diagnostic and therapeutic applications.
Single-cell RNA sequencing (scRNA-seq) has revolutionized kidney disease research by enabling high-resolution transcriptomic analysis at the cellular level. This technology can overcome the limitations of traditional bulk-sequencing; reveal disease-progression trajectories, … Single-cell RNA sequencing (scRNA-seq) has revolutionized kidney disease research by enabling high-resolution transcriptomic analysis at the cellular level. This technology can overcome the limitations of traditional bulk-sequencing; reveal disease-progression trajectories, intercellular communication networks, and cellular heterogeneity; and provide crucial insights into disease mechanisms, thereby facilitating the development of targeted therapies and personalized treatment strategies. We conducted a bibliometric analysis of publications describing the use of scRNA-seq in kidney disease research from 2015 to 2024 using the Web of Science Core Collection (WoSCC) database. Data analysis was performed using the R packages Bibliometrix, VOSviewer, and CiteSpace to systematically evaluate the research landscape and emerging trends. A total of 1,210 publications on scRNA-seq in kidney diseases were identified. China was the largest contributor among the participating countries, demonstrating consistent annual growth in publication numbers. The major research institutions were Harvard Medical School, Sun Yat-sen University, and Shanghai Jiao Tong University. Most articles in this field were published by Frontiers in Immunology. In a list of 8,984 authors, the most productive authors were B. D. Humphreys, Haojia Wu, and Matthias Kretzler. The dominant categories identified in this search were scRNA-seq, disease progression/mechanisms, and gene regulation/expression. Several budding areas of investigation were also noted, including immunotherapy and scRNA-seq innovations, which allude to active evolution in the field. This bibliometric analysis revealed the rapid growth and evolving landscape of scRNA-seq applications in kidney disease research and highlighted promising opportunities for understanding disease mechanisms and developing personalized therapeutic strategies.
Recent advances in spatially resolved transcriptomics (SRT) have provided valuable avenues for identifying cell-cell interactions and their critical roles in diseases. We introduce QuadST, a novel statistical method for the … Recent advances in spatially resolved transcriptomics (SRT) have provided valuable avenues for identifying cell-cell interactions and their critical roles in diseases. We introduce QuadST, a novel statistical method for the robust and powerful identification of cell-cell interactions and their impacted genes in single-cell SRT. QuadST models interactions at different cell-cell distance quantile levels and innovatively contrasts signals to identify interaction-changed genes, which exhibit stronger signals at shorter distances. Unlike other methods, QuadST does not require the specification of interacting cell pairs. It is also robust against unmeasured confounding factors and measurement errors of the data. Simulation studies demonstrate that QuadST effectively controls the type I error, even in misspecified settings, and significantly improves power over existing methods. Applications of QuadST to real datasets have successfully revealed biologically significant interaction-changed genes across various cell types.
Endothelial cells (ECs) are often a minority cell type in a tissue, limiting the utility of bulk sequencing approaches. Single cell sequencing lacks sensitivity and requires disruptive tissue digestion techniques. … Endothelial cells (ECs) are often a minority cell type in a tissue, limiting the utility of bulk sequencing approaches. Single cell sequencing lacks sensitivity and requires disruptive tissue digestion techniques. TRAP seq (Translating Ribosome Affinity Purification) or 'RiboTag' has been used to overcome these limitations. Multicellular co-culture systems allow primary ECs to differentiate and undergo tubular morphogenesis in cell culture, however similar limitations exist with these in vitro assays, as ECs are under-represented by as much as a factor of 10 in many assays. We sought to use TRAP seq to better understand the gene expression landscapes that drive these morphogenic events. We found TRAP seq selectively enriches for endothelial RNA in two distinct co-culture paradigms, in both the planar and fibrin bead co-culture assays. Intriguingly, the use of this technology in vitro, revealed distinct gene expression changes in blood vessel development and in the mitotic cell cycle, with genes unique to early and late phases of morphogenesis. It is widely accepted that expression of the NOTCH signaling pathway is a regulator of angiogenesis in vivo. Correspondingly, we found that a large number of NOTCH related genes relevant to endothelial cell biology, were changing across this morphogenic process, underpinning the importance and utility of this technology in 3D multicellular cultures that model in vivo environments.
Drug-induced acute kidney injury (AKI) affects about 20% of hospitalized AKI patients, a significant contributor to morbidity and mortality. The lack of understanding of the kidney system and functioning of … Drug-induced acute kidney injury (AKI) affects about 20% of hospitalized AKI patients, a significant contributor to morbidity and mortality. The lack of understanding of the kidney system and functioning of nephrotoxic drugs contributes to hospital-acquired AKI cases. AKI is difficult to predict because of its complex injury mechanism and the numerous pathways through which it manifests. Traditional toxicity biomarkers, like elevated creatinine levels, detect AKI only after significant kidney injury has occurred. Concurrently, advancements in single cell RNA sequencing (scRNAseq) have improved our ability to map cellular heterogeneity within tissues, potentially enabling the study of drug effects at a single cell level. We hypothesized that only particular subtypes of kidney cells may be responsible for observed nephrotoxicity and explain prediction challenges. To test this, we generated cellular response scores for 32 kidney cell types from the Human Cell Atlas and estimated drug effects. We identified significant expression differences in 6 cell types (e.g. Indistinct intercalated cell p = 0.009, Epithelial Progenitor cell, p = 0.04). We also developed an ensemble model that achieved an AUROC of 0.6 across different kidney cell populations - a significant improvement over using traditional bulk RNA sequencing alone. The single-cell transcriptomic signatures we identified potentially reveal unexplained molecular mechanisms of nephrotoxicity. Author Summary The prediction and early detection of drug-induced kidney injury is a significant clinical challenge since physicians rely on biomarkers that only become elevated after substantial kidney damage has occurred, limiting opportunities for intervention and patient protection. Our investigation utilized single-cell data and available drug toxicity information to examine how individual kidney cell populations respond to potentially harmful medications. We hypothesized that specific kidney cell subtypes are primarily responsible for observed drug toxicity, which may explain the difficulties in predicting drug-induced kidney injury. Through comprehensive analysis of 32 distinct kidney cell types, we identified six specific cellular populations that demonstrate differential responses to nephrotoxic compounds. We subsequently developed models that demonstrate superior predictive performance compared to analytical approaches using bulk RNA sequencing data. Our methodology represents a substantial advancement in precision medicine approaches to drug safety. These findings have important implications for clinical practice and patient safety. The cellular signatures we identified may enable earlier detection of kidney injury risk, potentially allowing clinicians to modify treatment regimens before irreversible damage occurs. Our work establishes a foundation for improved drug safety protocols and may contribute to reducing medication-related kidney injury in hospitalized patients.
Single-cell metagenomic sequencing (scMetaG) can provide maximum-resolution insights into complex microbial communities. However, existing bioinformatic tools are not equipped to handle the massive amounts of data generated by novel high-throughput … Single-cell metagenomic sequencing (scMetaG) can provide maximum-resolution insights into complex microbial communities. However, existing bioinformatic tools are not equipped to handle the massive amounts of data generated by novel high-throughput scMetaG methods. Here, we present a bioinformatic toolkit for complete, end-to-end scMetaG analysis: (i) Bascet, a command-line suite designed to scale to massive scMetaG datasets (≥1 million cells); (ii) Zorn, an R package/workflow manager that enables reproducible scMetaG data analysis, exploration, and visualization (http://zorn.henlab.org/). Enabled by recent advances in droplet microfluidics, we use Bascet and Zorn to develop and optimize a high-throughput scMetaG method on a ten-species mock community. To showcase their utility on a real-world sample, we use Bascet and Zorn to characterize a human saliva sample, generating single-amplified genomes (SAGs) from >10k prokaryotic cells. Overall, Bascet and Zorn enable reproducible scMetaG analysis, allowing users to query microbiomes at unprecedented resolution and scale.
Abstract Human diseases are characterized by intricate cellular dynamics. Single-cell transcriptomics provides critical insights, yet a persistent gap remains in computational tools for detailed disease progression analysis and targeted in … Abstract Human diseases are characterized by intricate cellular dynamics. Single-cell transcriptomics provides critical insights, yet a persistent gap remains in computational tools for detailed disease progression analysis and targeted in silico drug interventions. Here we introduce UNAGI, a deep generative neural network tailored to analyse time-series single-cell transcriptomic data. This tool captures the complex cellular dynamics underlying disease progression, enhancing drug perturbation modelling and screening. When applied to a dataset from patients with idiopathic pulmonary fibrosis, UNAGI learns disease-informed cell embeddings that sharpen our understanding of disease progression, leading to the identification of potential therapeutic drug candidates. Validation using proteomics reveals the accuracy of UNAGI’s cellular dynamics analysis, and the use of the fibrotic cocktail-treated human precision-cut lung slices confirms UNAGI’s predictions that nifedipine, an antihypertensive drug, may have anti-fibrotic effects on human tissues. UNAGI’s versatility extends to other diseases, including COVID, demonstrating adaptability and confirming its broader applicability in decoding complex cellular dynamics beyond idiopathic pulmonary fibrosis, amplifying its use in the quest for therapeutic solutions across diverse pathological landscapes.
Abstract Background Spatial transcriptomics technologies are revolutionizing our understanding of intra-tumor heterogeneity and the tumor microenvironment by revealing single-cell molecular profiles within their spatial tissue context. The rapid development of … Abstract Background Spatial transcriptomics technologies are revolutionizing our understanding of intra-tumor heterogeneity and the tumor microenvironment by revealing single-cell molecular profiles within their spatial tissue context. The rapid development of spatial transcriptomics methods, each with unique characteristics, makes it challenging to select the most suitable technology for specific research objectives. Here, we compare four imaging-based approaches—RNAscope HiPlex, Molecular Cartography, Merscope, and Xenium—alongside Visium, a sequencing-based method. These technologies were employed to study cryosections of medulloblastoma with extensive nodularity (MBEN), a tumor chosen for its distinct microanatomical features. Results Our analysis reveals that automated imaging-based spatial transcriptomics methods are well-suited to delineate the intricate MBEN microanatomy and capture cell-type-specific transcriptome profiles. We devise approaches to compare the sensitivity and specificity of different methods, along with their unique attributes, to guide method selection based on the research objective. Furthermore, we demonstrate how reimaging slides after the spatial transcriptomics analysis can significantly improve cell segmentation accuracy and integrate additional transcript and protein readouts, expanding the analytical possibilities and depth of insight. Conclusions This study underscores important distinctions between spatial transcriptomics technologies and offers a framework for evaluating their performance. Our findings support informed decisions regarding methods and outline strategies to improve the resolution and scope of spatial transcriptomic analyses, ultimately advancing spatial transcriptomics applications in solid tumor research.
Single-cell transcriptomics is a high-throughput technology capable of analyzing gene expression at the individual cell level. Spatial transcriptomics, on the other hand, is a technique that simultaneously captures both gene … Single-cell transcriptomics is a high-throughput technology capable of analyzing gene expression at the individual cell level. Spatial transcriptomics, on the other hand, is a technique that simultaneously captures both gene expression profiles and the spatial location information of cells. While single-cell transcriptomics enables sequencing of the transcriptome at a single-cell resolution, spatial transcriptomics provides the added dimension of spatial context alongside gene expression data. These two approaches—single-cell transcriptomics and spatial transcriptomics—are complementary, and their integration can facilitate a more comprehensive and in-depth investigation of biological questions.
Abstract Background Post-acute sequelae of SARS-CoV-2 infection (PASC) affects millions globally, yet the molecular mechanisms underlying acute COVID-19 and its chronic sequelae remain poorly understood. Methods We performed an integrative … Abstract Background Post-acute sequelae of SARS-CoV-2 infection (PASC) affects millions globally, yet the molecular mechanisms underlying acute COVID-19 and its chronic sequelae remain poorly understood. Methods We performed an integrative transcriptomic analysis of three independent RNA-seq datasets, capturing the complete COVID-19 pathophysiology from health through acute severe infection to post-acute sequelae and mortality (n=142 total samples). We implemented a containerized analytical pipeline from data download, quantification, differential gene expression to uniformly process these three RNA-seq datasets. Results Our analysis reveals striking molecular dichotomies contrasting disease phases with profound clinical implications. Acute severe/critical COVID-19 reveals predominant enrichment of TNF-α signaling via NF-κB pathways (normalized enrichment score >2.5, FDR <0.001), reflecting a cytokine storm pathophysiology characterized by rapid inflammatory developments involving IL-6, TNF-α, and anti-apoptotic responses. In contrast, PASC patients exhibit dominant enrichment of Myc Targets V1 and Oxidative Phosphorylation pathways (NES >2.2, FDR <0.005), indicating important shifts toward cellular adaptation. Pathway signature analysis identifies core differentially expressed genes that reliably distinguish disease phases, thereby offering objective biomarkers for precision diagnosis and monitoring. Conclusions These findings establish a comprehensive molecular framework distinguishing acute inflammatory from chronic metabolic COVID-19 phases, with potential clinical applicability. TNF-α/NF-κB pathway signatures identify patients at risk for severe disease progression, while Myc/OXPHOS signatures allow objective PASC diagnosis, addressing current reliance on subjective and eliminative diagnosis. This integrative analytical framework has utility beyond COVID-19, offering an applicable approach for precision medicine implementation across other diseases processes. Clinical Significance This study transforms COVID-19 from a symptom-based to a molecularly-defined disease spectrum, enabling precision diagnosis, prognostic monitoring, classification, and targeted therapeutic possibilities based on pathway-specific biomarkers rather than subjective clinical assessments.
With the rapid advancement of spatial multi-omics technologies, the simultaneous analysis of molecular profiles and spatial locations has provided unprecedented insights into cellular heterogeneity and tissue microenvironments. However, data sparsity … With the rapid advancement of spatial multi-omics technologies, the simultaneous analysis of molecular profiles and spatial locations has provided unprecedented insights into cellular heterogeneity and tissue microenvironments. However, data sparsity and the diversity of data distributions hinder the effective integration and analysis of spatial multi-omics data. In this study, we propose a novel ensemble learning framework based on dual-graph regularized anchor concept factorization, named SMODEL, for detecting spatial domains from spatial multi-omics data. SMODEL employs an element-wise weighted ensemble strategy to integrate multiple base clustering results, and leverages anchor concept factorization and dual-graph regularization to learn robust spatial consensus representations. We evaluated the performance of SMODEL on both real and simulated spatial multi-omics datasets, encompassing various technologies, tissue types, and species. Experimental results demonstrate that SMODEL not only outperforms existing methods in spatial domain identification but also effectively captures tissue structure, thereby enhancing the understanding of cellular heterogeneity.
Abstract The rapid advancement of spatial transcriptomics has provided a critical data foundation for the high‐resolution characterization of tissue spatial domains. Traditional methods for spatial domain identification primarily rely on … Abstract The rapid advancement of spatial transcriptomics has provided a critical data foundation for the high‐resolution characterization of tissue spatial domains. Traditional methods for spatial domain identification primarily rely on gene expression data from sampled spots in low‐resolution spatial transcriptomic data, often overlooking valuable information between spots that can be crucial for domain identification. Furthermore, these methods are limited by their focus on gene expression data from neighboring spots, without fully integrating prior knowledge of cell types within the tissue's spatial structure. To address these challenges, SGCD, a novel method for tissue spatial domain identification based on data interpolation and cell type deconvolution is proposed. SGCD utilizes interpolation techniques to estimate gene expression data for cells in the gaps between spots and applies deconvolution to extract cell type information from both spots and interstitial regions. By integrating gene expression, cell type, and spatial location data, SGCD achieves accurate delineation of complex spatial domains through graph contrastive learning. Evaluations on various publicly available datasets, including the human dorsolateral prefrontal cortex, mouse brain, pancreatic ductal adenocarcinoma, and breast cancer, demonstrate that SGCD significantly outperforms existing methods in both accuracy and detail, offering strong support for advancing the understanding of tissue functions and disease mechanisms.
Abstract Recent advancements in proteomics sequencing have significantly enhanced our ability to explore cell‐type‐specific signatures within complex tissues, providing critical insights into disease mechanisms. However, current proteomic technologies often suffer … Abstract Recent advancements in proteomics sequencing have significantly enhanced our ability to explore cell‐type‐specific signatures within complex tissues, providing critical insights into disease mechanisms. However, current proteomic technologies often suffer from low resolution, resulting in the mixing of multiple cell types during profiling. To address this limitation, cell‐type deconvolution methods are developed to infer cellular composition from proteomic data. While most existing deconvolution methods are focused on transcriptomics, their application to proteomics is hindered by the weak correlation and divergent quantification between transcriptome and proteome data. Although a few proteomic‐specific deconvolution methods are recently emerged, they still exhibit limited capability and performance, partly because they only extract shared information from individual samples while ignoring higher‐order relationships between them. Here, GraphDEC is proposed, a novel graph neural network‐based method for dec iphering cell type proportions in proteomic profiling data. GraphDEC begins by simulating bulk samples from single‐cell proteomic data to create reference data, which is then used to infer cell types in target datasets. Specifically, GraphDEC employs an autoencoder to extract low‐dimensional representations from both reference and target proteomic data, enabling the construction of similarity relationships among samples. These relationships, combined with proteomic data, are processed by a graph neural network that integrates a multi‐channel mechanism and a hybrid neighborhood‐aware approach to learn highly effective representations. To optimize the model, GraphDEC utilizes multiple loss functions, including triplet loss, domain adaptation loss, and Mean Squared Error (MSE) loss, ensuring robust performance and mitigating batch effects. Benchmark experiments demonstrate that GraphDEC achieves state‐of‐the‐art performance across diverse synthetic proteomic datasets from different sequencing technologies and real‐world spatial proteomic datasets. Furthermore, GraphDEC exhibits strong generalization capabilities, showing high efficiency when applied to cross‐species proteomic data and even transcriptomics.
Abstract Single-cell RNA sequencing has revolutionized cellular heterogeneity research, but analyzing the abundance of unannotated public datasets remains challenging. We present scExtract, a framework leveraging large language models to automate … Abstract Single-cell RNA sequencing has revolutionized cellular heterogeneity research, but analyzing the abundance of unannotated public datasets remains challenging. We present scExtract, a framework leveraging large language models to automate scRNA-seq data analysis from preprocessing to annotation and integration. scExtract extracts information from research articles to guide data processing, outperforming existing reference transfer methods in benchmarks. We introduce scanorama-prior and cellhint-prior, which incorporate prior annotation information for improved batch correction while preserving biological diversities. We demonstrate scExtract’s utility by integrating 14 datasets to create a comprehensive human skin atlas of 440,000 cells.
Flow cytometry is a powerful and widely used tool for the analysis of various cell populations, but its capabilities are severely limited by the need to apply correction of fluorescent … Flow cytometry is a powerful and widely used tool for the analysis of various cell populations, but its capabilities are severely limited by the need to apply correction of fluorescent signals from near or similar fluorochromes when analyzing multicolor panels. Spectral flow cytometry extends the capabilities of classical cytometry by reading the full fluorescence spectrum of fluorophores and their subsequent spectral separation. This significantly increases the number of markers analyzed in a single panel and thus allows for more in-depth studies of cell populations. In the age of big data analysis, this represents a serious advantage of spectral cytometry and can significantly increase its use in scientific and clinical practice. This review describes the principle of spectral cytometry, advantages and limitations of the method, and summarizes the newest deep immunophenotyping panels developed and validated for spectral cytometry.
Diabetes mellitus (DM) is among the most prevalent metabolic diseases worldwide, associated with an increased risk of mortality. Although numerous studies have been conducted to uncover the cellular and molecular … Diabetes mellitus (DM) is among the most prevalent metabolic diseases worldwide, associated with an increased risk of mortality. Although numerous studies have been conducted to uncover the cellular and molecular pathways associated with DM pathogenesis, reaching new diagnosis and treatment goals for DM requires further research. The progress in gene sequencing technologies, particularly in single‐cell RNA sequencing (scRNA‐seq), has yielded additional insights into the molecular pathways involved in the development and progression of DM. This review summarizes the latest advances and applications of RNA‐seq technologies in diabetes research, such as the characterization of single human islet and immune cells in DM, and the applications of scRNA‐seq in the treatment and early diagnosis of diabetes complications.
Flow cytometry use has significantly increased in clinical laboratories and has significantly helped improve the diagnosis of leukemias, lymphomas, and follow-up of minimal residual disease. Mastering this technique enables the … Flow cytometry use has significantly increased in clinical laboratories and has significantly helped improve the diagnosis of leukemias, lymphomas, and follow-up of minimal residual disease. Mastering this technique enables the performance of multiparametric single-cell analysis and increases the odds of identifying abnormal populations. As in many fields, there is a need to improve the quality of the data generated for accuracy, reproducibility, and trueness. The implementation of solutions reducing variability is achievable and needed, as the flow cytometry workflow involves many manual steps and items susceptible to operator bias and human error. Standardization of flow cytometry assays is sought and already implemented in many clinical hematology laboratories. However, the clinical community would highly benefit from further efforts in that direction to increase the comparability of findings across laboratories. This review covers the strengths and weaknesses of flow cytometry and focuses on the standardization approaches developed, including recent advances in the field.
Spatial transcriptomics (ST) is emerging as a powerful technology that transforms our understanding of thyroid cancer by offering a spatial context of gene expression within the tumor tissue. In this … Spatial transcriptomics (ST) is emerging as a powerful technology that transforms our understanding of thyroid cancer by offering a spatial context of gene expression within the tumor tissue. In this review, we synthesize the recent applications of ST in thyroid cancer research, with a particular focus on the heterogeneity of the tumor microenvironment, tumor evolution, and cellular interactions. Studies have leveraged the spatial information provided by ST to map distinct cell types and expression patterns of genes and pathways across the different regions of thyroid cancer samples. The spatial context also allows a closer examination of invasion and metastasis, especially through the dysregulation at the tumor leading edge. Additionally, signaling pathways are inferred at a more accurate level through the spatial proximity of ligands and receptors. We also discuss the limitations that need to be overcome, including technical limitations like low resolution and sequencing depth, the need for high-quality samples, and complex data handling processes, and suggest future directions for a wider and more efficient application of ST in advancing personalized treatment of thyroid cancer.
Oral squamous cell carcinoma is among the most prevalent tumours of the oral and maxillofacial region. The initial symptoms are typically minor and may remain misdiagnosed until the disease advances, … Oral squamous cell carcinoma is among the most prevalent tumours of the oral and maxillofacial region. The initial symptoms are typically minor and may remain misdiagnosed until the disease advances, resulting in a significantly reduced five-year survival rate for patients. Early detection is critical, as it can improve five-year survival rates from below 50% to 70–90%. Due to their reduced sensitivity and intrusive nature, conventional screening methods such as serological testing and histopathological biopsies have limitations in their application. In contrast, emerging technologies including single-cell sequencing, spatial transcriptomics, nanopore sequencing, biosensor technology, and artificial intelligence, among other advanced detection methods, are redefining biomarker discovery. Scalability obstacles still exist, including clinical validation gaps, high implementation costs, and analytical complexity. In order to close the gap between invention and equitable implementation, future efforts should focus on multicenter validation of potential biomarkers and cost-effective integration of these technologies. This will ultimately improve patient prognosis and quality of life. This work aims to comprehensively investigate and evaluate the prospective applications and future developmental potential of these technologies while offering an extensive examination of oral squamous cell cancer biomarker research
Abstract Spatial transcriptomics enables spatially resolved gene expression analysis, but accompanying histology images are often degraded by fiducial markers and background regions, hindering interpretation. To address this, we introduce Vispro, … Abstract Spatial transcriptomics enables spatially resolved gene expression analysis, but accompanying histology images are often degraded by fiducial markers and background regions, hindering interpretation. To address this, we introduce Vispro, an end-to-end automated image processing tool optimized for 10× Visium data. Vispro includes modules for fiducial marker detection, image restoration, tissue region detection, and segmentation of disconnected tissue areas. By enhancing image quality, Vispro improves the accuracy and performance of downstream analyses, including tissue and cell segmentation, image registration, gene expression imputation guided by histological context, and spatial domain detection.
Human brain development is characterized by a complex cellular and molecular landscape, which vary both within and between individuals. Here we explore the transformative impact of single-cell omics technologies on … Human brain development is characterized by a complex cellular and molecular landscape, which vary both within and between individuals. Here we explore the transformative impact of single-cell omics technologies on our understanding of human neurodiversity and neurocomplexity. We trace historical progressions of cellular and molecular biology, highlighting the cell as a pivotal “place holder” for biological inquiry, as a basis to better understand the current revolution of single-cell profiling enabling the study of individual genomes and environmental interactions at unprecedented resolution. Starting from the challenges of defining cell types and states within neurodevelopment, we emphasize the significance of moving beyond categorical distinctions to understand the molecular basis of inter-individual neurodiversity, including genetics, environment, and developmental stochasticity. We introduce the concept of “in vitro epidemiology”, leveraging brain organoids and multiplexing approaches to model population-scale cohorts in vitro and thus enabling the dissection of gene-environment interactions at single-cell resolution. We further discuss technical advancements and computational methodologies that are driving the field forward, including the efforts to create comprehensive cell atlases of the human brain and the emerging challenges in data integration and analysis. Finally, we anticipate future perspectives for single-cell studies and brain organoids in advancing our understanding of neurobiology, and cell-based strategies for drug discovery and personalized treatment.
Introduction Skin cutaneous melanoma (SKCM) is a highly aggressive form of cancer with poor prognosis, characterized by significant molecular and immune heterogeneity. The activation of KRAS signaling pathways is implicated … Introduction Skin cutaneous melanoma (SKCM) is a highly aggressive form of cancer with poor prognosis, characterized by significant molecular and immune heterogeneity. The activation of KRAS signaling pathways is implicated in melanoma progression, yet its role in shaping the tumor microenvironment, particularly in macrophage infiltration, remains poorly understood. Methods A comprehensive multi-platform approach was employed, analyzing gene expression data from the Gene Expression Omnibus (GEO) and The Cancer Genome Atlas (TCGA) databases. Gene set enrichment analysis (GSEA) was utilized to characterize the molecular pathways associated with KRAS signaling. Single-cell RNA sequencing (scRNA-seq) was leveraged to investigate the cellular heterogeneity within the SKCM tumor microenvironment, and macrophage populations were categorized using the Monocle2 algorithm. A KRAS-Macrophage Prognostic Associated Gene (KMPAG) signature was developed by integrating these findings, followed by validation using a least absolute shrinkage and selection operator (LASSO) regression model. The prognostic value of the KMPAG signature was assessed through its correlation with clinical outcomes, immune cell infiltration patterns, response to therapy, drug sensitivity, and miRNA-gene regulatory interactions. Cell-cell communication within the SKCM microenvironment was explored using the “CellChat” tool. Experimental validation of gene expression was performed via immunohistochemistry (IHC) and functional assays in gene-modified melanoma cell lines. Results Twenty-two genes involved in KRAS signaling were identified as critical for patient survival. Single-cell analysis revealed nine distinct cell populations within the SKCM microenvironment, leading to the construction of the KMPAG risk model, which incorporated three key genes—CLEC4A, CXCL10, and LAT2. This signature effectively reclassified macrophage subsets, offering improved diagnostic and prognostic capabilities. Furthermore, the KMPAG signature correlated with a range of clinical parameters, including immune infiltration levels, tumor stage, and therapy response. The model also provided insights into the immune landscape of SKCM, facilitating the prediction of responses to immunotherapy. Functional assays demonstrated that downregulation of CLEC4A significantly promoted melanoma cell proliferation, migration, and invasion. Conclusion This study highlights the importance of KRAS signaling and macrophage infiltration in melanoma prognosis. The KMPAG gene signature presents a novel prognostic tool, offering insights into personalized treatment strategies and predictive biomarkers for immunotherapy in SKCM. Further exploration of CLEC4A’s role in melanoma progression may provide new therapeutic avenues for targeted intervention.
Microbeads (MBs) aggregation-based immunoassay, which is independent of multistep bead washing, signal labeling, and even reporter eluting procedures, has emerged as a promising label-free route for protein biomarker analysis. However, … Microbeads (MBs) aggregation-based immunoassay, which is independent of multistep bead washing, signal labeling, and even reporter eluting procedures, has emerged as a promising label-free route for protein biomarker analysis. However, their wide application is not only subjected to challenges from target-actuated low aggregation efficiency derived from the steric hindrance and high weight of micrometer-sized beads but also suffers from the lack of precise method to exactly measure the uncontrollable aggregation process/state. Herein, a new mechanism of metastable DNA hybridization-accelerated programmable immuno-aggregation of MBs is proposed, which enables the facile mix-and-read and flow cytometric detection of protein biomarkers. The introduction of auxiliary metastable DNA hybridization and magnetic facilitation can remarkably boost the immunoreaction efficiency between two kinds of MBs, achieving an ∼200-fold increase of detection sensitivity. What is more, benefiting from the powerful ability of flow cytometry to precisely interrogate the light scattering and fluorescence information on individual events one-by-one, the distinct discrimination and precise quantification of MB aggregates from MB monomers can be easily achieved. Additionally, fluorescent color and intensity coencoded MBs can be easily acquired by simply adjusting the amounts of fluorescent probes to achieve the multiplexed analysis of protein targets. With these advantages, the proposed method demonstrated a successful application for mix-and-read protein detection, showing great potential in diverse biomedical applications.
Recent advancements in spatial transcriptomics technology have generated substantial volumes of spatial transcriptome data. However, the quality of this data is often compromised due to the limitations of current sequencing … Recent advancements in spatial transcriptomics technology have generated substantial volumes of spatial transcriptome data. However, the quality of this data is often compromised due to the limitations of current sequencing technologies. To address this issue, DiffusionST proposes a method for imputing spatial transcriptomics data and clustering the imputed data. The method employs a graph convolutional network (GCN) model combined with a newly designed loss function, denoising data using the zero-inflated negative binomial (ZINB) distribution, and data enhancement through a diffusion model to improve clustering accuracy. DiffusionST demonstrates superior clustering accuracy compared to six of the most popular spatial transcriptomics clustering algorithms. DiffusionST also excels in data imputation when compared to five single-cell RNA sequencing (scRNA-seq) imputation algorithms. Additionally, DiffusionST's robustness against noise is quantitatively validated by manually introducing random dropout noise into the dataset, where our model significantly enhances the quality of spatial transcriptomic data. Moreover, DiffusionST is well-suited for high-resolution spatial transcriptomics data and has been demonstrated, through survival analysis and cell-cell communication studies, to dissect spatial domains within breast cancer tissues. These findings provide strong evidence of DiffusionST's efficacy in handling spatial transcriptomic data especially with strong noise, making it a valuable tool in this field.
Abstract DNA binding assays, expression analyses, and binding site mutagenesis revealed that the Drosophila CrebA transcription factor (TF) boosts secretory capacity in the embryonic salivary gland (SG) through direct regulation … Abstract DNA binding assays, expression analyses, and binding site mutagenesis revealed that the Drosophila CrebA transcription factor (TF) boosts secretory capacity in the embryonic salivary gland (SG) through direct regulation of secretory pathway component genes (SPCGs). The mammalian orthologues of CrebA, the Creb3L-family of leucine zipper TFs, not only activate SPCG expression in a variety of mammalian tissues but can also activate SPCG expression in Drosophila embryos, suggesting a highly conserved role for this family of proteins in boosting secretory capacity. However, in vivo assays reveal that CrebA binds far more genes than it regulates, and it remains unclear what distinguishes functional binding. It is also unclear if CrebA is the major factor driving SPCG gene expression in all Drosophila embryonic tissues and/or if CrebA also regulates other tissue-specific functions. Thus, we did single cell RNA sequencing (scRNA-seq) of wild-type (WT) and CrebA null embryos to explore the relationship between CrebA binding and gene regulation. We find that CrebA binds the proximal promoters of its targets, that SPCGs are the major class of genes regulated by CrebA across tissues, and that CrebA is sufficient to activate SPCG expression even in cells that do not normally express the protein. A comparison of scRNA-Seq to other methods for capturing regulated transcripts reveals that the different methodologies identify overlapping but distinct sets of CrebA targets.
Pulmonary endothelial cells (PECs) are indispensable for sustaining lung microenvironmental homeostasis and exert significant influence across a spectrum of pulmonary pathologies. Single-cell RNA sequencing (scRNA-seq) has fundamentally transformed conventional paradigms … Pulmonary endothelial cells (PECs) are indispensable for sustaining lung microenvironmental homeostasis and exert significant influence across a spectrum of pulmonary pathologies. Single-cell RNA sequencing (scRNA-seq) has fundamentally transformed conventional paradigms surrounding PECs, unveiling novel perspectives on their roles in both physiological and pathological lung conditions. This technology provides critical insights into the phenotypic diversity and distinct molecular signatures of PECs, underscoring their substantial heterogeneity in structure, function and gene expression, which is contingent upon their spatial localization within the lung microenvironment. The advancements in scRNA-seq have catalyzed remarkable progress in the therapeutic management of pulmonary pathophysiology, facilitating breakthroughs in the identification of cellular subpopulations, functional characterization and discovery of innovative therapeutic targets. In this review, we systematically synthesize the markers and subclusters of PECs as delineated by scRNA-seq, elucidate their applications in normal and pathological lung contexts, and propose future directions regarding molecular mechanisms and therapeutic interventions targeting PECs.
Abstract Background Gastric cancer (GC) presents challenges in predicting treatment responses due to its patient-specific heterogeneity. Recently, liquid biopsies have emerged as a valuable data modality, offering essential cellular and … Abstract Background Gastric cancer (GC) presents challenges in predicting treatment responses due to its patient-specific heterogeneity. Recently, liquid biopsies have emerged as a valuable data modality, offering essential cellular and molecular insights while facilitating the capture of time-sensitive information. This study aimed to leverage artificial intelligence (AI) technology to analyze longitudinal liquid biopsy data. Methods We collected a dataset from longitudinal liquid biopsies of 91 patients at Peking Cancer Hospital, spanning from July 2019 to April 2022. This dataset included 1895 tumor-related cellular images and 1698 tumor marker indices. Subsequently, we introduced the Dynamic-Aware Model (DAM) to predict responses to GC treatment. DAM incorporates dynamic data through AI-engineered components, facilitating an in-depth longitudinal analysis. Results Utilizing threefold cross-validation, DAM exhibited superior performance compared to traditional cell-counting methods, achieving an AUC of 0.807 in predicting GC treatment responses. In the test set, DAM maintained stable efficacy with an AUC of 0.802. Besides, DAM showed the capability to accurately predict treatment responses based on early treatment data. Moreover, DAM’s visual analysis of attention mechanisms identified six dynamic visual features related to focus areas, which were strongly associated with treatment-response. Conclusions These findings represent a pioneering effort in applying AI technology to interpret longitudinal liquid biopsy data and employ visual analytics in GC. This approach provides a promising pathway toward precise response prediction and personalized treatment strategies for patients with GC.
Abstract The tumor microenvironment (TME) is a critical focus for biomarker discovery and therapeutic targeting in cancer. However, widespread clinical adoption of TME profiling is hindered by the high cost … Abstract The tumor microenvironment (TME) is a critical focus for biomarker discovery and therapeutic targeting in cancer. However, widespread clinical adoption of TME profiling is hindered by the high cost and technical complexity of current platforms such as spatial transcriptomics and proteomics. Artificial Intelligence (AI)-based analysis of the TME from routine Hematoxylin & Eosin (H&E)-stained pathology slides presents a promising alternative. Yet, most existing deep learning approaches depend on extensive high-quality single-cell or patch-level annotations, which are labor-intensive and costly to generate. To address these limitations, we previously introduced HistoTME, a weakly supervised deep learning framework that predicts the activity of cell type-specific transcriptomic signatures directly from whole slide H&E images of non-small cell lung cancer. This enables rapid, high throughput analysis of the TME composition from whole slide H&E images (WSI) without the need for segmenting and classifying individual cells. In this work, we present HistoTME-v2, a pan-cancer extension of HistoTME, applied across 25 solid tumor types, substantially broadening the scope of prior efforts. HistoTME-v2 demonstrates high accuracy for predicting cell type-specific transcriptomic signature activity from H&E images, achieving a median Pearson correlation of 0.61 with ground truth measurements in internal cross- validation on The Cancer Genome Atlas (TCGA), encompassing 7,586 WSIs, 6,901 patients, and 24 cancer types, and a median Pearson correlation of 0.53 on external validation datasets spanning 5,657 WSIs, 1,775 patients and 9 cancer types. Furthermore, HistoTME- v2 resolves the spatial distribution of key immune and stromal cell types, exhibiting strong spatial concordance with single-cell measurements derived from multiplex imaging (CODEX, IHC) as well as Visium spatial transcriptomics, spanning 259 WSI, 154 patients, and 7 cancer types. Overall, across both bulk and spatial settings, HistoTME-v2 significantly outperforms baselines, positioning it as a robust, interpretable and cost-efficient tool for TME profiling and advancing the integration of spatial biology into routine pathology workflows.
ABSTRACT Among various cellular characteristics, flow cytometry can evaluate antigen expression through qualitative or quantitative approaches. For relative quantification, fluorescence intensity (FI) values are converted into quantitative measurements using appropriate … ABSTRACT Among various cellular characteristics, flow cytometry can evaluate antigen expression through qualitative or quantitative approaches. For relative quantification, fluorescence intensity (FI) values are converted into quantitative measurements using appropriate reference materials. To quantitatively estimate antigen density or define ligand‐binding sites per cell, antibody binding capacity (ABC) values serve as the preferred metric. Standardizing assays through the conversion of arbitrary FI units into quantitative data is essential for consistency. However, reported ABC values for well‐characterized antigens vary across the literature. This study addresses the challenges in achieving robust and reproducible quantitative flow cytometry data, offering methodological recommendations for accurately assessing target expression. Our research includes a comprehensive investigation of multiple factors, such as conventional and full‐spectrum instruments, antibodies, reagents, matrices, cell density/confluency, cellular autofluorescence, and quantitative kits, to identify the primary sources of variation in ABC calculations. By implementing a systematic and integrated approach, we aim to ensure the generation of reliable and reproducible ABC values. Longitudinal studies provide strong evidence of assay robustness, while the established protocol further supports biomarker evaluation across different matrices and various stages of drug development.