Biochemistry, Genetics and Molecular Biology Genetics

Genomics and Rare Diseases

Description

This cluster of papers focuses on the standards, guidelines, and tools for interpreting genetic variants, particularly in the context of clinical genomics and Mendelian disorders. It includes topics such as pathogenicity prediction, functional annotations, sequence interpretation, and the use of exome sequencing for identifying disease-causing variants.

Keywords

Genetic Variants; Sequence Interpretation; Clinical Genomics; Pathogenicity Prediction; Exome Sequencing; Mendelian Disorders; Functional Annotations; Variant Databases; Phenotype Analysis; ACMG Guidelines

Abstract PolyPhen‐2 (Polymorphism Phenotyping v2), available as software and via a Web server, predicts the possible impact of amino acid substitutions on the stability and function of human proteins using … Abstract PolyPhen‐2 (Polymorphism Phenotyping v2), available as software and via a Web server, predicts the possible impact of amino acid substitutions on the stability and function of human proteins using structural and comparative evolutionary considerations. It performs functional annotation of single‐nucleotide polymorphisms (SNPs), maps coding SNPs to gene transcripts, extracts protein sequence annotations and structural attributes, and builds conservation profiles. It then estimates the probability of the missense mutation being damaging based on a combination of all these properties. PolyPhen‐2 features include a high‐quality multiple protein sequence alignment pipeline and a prediction method employing machine‐learning classification. The software also integrates the UCSC Genome Browser's human genome annotations and MultiZ multiple alignments of vertebrate genomes with the human genome. PolyPhen‐2 is capable of analyzing large volumes of data produced by next‐generation sequencing projects, thanks to built‐in support for high‐performance computing environments like Grid Engine and Platform LSF. Curr. Protoc. Hum. Genet . 76:7.20.1‐7.20.41. © 2013 by John Wiley & Sons, Inc.
Here, we describe an overview and update on GeneMatcher (http://www.genematcher.org), a freely accessible Web-based tool developed as part of the Baylor-Hopkins Center for Mendelian Genomics. We created GeneMatcher with the … Here, we describe an overview and update on GeneMatcher (http://www.genematcher.org), a freely accessible Web-based tool developed as part of the Baylor-Hopkins Center for Mendelian Genomics. We created GeneMatcher with the goal of identifying additional individuals with rare phenotypes who had variants in the same candidate disease gene. We also wanted to facilitate connections to basic scientists working on orthologous genes in model systems with the goal of connecting their work to human Mendelian phenotypes. Meeting these goals will enhance the identification of novel Mendelian genes. Launched in September, 2013, GeneMatcher now has 2,178 candidate genes from 486 submitters spread across 38 countries entered in the database (June 1, 2015). GeneMatcher is also part of the Matchmaker Exchange (http://matchmakerexchange.org/) with an Application Programing Interface enabling submitters to query other databases of genetic variants and phenotypes without having to create accounts and data entries in multiple systems.
High-throughput sequencing platforms are generating massive amounts of genetic variation data for diverse genomes, but it remains a challenge to pinpoint a small subset of functionally important variants. To fill … High-throughput sequencing platforms are generating massive amounts of genetic variation data for diverse genomes, but it remains a challenge to pinpoint a small subset of functionally important variants. To fill these unmet needs, we developed the ANNOVAR tool to annotate single nucleotide variants (SNVs) and insertions/deletions, such as examining their functional consequence on genes, inferring cytogenetic bands, reporting functional importance scores, finding variants in conserved regions, or identifying variants reported in the 1000 Genomes Project and dbSNP. ANNOVAR can utilize annotation databases from the UCSC Genome Browser or any annotation data set conforming to Generic Feature Format version 3 (GFF3). We also illustrate a 'variants reduction' protocol on 4.7 million SNVs and indels from a human genome, including two causal mutations for Miller syndrome, a rare recessive disease. Through a stepwise procedure, we excluded variants that are unlikely to be causal, and identified 20 candidate genes including the causal gene. Using a desktop computer, ANNOVAR requires ∼4 min to perform gene-based annotation and ∼15 min to perform variants reduction on 4.7 million variants, making it practical to handle hundreds of human genomes in a day. ANNOVAR is freely available at http://www.openbioinformatics.org/annovar/.
As a first step toward understanding how rare variants contribute to risk for complex diseases, we sequenced 15,585 human protein-coding genes to an average median depth of 111× in 2440 … As a first step toward understanding how rare variants contribute to risk for complex diseases, we sequenced 15,585 human protein-coding genes to an average median depth of 111× in 2440 individuals of European (n = 1351) and African (n = 1088) ancestry. We identified over 500,000 single-nucleotide variants (SNVs), the majority of which were rare (86% with a minor allele frequency less than 0.5%), previously unknown (82%), and population-specific (82%). On average, 2.3% of the 13,595 SNVs each person carried were predicted to affect protein function of ~313 genes per genome, and ~95.7% of SNVs predicted to be functionally important were rare. This excess of rare functional variants is due to the combined effects of explosive, recent accelerated population growth and weak purifying selection. Furthermore, we show that large sample sizes will be required to associate rare variants with complex traits.
The Human Gene Mutation Database (HGMD) constitutes a comprehensive core collection of data on germ-line mutations in nuclear genes underlying or associated with human inherited disease (www.hgmd.org). Data catalogued includes: … The Human Gene Mutation Database (HGMD) constitutes a comprehensive core collection of data on germ-line mutations in nuclear genes underlying or associated with human inherited disease (www.hgmd.org). Data catalogued includes: single base-pair substitutions in coding, regulatory and splicing-relevant regions; micro-deletions and micro-insertions; indels; triplet repeat expansions as well as gross deletions; insertions; duplications; and complex rearrangements. Each mutation is entered into HGMD only once in order to avoid confusion between recurrent and identical-by-descent lesions. By March 2003, the database contained in excess of 39,415 different lesions detected in 1,516 different nuclear genes, with new entries currently accumulating at a rate exceeding 5,000 per annum. Since its inception, HGMD has been expanded to include cDNA reference sequences for more than 87% of listed genes, splice junction sequences, disease-associated and functional polymorphisms, as well as links to data present in publicly available online locus-specific mutation databases. Although HGMD has recently entered into a licensing agreement with Celera Genomics (Rockville, MD), mutation data will continue to be made freely available via the Internet.
Consistent gene mutation nomenclature is essential for efficient and accurate reporting, testing, and curation of the growing number of disease mutations and useful polymorphisms being discovered in the human genome. … Consistent gene mutation nomenclature is essential for efficient and accurate reporting, testing, and curation of the growing number of disease mutations and useful polymorphisms being discovered in the human genome. While a codified mutation nomenclature system for simple DNA lesions has now been adopted broadly by the medical genetics community, it is inherently difficult to represent complex mutations in a unified manner. In this article, suggestions are presented for reporting just such complex mutations.
The Human Gene Mutation Database (HGMD®) is a comprehensive collection of germline mutations in nuclear genes that underlie, or are associated with, human inherited disease. By June 2013, the database … The Human Gene Mutation Database (HGMD®) is a comprehensive collection of germline mutations in nuclear genes that underlie, or are associated with, human inherited disease. By June 2013, the database contained over 141,000 different lesions detected in over 5,700 different genes, with new mutation entries currently accumulating at a rate exceeding 10,000 per annum. HGMD was originally established in 1996 for the scientific study of mutational mechanisms in human genes. However, it has since acquired a much broader utility as a central unified disease-oriented mutation repository utilized by human molecular geneticists, genome scientists, molecular biologists, clinicians and genetic counsellors as well as by those specializing in biopharmaceuticals, bioinformatics and personalized genomics. The public version of HGMD ( http://www.hgmd.org ) is freely available to registered users from academic institutions/non-profit organizations whilst the subscription version (HGMD Professional) is available to academic, clinical and commercial users under license via BIOBASE GmbH.
The goal of the International HapMap Project is to determine the common patterns of DNA sequence variation in the human genome and to make this information freely available in the … The goal of the International HapMap Project is to determine the common patterns of DNA sequence variation in the human genome and to make this information freely available in the public domain. An international consortium is developing a map of these patterns across the genome by determining the genotypes of one million or more sequence variants, their frequencies and the degree of association between them, in DNA samples from populations with ancestry from parts of Africa, Asia and Europe. The HapMap will allow the discovery of sequence variants that affect common disease, will facilitate development of diagnostic tools, and will enhance our ability to choose targets for therapeutic intervention.
Single-copy sequences can be enzymatically amplified from genomic DNA by the polymerase chain reaction. By using unequal molar amounts of the two amplification primers, it is possible in a single … Single-copy sequences can be enzymatically amplified from genomic DNA by the polymerase chain reaction. By using unequal molar amounts of the two amplification primers, it is possible in a single step to amplify a single-copy gene and produce an excess of single-stranded DNA of a chosen strand for direct sequencing or for use as a hybridization probe. Further, individual alleles in a heterozygote can be sequenced directly by using allele-specific oligonucleotides either in the amplification reaction or as sequencing primers. By using these methods, we have studied the allelic diversity at the HLA-DQA locus and its association with the serologically defined HLA-DR and -DQ types. This analysis has revealed a total of eight alleles and three additional haplotypes. This procedure has wide applications in screening for mutations in human genes and facilitates the linking of enzymatic amplification of genes to automated sequencing.
Whole-exome sequencing is a diagnostic approach for the identification of molecular defects in patients with suspected genetic disorders. Whole-exome sequencing is a diagnostic approach for the identification of molecular defects in patients with suspected genetic disorders.
The Sorting Intolerant from Tolerant (SIFT) algorithm predicts the effect of coding variants on protein function. It was first introduced in 2001, with a corresponding website that provides users with … The Sorting Intolerant from Tolerant (SIFT) algorithm predicts the effect of coding variants on protein function. It was first introduced in 2001, with a corresponding website that provides users with predictions on their variants. Since its release, SIFT has become one of the standard tools for characterizing missense variation. We have updated SIFT’s genome-wide prediction tool since our last publication in 2009, and added new features to the insertion/deletion (indel) tool. We also show accuracy metrics on independent data sets. The original developers have hosted the SIFT web server at FHCRC, JCVI and the web server is currently located at BII. The URL is http://sift-dna.org (24 May 2012, date last accessed).
This article provides a classification of primary progressive aphasia (PPA) and its 3 main variants to improve the uniformity of case reporting and the reliability of research results. Criteria for … This article provides a classification of primary progressive aphasia (PPA) and its 3 main variants to improve the uniformity of case reporting and the reliability of research results. Criteria for the 3 variants of PPA--nonfluent/agrammatic, semantic, and logopenic--were developed by an international group of PPA investigators who convened on 3 occasions to operationalize earlier published clinical descriptions for PPA subtypes. Patients are first diagnosed with PPA and are then divided into clinical variants based on specific speech and language features characteristic of each subtype. Classification can then be further specified as "imaging-supported" if the expected pattern of atrophy is found and "with definite pathology" if pathologic or genetic data are available. The working recommendations are presented in lists of features, and suggested assessment tasks are also provided. These recommendations have been widely agreed upon by a large group of experts and should be used to ensure consistency of PPA classification in future studies. Future collaborations will collect prospective data to identify relationships between each of these syndromes and specific biomarkers for a more detailed understanding of clinicopathologic correlations.
Clinical whole-exome sequencing is increasingly used for diagnostic evaluation of patients with suspected genetic disorders.To perform clinical whole-exome sequencing and report (1) the rate of molecular diagnosis among phenotypic groups, … Clinical whole-exome sequencing is increasingly used for diagnostic evaluation of patients with suspected genetic disorders.To perform clinical whole-exome sequencing and report (1) the rate of molecular diagnosis among phenotypic groups, (2) the spectrum of genetic alterations contributing to disease, and (3) the prevalence of medically actionable incidental findings such as FBN1 mutations causing Marfan syndrome.Observational study of 2000 consecutive patients with clinical whole-exome sequencing analyzed between June 2012 and August 2014. Whole-exome sequencing tests were performed at a clinical genetics laboratory in the United States. Results were reported by clinical molecular geneticists certified by the American Board of Medical Genetics and Genomics. Tests were ordered by the patient's physician. The patients were primarily pediatric (1756 [88%]; mean age, 6 years; 888 females [44%], 1101 males [55%], and 11 fetuses [1% gender unknown]), demonstrating diverse clinical manifestations most often including nervous system dysfunction such as developmental delay.Whole-exome sequencing diagnosis rate overall and by phenotypic category, mode of inheritance, spectrum of genetic events, and reporting of incidental findings.A molecular diagnosis was reported for 504 patients (25.2%) with 58% of the diagnostic mutations not previously reported. Molecular diagnosis rates for each phenotypic category were 143/526 (27.2%; 95% CI, 23.5%-31.2%) for the neurological group, 282/1147 (24.6%; 95% CI, 22.1%-27.2%) for the neurological plus other organ systems group, 30/83 (36.1%; 95% CI, 26.1%-47.5%) for the specific neurological group, and 49/244 (20.1%; 95% CI, 15.6%-25.8%) for the nonneurological group. The Mendelian disease patterns of the 527 molecular diagnoses included 280 (53.1%) autosomal dominant, 181 (34.3%) autosomal recessive (including 5 with uniparental disomy), 65 (12.3%) X-linked, and 1 (0.2%) mitochondrial. Of 504 patients with a molecular diagnosis, 23 (4.6%) had blended phenotypes resulting from 2 single gene defects. About 30% of the positive cases harbored mutations in disease genes reported since 2011. There were 95 medically actionable incidental findings in genes unrelated to the phenotype but with immediate implications for management in 92 patients (4.6%), including 59 patients (3%) with mutations in genes recommended for reporting by the American College of Medical Genetics and Genomics.Whole-exome sequencing provided a potential molecular diagnosis for 25% of a large cohort of patients referred for evaluation of suspected genetic conditions, including detection of rare genetic events and new mutations contributing to disease. The yield of whole-exome sequencing may offer advantages over traditional molecular diagnostic approaches in certain patients.
The SWISS-PROT protein knowledgebase (http://www.expasy.org/sprot/ and http://www.ebi.ac.uk/swissprot/) connects amino acid sequences with the current knowledge in the Life Sciences. Each protein entry provides an interdisciplinary overview of relevant information by … The SWISS-PROT protein knowledgebase (http://www.expasy.org/sprot/ and http://www.ebi.ac.uk/swissprot/) connects amino acid sequences with the current knowledge in the Life Sciences. Each protein entry provides an interdisciplinary overview of relevant information by bringing together experimental results, computed features and sometimes even contradictory conclusions. Detailed expertise that goes beyond the scope of SWISS-PROT is made available via direct links to specialised databases. SWISS-PROT provides annotated entries for all species, but concentrates on the annotation of entries from human (the HPI project) and other model organisms to ensure the presence of high quality annotation for representative members of all protein families. Part of the annotation can be transferred to other family members, as is already done for microbes by the High-quality Automated and Manual Annotation of microbial Proteomes (HAMAP) project. Protein families and groups of proteins are regularly reviewed to keep up with current scientific findings. Complementarily, TrEMBL strives to comprise all protein sequences that are not yet represented in SWISS-PROT, by incorporating a perpetually increasing level of mostly automated annotation. Researchers are welcome to contribute their knowledge to the scientific community by submitting relevant findings to SWISS-PROT at [email protected].
ClinVar (http://www.ncbi.nlm.nih.gov/clinvar/) provides a freely available archive of reports of relationships among medically important variants and phenotypes. ClinVar accessions submissions reporting human variation, interpretations of the relationship of that variation … ClinVar (http://www.ncbi.nlm.nih.gov/clinvar/) provides a freely available archive of reports of relationships among medically important variants and phenotypes. ClinVar accessions submissions reporting human variation, interpretations of the relationship of that variation to human health and the evidence supporting each interpretation. The database is tightly coupled with dbSNP and dbVar, which maintain information about the location of variation on human assemblies. ClinVar is also based on the phenotypic descriptions maintained in MedGen (http://www.ncbi.nlm.nih.gov/medgen). Each ClinVar record represents the submitter, the variation and the phenotype, i.e. the unit that is assigned an accession of the format SCV000000000.0. The submitter can update the submission at any time, in which case a new version is assigned. To facilitate evaluation of the medical importance of each variant, ClinVar aggregates submissions with the same variation/phenotype combination, adds value from other NCBI databases, assigns a distinct accession of the format RCV000000000.0 and reports if there are conflicting clinical interpretations. Data in ClinVar are available in multiple formats, including html, download as XML, VCF or tab-delimited subsets. Data from ClinVar are provided as annotation tracks on genomic RefSeqs and are used in tools such as Variation Reporter (http://www.ncbi.nlm.nih.gov/variation/tools/reporter), which reports what is known about variation based on user-supplied locations.
The discovery of rare genetic variants is accelerating, and clear guidelines for distinguishing disease-causing sequence variants from the many potentially functional variants present in any human genome are urgently needed. … The discovery of rare genetic variants is accelerating, and clear guidelines for distinguishing disease-causing sequence variants from the many potentially functional variants present in any human genome are urgently needed. Without rigorous standards we risk an acceleration of false-positive reports of causality, which would impede the translation of genomic research findings into the clinical diagnostic setting and hinder biological understanding of disease. Here we discuss the key challenges of assessing sequence variants in human disease, integrating both gene-level and variant-level support for causality. We propose guidelines for summarizing confidence in variant pathogenicity and highlight several areas that require further resource development.
The 1000 Genomes Project aims to provide a deep characterization of human genome sequence variation as a foundation for investigating the relationship between genotype and phenotype. Here we present results … The 1000 Genomes Project aims to provide a deep characterization of human genome sequence variation as a foundation for investigating the relationship between genotype and phenotype. Here we present results of the pilot phase of the project, designed to develop and compare different strategies for genome-wide sequencing with high-throughput platforms. We undertook three projects: low-coverage whole-genome sequencing of 179 individuals from four populations; high-coverage sequencing of two mother-father-child trios; and exon-targeted sequencing of 697 individuals from seven populations. We describe the location, allele frequency and local haplotype structure of approximately 15 million single nucleotide polymorphisms, 1 million short insertions and deletions, and 20,000 structural variants, most of which were previously undescribed. We show that, because we have catalogued the vast majority of common variation, over 95% of the currently accessible variants found in any individual are present in this data set. On average, each person is found to carry approximately 250 to 300 loss-of-function variants in annotated genes and 50 to 100 variants previously implicated in inherited disorders. We demonstrate how these results can be used to inform association and functional studies. From the two trios, we directly estimate the rate of de novo germline base substitution mutations to be approximately 10(-8) per base pair per generation. We explore the data with regard to signatures of natural selection, and identify a marked reduction of genetic variation in the neighbourhood of genes, due to selection at linked sites. These methods and public data will support the next phase of human genetic research.
ClinVar (https://www.ncbi.nlm.nih.gov/clinvar/) at the National Center for Biotechnology Information (NCBI) is a freely available archive for interpretations of clinical significance of variants for reported conditions. The database includes germline and … ClinVar (https://www.ncbi.nlm.nih.gov/clinvar/) at the National Center for Biotechnology Information (NCBI) is a freely available archive for interpretations of clinical significance of variants for reported conditions. The database includes germline and somatic variants of any size, type or genomic location. Interpretations are submitted by clinical testing laboratories, research laboratories, locus-specific databases, OMIM®, GeneReviews™, UniProt, expert panels and practice guidelines. In NCBI's Variation submission portal, submitters upload batch submissions or use the Submission Wizard for single submissions. Each submitted interpretation is assigned an accession number prefixed with SCV. ClinVar staff review validation reports with data types such as HGVS (Human Genome Variation Society) expressions; however, clinical significance is reported directly from submitters. Interpretations are aggregated by variant-condition combination and assigned an accession number prefixed with RCV. Clinical significance is calculated for the aggregate record, indicating consensus or conflict in the submitted interpretations. ClinVar uses data standards, such as HGVS nomenclature for variants and MedGen identifiers for conditions. The data are available on the web as variant-specific views; the entire data set can be downloaded via ftp. Programmatic access for ClinVar records is available through NCBI's E-utilities. Future development includes providing a variant-centric XML archive and a web page for details of SCV submissions.
Large-scale reference data sets of human genetic variation are critical for the medical and functional interpretation of DNA sequence changes. Here we describe the aggregation and analysis of high-quality exome … Large-scale reference data sets of human genetic variation are critical for the medical and functional interpretation of DNA sequence changes. Here we describe the aggregation and analysis of high-quality exome (protein-coding region) DNA sequence data for 60,706 individuals of diverse ancestries generated as part of the Exome Aggregation Consortium (ExAC). This catalogue of human genetic diversity contains an average of one variant every eight bases of the exome, and provides direct evidence for the presence of widespread mutational recurrence. We have used this catalogue to calculate objective metrics of pathogenicity for sequence variants, and to identify genes subject to strong selection against various classes of mutation; identifying 3,230 genes with near-complete depletion of predicted protein-truncating variants, with 72% of these genes having no currently established human disease phenotype. Finally, we demonstrate that these data can be used for the efficient filtering of candidate disease-causing variants, and for the discovery of human 'knockout' variants in protein-coding genes. Exome sequencing data from 60,706 people of diverse geographic ancestry is presented, providing insight into genetic variation across populations, and illuminating the relationship between DNA variants and human disease. As part of the Exome Aggregation Consortium (ExAC) project, Daniel MacArthur and colleagues report on the generation and analysis of high-quality exome sequencing data from 60,706 individuals of diverse ancestry. This provides the most comprehensive catalogue of human protein-coding genetic variation to date, yielding unprecedented resolution for the analysis of very rare variants across multiple human populations. The catalogue is freely accessible and provides a critical reference panel for the clinical interpretation of genetic variants and the discovery of disease-related genes.
The consistent and unambiguous description of sequence variants is essential to report and exchange information on the analysis of a genome. In particular, DNA diagnostics critically depends on accurate and … The consistent and unambiguous description of sequence variants is essential to report and exchange information on the analysis of a genome. In particular, DNA diagnostics critically depends on accurate and standardized description and sharing of the variants detected. The sequence variant nomenclature system proposed in 2000 by the Human Genome Variation Society has been widely adopted and has developed into an internationally accepted standard. The recommendations are currently commissioned through a Sequence Variant Description Working Group (SVD-WG) operating under the auspices of three international organizations: the Human Genome Variation Society (HGVS), the Human Variome Project (HVP), and the Human Genome Organization (HUGO). Requests for modifications and extensions go through the SVD-WG following a standard procedure including a community consultation step. Version numbers are assigned to the nomenclature system to allow users to specify the version used in their variant descriptions. Here, we present the current recommendations, HGVS version 15.11, and briefly summarize the changes that were made since the 2000 publication. Most focus has been on removing inconsistencies and tightening definitions allowing automatic data processing. An extensive version of the recommendations is available online, at http://www.HGVS.org/varnomen.
The Ensembl Variant Effect Predictor is a powerful toolset for the analysis, annotation, and prioritization of genomic variants in coding and non-coding regions. It provides access to an extensive collection … The Ensembl Variant Effect Predictor is a powerful toolset for the analysis, annotation, and prioritization of genomic variants in coding and non-coding regions. It provides access to an extensive collection of genomic annotation, with a variety of interfaces to suit different requirements, and simple options for configuring and extending analysis. It is open source, free to use, and supports full reproducibility of results. The Ensembl Variant Effect Predictor can simplify and accelerate variant interpretation in a wide range of study designs.
The Human Gene Mutation Database (HGMD®) constitutes a comprehensive collection of published germline mutations in nuclear genes that underlie, or are closely associated with human inherited disease. At the time … The Human Gene Mutation Database (HGMD®) constitutes a comprehensive collection of published germline mutations in nuclear genes that underlie, or are closely associated with human inherited disease. At the time of writing (March 2017), the database contained in excess of 203,000 different gene lesions identified in over 8000 genes manually curated from over 2600 journals. With new mutation entries currently accumulating at a rate exceeding 17,000 per annum, HGMD represents de facto the central unified gene/disease-oriented repository of heritable mutations causing human genetic disease used worldwide by researchers, clinicians, diagnostic laboratories and genetic counsellors, and is an essential tool for the annotation of next-generation sequencing data. The public version of HGMD ( http://www.hgmd.org ) is freely available to registered users from academic institutions and non-profit organisations whilst the subscription version (HGMD Professional) is available to academic, clinical and commercial users under license via QIAGEN Inc.
ClinVar (https://www.ncbi.nlm.nih.gov/clinvar/) is a freely available, public archive of human genetic variants and interpretations of their significance to disease, maintained at the National Institutes of Health. Interpretations of the clinical … ClinVar (https://www.ncbi.nlm.nih.gov/clinvar/) is a freely available, public archive of human genetic variants and interpretations of their significance to disease, maintained at the National Institutes of Health. Interpretations of the clinical significance of variants are submitted by clinical testing laboratories, research laboratories, expert panels and other groups. ClinVar aggregates data by variant-disease pairs, and by variant (or set of variants). Data aggregated by variant are accessible on the website, in an improved set of variant call format files and as a new comprehensive XML report. ClinVar recently started accepting submissions that are focused primarily on providing phenotypic information for individuals who have had genetic testing. Submissions may come from clinical providers providing their own interpretation of the variant ('provider interpretation') or from groups such as patient registries that primarily provide phenotypic information from patients ('phenotyping only'). ClinVar continues to make improvements to its search and retrieval functions. Several new fields are now indexed for more precise searching, and filters allow the user to narrow down a large set of search results.
Combined Annotation-Dependent Depletion (CADD) is a widely used measure of variant deleteriousness that can effectively prioritize causal variants in genetic analyses, particularly highly penetrant contributors to severe Mendelian disorders. CADD … Combined Annotation-Dependent Depletion (CADD) is a widely used measure of variant deleteriousness that can effectively prioritize causal variants in genetic analyses, particularly highly penetrant contributors to severe Mendelian disorders. CADD is an integrative annotation built from more than 60 genomic features, and can score human single nucleotide variants and short insertion and deletions anywhere in the reference assembly. CADD uses a machine learning model trained on a binary distinction between simulated de novo variants and variants that have arisen and become fixed in human populations since the split between humans and chimpanzees; the former are free of selective pressure and may thus include both neutral and deleterious alleles, while the latter are overwhelmingly neutral (or, at most, weakly deleterious) by virtue of having survived millions of years of purifying selection. Here we review the latest updates to CADD, including the most recent version, 1.4, which supports the human genome build GRCh38. We also present updates to our website that include simplified variant lookup, extended documentation, an Application Program Interface and improved mechanisms for integrating CADD scores into other tools or applications. CADD scores, software and documentation are available at https://cadd.gs.washington.edu.
Abstract Summary VarSome.com is a search engine, aggregator and impact analysis tool for human genetic variation and a community-driven project aiming at sharing global expertise on human variants. Availability and … Abstract Summary VarSome.com is a search engine, aggregator and impact analysis tool for human genetic variation and a community-driven project aiming at sharing global expertise on human variants. Availability and implementation VarSome is freely available at http://varsome.com. Supplementary information Supplementary data are available at Bioinformatics online.
Abstract One of the most pressing challenges in genomic medicine is to understand the role played by genetic variation in health and disease. Thanks to the exploration of genomic variants … Abstract One of the most pressing challenges in genomic medicine is to understand the role played by genetic variation in health and disease. Thanks to the exploration of genomic variants at large scale, hundreds of thousands of disease-associated loci have been uncovered. However, the identification of variants of clinical relevance is a significant challenge that requires comprehensive interrogation of previous knowledge and linkage to new experimental results. To assist in this complex task, we created DisGeNET (http://www.disgenet.org/), a knowledge management platform integrating and standardizing data about disease associated genes and variants from multiple sources, including the scientific literature. DisGeNET covers the full spectrum of human diseases as well as normal and abnormal traits. The current release covers more than 24 000 diseases and traits, 17 000 genes and 117 000 genomic variants. The latest developments of DisGeNET include new sources of data, novel data attributes and prioritization metrics, a redesigned web interface and recently launched APIs. Thanks to the data standardization, the combination of expert curated information with data automatically mined from the scientific literature, and a suite of tools for accessing its publicly available data, DisGeNET is an interoperable resource supporting a variety of applications in genomic medicine and drug R&D.
Abstract Genetic variants that inactivate protein-coding genes are a powerful source of information about the phenotypic consequences of gene disruption: genes that are crucial for the function of an organism … Abstract Genetic variants that inactivate protein-coding genes are a powerful source of information about the phenotypic consequences of gene disruption: genes that are crucial for the function of an organism will be depleted of such variants in natural populations, whereas non-essential genes will tolerate their accumulation. However, predicted loss-of-function variants are enriched for annotation errors, and tend to be found at extremely low frequencies, so their analysis requires careful variant annotation and very large sample sizes 1 . Here we describe the aggregation of 125,748 exomes and 15,708 genomes from human sequencing studies into the Genome Aggregation Database (gnomAD). We identify 443,769 high-confidence predicted loss-of-function variants in this cohort after filtering for artefacts caused by sequencing and annotation errors. Using an improved model of human mutation rates, we classify human protein-coding genes along a spectrum that represents tolerance to inactivation, validate this classification using data from model organisms and engineered human cells, and show that it can be used to improve the power of gene discovery for both common and rare diseases.
To explore the clinical and genetic characteristics of two children diagnosed with two rare genetic diseases simultaneously. Two children with comorbidity of two genetic diseases due to dual genetic mutations … To explore the clinical and genetic characteristics of two children diagnosed with two rare genetic diseases simultaneously. Two children with comorbidity of two genetic diseases due to dual genetic mutations diagnosed at the Third Affiliated Hospital of Zhengzhou University respectively in May 2022 and March 2023 were selected as the study subjects. Clinical and genetic data of the two children were retrospectively analyzed. This study has been approved by the Ethics Committee of the Third Affiliated Hospital of Zhengzhou University (Ethic No. 2021-062-01). Child 1 was a 2-year-and-4-month-old boy whose clinical manifestations included facial dysmorphism, developmental delay, short stature, microcephaly, cleft palate, cryptorchidism, hypospadias, recurrent infections and immunological abnormalities. Whole exome sequencing revealed that he had harbored a heterozygous c.6595delT (p.Y2199Ifs*65) variant of the KMT2D gene and a heterozygous c.1892G>A (p.R631Q) variant of the PIK3R1 gene. This has led to a dual genetic diagnosis of Kabuki syndrome and PI3Kδ-related immunodeficiency type 36. Child 2 was a 15-year-old girl whose clinical manifestations included epilepsy, Albright's hereditary osteodystrophy, long body trunk, short limbs, hypocalcemia, hyperphosphatemia and hyperparathyroidism. The child also had a family history of short stature. Whole exome sequencing revealed that she had harbored a heterozygous c.2T>C (p.Met1?) variant of the GNAS gene and deletion of exons 2 to 6 of the SHOX gene. The two variants have led to dual diagnose of pseudohypoparathyroidism and X-linked idiopathic short stature. When the clinical phenotype of a genetic disease is complex and cannot be fully explained with a single genetic variant, multiple pathogenic variants should be considered, and this may lead to the diagnosis of co-morbid genetic diseases. To adopt or supplement corresponding genetic testing in time and re-analyze the genetic data may facilitate accurate diagnosis of co-morbid genetic diseases.
BACKGROUND Genetic disorders are pervasive in neonatal intensive care unit (NICU) populations, and the superiority of genomic testing to rapidly identify genetic diagnoses is established, yet patients remain untested and … BACKGROUND Genetic disorders are pervasive in neonatal intensive care unit (NICU) populations, and the superiority of genomic testing to rapidly identify genetic diagnoses is established, yet patients remain untested and undiagnosed. METHODS In this single-center 19-month cohort study, outcomes before and after implementation of a clinical guideline standardizing genomic testing were evaluated in 2169 patients in the NICU (pre: 692, 31.9%; post: 1477, 68.1%). Primary outcomes were qualifying for and receipt of any genetic services and a diagnosis. Secondary outcomes included admission length and hospital charges. RESULTS The frequency of qualifying for genetic services across racial and birth weight (BW) categories differed: 643 (44.3%; 95% CI 41.8–46.9) white vs 155 (32.3%; 95% CI 28.1–36.5) Black, P < .001; and 584 (49.1%; 95% CI 46.3–52.0) normal vs 78 (23.2%; 95% CI 18.7–27.7) very and extremely low BW, P < .001. When adjusting for these differences, all populations experienced increases in genetics consultations, 177 (25.6%; 95% CI 22.3–28.8) vs 461 (31.2%; 95% CI 28.9–33.6), P = .007; completion of genomic testing, 62 (9.0%; 95% CI 6.8–11.1) vs 363 (24.6%; 95% CI 22.4–26.8), P < .001; and confirmed genetic diagnoses, 57 (8.2%; 95% CI 6.2–10.3) vs 172 (11.6%; 95% CI 10.0–13.3), P = .02. Patients receiving genomic testing experienced decreases in admission length, 46 vs 24 days, P = .008; and hospital charges, $561 536.00 vs $354 627.00, P = .03; regardless of testing outcome. CONCLUSIONS A significant number of patients in the NICU required genomic testing. However, differences existed across race and BW categories in qualifying for genomic services. Standardizing genomic care equitability improved access to testing and genetic disorder detection and lowered the use of health care resources.
| Nature Structural & Molecular Biology
Natural History Studies can help inform clinician and caregiver expectations, form the basis of management guidelines, and provide a comparator for therapeutic intervention. In rare conditions, where collection of prospective … Natural History Studies can help inform clinician and caregiver expectations, form the basis of management guidelines, and provide a comparator for therapeutic intervention. In rare conditions, where collection of prospective longitudinal data is untimely and impractical, quasi-natural history data-from multiple individuals of different ages-provides an alternative approach. A detailed genotype-phenotype analysis of 64 individuals with pathogenic or likely pathogenic ASXL3 variants was carried out, comprising qualitative and quantitative data. The majority of data was collected through direct clinic consultation with the individual and/-or caregiver(s). We report significant phenotypic variability, but improvement trends in feeding, hypotonia, verbalisation, and motor skills over time. Findings include: an increased prevalence of antenatal and neonatal structural anomalies, an emerging renal phenotype, a tendency for poor post-natal growth (with novel reports of obesity later in childhood), and a lower-than-expected prevalence of seizures (compared to the existing literature). We also provide the first qualitative descriptions of several mildly affected probands, at different ages. Our recommendations include: baseline renal imaging after diagnosis, and Dental and Ophthalmological follow-up for all. We describe the largest-to-date cohort of individuals with ASXL3-related disorder, including 24 novel variants, novel clinical findings, quasi-natural history trends, management insights, and recommendations.
ABSTRACT Epilepsy is a relatively common condition with genetic factors contributing significantly to its etiology. Advances in next‐generation sequencing have dramatically increased the number of known epilepsy genes, improving diagnostic … ABSTRACT Epilepsy is a relatively common condition with genetic factors contributing significantly to its etiology. Advances in next‐generation sequencing have dramatically increased the number of known epilepsy genes, improving diagnostic capabilities and patient care. However, 50%–80% of epilepsy patients remain undiagnosed after genomic testing, which includes chromosomal microarray, multigene panels, and genome‐wide sequencing. Reanalysis of existing exome sequencing data has shown promise in increasing diagnostic yield. In this study, we reanalyzed exome sequencing data from 87 individuals with unsolved epilepsy and developmental delay or intellectual disability in Ontario, Canada. Our approach combined clinical and translational research methodologies to identify genetic variants linked to epilepsy. We obtained a diagnostic yield of 14.9%, solving 13 participants, with 11 involving known genes and two novel gene discoveries. In addition, 11 potential diagnoses were identified, suggesting that further investigation could confirm additional diagnoses. Factors such as the inclusion of additional family data, new disease‐gene associations, and technological advancements contributed to these findings. This study highlights the importance of reanalysis as a cost‐effective and timely approach to improving diagnostic yield in epilepsy associated with neurodevelopmental delay.
Abstract Background High-throughput sequencing has revolutionized genetic disorder diagnosis, but variant pathogenicity interpretation is still challenging. Even though the human genome variation society (HGVS) provides recommendations for variant nomenclature, discrepancies … Abstract Background High-throughput sequencing has revolutionized genetic disorder diagnosis, but variant pathogenicity interpretation is still challenging. Even though the human genome variation society (HGVS) provides recommendations for variant nomenclature, discrepancies in annotation remain a significant hurdle. Results In this study, we evaluated the annotation concordance between three tools—ANNOVAR, SnpEff, and variant effect predictor (VEP)—using 164,549 two-star variants from ClinVar. The analysis used HGVS nomenclature string-match comparisons to assess annotation consistency from each tool, corresponding coding impacts, and associated ACMG criteria inferred from the annotations. The analysis revealed variable concordance rates, with 58.52% agreement for HGVSc, 84.04% for HGVSp, and 85.58% for the coding impact. SnpEff showed the highest match for HGVSc (0.988), while VEP bettered for HGVSp (0.977). The substantial discrepancies were noted in the loss-of-function (LoF) category. Incorrect PVS1 interpretations affected the final pathogenicity and downgraded PLP variants (ANNOVAR 55.9%, SnpEff 66.5%, VEP 67.3%), risking false negatives of clinically relevant variants in reports. Conclusions These findings highlight the critical challenges in accurately interpreting variant pathogenicity due to discrepancies in annotations. To enhance the reliability of genetic variant interpretation in clinical practice, standardizing transcript sets and systematically cross-validating results across multiple annotation tools is essential. Graphical Abstract
Introduction A trio analysis refers to the strategy of exome or genome sequencing of DNA from a patient, as well as parents, in order to identify the genetic cause of … Introduction A trio analysis refers to the strategy of exome or genome sequencing of DNA from a patient, as well as parents, in order to identify the genetic cause of a disorder or syndrome. Methods During the last 10 years, we have successfully applied exome or genome sequencing and performed trio analysis for 1,000 patients. Results Overall, 39% of the patients were diagnosed, with the detection of causative variant(s). The variants were located in 308 different genes. Autosomal dominant de novo variants were detected in 46% of the solved cases. Detection rates were highest in patients with a syndromic neurodevelopmental disorder (46%) and in patients with known consanguinity (59%). Even for patients previously analyzed as singletons, using a pre-defined gene panel, a consecutive trio analysis resulted in the detection of a causative variant in 30%. Discussion A major advantage of trio analysis is the immediate identification of de novo variants as well as confirmation of compound heterozygosity. Additionally, inherited variants from a healthy parent can be dismissed as non-disease causing. The trio strategy enables analysis of a high number of genes–or even the whole genome–simultaneously. The strengths of a trio analysis, in combination with analysis of genome sequence data, allows for the detection of a wide range of genetic aberrations. This enables a high diagnostic yield, even in previously analyzed patients. Our current protocol for trio analysis is based on genome sequencing data, which allows for simultaneous detection of single nucleotide variants, insertion/deletions, structural variants, expanded short tandem repeats, as well as a copy number analysis corresponding to an array-CGH, and analysis regarding SMN1 gene copies.
There are over 7000 rare diseases, some affecting 3500 or fewer patients in the United States. Due to clinicians' limited experience with such diseases and the heterogeneity of clinical presentations, … There are over 7000 rare diseases, some affecting 3500 or fewer patients in the United States. Due to clinicians' limited experience with such diseases and the heterogeneity of clinical presentations, ~70% of individuals seeking a diagnosis remain undiagnosed. Deep learning has demonstrated success in aiding the diagnosis of common diseases. However, existing approaches require labeled datasets with thousands of diagnosed patients per disease. We present SHEPHERD, a few-shot learning approach for multi-faceted rare disease diagnosis. SHEPHERD performs deep learning over a knowledge graph enriched with rare disease information and is trained on a dataset of simulated rare disease patients. We demonstrate SHEPHERD's effectiveness across diverse diagnostic tasks, performing causal gene discovery, retrieving "patients-like-me", and characterizing novel disease presentations, using real-world cohorts from the Undiagnosed Diseases Network (N = 465), MyGene2 (N = 146), and the Deciphering Developmental Disorders study (N = 1431). SHEPHERD demonstrates the potential of knowledge-grounded deep learning to accelerate rare disease diagnosis.
Pablo Enrique Guillem , Marco Zurdo-Tabernero , Noelia Egido Iglesias +5 more | Berichte aus der medizinischen Informatik und Bioinformatik/Journal of integrative bioinformatics
Abstract The rapid advancement of Next-Generation Sequencing (NGS) technologies has revolutionized the field of genomics, producing large volumes of data that necessitate sophisticated analytical techniques. This paper introduces a Deep … Abstract The rapid advancement of Next-Generation Sequencing (NGS) technologies has revolutionized the field of genomics, producing large volumes of data that necessitate sophisticated analytical techniques. This paper introduces a Deep Learning model designed to predict the pathogenicity of genetic variants, a vital component in advancing personalized medicine. The model is trained on a dataset derived from the analysis of NGS outputs, containing a combination of well-defined and ambiguous genetic variants. By employing a semi-supervised learning approach, the model efficiently utilizes both confidently labeled and less certain data. At the core of the methodology is the Feature Tokenizer Transformer architecture, which processes both numerical and categorical genomic information. The preprocessing pipeline includes key steps such as data imputation, scaling, and encoding to ensure high data quality. The results highlight the model’s impressive accuracy, particularly in detecting confidently labeled variants, while also addressing the impact of its predictions on less certain (soft-labeled) data.
Abstract Background Diagnosing rare diseases remains challenging due to their inherent complexity and limited physician knowledge. Large language models (LLMs) offer new potential to enhance diagnostic workflows. Objective This study … Abstract Background Diagnosing rare diseases remains challenging due to their inherent complexity and limited physician knowledge. Large language models (LLMs) offer new potential to enhance diagnostic workflows. Objective This study aimed to evaluate the diagnostic accuracy of ChatGPT-4o and 4 open-source LLMs (qwen2.5:7b, Llama3.1:8b, qwen2.5:72b, and Llama3.1:70b) for rare diseases, assesses the language effect on diagnostic performance, and explore retrieval augmented generation (RAG) and chain-of-thought (CoT) reasoning. Methods We extracted clinical manifestations of 121 rare diseases from China’s inaugural rare disease catalog. ChatGPT-4o generated a primary and 5 differential diagnoses, while 4 LLMs were assessed in both English and Chinese contexts. The lowest-performing model underwent RAG and CoT re-evaluation. Diagnostic accuracy was compared via the McNemar test. A survey evaluated 11 clinicians’ familiarity with rare diseases. Results ChatGPT-4o demonstrated the highest diagnostic accuracy with 90.1%. Language effects varied across models: qwen2.5:7b showed comparable performance in Chinese (51.2%) and English (47.9%; χ ² 1 =0.32, P =.57), whereas Llama3.1:8b exhibited significantly higher English accuracy (67.8% vs 31.4%; χ ² 1 =40.20, P <.001). Among larger models, qwen2.5:72b maintained cross-lingual consistency considering the odds ratio (OR; Chinese: 82.6% vs English: 83.5%; OR 0.88, 95% CI 0.27-2.76, P =1.000), contrasting with Llama3.1:70b’s language-dependent variation (Chinese: 80.2% vs English: 90.1%; OR 0.29,95% CI 0.08-0.83, P =.02). Cross-model comparisons revealed Llama3.1:8b underperformed qwen2.5:7b in Chinese ( χ ² 1 =13.22, P <.001) but surpassed it in English ( χ ² 1 =13.92, P <.001). No significant differences were observed between qwen2.5:72b and Llama3.1:70b (English: OR 0.33, P =.08; Chinese: OR 1.5, 95% CI 0.48-5.12, P =.07); qwen2.5:72b matched ChatGPT-4o’s performance in both languages (English: OR 0.33, P =.08; Chinese: OR 0.44, P =.09); Llama3.1:70b mirrored ChatGPT-4o’s English accuracy (OR 1, P =1.000) but lagged in Chinese (OR 0.33; P =.02). RAG implementation enhanced qwen2.5:7b’s accuracy to 79.3% ( χ ² 1 =31.11, P <.001) with 85.9% retrieval precision. The distilled model Deepseek-R1:7b markedly underperformed (9.9% vs qwen2.5:7b; χ ² 1 =42.19, P <.001). Clinician surveys revealed significant knowledge gaps in rare disease management. Conclusions ChatGPT-4o demonstrated superior diagnostic performance for rare diseases. While Llama3.1:8b demonstrates viability for localized deployment in resource-constrained English diagnostic workflows, Chinese applications require larger models to achieve comparable diagnostic accuracy. This urgency is heightened by the release of open-source models like DeepSeek-R1, which may see rapid adoption without thorough validation. Successful clinical implementation of LLMs requires 3 core elements: model parameterization, user language, and pretraining data. The integration of RAG significantly enhanced open-source LLM accuracy for rare disease diagnosis, although caution remains warranted for low-parameter reasoning models showing substantial performance limitations. We recommend hospital IT departments and policymakers prioritize language relevance in model selection and consider integrating RAG with curated knowledge bases to enhance diagnostic utility in constrained settings, while exercising caution with low-parameter models.
ABSTRACT KDM1A ‐related neurodevelopmental disorder (CPRF, OMIM #616728) is characterized by cleft palate, global developmental delay, and distinct facial gestalt, but phenotypic knowledge of this ultra‐rare autosomal dominant disorder is … ABSTRACT KDM1A ‐related neurodevelopmental disorder (CPRF, OMIM #616728) is characterized by cleft palate, global developmental delay, and distinct facial gestalt, but phenotypic knowledge of this ultra‐rare autosomal dominant disorder is limited. Here, we report on a 13‐year‐old boy with a novel heterozygous, likely pathogenic germline missense variant in exon 16 of KDM1A with developmental delay, hypotonia, mild intellectual disability, and unspecific facial features but without palate abnormalities. Notably, this first reported individual without palate abnormalities highlights the variable expressivity of this feature in the KDM1A ‐related phenotype. Furthermore, this case report expands knowledge on the phenotypic spectrum, including intellectual disability with global developmental delay, muscular hypotonia, and variable dysmorphic anomalies, highlighting the value of individual case studies in clinical research.
ABSTRACT Objective Dystonia is one of the most prevalent movement disorders, characterized by significant clinical and etiological heterogeneity. Despite considerable heritability (~25%), the etiology in most patients remains elusive. Moreover, … ABSTRACT Objective Dystonia is one of the most prevalent movement disorders, characterized by significant clinical and etiological heterogeneity. Despite considerable heritability (~25%), the etiology in most patients remains elusive. Moreover, understanding correlations between clinical manifestations and genetic variants has become increasingly complex. Methods Exome sequencing was conducted on 1924 genetically unsolved, mainly late‐onset isolated dystonia patients, recruited primarily from two dystonia registries (DysTract and the Dystonia Coalition). Rare variants in genes previously linked to dystonia ( n = 406) were examined, confirmed via Sanger sequencing, and analyzed for segregation when possible. Results We identified 137 distinct likely pathogenic/pathogenic variants (according to ACMG criteria) across 51 genes in 163/1924 patients, including 153/1895 index patients (diagnostic yield 8.1%). The strongest predictors of a genetic diagnosis were generalized dystonia (28.6% yield) and age at onset (20.4% yield in patients with onset < 30 years). Notably, 56.2% of these variants were novel, with recurrent variants in EIF2AK2 , VPS16 , KCNMA1 , and SLC2A1 . Additionally, 321 index patients (16.9%) harbored variants of uncertain significance in 102 genes. The most frequently implicated genes included VPS16 , THAP1 , GCH1 , SGCE , GNAL , and KMT2B. Presumably pathogenic variants in less well‐established dystonia genes were also found, including KCNMA1 , KIF1A , and ZMYND11. At least six variants (in ADCY5 , GNB1 , IR2BPL, KCNN2 , KMT2B , and VPS16 ) occurred de novo, supporting pathogenicity. Interpretation This study provides valuable insights into the genetic landscape of dystonia, underscores the utility of exome sequencing for diagnosis, substantiates several candidate genes, and expands the phenotypic spectrum of some genes to include prominent, sometimes isolated dystonia.
<title>Abstract</title> Background As clinical genetics evolves towards the broader field of clinical genomics, the diagnostic approach to rare diseases is undergoing a paradigm shift. This transformation has significantly impacted rare … <title>Abstract</title> Background As clinical genetics evolves towards the broader field of clinical genomics, the diagnostic approach to rare diseases is undergoing a paradigm shift. This transformation has significantly impacted rare disease diagnostics, increasingly done through gene panels, whole exome and whole genome sequencing. To advance beyond genomics into precision medicine and encompass the breadth of relevant clinical scenarios, a true systems shift is required that challenges conventional barriers and enables the formation of cross-disciplinary, integrated environments. Methods The Genomic Medicine Center Karolinska Rare Diseases (GMCK-RD) has, for the past 10 years, brought together healthcare and academia to enable large-scale genome sequencing in a clinical diagnostics context. Within GMCK-RD, experts from various medical disciplines collaborate closely with clinical geneticists, bioinformaticians, and researchers to integrate genome sequencing into healthcare. Results In total, 15 644 individuals with suspected rare diseases were analyzed using clinical genome sequencing, including pediatric (48%), adult (48%) and fetal (4%) samples. The overall diagnostic yield was 22.6% providing a diagnosis for 3 538 individuals with variants in 1 570 genes. Moreover, a rare disease analysis tool suite developed and validated <italic>in house</italic> includes a bioinformatic pipeline allowing for comprehensive data analysis covering a wide range of genetic variants including SNVs, INDELs, repeat expansions, uniparental disomies, balanced and unbalanced structural variants as well as insertions of mobile elements. Results are visualized and interpreted in custom-developed decision support systems functioning as an interpretation portal as well as a knowledge-base to capture the interpretation efforts made in a structured format allowing future secondary use. Conclusions Altogether, GMCK-RD has shifted healthcare in our region towards precision diagnostics. We emphasize the need to transition from traditional clinical genetic diagnostics to a broader clinical genomics approach. Beyond this shift, we advocate integrating genomics with specialized clinical and laboratory medicine, a concept pioneered for inborn errors of metabolism (IEM) with stepwise spread to additional disease groups. In this model, a multidisciplinary unit combines screening, targeted diagnostics, individualized treatment, and long-term patient follow-up. Here we provide a road map and guide for inspiration for centers aiming to implement genome sequencing in rare disease diagnostics.
In contrast to hereditary angioedema (HAE) due to C1-inhibitor deficiency, the detection of pathogenic variants in genes linked to HAE with normal C1 inhibitor levels (HAE-nC1INH) is required for the … In contrast to hereditary angioedema (HAE) due to C1-inhibitor deficiency, the detection of pathogenic variants in genes linked to HAE with normal C1 inhibitor levels (HAE-nC1INH) is required for the diagnosis of the corresponding types of the disease. The mainstreaming of genomic technology and the increasing use of next generation sequencing have increased the possibility of an unintentional detection of HAE-nC1INH pathogenic variants and allowed the incidental finding of variants of uncertain significance (VUS) in the relevant genes. Apart from F12 and PLG pathogenic variants, the current level of evidence on the prevalence and penetrance of variants associated with HAE-nC1INH does not support the reporting of their incidental finding. On the other hand, although VUS should not be used in clinical decision-making, further consideration is warranted (a) for VUS found in exon 9 of the F12 gene after a diagnostic genetic analysis of individuals either with or without personal or family history of angioedema, and (b) for VUS found in any of the other genes linked to HAE-nC1INH, after genetic analysis performed in the context of differential diagnosis of angioedema cases. Given the complexity of interpreting, reporting and communicating incidental findings, a close partnership between patients, clinicians, laboratory geneticists and genetic counsellors is essential to optimize the management of these results.
Abstract Purpose Newborn screening (NBS) is an effective measure of secondary prevention. The application of genomic sequencing in population-based screening would enable further expansions of the NBS disease panel and … Abstract Purpose Newborn screening (NBS) is an effective measure of secondary prevention. The application of genomic sequencing in population-based screening would enable further expansions of the NBS disease panel and a genomic NBS (gNBS). The selection of NBS target diseases is still based on the Wilson and Jungner screening criteria from 1968, which are considered incomplete, rendering the necessity of developing new criteria. Methods The present work aims to establish a multi-dimensional framework for future gNBS programs. An interdisciplinary expert panel comprising researchers from pediatric and adolescent medicine, human genetics, ethics, medical psychology, law, and patient representatives used a nominal group technique-like multi-stage consensus process to define criteria for gNBS, considering ethical, legal, and social implications, medical aspects, and patient perspectives. Results Overall, 18 criteria were developed, clustered into four subcategories: I. Clinical criteria (characteristics of the target disease); II. Diagnostic criteria (requirements of the test); III. Therapeutic-interventional criteria (prerequisites of the intervention); IV. Program management criteria (requirements of the program). Subcategories I–III define selection criteria for target diseases, subcategory IV defines criteria for how to establish and manage the program. Conclusion This multi-dimensional framework serves as a well-balanced basis for developing thoroughly revised and internationally accepted consensus screening criteria.
ABSTRACT Rare diseases affect 6% of Western societies and are a leading cause of pediatric mortality. The popularization of Next Generation Sequencing technologies, especially exome sequencing (ES), revolutionized the diagnosis … ABSTRACT Rare diseases affect 6% of Western societies and are a leading cause of pediatric mortality. The popularization of Next Generation Sequencing technologies, especially exome sequencing (ES), revolutionized the diagnosis of children with rare disease. Still, most patients face extensive diagnostic odysseys and remain undiagnosed. Recently, genome sequencing (GS) emerged in hope that its broader coverage could improve diagnostic yield compared to ES. This study aims to systematically review and meta‐analyze the diagnostic power of ES versus GS in pediatric populations with rare diseases. A systematic review of PubMed, Cochrane, and Embase databases was performed on December 11, 2024, for nonrandomized studies comparing GS diagnostic yield with ES or ES reanalysis in pediatric populations with rare disease. Statistical analyses were performed in R software version 4.4.2, and the study was registered on PROSPERO (CRD42024619640). In a cohort of 1684 patients from 11 studies, GS‐specific diagnostic yield was 7.0% (95% CI: 5.1%–9.5%; p &lt; 0.0001). Subgroup analysis revealed that ES reanalysis and GS provided statistically similar diagnostic yields in patients with prior negative ES due to overlapping confidence intervals. The diagnostic rate of ES reanalysis was 14.2% (8.9%–21.8%; p &lt; 0.0001), while total GS diagnostic yield in the same cohort was 24.1% (17.6%–31.9%; p &lt; 0.0239). This meta‐analysis showed that GS could establish molecular diagnoses in 7.0% more patients after negative ES. However, similar diagnostic yields from ES reanalysis and GS emphasize the importance of periodic reanalysis and variant reinterpretation in diagnostic workflows.
Abstract Background The utilization of genome/exome sequencing for managing cancer patients is rising. However, deciphering the genomic variations and determining their pathogenicity can be intricate. The widely accepted practice of … Abstract Background The utilization of genome/exome sequencing for managing cancer patients is rising. However, deciphering the genomic variations and determining their pathogenicity can be intricate. The widely accepted practice of using in silico pathogenicity predictions as evidence when interpreting genetic variants is an integral part of standard variant classification guidelines. Several algorithms have been developed and evaluated to predict deleterious variants. The objective of this study was to assess the performance of 34 pathogenicity prediction tools (such as BayesDel, CADD, ClinPred, DANN, DEOGEN2, Eigen-PC, FATHMM, GERP++, M-CAP, MetaLR, MutationAssessor, MutationTaster, MutPred, Polyphen2, PROVEAN, REVEL, and SIFT) on the latest version of ClinVar dataset and implement it on the exome sequence data of acute myeloid leukemia (AML) patients to assess the performance to these tools in clinical samples. Results While predicting the pathogenicity of genetic variants, there were 6 in silico tools having specificity &gt; 0.9 and 14 tools having sensitivity &gt; 0.9 on the ClinVar dataset. Further, three tools BayesDel, MetaRNN, and ClinPred demonstrated highest accuracy achieving sensitivity 0.9337–0.9627 and specificity 0.9245–0.9513. By applying these 3 tools on the present study AML exome dataset, 1421, 1235, and 2033 potential deleterious variants in 410 AML-associated genes were observed, respectively. Conclusion This comparison highlighted the in silico tools to predict the potential pathogenicity of the variants which otherwise might have been classified as variants of uncertain significance (VUS). The finding can help in the genetic risk assessment and targeted therapeutic approaches in AML.
La secuenciación genómica es ampliamente utilizada hoy en día en investigación y en la práctica clínica. Genera una ingente cantidad de datos crudos que, debidamente analizados e interpretados, se comunican … La secuenciación genómica es ampliamente utilizada hoy en día en investigación y en la práctica clínica. Genera una ingente cantidad de datos crudos que, debidamente analizados e interpretados, se comunican a la persona secuenciada vinculados al asesoramiento genético. En el entorno asistencial el informe interpretativo se incluye en la historia clínica pero los datos crudos se archivan aparte. Sin embargo, cada vez con más frecuencia, un número creciente de personas solicita a las instituciones secuenciadoras acceder a sus datos personales crudos (sin interpretar) por razones muy diversas. Esta petición es objeto hoy en día de un intenso debate debido en gran parte a las múltiples posibilidades de reutilización. Para reflexionar sobre ello se contemplan en primer lugar unos conceptos básicos sobre el genoma personal y los datos genómicos crudos. A continuación, se analizan los factores que han contribuido a la creciente disponibilidad de información genómica y las posibilidades de reutilizar la secuencia cruda con fines adicionales, clínicos, de salud, de investigación o incluso recreativos. La solicitud de acceso a los datos genómicos plantea cuestiones éticas, legales y prácticas, y por ello es interesante revisar las actuales políticas de almacenamiento y acceso de las instituciones secuenciadoras europeas y americanas. Partiendo de estos conocimientos previos, en un segundo artículo (Acceso a la secuencia del genoma II Consideraciones éticas, legales y sociales) se analizan específicamente y en profundidad dichos temas.
Objective: To analyze the genetic characteristics of clinical manifestations in children with KBG syndrome due to microdeletions. Methods: A retrospective case summary was conducted. Four children diagnosed with KBG syndrome … Objective: To analyze the genetic characteristics of clinical manifestations in children with KBG syndrome due to microdeletions. Methods: A retrospective case summary was conducted. Four children diagnosed with KBG syndrome due to 16q24.3 microdeletion at Children's Hospital of Zhengzhou University from July 2021 to April 2024 were enrolled.Their clinical manifestations, biochemical parameters, imaging data, whole-exome sequencing results, treatments and follow-up outcomes were reviewed. Results: The cohort included two males and two females (diagnosed at 81, 18, 26, and 56 months of age, respectively), from four unrelated families. All patients exhibited peculiar facial features (Cupid's bowed-shaped lips, prominent ears, thick eyebrows), skeletal abnormalities (brachydactyly, abnormal ribs, short stature, etc.), ocular anomalies (astigmatism, strabismus, amblyopia, etc.), intrauterine growth restriction, and developmental retardation. Case 2, 3, 4 had cranial imaging abnormalities, including thin anterior pituitary lobes with pineal cyst, left ventricular cyst, and abnormal pituitary stalk or lateral ventricles with sinusitis, respectively. Two children had intellectual disability, two had congenital heart disease, and one had delayed bone age and hair abnormalities. Whole exome genomic sequencing confirmed 16q24.3 microdeletions encompassing ANKRD11 gene in all four cases. Two children treated with recombinant human growth hormone achieved height increments of 1.5 s and 0.4 s, respectively. Conclusions: Typical features of 16q24.3 microdeletion-induced KBG syndrome include peculiar facial features, macrodontia, skeletal anomalies, neurological abnormalities, and ocular defects. Genetic testing is essential for definitive diagnosis. The treatment of KBG syndrome requires early diagnosis and multidisciplinary collaboration to implement individualized treatment for multisystem symptoms.
Abstract Structural variants (SVs) of the nebulin gene ( NEB ), including intragenic duplications, deletions, and copy number variation of the triplicate region, are an established cause of recessively inherited … Abstract Structural variants (SVs) of the nebulin gene ( NEB ), including intragenic duplications, deletions, and copy number variation of the triplicate region, are an established cause of recessively inherited nemaline myopathies and related neuromuscular disorders. Large deletions have been shown to cause dominantly inherited distal myopathies. Here we provide an overview of 35 families with muscle disorders caused by such SVs in NEB . Using custom Comparative Genomic Hybridization arrays, exome sequencing, short-read genome sequencing, custom Droplet Digital PCR, or Sanger sequencing, we identified pathogenic SVs in 35 families with NEB -related myopathies. In 23 families, recessive intragenic deletions and duplications or pathogenic gains of the triplicate region segregating with the disease in compound heterozygous form, together with a small variant in trans, were identified. In two families the SV was, however, homozygous. Eight of these families have not been described previously. In 12 families with a distal myopathy phenotype (of which 10 are previously unpublished), eight unique, large deletions encompassing 52–97 exons in either heterozygous ( n = 10) or mosaic ( n = 2) state were identified. In the families where inheritance was recessive, no correlation could be made between the types of variants and the severity of the disease. In contrast, all patients with large dominant deletions in NEB had milder, predominantly distal muscle weakness. For the first time, we establish a clear and statistically significant association between large NEB deletions and a form of distal myopathy. In addition, we provide the hitherto largest overview of the spectrum of SVs in NEB .
Abstract Status epilepticus (SE) is a common, life‐threatening neurologic emergency. Understanding population‐level outcomes after SE requires a validated case definition, yet International Classification of Diseases, 10th Revision, Clinical Modification (ICD‐10‐CM) … Abstract Status epilepticus (SE) is a common, life‐threatening neurologic emergency. Understanding population‐level outcomes after SE requires a validated case definition, yet International Classification of Diseases, 10th Revision, Clinical Modification (ICD‐10‐CM) codes of SE have not been well‐validated in US populations since adoption in 2015. We aimed to determine whether the ICD‐10‐CM code‐based definitions accurately identify SE in the in‐hospital setting. The population included all ages (excluding neonates) admitted to a Mount Sinai Health System (MSHS) intensive care unit (ICU) in 2019. A data collection form was developed, tested, and used by trained reviewers. Every admission in a random month (November) was reviewed to determine whether all SE cases had at least one code for epilepsy, seizure, or convulsion followed by all charts with an ICD‐10‐CM diagnosis code for seizure/epilepsy/convulsion in 2019. Chart review data were linked to MSHS electronic medical record data. Sensitivity (Sn), specificity (Sp), negative predictive value (NPV), and positive predictive value (PPV) with 95% confidence intervals (CIs) and Youden index were calculated for ICD‐10 coding of SE (G40.xx1 or G40.xx3, as G41 was not adopted in the United States). MSHS had 13 694 ICU admissions in 2019, for which 1851 charts were reviewed and of which 173 were admissions with definite SE. The ICD‐10‐CM case definition (G40.xx1) has an Sn of 68.7% (95% CI = 61.5–75.8) and Sp of 92.6% (95% CI = 91.4–93.8). PPV was 47.4% (95% CI = 41–53.8), and NPV was 96.8% (95% CI = 95.9–97.6). Youden index was 61.3%. ICD‐10‐CM coding for SE has high specificity but limited sensitivity. These findings align with SE prevalence studies showing a decrease in prevalence with the change from ICD‐9‐CM to ICD‐10‐CM, which may be related to the United States' unique adoption of ICD‐10‐CM, which did not include the standalone SE code (G41). Our findings emphasize the importance of revision and improvement of coding practices to best represent the prevalence of SE, and of consideration when planning for the next iteration of ICD coding.
ABSTRACT The research of single gene‐related disorders or pathogenic copy‐number variations (CNVs) has given a significant impetus to the shift from a diagnostic work‐up focused on epileptic syndromes to genomic … ABSTRACT The research of single gene‐related disorders or pathogenic copy‐number variations (CNVs) has given a significant impetus to the shift from a diagnostic work‐up focused on epileptic syndromes to genomic approaches in individuals with severe pediatric‐onset epilepsies and in developmental and epileptic encephalopathies. Genome‐wide association studies (GWAS) have identified various loci of susceptibility for common epilepsies and highlighted a strong predisposing role of common variants in several genes involved in well‐known monogenic diseases. The largest GWAS identified eight major loci with stronger genome‐wide significance for epilepsy, regardless the underlying epileptic syndrome: 2q24.3, 2p16.1, 4p15.1, 7q21.11, 8p23.1, 9q21.13, 10q24.32, 16q12.1, 2p16.1 and 2q24.3 occurred more frequently in patients with genetic generalized epilepsies. Loci 4p12, 8q23.1 and 16p11.2 achieved a high genome‐wide significance for Juvenile Myoclonic Epilepsy. Childhood Absence Epilepsy was significantly genome‐wide associated with 2p16.1 and 2q22.3. The loci 3q25.31, 6q22.31 and 2q24.3 were significantly associated with non‐acquired focal epilepsies. Polygenic risk scores (PRS) are used to quantify the cumulative effects of several common genetic variants in a single score, each of which individually contributes minimally to disease susceptibility. The impact of PRS on clinical practice might be relevant for epilepsy risk prediction in groups of patients at high risk of developing epilepsy in the near future. Elevated PRS values have been observed in genetic generalized epilepsies particularly in familial forms, females, and patients with previous seizure events. Among comorbidities associated with epilepsy, depression, psychosis, and attention‐deficit/hyperactivity disorder (ADHD) showed significantly elevated PRS.
Introduction and Objective: Mutations in the ABCC8 gene, encoding the SUR1 subunit of the ATP-sensitive potassium channel, are classically associated with neonatal diabetes (NDM). However, some variants in this gene … Introduction and Objective: Mutations in the ABCC8 gene, encoding the SUR1 subunit of the ATP-sensitive potassium channel, are classically associated with neonatal diabetes (NDM). However, some variants in this gene can result in MODY (Maturity-Onset Diabetes of the Young), with milder phenotypes and variable age of onset. This study aims to describe the phenotypic variability in a Brazilian family with MODY diabetes due to an ABCC8 gene mutation. Methods: This is a descriptive study of a Brazilian family with multiple members affected by diabetes. Next-generation sequencing was used to perform genetic sequencing and the identified pathogenic variant was confirmed through Sanger sequencing and classified according to ACMG criteria. Clinical data, including age at diagnosis, HbA1c, presence of diabetes complications, and sulfonylurea treatment, were analyzed. Results: Genetic sequencing of the proband identified the pathogenic variant c.2473C&amp;gt;T/p.R825W in ABCC8. This variant was also identified in four other family members (two offspring and two sisters). Despite the shared genetic etiology, significant phenotypic variability was observed. Age at diagnosis ranged from 16-51y. HbA1c levels at initial evaluation varied from 5.8 to 9.2%. Two offspring and one sister were asymptomatic at diagnosis, while the other sister had pre-diabetes. Only the proband exhibited diabetes-related complications (diabetic retinopathy, myocardial dysfunction). The sulfonylurea treatment response was heterogeneous, with variable doses required for glycemic control. The proband used glibenclamide 1.25 mg, while the sisters used gliclazide 30 mg and glibenclamide 5 mg. The two offspring did not require antidiabetic medication. Conclusion: The location of the pathogenic variant within the ABCC8 gene influences the disease phenotype, resulting in either MODY or NDM. Our study highlights the phenotypic variability associated with ABCC8 mutations and underscores precise molecular diagnosis's importance in guiding appropriate treatment strategies. Disclosure A.C. Santomauro Junior: None. A.D. Costa-Riquetto: None. T.G. Amorim: None. F.R. Barros: None. E.B. Val: None. A. Jorge: None. M.G. Teles: None. Funding FAPESP (#2013/19920-2)
Introduction and Objective: Clinical/phenotypic associations with various partitioned polygenic risk scores (pPRS) have been reported. However, there is limited data on how they aggregate into patient subgroups. Methods: EMR from … Introduction and Objective: Clinical/phenotypic associations with various partitioned polygenic risk scores (pPRS) have been reported. However, there is limited data on how they aggregate into patient subgroups. Methods: EMR from T2D patients (n=12,136) were analyzed. K-means clustering was conducted based on the pPRS published by Suzuki et al (Nature, 2024) utilizing pPRS normalized scores, yielding 7 T2D patient subgroups, ranging in subject size from 1658 to 1788. Results: Each of the 7 subgroups presented distinct patterns of pPRS contributions, with one different pPRS being particularly prominent in each of the subgroups. The subjects with T2D in subgroup 1 (driven by the beta cell with positive proinsulin cluster) were older and male predominance (p&amp;lt;.002), while patients in subgroup 5 (dominated by the residual glycemic cluster) were younger, leaner, and had lower triglyceride and higher HDL-C levels (p&amp;lt;.0001). Subgroup 3 (driven by the obesity cluster), presented with the highest BMI (p&amp;lt;.0001). Subgroup 7 (led by the metabolomic syndrome cluster), had higher triglyceride and lower HDL-C levels, with the lowest proportion of males (p&amp;lt;.002). Finally, individuals in subgroups 4 and 6 (both driven by the beta cell with negative proinsulin cluster) presented a higher proportion of being treated with injection therapies (mainly insulin) (using the lowest injection subgroup 1, as a reference)( 4: OR 1.28, 95% CI 1.05-1.54; 6: OR 1.29, 95% CI 1.07-1.56). Conclusion: Our study indicates that T2D genetic pPRS differentially aggregate among subgroups of individuals with T2D in Taiwan Asians, supporting that genetic heterogeneity underlies the clinical heterogeneity within T2D. Disclosure W. Sheu: None. C. Lee: None. T. Hsiao: None. Y. Chen: None. J.I. Rotter: None.
Abstract Validation of genomic predictions or polygenic risk scores is key for model selection and evaluating the performance of the chosen prediction machinery. Non-parametric validation, such as cross-validation, is popular … Abstract Validation of genomic predictions or polygenic risk scores is key for model selection and evaluating the performance of the chosen prediction machinery. Non-parametric validation, such as cross-validation, is popular but does not account for population structure and the fact that the interest could be in validating a set of individuals and not the entire population. Semi-parametric methods, such as the LR method, which also uses removed records to validate predictions, account for population structure and allow to focus on a specific set of individuals of interest. Confidence intervals are obtained using semi-parametric methods without the need of repeated cross-validation. We developed a tool within the Blupf90 software suite, called validationf90, that allows researchers to conduct semi-parametric validation from the solutions obtained from that software suite. validationf90 calculates different validation statistics and their confidence intervals for a pre-defined set of individuals of interest, reflecting the bias and accuracy of genomic predictions. The program allows for genomic predictions obtained from frequentist and Bayesian methods, as well as for categorical data. validationf90 can validate any model supported by the Blupf90 software suite and can be used with animal, plant, and human datasets. Predictions obtained with other software can be provided to validationf90 as long as the input format matches with the Blupf90 format.
The KBG syndrome (KBGS) affects several systems caused by the mutation of the ANKRD11 gene. The main manifestations of KGBS included hearing loss, feeding difficulties, craniofacial abnormalities, tooth deformity, and … The KBG syndrome (KBGS) affects several systems caused by the mutation of the ANKRD11 gene. The main manifestations of KGBS included hearing loss, feeding difficulties, craniofacial abnormalities, tooth deformity, and developmental delay (delayed overall development, convulsions, and intellectual abnormalities). Only 10%–26% of patients with KBG syndrome have congenital heart disease, including atrial and ventricular septal defects. Here, we report a case of KBG syndrome in a preterm newborn with low birth weight, a huge ventricular septal defect, and a congenital chylothorax. Whole-exome sequencing detected an ANKRD11 gene mutation in the infant. The finding expands the understanding of the clinical and genetic phenotype. The multidisciplinary consultation of the complex KGB syndrome including interventional occlusion, nutritional management, and rehabilitation training can improve the prognosis and outcome.
Introduction: In North Africa, genetic diseases are widespread, but under-studied due to limited research resources. This study used exome sequencing to identify disease-causing variants in a large series of Moroccan … Introduction: In North Africa, genetic diseases are widespread, but under-studied due to limited research resources. This study used exome sequencing to identify disease-causing variants in a large series of Moroccan patients with suspected genetic diseases. Methods: A cohort of 30 patients with genetic diseases from the BRO Biobank underwent exome sequencing. Candidate variants were evaluated by segregation analysis and molecular modelling. Results: Thirty-one variants were identified in 27 known genes. Interestingly, 54.8% of these variants were novel and therefore could be specific to the Moroccan population. Pathogenic or likely pathogenic disease-causing variants were identified in 22 of 30 patients, leading to a genetic testing yield of 73.3%. Moreover, the identified variants, classified as of uncertain significance, likely benign or benign, were predicted to alter protein structure using in silico modelling of 3D protein structure. The diagnosis was changed in 23% of patients with suspected genetic syndromes, and the etiology was determined in all patients with unrecognizable genetic disorder. Conclusion: This study represents the largest biobank-based study of inherited diseases in a North African country. It illustrates the genetic variability of the Moroccan population and improves our understanding of genotype-phenotype correlations. Furthermore, the relatively high yield of genetic testing obtained in this study justifies the need to implement exome sequencing in the clinical setting in Morocco for better genetic diagnosis.
ABSTRACT Xia–Gibbs syndrome (XGS) is a rare intellectual disability (ID) syndrome caused by de novo AHDC1 pathogenic variants. We characterized clinical and molecular features of 16 Brazilian patients with XGS. … ABSTRACT Xia–Gibbs syndrome (XGS) is a rare intellectual disability (ID) syndrome caused by de novo AHDC1 pathogenic variants. We characterized clinical and molecular features of 16 Brazilian patients with XGS. Patient data were collected through semistructured interviews with family members, reanalysis of previous health and genetic assessments, and clinical reports from physicians. Genomic variants and their segregation were validated via Sanger sequencing. Statistical analyses were conducted to evaluate genotype–phenotype associations. Twelve novel AHDC1 causative variants were documented. ID, hypotonia, motor developmental delay, and varied nonspecific facial dysmorphisms were observed in all patients, while speech impairment and autism spectrum disorder were present in nearly all. Three frequent phenotypes, not previously reported, were identified: hyperphagia/food obsession, genital/gonadal alterations in males, and shortening of the Achilles tendon. Additionally, our findings provide statistically significant support for previously reported genotype–phenotype associations between pathogenic variants in the first half of the AHDC1 coding region and the occurrence of epilepsy and scoliosis. We also propose a novel association between N‐terminal variants and developmental regression. In summary, our results broaden the clinical phenotype of XGS, with musculoskeletal and genital/gonadal abnormalities highlighting the multisystem involvement in this condition, beyond neurodevelopmental deficits. Comprehensive phenotypic assessments in all identified XGS cases are recommended to accurately recognize and associate novel clinical signs with XGS.
Background: The growing prevalence of neuropsychiatric disorders is becoming a major health challenge. Traditional pharmacotherapies face limitations, making drug repurposing a valuable strategy. However, high-throughput screening approaches for these conditions … Background: The growing prevalence of neuropsychiatric disorders is becoming a major health challenge. Traditional pharmacotherapies face limitations, making drug repurposing a valuable strategy. However, high-throughput screening approaches for these conditions are scarce. Methods: This study leveraged exposure data from the UK Biobank Neale Lab (N = 361,141) and outcome data from the FinnGen database (N = approximately 410,000) to employ Mendelian Randomization (MR) analyses and identify potential drug repurposing candidates for neuropsychiatric disorders. Sensitivity, Linkage Disequilibrium Score Correlation (LDSC), and Bayesian Colocalization (COLOC) analyses were conducted to ensure the robustness and reliability of our findings. Results: Using the IVW method, seven medications with negative causal associations with neuropsychiatric disorders were identified. Pregabalin, bumetanide, and prednisolone were associated with reduced anxiety (beta = -7.28, p = 4.00e-03; beta = -2.24, p = 6.00e-03; beta = -1.74, p = 2.84e-03). Vitamin B1 preparations showed an inverse association with dementia (beta = -2.47, p = 1.51e-03), Creon E/C granules with epilepsy (beta = -4.99, p = 3.91e-03), Pentasa SR 250 mg with multiple sclerosis (beta = -3.95, p = 3.83e-03), and zolmitriptan with stroke excluding subarachnoid hemorrhage (beta = -1.61, p = 6.00e-03). Sensitivity analyses confirmed these findings, whereas the LDSC and COLOC analyses provided additional support. Conclusion: MR-based drug repurposing is a promising approach for the treatment of neuropsychiatric disorders. Further validation is necessary to effectively integrate these medications into clinical practice.