Biochemistry, Genetics and Molecular Biology Molecular Biology

Gene Regulatory Network Analysis

Description

This cluster of papers focuses on the stochastic behavior and regulation of gene networks, exploring topics such as stochastic gene expression, synthetic biology, cellular noise, network inference, genetic circuits, biochemical modeling, cell signaling dynamics, and single-cell analysis. The research delves into understanding the inherent stochasticity in gene regulatory networks and its implications for cellular functions.

Keywords

Stochastic Gene Expression; Gene Regulatory Networks; Synthetic Biology; Cellular Noise; Systems Biology; Network Inference; Genetic Circuits; Biochemical Modeling; Cell Signaling Dynamics; Single-Cell Analysis

Abstract Fluctuations in rates of gene expression can produce highly erratic time patterns of protein production in individual cells and wide diversity in instantaneous protein concentrations across cell populations. When … Abstract Fluctuations in rates of gene expression can produce highly erratic time patterns of protein production in individual cells and wide diversity in instantaneous protein concentrations across cell populations. When two independently produced regulatory proteins acting at low cellular concentrations competitively control a switch point in a pathway, stochastic variations in their concentrations can produce probabilistic pathway selection, so that an initially homogeneous cell population partitions into distinct phenotypic subpopulations. Many pathogenic organisms, for example, use this mechanism to randomly switch surface features to evade host responses. This coupling between molecular-level fluctuations and macroscopic phenotype selection is analyzed using the phage λ lysis-lysogeny decision circuit as a model system. The fraction of infected cells selecting the lysogenic pathway at different phage:cell ratios, predicted using a molecular-level stochastic kinetic model of the genetic regulatory circuit, is consistent with experimental observations. The kinetic model of the decision circuit uses the stochastic formulation of chemical kinetics, stochastic mechanisms of gene expression, and a statistical-thermodynamic model of promoter regulation. Conventional deterministic kinetics cannot be used to predict statistics of regulatory systems that produce probabilistic outcomes. Rather, a stochastic kinetic analysis must be used to predict statistics of regulatory outcomes for such stochastically regulated systems.
In cellular regulatory networks, genetic activity is controlled by molecular signals that determine when and how often a given gene is transcribed. In genetically controlled pathways, the protein product encoded … In cellular regulatory networks, genetic activity is controlled by molecular signals that determine when and how often a given gene is transcribed. In genetically controlled pathways, the protein product encoded by one gene often regulates expression of other genes. The time delay, after activation of the first promoter, to reach an effective level to control the next promoter depends on the rate of protein accumulation. We have analyzed the chemical reactions controlling transcript initiation and translation termination in a single such “genetically coupled” link as a precursor to modeling networks constructed from many such links. Simulation of the processes of gene expression shows that proteins are produced from an activated promoter in short bursts of variable numbers of proteins that occur at random time intervals. As a result, there can be large differences in the time between successive events in regulatory cascades across a cell population. In addition, the random pattern of expression of competitive effectors can produce probabilistic outcomes in switching mechanisms that select between alternative regulatory paths. The result can be a partitioning of the cell population into different phenotypes as the cells follow different paths. There are numerous unexplained examples of phenotypic variations in isogenic populations of both prokaryotic and eukaryotic cells that may be the result of these stochastic gene expression mechanisms.
To understand biology at the system level, we must examine the structure and dynamics of cellular and organismal function, rather than the characteristics of isolated parts of a cell or … To understand biology at the system level, we must examine the structure and dynamics of cellular and organismal function, rather than the characteristics of isolated parts of a cell or organism. Properties of systems, such as robustness, emerge as central issues, and understanding these properties may have an impact on the future of medicine. However, many breakthroughs in experimental devices, advanced software, and analytical methods are required before the achievements of systems biology can live up to their much-touted potential.
Cells are intrinsically noisy biochemical reactors: low reactant numbers can lead to significant statistical fluctuations in molecule numbers and reaction rates. Here we use an analytic model to investigate the … Cells are intrinsically noisy biochemical reactors: low reactant numbers can lead to significant statistical fluctuations in molecule numbers and reaction rates. Here we use an analytic model to investigate the emergent noise properties of genetic systems. We find for a single gene that noise is essentially determined at the translational level, and that the mean and variance of protein concentration can be independently controlled. The noise strength immediately following single gene induction is almost twice the final steady-state value. We find that fluctuations in the concentrations of a regulatory protein can propagate through a genetic cascade; translational noise control could explain the inefficient translation rates observed for genes encoding such regulatory proteins. For an autoregulatory protein, we demonstrate that negative feedback efficiently decreases system noise. The model can be used to predict the noise characteristics of networks of arbitrary connectivity. The general procedure is further illustrated for an autocatalytic protein and a bistable genetic switch. The analysis of intrinsic noise reveals biological roles of gene network structures and can lead to a deeper understanding of their evolutionary origin.
Advanced technologies and biology have extremely different physical implementations, but they are far more alike in systems-level organization than is widely appreciated. Convergent evolution in both domains produces modular architectures … Advanced technologies and biology have extremely different physical implementations, but they are far more alike in systems-level organization than is widely appreciated. Convergent evolution in both domains produces modular architectures that are composed of elaborate hierarchies of protocols and layers of feedback regulation, are driven by demand for robustness to uncertain environments, and use often imprecise components. This complexity may be largely hidden in idealized laboratory settings and in normal operation, becoming conspicuous only when contributing to rare cascading failures. These puzzling and paradoxical features are neither accidental nor artificial, but derive from a deep and necessary interplay between complexity and robustness, modularity, feedback, and fragility. This review describes insights from engineering theory and practice that can shed some light on biological complexity.
Genetically identical cells and organisms exhibit remarkable diversity even when they have identical histories of environmental exposure. Noise, or variation, in the process of gene expression may contribute to this … Genetically identical cells and organisms exhibit remarkable diversity even when they have identical histories of environmental exposure. Noise, or variation, in the process of gene expression may contribute to this phenotypic variability. Recent studies suggest that this noise has multiple sources, including the stochastic or inherently random nature of the biochemical reactions of gene expression. In this review, we summarize noise terminology and comment on recent investigations into the sources, consequences, and control of noise in gene expression.
The stochastic dynamical behavior of a well-stirred mixture of N molecular species that chemically interact through M reaction channels is accurately described by the chemical master equation. It is shown … The stochastic dynamical behavior of a well-stirred mixture of N molecular species that chemically interact through M reaction channels is accurately described by the chemical master equation. It is shown here that, whenever two explicit dynamical conditions are satisfied, the microphysical premise from which the chemical master equation is derived leads directly to an approximate time-evolution equation of the Langevin type. This chemical Langevin equation is the same as one studied earlier by Kurtz, in contradistinction to some other earlier proposed forms that assume a deterministic macroscopic evolution law. The novel aspect of the present analysis is that it shows that the accuracy of the equation depends on the satisfaction of certain specific conditions that can change from moment to moment, rather than on a static system size parameter. The derivation affords a new perspective on the origin and magnitude of noise in a chemically reacting system. It also clarifies the connection between the stochastically correct chemical master equation, and the deterministic but often satisfactory reaction rate equation.
Protein and messenger RNA (mRNA) copy numbers vary from cell to cell in isogenic bacterial populations. However, these molecules often exist in low copy numbers and are difficult to detect … Protein and messenger RNA (mRNA) copy numbers vary from cell to cell in isogenic bacterial populations. However, these molecules often exist in low copy numbers and are difficult to detect in single cells. We carried out quantitative system-wide analyses of protein and mRNA expression in individual cells with single-molecule sensitivity using a newly constructed yellow fluorescent protein fusion library for Escherichia coli. We found that almost all protein number distributions can be described by the gamma distribution with two fitting parameters which, at low expression levels, have clear physical interpretations as the transcription rate and protein burst size. At high expression levels, the distributions are dominated by extrinsic noise. We found that a single cell's protein and mRNA copy numbers for any given gene are uncorrelated.
In order to understand the functioning of organisms on the molecular level, we need to know which genes are expressed, when and where in the organism, and to which extent. … In order to understand the functioning of organisms on the molecular level, we need to know which genes are expressed, when and where in the organism, and to which extent. The regulation of gene expression is achieved through genetic regulatory systems structured by networks of interactions between DNA, RNA, proteins, and small molecules. As most genetic regulatory networks of interest involve many components connected through interlocking positive and negative feedback loops, an intuitive understanding of their dynamics is hard to obtain. As a consequence, formal methods and computer tools for the modeling and simulation of genetic regulatory networks will be indispensable. This paper reviews formalisms that have been employed in mathematical biology and bioinformatics to describe genetic regulatory systems, in particular directed graphs, Bayesian networks, Boolean networks and their generalizations, ordinary and partial differential equations, qualitative differential equations, stochastic equations, and rule-based formalisms. In addition, the paper discusses how these formalisms have been used in the simulation of the behavior of actual regulatory systems.
The mitogen-activated protein kinase (MAPK) cascade is a highly conserved series of three protein kinases implicated in diverse biological processes. Here we demonstrate that the cascade arrangement has unexpected consequences … The mitogen-activated protein kinase (MAPK) cascade is a highly conserved series of three protein kinases implicated in diverse biological processes. Here we demonstrate that the cascade arrangement has unexpected consequences for the dynamics of MAPK signaling. We solved the rate equations for the cascade numerically and found that MAPK is predicted to behave like a highly cooperative enzyme, even though it was not assumed that any of the enzymes in the cascade were regulated cooperatively. Measurements of MAPK activation in Xenopus oocyte extracts confirmed this prediction. The stimulus/response curve of the MAPK was found to be as steep as that of a cooperative enzyme with a Hill coefficient of 4-5, well in excess of that of the classical allosteric protein hemoglobin. The shape of the MAPK stimulus/ response curve may make the cascade particularly appropriate for mediating processes like mitogenesis, cell fate induction, and oocyte maturation, where a cell switches from one discrete state to another.
Machine learning was applied for the automated derivation of causal influences in cellular signaling networks. This derivation relied on the simultaneous measurement of multiple phosphorylated protein and phospholipid components in … Machine learning was applied for the automated derivation of causal influences in cellular signaling networks. This derivation relied on the simultaneous measurement of multiple phosphorylated protein and phospholipid components in thousands of individual primary human immune system cells. Perturbing these cells with molecular interventions drove the ordering of connections between pathway components, wherein Bayesian network computational methods automatically elucidated most of the traditionally reported signaling relationships and predicted novel interpathway network causalities, which we verified experimentally. Reconstruction of network models from physiologically relevant primary single cells might be applied to understanding native-state tissue signaling biology, complex drug actions, and dysfunctional signaling in diseased cells.
One of the pressing open problems of computational systems biology is the elucidation of the topology of genetic regulatory networks (GRNs) using high throughput genomic data, in particular microarray gene … One of the pressing open problems of computational systems biology is the elucidation of the topology of genetic regulatory networks (GRNs) using high throughput genomic data, in particular microarray gene expression data. The Dialogue for Reverse Engineering Assessments and Methods (DREAM) challenge aims to evaluate the success of GRN inference algorithms on benchmarks of simulated data. In this article, we present GENIE3, a new algorithm for the inference of GRNs that was best performer in the DREAM4 In Silico Multifactorial challenge. GENIE3 decomposes the prediction of a regulatory network between p genes into p different regression problems. In each of the regression problems, the expression pattern of one of the genes (target gene) is predicted from the expression patterns of all the other genes (input genes), using tree-based ensemble methods Random Forests or Extra-Trees. The importance of an input gene in the prediction of the target gene expression pattern is taken as an indication of a putative regulatory link. Putative regulatory links are then aggregated over all genes to provide a ranking of interactions from which the whole network is reconstructed. In addition to performing well on the DREAM4 In Silico Multifactorial challenge simulated data, we show that GENIE3 compares favorably with existing algorithms to decipher the genetic regulatory network of Escherichia coli. It doesn't make any assumption about the nature of gene regulation, can deal with combinatorial and non-linear interactions, produces directed GRNs, and is fast and scalable. In conclusion, we propose a new algorithm for GRN inference that performs well on both synthetic and real gene expression data. The algorithm, based on feature selection with tree-based ensemble methods, is simple and generic, making it adaptable to other types of genomic data and interactions.
There are two fundamental ways to view coupled systems of chemical equations: as continuous, represented by differential equations whose variables are concentrations, or as discrete, represented by stochastic processes whose … There are two fundamental ways to view coupled systems of chemical equations: as continuous, represented by differential equations whose variables are concentrations, or as discrete, represented by stochastic processes whose variables are numbers of molecules. Although the former is by far more common, systems with very small numbers of molecules are important in some applications (e.g., in small biological cells or in surface processes). In both views, most complicated systems with multiple reaction channels and multiple chemical species cannot be solved analytically. There are exact numerical simulation methods to simulate trajectories of discrete, stochastic systems, (methods that are rigorously equivalent to the Master Equation approach) but these do not scale well to systems with many reaction pathways. This paper presents the Next Reaction Method, an exact algorithm to simulate coupled chemical reactions that is also efficient: it (a) uses only a single random number per simulation event, and (b) takes time proportional to the logarithm of the number of reactions, not to the number of reactions itself. The Next Reaction Method is extended to include time-dependent rate constants and non-Markov processes and is applied to a sample application in biology (the lysis/lysogeny decision circuit of lambda phage). The performance of the Next Reaction Method on this application is compared with one standard method and an optimized version of that standard method.
The stochastic simulation algorithm (SSA) is an essentially exact procedure for numerically simulating the time evolution of a well-stirred chemically reacting system. Despite recent major improvements in the efficiency of … The stochastic simulation algorithm (SSA) is an essentially exact procedure for numerically simulating the time evolution of a well-stirred chemically reacting system. Despite recent major improvements in the efficiency of the SSA, its drawback remains the great amount of computer time that is often required to simulate a desired amount of system time. Presented here is the “τ-leap” method, an approximate procedure that in some circumstances can produce significant gains in simulation speed with acceptable losses in accuracy. Some primitive strategies for control parameter selection and error mitigation for the τ-leap method are described, and simulation results for two simple model systems are exhibited. With further refinement, the τ-leap method should provide a viable way of segueing from the exact SSA to the approximate chemical Langevin equation, and thence to the conventional deterministic reaction rate equation, as the system size becomes larger.
Abstract Motivation: Simulation and modeling is becoming a standard approach to understand complex biochemical processes. Therefore, there is a big need for software tools that allow access to diverse simulation … Abstract Motivation: Simulation and modeling is becoming a standard approach to understand complex biochemical processes. Therefore, there is a big need for software tools that allow access to diverse simulation and modeling methods as well as support for the usage of these methods. Results: Here, we present COPASI, a platform-independent and user-friendly biochemical simulator that offers several unique features. We discuss numerical issues with these features; in particular, the criteria to switch between stochastic and deterministic simulation methods, hybrid deterministic–stochastic methods, and the importance of random number generator numerical resolution in stochastic simulation. Availability: The complete software is available in binary (executable) for MS Windows, OS X, Linux (Intel) and Sun Solaris (SPARC), as well as the full source code under an open source license from . Contact: [email protected]
▪ Abstract Systems biology studies biological systems by systematically perturbing them (biologically, genetically, or chemically); monitoring the gene, protein, and informational pathway responses; integrating these data; and ultimately, formulating mathematical … ▪ Abstract Systems biology studies biological systems by systematically perturbing them (biologically, genetically, or chemically); monitoring the gene, protein, and informational pathway responses; integrating these data; and ultimately, formulating mathematical models that describe the structure of the system and its response to individual perturbations. The emergence of systems biology is described, as are several examples of specific systems approaches.
Engineered systems are often built of recurring circuit modules that carry out key functions. Transcription networks that regulate the responses of living cells were recently found to obey similar principles: … Engineered systems are often built of recurring circuit modules that carry out key functions. Transcription networks that regulate the responses of living cells were recently found to obey similar principles: they contain several biochemical wiring patterns, termed network motifs, which recur throughout the network. One of these motifs is the feed-forward loop (FFL). The FFL, a three-gene pattern, is composed of two input transcription factors, one of which regulates the other, both jointly regulating a target gene. The FFL has eight possible structural types, because each of the three interactions in the FFL can be activating or repressing. Here, we theoretically analyze the functions of these eight structural types. We find that four of the FFL types, termed incoherent FFLs, act as sign-sensitive accelerators: they speed up the response time of the target gene expression following stimulus steps in one direction (e.g., off to on) but not in the other direction (on to off). The other four types, coherent FFLs, act as sign-sensitive delays. We find that some FFL types appear in transcription network databases much more frequently than others. In some cases, the rare FFL types have reduced functionality (responding to only one of their two input stimuli), which may partially explain why they are selected against. Additional features, such as pulse generation and cooperativity, are discussed. This study defines the function of one of the most significant recurring circuit elements in transcription networks.
High-throughput genome-wide molecular assays, which probe cellular networks from different perspectives, have become central to molecular biology. Probabilistic graphical models are useful for extracting meaningful biological insights from the resulting … High-throughput genome-wide molecular assays, which probe cellular networks from different perspectives, have become central to molecular biology. Probabilistic graphical models are useful for extracting meaningful biological insights from the resulting data sets. These models provide a concise representation of complex cellular networks by composing simpler submodels. Procedures based on well-understood principles for inferring such models from data facilitate a model-based methodology for analysis and discovery. This methodology and its capabilities are illustrated by several recent applications to gene expression data.
Many distinct signaling pathways allow the cell to receive, process, and respond to information. Often, components of different pathways interact, resulting in signaling networks. Biochemical signaling networks were constructed with … Many distinct signaling pathways allow the cell to receive, process, and respond to information. Often, components of different pathways interact, resulting in signaling networks. Biochemical signaling networks were constructed with experimentally obtained constants and analyzed by computational methods to understand their role in complex biological processes. These networks exhibit emergent properties such as integration of signals across multiple time scales, generation of distinct outputs depending on input strength and duration, and self-sustaining feedback loops. Feedback can result in bistable behavior with discrete steady-state activities, well-defined input thresholds for transition between states and prolonged signal output, and signal modulation in response to transient stimuli. These properties of signaling networks raise the possibility that information for “learned behavior” of biological systems may be stored within intracellular biochemical reactions that comprise signaling pathways.
Gene expression is a stochastic, or “noisy,” process. This noise comes about in two ways. The inherent stochasticity of biochemical processes such as transcription and translation generates “intrinsic” noise. In … Gene expression is a stochastic, or “noisy,” process. This noise comes about in two ways. The inherent stochasticity of biochemical processes such as transcription and translation generates “intrinsic” noise. In addition, fluctuations in the amounts or states of other cellular components lead indirectly to variation in the expression of a particular gene and thus represent “extrinsic” noise. Here, we show how the total variation in the level of expression of a given gene can be decomposed into its intrinsic and extrinsic components. We demonstrate theoretically that simultaneous measurement of two identical genes per cell enables discrimination of these two types of noise. Analytic expressions for intrinsic noise are given for a model that involves all the major steps in transcription and translation. These expressions give the sensitivity to various parameters, quantify the deviation from Poisson statistics, and provide a way of fitting experiment. Transcription dominates the intrinsic noise when the average number of proteins made per mRNA transcript is greater than ≃2. Below this number, translational effects also become important. Gene replication and cell division, included in the model, cause protein numbers to tend to a limit cycle. We calculate a general form for the extrinsic noise and illustrate it with the particular case of a single fluctuating extrinsic variable—a repressor protein, which acts on the gene of interest. All results are confirmed by stochastic simulation using plausible parameters for Escherichia coli.
Individual cells in genetically homogeneous populations have been found to express different numbers of molecules of specific proteins. We investigated the origins of these variations in mammalian cells by counting … Individual cells in genetically homogeneous populations have been found to express different numbers of molecules of specific proteins. We investigated the origins of these variations in mammalian cells by counting individual molecules of mRNA produced from a reporter gene that was stably integrated into the cell's genome. We found that there are massive variations in the number of mRNA molecules present in each cell. These variations occur because mRNAs are synthesized in short but intense bursts of transcription beginning when the gene transitions from an inactive to an active state and ending when they transition back to the inactive state. We show that these transitions are intrinsically random and not due to global, extrinsic factors such as the levels of transcriptional activators. Moreover, the gene activation causes burst-like expression of all genes within a wider genomic locus. We further found that bursts are also exhibited in the synthesis of natural genes. The bursts of mRNA expression can be buffered at the protein level by slow protein degradation rates. A stochastic model of gene activation and inactivation was developed to explain the statistical properties of the bursts. The model showed that increasing the level of transcription factors increases the average size of the bursts rather than their frequency. These results demonstrate that gene expression in mammalian cells is subject to large, intrinsically random fluctuations and raise questions about how cells are able to function in the face of such noise.
Abstract Motivation: Our goal is to construct a model for genetic regulatory networks such that the model class: (i) incorporates rule-based dependencies between genes; (ii) allows the systematic study of … Abstract Motivation: Our goal is to construct a model for genetic regulatory networks such that the model class: (i) incorporates rule-based dependencies between genes; (ii) allows the systematic study of global network dynamics; (iii) is able to cope with uncertainty, both in the data and the model selection; and (iv) permits the quantification of the relative influence and sensitivity of genes in their interactions with other genes. Results: We introduce Probabilistic Boolean Networks (PBN) that share the appealing rule-based properties of Boolean networks, but are robust in the face of uncertainty. We show how the dynamics of these networks can be studied in the probabilistic context of Markov chains, with standard Boolean networks being special cases. Then, we discuss the relationship between PBNs and Bayesian networks—a family of graphical models that explicitly represent probabilistic relationships between variables. We show how probabilistic dependencies between a gene and its parent genes, constituting the basic building blocks of Bayesian networks, can be obtained from PBNs. Finally, we present methods for quantifying the influence of genes on other genes, within the context of PBNs. Examples illustrating the above concepts are presented throughout the paper. Contact: [email protected]
Stochastic chemical kinetics describes the time evolution of a well-stirred chemically reacting system in a way that takes into account the fact that molecules come in whole numbers and exhibit … Stochastic chemical kinetics describes the time evolution of a well-stirred chemically reacting system in a way that takes into account the fact that molecules come in whole numbers and exhibit some degree of randomness in their dynamical behavior. Researchers are increasingly using this approach to chemical kinetics in the analysis of cellular systems in biology, where the small molecular populations of only a few reactant species can lead to deviations from the predictions of the deterministic differential equations of classical chemical kinetics. After reviewing the supporting theory of stochastic chemical kinetics, I discuss some recent advances in methods for using that theory to make numerical simulations. These include improvements to the exact stochastic simulation algorithm (SSA) and the approximate explicit tau-leaping procedure, as well as the development of two approximate strategies for simulating systems that are dynamically stiff: implicit tau-leaping and the slow-scale SSA.
Abstract Motivation: Mathematical description of biological reaction networks by differential equations leads to large models whose parameters are calibrated in order to optimally explain experimental data. Often only parts of … Abstract Motivation: Mathematical description of biological reaction networks by differential equations leads to large models whose parameters are calibrated in order to optimally explain experimental data. Often only parts of the model can be observed directly. Given a model that sufficiently describes the measured data, it is important to infer how well model parameters are determined by the amount and quality of experimental data. This knowledge is essential for further investigation of model predictions. For this reason a major topic in modeling is identifiability analysis. Results: We suggest an approach that exploits the profile likelihood. It enables to detect structural non-identifiabilities, which manifest in functionally related model parameters. Furthermore, practical non-identifiabilities are detected, that might arise due to limited amount and quality of experimental data. Last but not least confidence intervals can be derived. The results are easy to interpret and can be used for experimental planning and for model reduction. Availability: An implementation is freely available for MATLAB and the PottersWheel modeling toolbox at http://web.me.com/andreas.raue/profile/software.html. Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.
Clonal populations of cells exhibit substantial phenotypic variation. Such heterogeneity can be essential for many biological processes and is conjectured to arise from stochasticity, or noise, in gene expression. We … Clonal populations of cells exhibit substantial phenotypic variation. Such heterogeneity can be essential for many biological processes and is conjectured to arise from stochasticity, or noise, in gene expression. We constructed strains of Escherichia coli that enable detection of noise and discrimination between the two mechanisms by which it is generated. Both stochasticity inherent in the biochemical process of gene expression (intrinsic noise) and fluctuations in other cellular components (extrinsic noise) contribute substantially to overall variation. Transcription rate, regulatory dynamics, and genetic factors control the amplitude of noise. These results establish a quantitative foundation for modeling noise in genetic networks and reveal how low intracellular copy numbers of molecules can fundamentally limit the precision of gene regulation.
Noise, or random fluctuations, in gene expression may produce variability in cellular behavior. To measure the noise intrinsic to eukaryotic gene expression, we quantified the differences in expression of two … Noise, or random fluctuations, in gene expression may produce variability in cellular behavior. To measure the noise intrinsic to eukaryotic gene expression, we quantified the differences in expression of two alleles in a diploid cell. We found that such noise is gene-specific and not dependent on the regulatory pathway or absolute rate of expression. We propose a model in which the balance between promoter activation and transcription influences the variability in messenger RNA levels. To confirm the predictions of our model, we identified both cis - and trans -acting mutations that alter the noise of gene expression. These mutations suggest that noise is an evolvable trait that can be optimized to balance fidelity and diversity in eukaryotic gene expression.
Elucidating gene regulatory networks is crucial for understanding normal cell physiology and complex pathologic phenotypes. Existing computational methods for the genome-wide "reverse engineering" of such networks have been successful only … Elucidating gene regulatory networks is crucial for understanding normal cell physiology and complex pathologic phenotypes. Existing computational methods for the genome-wide "reverse engineering" of such networks have been successful only for lower eukaryotes with simple genomes. Here we present ARACNE, a novel algorithm, using microarray expression profiles, specifically designed to scale up to the complexity of regulatory networks in mammalian cells, yet general enough to address a wider range of network deconvolution problems. This method uses an information theoretic approach to eliminate the majority of indirect interactions inferred by co-expression methods. We prove that ARACNE reconstructs the network exactly (asymptotically) if the effect of loops in the network topology is negligible, and we show that the algorithm works well in practice, even in the presence of numerous loops and complex topologies. We assess ARACNE's ability to reconstruct transcriptional regulatory networks using both a realistic synthetic dataset and a microarray dataset from human B cells. On synthetic datasets ARACNE achieves very low error rates and outperforms established methods, such as Relevance Networks and Bayesian Networks. Application to the deconvolution of genetic networks in human B cells demonstrates ARACNE's ability to infer validated transcriptional targets of the cMYC proto-oncogene. We also study the effects of misestimation of mutual information on network reconstruction, and show that algorithms based on mutual information ranking are more resilient to estimation errors. ARACNE shows promise in identifying direct transcriptional interactions in mammalian cellular networks, a problem that has challenged existing reverse engineering algorithms. This approach should enhance our ability to use microarray data to elucidate functional mechanisms that underlie cellular processes and to identify molecular targets of pharmacological compounds in mammalian cellular networks.
Markov chain perturbation theory is a rapidly developing subfield of the theory of stochastic processes. This review outlines emerging applications of this theory in the analysis of stochastic models of … Markov chain perturbation theory is a rapidly developing subfield of the theory of stochastic processes. This review outlines emerging applications of this theory in the analysis of stochastic models of chemical reactions, with a particular focus on biochemistry and molecular biology. We begin by discussing the general problem of approximate modeling in stochastic chemical kinetics. We then briefly review some essential mathematical results pertaining to perturbation bounds for continuous-time Markov chains, emphasizing the relationship between robustness under perturbations and the rate of exponential convergence to the stationary distribution. We illustrate the use of these results to analyze stochastic models of biochemical reactions by providing concrete examples. Particular attention is given to fundamental problems related to approximation accuracy in model reduction. These include the partial thermodynamic limit, the irreversible-reaction limit, parametric uncertainty analysis, and model reduction for linear reaction networks. We conclude by discussing generalizations and future developments of these methodologies, such as the need for time-inhomogeneous Markov models.
Stem cells are capable of self-renewal and differentiation into various cell types, showing significant potential for cellular therapies and regenerative medicine, particularly in cardiovascular diseases. The differentiation to cardiomyocytes replicates … Stem cells are capable of self-renewal and differentiation into various cell types, showing significant potential for cellular therapies and regenerative medicine, particularly in cardiovascular diseases. The differentiation to cardiomyocytes replicates the embryonic heart development, potentially supporting cardiac regeneration. Cardiomyogenesis is controlled by complex post-transcriptional regulation that affects the construction of gene regulatory networks (GRNs), such as: alternative polyadenylation (APA), length changes in untranslated regulatory regions (3'UTRs), and microRNA (miRNA) regulation. To deepen our understanding of the cardiomyogenesis process, we have modeled a GRN for each day of cardiomyocyte differentiation. Then, each GRN was automatically transformed by four transformation rules to a Petri net and simulated using the software VANESA. The Petri nets highlighted the relationship between genes and alternative isoforms, emphasizing the inhibition of miRNA on APA isoforms with varying 3'UTR lengths. Moreover, in silico simulation of miRNA knockout enabled the visualization of the consequential effects on isoform expression. Our Petri net models provide a resourceful tool and holistic perspective to investigate the functional orchestra of transcript regulation that differentiate hESCs to cardiomyocytes. Additionally, the models can be adapted to investigate post-transcriptional GRN in other biological contexts.
Abstract Cell growth rates exhibit cell-intrinsic cell-to-cell variability, which influences cell fitness and size home-ostasis from bacteria to cancer. Whether this variability arises from noise in cell growth or cell … Abstract Cell growth rates exhibit cell-intrinsic cell-to-cell variability, which influences cell fitness and size home-ostasis from bacteria to cancer. Whether this variability arises from noise in cell growth or cell division processes, or originates from cell-size-dependent growth rates, remains unclear. To separate these potential sources of growth variability, single-cell growth rates need to be examined across different timescales. Here, we study cell-intrinsic size and growth regulation by tracking lymphocytic leukemia cell mass accumulation with high precision and minute-scale temporal resolution along long ancestral lineages. We first show that cell-size-dependent growth regulation and asymmetric division of cell size do not explain cell-to-cell growth variability. We then isolate growth fluctuations from overlapping cell-cycle-dependent growth using a Gaussian process regression analysis. We find that these growth fluctuations drive cell-to-cell growth variability within ancestral lineages despite being independent of cell divisions, cell cycle, and cell size. Overall, our results indicate that cell-intrinsic long-term patterns in cell growth are a byproduct of short-term growth fluctuations.

Chance

2025-06-17
| Cambridge University Press eBooks
H. Hammouri | ESAIM Control Optimisation and Calculus of Variations
By normal form of observability (or canonical form), we mean a controlled dynamical system with a certain triangular structure that respects the input-output map of the system. The problem of … By normal form of observability (or canonical form), we mean a controlled dynamical system with a certain triangular structure that respects the input-output map of the system. The problem of transforming a single output controlled dynamical system using a local diffeomorphism (local coordinate change) was solved in the 1980s and 1990s. Under the assumption of uniform observability, it was shown that the system can be locally transformed almost everywhere into the so-called normal form. The global aspect consists of finding an injective transformation that sends the initial system into normal form, while preserving the input-output map. To get around the problem of the singularities involved in constructing this transformation, we have proposed a purely analytic condition which, combined with an observability condition (differential observability), solves this problem.
In structure-based drug discovery, reliable structural models of ligands bound to their target receptors are critical for establishing the structure-activity relationship of the congeneric series. In such a series, substitutions … In structure-based drug discovery, reliable structural models of ligands bound to their target receptors are critical for establishing the structure-activity relationship of the congeneric series. In such a series, substitutions on a common scaffold core might lead to different binding modes, ranging from slight changes of orientations to flipping or inversion of the core structure. Moreover, molecular docking might lead to alternative orientations within the top-ranked poses without being able to discriminate which is most likely. To determine the relative binding affinities between two alternative ligand poses, we propose a methodology based on relative binding free energy calculations using the λ-dynamics method. We used a dual-topology approach with distance-restraining schemes. We introduced a novel strategy using a one-step perturbation to calculate the contributions of the applied restraints. While using FEP/MBAR instead for that purpose led to smaller uncertainties, it suffered from convergence issues. We tested the validity and predictive power of our approach using two pharmaceutically relevant targets and eight small-molecule inhibitors from the experimentally characterized congeneric series. For each target, our approach correctly ranks the known X-ray poses as more favorable than alternative flipped poses. The proposed methodology can be easily extended to rank more than two poses and should also be applicable to the evaluation of alternative rotamers of target amino acids.
Ultrasensitive transcriptional switches are essential for converting gradual molecular inputs into decisive gene expression responses, enabling critical behaviors such as bistability and oscillations. While cooperative binding, relying on direct repressor-DNA … Ultrasensitive transcriptional switches are essential for converting gradual molecular inputs into decisive gene expression responses, enabling critical behaviors such as bistability and oscillations. While cooperative binding, relying on direct repressor-DNA binding, has been classically regarded as a key ultrasensitivity mechanism, recent theoretical works have demonstrated that combinations of indirect repression mechanisms—sequestration, blocking, and displacement—can also achieve ultrasensitive switches with greater robustness to transcriptional noise. However, these previous works have neglected key biological constraints such as DNA binding kinetics and the limited availability of transcriptional activators, raising the question of whether ultrasensitivity and noise robustness can be sustained under biologically realistic conditions. Here, we systematically assess the impact of these factors on ultrasensitivity and noise robustness under physiologically plausible conditions. We show that while various repression combinations can reduce noise, only the full combination of all three indirect mechanisms consistently maintains low noise and high ultrasensitivity. As a result, biological oscillators employing this triple repression architecture retain precise rhythmic switching even under high noise, and even when activators are shared across thousands of target genes. Our findings offer a mechanistic explanation for the frequent co-occurrence of these repression mechanisms in natural gene regulatory systems.
Cellular memory is the competence of cells to preserve information from past experiences and respond aptly. This memory is maintained and controlled by gene regulatory networks (GRNs). GRNs are crucial … Cellular memory is the competence of cells to preserve information from past experiences and respond aptly. This memory is maintained and controlled by gene regulatory networks (GRNs). GRNs are crucial for understanding why some cells are resistant to treatment, particularly for cancer. In our study, we created a new mathematical model to understand how "noise" affects cellular memory in GRNs, focusing on a "double positive feedback loop". Our theoretical perspective article equipped with mathematical modeling exhibits how noise and feedback loops interact in GRNs. It also proposes a potential theoretical avenue for future therapy. By targeting the mechanisms that maintain drug resistance in cells, we aim to develop therapies that can restore the sensitivity of cancer cells to treatment.
Reconstructing genome-scale gene regulatory networks (GRNs) remains a difficult problem in systems biology, and many experimental and computational methods have been developed to address this problem. Recent computational methods have … Reconstructing genome-scale gene regulatory networks (GRNs) remains a difficult problem in systems biology, and many experimental and computational methods have been developed to address this problem. Recent computational methods have aimed to more accurately model GRNs by estimating the hidden Transcription Factor Activity (TFA), from prior knowledge of TF target regulatory connections, encoded as an input directed graph, to relax the assumption that mRNA level of the regulator correlates with the protein activity of the regulator. However, the noise in the prior knowledge can adversely affect the estimated TFA levels and the quality of the downstream inferred GRNs. Here, we present a new approach, MERLIN+P+TFA, that uses prior knowledge-guided sparsity regularization to robustly and accurately estimate TFA and downstream GRNs. We apply our method to simulated and real expression data in yeast and mammalian systems and show improved quality of inferred GRNs for both bulk and single-cell datasets. Regularized TFA offers benefits to a variety of other GRN inference algorithms, including those that have traditionally be used with expression alone, in both bulk and scRNA-seq settings. We used the inferred GRN to prioritize key regulators for the mouse Embryonic Stem Cell (mESC) state and validate 58 regulators experimentally. We identify both known and novel regulators of the mESC state and further validate the targets of 4 known and novel regulators. Our validation experiments suggest that computationally inferred networks can capture functional targets of TFs with higher precision than estimated in current benchmarks, however, it is important to generate context-specific gold standards.
During neuronal differentiation, gene transcription patterns change in response to both intrinsic and extrinsic cues. Chromatin regulation at regulatory elements plays a key role in this process. However, how chromatin … During neuronal differentiation, gene transcription patterns change in response to both intrinsic and extrinsic cues. Chromatin regulation at regulatory elements plays a key role in this process. However, how chromatin accessibility evolves in vivo in cortical neurons remains unclear. Here, we established a method for labeling differentiating neurons with specific birthdates. Using this method, we traced the four-day differentiation process of in vivo deep-layer excitatory neurons in the mouse embryonic cortex and examined changes in the genome-wide transcription pattern and chromatin accessibility using RNA-sequencing and DNase-sequencing, respectively. We found that genomic regions of genes linked to mature neuronal functions, including deep-layer-specific and stimulus-responsive genes, became accessible even at the embryonic stage. Additionally, our results indicated the involvement of bivalent marks in neural precursor/stem cells and Dmrt3 and Dmrta2 in regulating chromatin accessibility during neuronal differentiation. These findings highlight the importance of chromatin regulation in embryonic neurons, enabling the timely activation of neuronal genes during maturation.
[This corrects the article DOI: 10.1371/journal.pcbi.1011530.]. [This corrects the article DOI: 10.1371/journal.pcbi.1011530.].
In the preimplantation mammalian embryo, stochastic cell-to-cell expression heterogeneity is followed by signal reinforcement to initiate the specification of Inner Cell Mass (ICM) cells into Epiblast (Epi). The expression of … In the preimplantation mammalian embryo, stochastic cell-to-cell expression heterogeneity is followed by signal reinforcement to initiate the specification of Inner Cell Mass (ICM) cells into Epiblast (Epi). The expression of NANOG, the key transcription factor for the Epi fate, is necessary but not sufficient: coincident expression of other factors is required. To identify possible Nanog-helper genes, we analyzed gene expression variability in five time-stamped single-cell transcriptomic datasets using differential entropy, a quantitative measure of cell-to-cell heterogeneity. The entropy of Nanog displays a peak-shaped temporal pattern from the 16-cell to the 64-cell stage, consistent with its key role in Epi specification. By estimating the entropy profiles of the 21 genes common to all five datasets, we identified three genes - Pecam1, Sox2, and Hnf4a - whose variability in expression patterns mirrors that of Nanog. We further performed gene regulatory network inference using CARDAMOM, an algorithm that exploits temporal dynamics and transcriptional bursting. The results revealed that these three genes exhibit reciprocal activation with Nanog at the 32-cell stage. This regulatory motif reinforces fate-switching decisions and co-expression states. Our innovative analysis of single-cell transcriptomic data thus uncovers a likely role for Pecam1, Sox2, and Hnf4a as key genes that, when coincidentally expressed with Nanog, initiate ICM differentiation.
ABSTRACT The central dogma indicates the basic direction of gene expression pathways. For activated gene expression, the quantitative relationship between various links from the binding of transcription factors (TFs) to … ABSTRACT The central dogma indicates the basic direction of gene expression pathways. For activated gene expression, the quantitative relationship between various links from the binding of transcription factors (TFs) to DNA to protein synthesis remains unclear and debated. There is consensus that at a steady state, protein levels are largely determined by the mRNA level. How can we find this steady state? Taking p53 as an example, based on the previously discovered Hill‐type equation that characterizes mRNA expression under p53 pulsing, I proved that the same equation can be used to describe the average steady state of target protein expression. Therefore, at steady state, the average fold changes in mRNA and protein expression under TFs pulsing were the same. This consensus has been successfully demonstrated. For the p53 target gene BAX , the observed fold changes in mRNA and protein expression were 1.40 and 1.28, respectively; the fold changes in mRNA and protein expression calculated using the Hill‐type equation were both 1.35. Therefore, using this equation, we can not only fine‐tune gene expression, but also predict the proteome from the transcriptome. Furthermore, by introducing two quantitative indicators, we can determine the degree of accumulation and stability of protein expression.