Physics and Astronomy Statistical and Nonlinear Physics

Model Reduction and Neural Networks

Description

This cluster of papers focuses on the development and application of physics-informed neural networks for scientific computing, particularly in the context of solving partial differential equations, model reduction, fluid dynamics, dynamic mode decomposition, and nonlinear systems. The research explores the integration of deep learning techniques with traditional numerical methods to address complex problems in physics-based modeling and simulation.

Keywords

Deep Learning; Partial Differential Equations; Model Reduction; Fluid Dynamics; Dynamic Mode Decomposition; Nonlinear Systems; Machine Learning; Data-Driven Modeling; Numerical Computing; Inverse Problems

We develop a new method which extends dynamic mode decomposition (DMD) to incorporate the effect of control to extract low-order models from high-dimensional, complex systems. DMD finds spatial-temporal coherent modes, … We develop a new method which extends dynamic mode decomposition (DMD) to incorporate the effect of control to extract low-order models from high-dimensional, complex systems. DMD finds spatial-temporal coherent modes, connects local-linear analysis to nonlinear operator theory, and provides an equation-free architecture which is compatible with compressive sensing. In actuated systems, DMD is incapable of producing an input-output model; moreover, the dynamics and the modes will be corrupted by external forcing. Our new method, dynamic mode decomposition with control (DMDc), capitalizes on all of the advantages of DMD and provides the additional innovation of being able to disambiguate between the underlying dynamics and the effects of actuation, resulting in accurate input-output models. The method is data-driven in that it does not require knowledge of the underlying governing equations---only snapshots in time of observables and actuation data from historical, experimental, or black-box simulations. We demonstrate the method on high-dimensional dynamical systems, including a model with relevance to the analysis of infectious disease data with mass vaccination (actuation).
There are two widely known issues with properly training Recurrent Neural Networks, the vanishing and the exploding gradient problems detailed in Bengio et al. (1994). In this paper we attempt … There are two widely known issues with properly training Recurrent Neural Networks, the vanishing and the exploding gradient problems detailed in Bengio et al. (1994). In this paper we attempt to improve the understanding of the underlying issues by exploring these problems from an analytical, a geometric and a dynamical systems perspective. Our analysis is used to justify a simple yet effective solution. We propose a gradient norm clipping strategy to deal with exploding gradients and a soft constraint for the vanishing gradients problem. We validate empirically our hypothesis and proposed solutions in the experimental section.
Abstract The problem of approximating a multivariable transfer function G(s) of McMillan degree n, by Ĝ(s) of McMillan degree k is considered. A complete characterization of all approximations that minimize … Abstract The problem of approximating a multivariable transfer function G(s) of McMillan degree n, by Ĝ(s) of McMillan degree k is considered. A complete characterization of all approximations that minimize the Hankel-norm is derived. The solution involves a characterization of all rational functions Ĝ(s) + F(s) that minimize where Ĝ(s) has McMillan degree k, and F(s) is anticavisal. The solution to the latter problem is via results on balanced realizations, all-pass functions and the inertia of matrices, all in terms of the solutions to Lyapunov equations. It is then shown that where σ k+1(G(s)) is the (k+l)st Hankel singular value of G(s) and for one class of optimal Hankel-norm approximations. The method is not computationally demanding and is applied to a 12-state model.
The description of coherent features of fluid flow is essential to our understanding of fluid-dynamical and transport processes. A method is introduced that is able to extract dynamic information from … The description of coherent features of fluid flow is essential to our understanding of fluid-dynamical and transport processes. A method is introduced that is able to extract dynamic information from flow fields that are either generated by a (direct) numerical simulation or visualized/measured in a physical experiment. The extracted dynamic modes, which can be interpreted as a generalization of global stability modes, can be used to describe the underlying physical mechanisms captured in the data sequence or to project large-scale problems onto a dynamical system of significantly fewer degrees of freedom. The concentration on subdomains of the flow field where relevant dynamics is expected allows the dissection of a complex flow into regions of localized instability phenomena and further illustrates the flexibility of the method, as does the description of the dynamics within a spatial framework. Demonstrations of the method are presented consisting of a plane channel flow, flow over a two-dimensional cavity, wake flow behind a flexible membrane and a jet passing between two cylinders.
An LQG (linear quadratic Gaussian) control-design problem involving a constraint on H/sup infinity / disturbance attenuation is considered. The H/sup infinity / performance constraint is embedded within the optimization process … An LQG (linear quadratic Gaussian) control-design problem involving a constraint on H/sup infinity / disturbance attenuation is considered. The H/sup infinity / performance constraint is embedded within the optimization process by replacing the covariance Lyapunov equation by a Riccati equation whose solution leads to an upper bound on L/sub 2/ performance. In contrast to the pair of separated Riccati equations of standard LQG theory, the H/sup infinity /-constrained gains are given by a coupled system of three modified Riccati equations. The coupling illustrates the breakdown of the separation principle for the H/sup infinity /-constrained problem. Both full- and reduced-order design problems are considered with an H/sup infinity / attenuation constraint involving both state and control variables. An algorithm is developed for the full-order design problem and illustrative numerical results are given.< <ETX xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">&gt;</ETX>
A dimension reduction method called discrete empirical interpolation is proposed and shown to dramatically reduce the computational complexity of the popular proper orthogonal decomposition (POD) method for constructing reduced-order models … A dimension reduction method called discrete empirical interpolation is proposed and shown to dramatically reduce the computational complexity of the popular proper orthogonal decomposition (POD) method for constructing reduced-order models for time dependent and/or parametrized nonlinear partial differential equations (PDEs). In the presence of a general nonlinearity, the standard POD-Galerkin technique reduces dimension in the sense that far fewer variables are present, but the complexity of evaluating the nonlinear term remains that of the original problem. The original empirical interpolation method (EIM) is a modification of POD that reduces the complexity of evaluating the nonlinear term of the reduced model to a cost proportional to the number of reduced variables obtained by POD. We propose a discrete empirical interpolation method (DEIM), a variant that is suitable for reducing the dimension of systems of ordinary differential equations (ODEs) of a certain type. As presented here, it is applicable to ODEs arising from finite difference discretization of time dependent PDEs and/or parametrically dependent steady state problems. However, the approach extends to arbitrary systems of nonlinear ODEs with minor modification. Our contribution is a greatly simplified description of the EIM in a finite-dimensional setting that possesses an error bound on the quality of approximation. An application of DEIM to a finite difference discretization of the one-dimensional FitzHugh–Nagumo equations is shown to reduce the dimension from 1024 to order 5 variables with negligible error over a long-time integration that fully captures nonlinear limit cycle behavior. We also demonstrate applicability in higher spatial dimensions with similar state space dimension reduction and accuracy results.
CONTENTS Introduction § 1. Results § 2. Preliminary results from mechanics § 3. Preliminary results from mathematics § 4. The simplest problem of stability § 5. Contents of the paper … CONTENTS Introduction § 1. Results § 2. Preliminary results from mechanics § 3. Preliminary results from mathematics § 4. The simplest problem of stability § 5. Contents of the paper Chapter I. Theory of perturbations § 1. Integrable and non-integrable problems of dynamics § 2. The classical theory of perturbations § 3. Small denominators § 4. Newton's method § 5. Proper degeneracy § 6. Remark 1 § 7. Remark 2 § 8. Application to the problem of proper degeneracy § 9. Limiting degeneracy. Birkhoff's transformation § 10. Stability of positions of equilibrium of Hamiltonian systems Chapter II. Adiabatic invariants § 1. The concept of an adiabatic invariant § 2. Perpetual adiabatic invariance of action with a slow periodic variation of the Hamiltonian § 3. Adiabatic invariants of conservative systems § 4. Magnetic traps § 5. The many-dimensional case Chapter III. The stability of planetary motions § 1. Picture of the motion § 2. Jacobi, Delaunay and Poincaré variables §3. Birkhoff's transformation § 4. Calculation of the asymptotic behaviour of the coefficients in the expansion of § 5. The many-body problem Chapter IV. The fundamental theorem § 1. Fundamental theorem § 2. Inductive theorem § 3. Inductive lemma § 4. Fundamental lemma § 5. Lemma on averaging over rapid variables § 6. Proof of the fundamental lemma § 7. Proof of the inductive lemma § 8. Proof of the inductive theorem § 9. Lemma on the non-degeneracy of diffeomorphisms § 10. Averaging over rapid variables § 11. Polar coordinates § 12. The applicability of the inductive theorem § 13. Passage to the limit § 14. Proof of the fundamental theorem Chapter V. Technical lemmas § 1. Domains of type D § 2. Arithmetic lemmas § 3. Analytic lemmas § 4. Geometric lemmas § 5. Convergence lemmas § 6. Notation Chapter VI. Appendix § 1. Integrable systems § 2. Unsolved problems § 3. Neighbourhood of an invariant manifold §4. Intermixing § 5. Smoothing techniques References
Many of the tools of dynamical systems and control theory have gone largely unused for fluids, because the governing equations are so dynamically complex, both high-dimensional and nonlinear. Model reduction … Many of the tools of dynamical systems and control theory have gone largely unused for fluids, because the governing equations are so dynamically complex, both high-dimensional and nonlinear. Model reduction involves finding low-dimensional models that approximate the full high-dimensional dynamics. This paper compares three different methods of model reduction: proper orthogonal decomposition (POD), balanced truncation, and a method called balanced POD. Balanced truncation produces better reduced-order models than POD, but is not computationally tractable for very large systems. Balanced POD is a tractable method for computing approximate balanced truncations, that has computational cost similar to that of POD. The method presented here is a variation of existing methods using empirical Gramians, and the main contributions of the present paper are a version of the method of snapshots that allows one to compute balancing transformations directly, without separate reduction of the Gramians; and an output projection method, which allows tractable computation even when the number of outputs is large. The output projection method requires minimal additional computation, and has a priori error bounds that can guide the choice of rank of the projection. Connections between POD and balanced truncation are also illuminated: in particular, balanced truncation may be viewed as POD of a particular dataset, using the observability Gramian as an inner product. The three methods are illustrated on a numerical example, the linearized flow in a plane channel.
A model reduction procedure, based on balanced state space representations, is studied in this paper. The reduced order model is examined from the point of view of stability, controllability, and … A model reduction procedure, based on balanced state space representations, is studied in this paper. The reduced order model is examined from the point of view of stability, controllability, and observability. Both continuous time and discrete time systems are considered.
A new method for performing a balanced reduction of a high-order linear system is presented. The technique combines the proper orthogonal decomposition and concepts from balanced realization theory. The method … A new method for performing a balanced reduction of a high-order linear system is presented. The technique combines the proper orthogonal decomposition and concepts from balanced realization theory. The method of snapshotsisused to obtainlow-rank,reduced-rangeapproximationsto thesystemcontrollability and observability grammiansineitherthetimeorfrequencydomain.Theapproximationsarethenusedtoobtainabalancedreducedorder model. The method is particularly effective when a small number of outputs is of interest. It is demonstrated for a linearized high-order system that models unsteady motion of a two-dimensional airfoil. Computation of the exact grammians would be impractical for such a large system. For this problem, very accurate reducedorder models are obtained that capture the required dynamics with just three states. The new models exhibit far superiorperformancethanthosederived using a conventionalproperorthogonal decomposition. Although further development is necessary, the concept also extends to nonlinear systems.
Despite the widespread practical success of deep learning methods, our theoretical understanding of the dynamics of learning in deep neural networks remains quite sparse. We attempt to bridge the gap … Despite the widespread practical success of deep learning methods, our theoretical understanding of the dynamics of learning in deep neural networks remains quite sparse. We attempt to bridge the gap between the theory and practice of deep learning by systematically analyzing learning dynamics for the restricted case of deep linear neural networks. Despite the linearity of their input-output map, such networks have nonlinear gradient descent dynamics on weights that change with the addition of each new hidden layer. We show that deep linear networks exhibit nonlinear learning phenomena similar to those seen in simulations of nonlinear networks, including long plateaus followed by rapid transitions to lower error solutions, and faster convergence from greedy unsupervised pretraining initial conditions than from random initial conditions. We provide an analytical description of these phenomena by finding new exact solutions to the nonlinear dynamics of deep learning. Our theoretical analysis also reveals the surprising finding that as the depth of a network approaches infinity, learning speed can nevertheless remain finite: for a special class of initial conditions on the weights, very deep networks incur only a finite, depth independent, delay in learning speed relative to shallow networks. We show that, under certain conditions on the training data, unsupervised pretraining can find this special class of initial conditions, while scaled random Gaussian initializations cannot. We further exhibit a new class of random orthogonal initial conditions on weights that, like unsupervised pre-training, enjoys depth independent learning times. We further show that these initial conditions also lead to faithful propagation of gradients even in deep nonlinear networks, as long as they operate in a special regime known as the edge of chaos.
This article reviews theory and applications of Koopman modes in fluid mechanics. Koopman mode decomposition is based on the surprising fact, discovered in Mezić (2005) , that normal modes of … This article reviews theory and applications of Koopman modes in fluid mechanics. Koopman mode decomposition is based on the surprising fact, discovered in Mezić (2005) , that normal modes of linear oscillations have their natural analogs—Koopman modes—in the context of nonlinear dynamics. To pursue this analogy, one must change the representation of the system from the state-space representation to the dynamics governed by the linear Koopman operator on an infinite-dimensional space of observables. Whereas Koopman in his original paper dealt only with measure-preserving transformations, the discussion here is predominantly on dissipative systems arising from Navier-Stokes evolution. The analysis is based on spectral properties of the Koopman operator. Aspects of point and continuous parts of the spectrum are discussed. The point spectrum corresponds to isolated frequencies of oscillation present in the fluid flow, and also to growth rates of stable and unstable modes. The continuous part of the spectrum corresponds to chaotic motion on the attractor. A method of computation of the spectrum and the associated Koopman modes is discussed in terms of generalized Laplace analysis. When applied to a generic observable, this method uncovers the full point spectrum. A computational alternative is given by Arnoldi-type methods, leading to so-called dynamic mode decomposition, and I discuss the connection and differences between these two methods. A number of applications are reviewed in which decompositions of this type have been pursued. Koopman mode theory unifies and provides a rigorous background for a number of different concepts that have been advanced in fluid mechanics, including global mode analysis, triple decomposition, and dynamic mode decomposition.
The use of natural symmetries (mirror images) in a well-defined family of patterns (human faces) is discussed within the framework of the Karhunen-Loeve expansion. This results in an extension of … The use of natural symmetries (mirror images) in a well-defined family of patterns (human faces) is discussed within the framework of the Karhunen-Loeve expansion. This results in an extension of the data and imposes even and odd symmetry on the eigenfunctions of the covariance matrix, without increasing the complexity of the calculation. The resulting approximation of faces projected from outside of the data set onto this optimal basis is improved on average.< <ETX xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">&gt;</ETX>
Nonlinear dynamical systems have been used in many disciplines to model complex behaviors, including biological motor control, robotics, perception, economics, traffic prediction, and neuroscience. While often the unexpected emergent behavior … Nonlinear dynamical systems have been used in many disciplines to model complex behaviors, including biological motor control, robotics, perception, economics, traffic prediction, and neuroscience. While often the unexpected emergent behavior of nonlinear systems is the focus of investigations, it is of equal importance to create goal-directed behavior (e.g., stable locomotion from a system of coupled oscillators under perceptual guidance). Modeling goal-directed behavior with nonlinear systems is, however, rather difficult due to the parameter sensitivity of these systems, their complex phase transitions in response to subtle parameter changes, and the difficulty of analyzing and predicting their long-term behavior; intuition and time-consuming parameter tuning play a major role. This letter presents and reviews dynamical movement primitives, a line of research for modeling attractor behaviors of autonomous nonlinear dynamical systems with the help of statistical learning techniques. The essence of our approach is to start with a simple dynamical system, such as a set of linear differential equations, and transform those into a weakly nonlinear system with prescribed attractor dynamics by means of a learnable autonomous forcing term. Both point attractors and limit cycle attractors of almost arbitrary complexity can be generated. We explain the design principle of our approach and evaluate its properties in several example applications in motor control and robotics.
Numerical simulation of large-scale dynamical systems plays a fundamental role in studying a wide range of complex physical phenomena; however, the inherent large-scale nature of the models often leads to … Numerical simulation of large-scale dynamical systems plays a fundamental role in studying a wide range of complex physical phenomena; however, the inherent large-scale nature of the models often leads to unmanageable demands on computational resources. Model reduction aims to reduce this computational burden by generating reduced models that are faster and cheaper to simulate, yet accurately represent the original large-scale system behavior. Model reduction of linear, nonparametric dynamical systems has reached a considerable level of maturity, as reflected by several survey papers and books. However, parametric model reduction has emerged only more recently as an important and vibrant research area, with several recent advances making a survey paper timely. Thus, this paper aims to provide a resource that draws together recent contributions in different communities to survey the state of the art in parametric model reduction methods. Parametric model reduction targets the broad class of problems for which the equations governing the system behavior depend on a set of parameters. Examples include parameterized partial differential equations and large-scale systems of parameterized ordinary differential equations. The goal of parametric model reduction is to generate low-cost but accurate models that characterize system response for different values of the parameters. This paper surveys state-of-the-art methods in projection-based parametric model reduction, describing the different approaches within each class of methods for handling parametric variation and providing a comparative discussion that lends insights to potential advantages and disadvantages in applying each of the methods. We highlight the important role played by parametric model reduction in design, control, optimization, and uncertainty quantification---settings that require repeated model evaluations over different parameter values.
In this paper a new algorithm for solving algebraic Riccati equations (both continuous-time and discrete-time versions) is presented. The method studied is a variant of the classical eigenvector approach and … In this paper a new algorithm for solving algebraic Riccati equations (both continuous-time and discrete-time versions) is presented. The method studied is a variant of the classical eigenvector approach and uses instead an appropriate set of Schur vectors, thereby gaining substantial numerical advantages. Considerable discussion is devoted to a number of numerical issues. The method is apparently quite numerically stable and performs reliably on systems with dense matrices up to order 100 or so, storage being the main limiting factor.
We present a technique for describing the global behaviour of complex nonlinear flows by decomposing the flow into modes determined from spectral analysis of the Koopman operator, an infinite-dimensional linear … We present a technique for describing the global behaviour of complex nonlinear flows by decomposing the flow into modes determined from spectral analysis of the Koopman operator, an infinite-dimensional linear operator associated with the full nonlinear system. These modes, referred to as Koopman modes, are associated with a particular observable, and may be determined directly from data (either numerical or experimental) using a variant of a standard Arnoldi method. They have an associated temporal frequency and growth rate and may be viewed as a nonlinear generalization of global eigenmodes of a linearized system. They provide an alternative to proper orthogonal decomposition, and in the case of periodic data the Koopman modes reduce to a discrete temporal Fourier transform. The Arnoldi method used for computations is identical to the dynamic mode decomposition recently proposed by Schmid &amp; Sesterhenn ( Sixty-First Annual Meeting of the APS Division of Fluid Dynamics , 2008), so dynamic mode decomposition can be thought of as an algorithm for finding Koopman modes. We illustrate the method on an example of a jet in crossflow, and show that the method captures the dominant frequencies and elucidates the associated spatial structures.
A method, called the Eigensystem Realization Algorithm (ERA), is developed for modal parameter identification and model reduction of dynamic systems from test data. A new approach is introduced in conjunction … A method, called the Eigensystem Realization Algorithm (ERA), is developed for modal parameter identification and model reduction of dynamic systems from test data. A new approach is introduced in conjunction with the singular value decomposition technique to derive the basic formulation of minimum order realization which is an extended version of the Ho-Kalman algorithm. The basic formulation is then transformed into modal space for modal parameter identification. Two accuracy indicators are developed to quantitatively identify the system modes and noise modes. For illustration of the algorithm, examples are shown using simulation data and experimental data for a rectangular grid structure.
The ability to discover physical laws and governing equations from data is one of humankind's greatest intellectual achievements. A quantitative understanding of dynamic constraints and balances in nature has facilitated … The ability to discover physical laws and governing equations from data is one of humankind's greatest intellectual achievements. A quantitative understanding of dynamic constraints and balances in nature has facilitated rapid development of knowledge and enabled advanced technological achievements, including aircraft, combustion engines, satellites, and electrical power. In this work, we combine sparsity-promoting techniques and machine learning with nonlinear dynamical systems to discover governing physical equations from measurement data. The only assumption about the structure of the model is that there are only a few important terms that govern the dynamics, so that the equations are sparse in the space of possible functions; this assumption holds for many physical systems. In particular, we use sparse regression to determine the fewest terms in the dynamic governing equations required to accurately represent the data. The resulting models are parsimonious, balancing model complexity with descriptive ability while avoiding overfitting. We demonstrate the algorithm on a wide range of problems, from simple canonical systems, including linear and nonlinear oscillators and the chaotic Lorenz system, to the fluid vortex shedding behind an obstacle. The fluid example illustrates the ability of this method to discover the underlying dynamics of a system that took experts in the community nearly 30 years to resolve. We also show that this method generalizes to parameterized, time-varying, or externally forced systems.
Researchers propose sparse regression for identifying governing partial differential equations for spatiotemporal systems. Researchers propose sparse regression for identifying governing partial differential equations for spatiotemporal systems.
Deep neural networks have enabled progress in a wide variety of applications. Growing the size of the neural network typically results in improved accuracy. As model sizes grow, the memory … Deep neural networks have enabled progress in a wide variety of applications. Growing the size of the neural network typically results in improved accuracy. As model sizes grow, the memory and compute requirements for training these models also increases. We introduce a technique to train deep neural networks using half precision floating point numbers. In our technique, weights, activations and gradients are stored in IEEE half-precision format. Half-precision floating numbers have limited numerical range compared to single-precision numbers. We propose two techniques to handle this loss of information. Firstly, we recommend maintaining a single-precision copy of the weights that accumulates the gradients after each optimizer step. This single-precision copy is rounded to half-precision format during training. Secondly, we propose scaling the loss appropriately to handle the loss of information with half-precision gradients. We demonstrate that this approach works for a wide variety of models including convolution neural networks, recurrent neural networks and generative adversarial networks. This technique works for large scale models with more than 100 million parameters trained on large datasets. Using this approach, we can reduce the memory consumption of deep learning models by nearly 2x. In future processors, we can also expect a significant computation speedup using half-precision hardware units.
We introduce physics informed neural networks -- neural networks that are trained to solve supervised learning tasks while respecting any given law of physics described by general nonlinear partial differential … We introduce physics informed neural networks -- neural networks that are trained to solve supervised learning tasks while respecting any given law of physics described by general nonlinear partial differential equations. In this two part treatise, we present our developments in the context of solving two main classes of problems: data-driven solution and data-driven discovery of partial differential equations. Depending on the nature and arrangement of the available data, we devise two distinct classes of algorithms, namely continuous time and discrete time models. The resulting neural networks form a new class of data-efficient universal function approximators that naturally encode any underlying physical laws as prior information. In this first part, we demonstrate how these networks can be used to infer solutions to partial differential equations, and obtain physics-informed surrogate models that are fully differentiable with respect to all input coordinates and free parameters.
Significance Partial differential equations (PDEs) are among the most ubiquitous tools used in modeling problems in nature. However, solving high-dimensional PDEs has been notoriously difficult due to the “curse of … Significance Partial differential equations (PDEs) are among the most ubiquitous tools used in modeling problems in nature. However, solving high-dimensional PDEs has been notoriously difficult due to the “curse of dimensionality.” This paper introduces a practical algorithm for solving nonlinear PDEs in very high (hundreds and potentially thousands of) dimensions. Numerical results suggest that the proposed algorithm is quite effective for a wide variety of problems, in terms of both accuracy and speed. We believe that this opens up a host of possibilities in economics, finance, operational research, and physics, by considering all participating agents, assets, resources, or particles together at the same time, instead of making ad hoc assumptions on their interrelationships.
The choice of approximate posterior distribution is one of the core problems in variational inference. Most applications of variational inference employ simple families of posterior approximations in order to allow … The choice of approximate posterior distribution is one of the core problems in variational inference. Most applications of variational inference employ simple families of posterior approximations in order to allow for efficient inference, focusing on mean-field or other simple structured approximations. This restriction has a significant impact on the quality of inferences made using variational methods. We introduce a new approach for specifying flexible, arbitrarily complex and scalable approximate posterior distributions. Our approximations are distributions constructed through a normalizing flow, whereby a simple initial density is transformed into a more complex one by applying a sequence of invertible transformations until a desired level of complexity is attained. We use this view of normalizing flows to develop categories of finite and infinitesimal flows and provide a unified view of approaches for constructing rich posterior approximations. We demonstrate that the theoretical advantages of having posteriors that better match the true posterior, combined with the scalability of amortized variational approaches, provides a clear improvement in performance and applicability of variational inference.
We introduce a new family of deep neural network models. Instead of specifying a discrete sequence of hidden layers, we parameterize the derivative of the hidden state using a neural … We introduce a new family of deep neural network models. Instead of specifying a discrete sequence of hidden layers, we parameterize the derivative of the hidden state using a neural network. The output of the network is computed using a black-box differential equation solver. These continuous-depth models have constant memory cost, adapt their evaluation strategy to each input, and can explicitly trade numerical precision for speed. We demonstrate these properties in continuous-depth residual networks and continuous-time latent variable models. We also construct continuous normalizing flows, a generative model that can train by maximum likelihood, without partitioning or ordering the data dimensions. For training, we show how to scalably backpropagate through any ODE solver, without access to its internal operations. This allows end-to-end training of ODEs within larger models.
For centuries, flow visualization has been the art of making fluid motion visible in physical and biological systems. Although such flow patterns can be, in principle, described by the Navier-Stokes … For centuries, flow visualization has been the art of making fluid motion visible in physical and biological systems. Although such flow patterns can be, in principle, described by the Navier-Stokes equations, extracting the velocity and pressure fields directly from the images is challenging. We addressed this problem by developing hidden fluid mechanics (HFM), a physics-informed deep-learning framework capable of encoding the Navier-Stokes equations into the neural networks while being agnostic to the geometry or the initial and boundary conditions. We demonstrate HFM for several physical and biomedical problems by extracting quantitative information for which direct measurements may not be possible. HFM is robust to low resolution and substantial noise in the observation data, which is important for potential applications.
Deep learning has achieved remarkable success in diverse applications; however, its use in solving partial differential equations (PDEs) has emerged only recently. Here, we present an overview of physics-informed neural … Deep learning has achieved remarkable success in diverse applications; however, its use in solving partial differential equations (PDEs) has emerged only recently. Here, we present an overview of physics-informed neural networks (PINNs), which embed a PDE into the loss of the neural network using automatic differentiation. The PINN algorithm is simple, and it can be applied to different types of PDEs, including integro-differential equations, fractional PDEs, and stochastic PDEs. Moreover, from an implementation point of view, PINNs solve inverse problems as easily as forward problems. We propose a new residual-based adaptive refinement (RAR) method to improve the training efficiency of PINNs. For pedagogical reasons, we compare the PINN algorithm to a standard finite element method. We also present a Python library for PINNs, DeepXDE, which is designed to serve both as an educational tool to be used in the classroom as well as a research tool for solving problems in computational science and engineering. Specifically, DeepXDE can solve forward problems given initial and boundary conditions, as well as inverse problems given some extra measurements. DeepXDE supports complex-geometry domains based on the technique of constructive solid geometry and enables the user code to be compact, resembling closely the mathematical formulation. We introduce the usage of DeepXDE and its customizability, and we also demonstrate the capability of PINNs and the user-friendliness of DeepXDE for five different examples. More broadly, DeepXDE contributes to the more rapid development of the emerging scientific machine learning field.
We consider the frequency domain form of proper orthogonal decomposition (POD), called spectral proper orthogonal decomposition (SPOD). Spectral POD is derived from a space–time POD problem for statistically stationary flows … We consider the frequency domain form of proper orthogonal decomposition (POD), called spectral proper orthogonal decomposition (SPOD). Spectral POD is derived from a space–time POD problem for statistically stationary flows and leads to modes that each oscillate at a single frequency. This form of POD goes back to the original work of Lumley ( Stochastic Tools in Turbulence , Academic Press, 1970), but has been overshadowed by a space-only form of POD since the 1990s. We clarify the relationship between these two forms of POD and show that SPOD modes represent structures that evolve coherently in space and time, while space-only POD modes in general do not. We also establish a relationship between SPOD and dynamic mode decomposition (DMD); we show that SPOD modes are in fact optimally averaged DMD modes obtained from an ensemble DMD problem for stationary flows. Accordingly, SPOD modes represent structures that are dynamic in the same sense as DMD modes but also optimally account for the statistical variability of turbulent flows. Finally, we establish a connection between SPOD and resolvent analysis. The key observation is that the resolvent-mode expansion coefficients must be regarded as statistical quantities to ensure convergent approximations of the flow statistics. When the expansion coefficients are uncorrelated, we show that SPOD and resolvent modes are identical. Our theoretical results and the overall utility of SPOD are demonstrated using two example problems: the complex Ginzburg–Landau equation and a turbulent jet.
Originally introduced in the fluid mechanics community, dynamic mode decomposition (DMD) has emerged as a powerful tool for analyzing the dynamics of nonlinear systems. However, existing DMD theory deals primarily … Originally introduced in the fluid mechanics community, dynamic mode decomposition (DMD) has emerged as a powerful tool for analyzing the dynamics of nonlinear systems. However, existing DMD theory deals primarily with sequential time series for which the measurement dimension is much larger than the number of measurements taken. We present a theoretical framework in which we define DMD as the eigendecomposition of an approximating linear operator. This generalizes DMD to a larger class of datasets, including nonsequential time series. We demonstrate the utility of this approach by presenting novel sampling strategies that increase computational efficiency and mitigate the effects of noise, respectively. We also introduce the concept of linear consistency, which helps explain the potential pitfalls of applying DMD to rank-deficient datasets, illustrating with examples. Such computations are not considered in the existing literature, but can be understood using our more general framework. In addition, we show that our theory strengthens the connections between DMD and Koopman operator theory. It also establishes connections between DMD and other techniques, including the eigensystem realization algorithm (ERA), a system identification method, and linear inverse modeling (LIM), a method from climate science. We show that under certain conditions, DMD is equivalent to LIM.
Related DatabasesWeb of Science You must be logged in with an active subscription to view this.Article DataHistorySubmitted: 12 February 2020Accepted: 25 May 2021Published online: 09 September 2021Keywordsdeep learning, differential equations, … Related DatabasesWeb of Science You must be logged in with an active subscription to view this.Article DataHistorySubmitted: 12 February 2020Accepted: 25 May 2021Published online: 09 September 2021Keywordsdeep learning, differential equations, optimization, stiff dynamics, computational physicsAMS Subject Headings68T99, 65M99, 68U20Publication DataISSN (print): 1064-8275ISSN (online): 1095-7197Publisher: Society for Industrial and Applied MathematicsCODEN: sjoce3
Physics-Informed Neural Networks (PINN) are neural networks (NNs) that encode model equations, like Partial Differential Equations (PDE), as a component of the neural network itself. PINNs are nowadays used to … Physics-Informed Neural Networks (PINN) are neural networks (NNs) that encode model equations, like Partial Differential Equations (PDE), as a component of the neural network itself. PINNs are nowadays used to solve PDEs, fractional equations, integral-differential equations, and stochastic PDEs. This novel methodology has arisen as a multi-task learning framework in which a NN must fit observed data while reducing a PDE residual. This article provides a comprehensive review of the literature on PINNs: while the primary goal of the study was to characterize these networks and their related advantages and disadvantages. The review also attempts to incorporate publications on a broader range of collocation-based physics informed neural networks, which stars form the vanilla PINN, as well as many other variants, such as physics-constrained neural networks (PCNN), variational hp-VPINN, and conservative PINN (CPINN). The study indicates that most research has focused on customizing the PINN through different activation functions, gradient optimization techniques, neural network structures, and loss function structures. Despite the wide range of applications for which PINNs have been used, by demonstrating their ability to be more feasible in some contexts than classical numerical techniques like Finite Element Method (FEM), advancements are still possible, most notably theoretical issues that remain unresolved.
Learning to solve complex sequences of tasks--while both leveraging transfer and avoiding catastrophic forgetting--remains a key obstacle to achieving human-level intelligence. The progressive networks approach represents a step forward in … Learning to solve complex sequences of tasks--while both leveraging transfer and avoiding catastrophic forgetting--remains a key obstacle to achieving human-level intelligence. The progressive networks approach represents a step forward in this direction: they are immune to forgetting and can leverage prior knowledge via lateral connections to previously learned features. We evaluate this architecture extensively on a wide variety of reinforcement learning tasks (Atari and 3D maze games), and show that it outperforms common baselines based on pretraining and finetuning. Using a novel sensitivity measure, we demonstrate that transfer occurs at both low-level sensory and high-level control layers of the learned policy.
The accurate quantification of wall-shear stress dynamics is of substantial importance for various applications in fundamental and applied research, spanning areas from human health to aircraft design and optimization. Despite … The accurate quantification of wall-shear stress dynamics is of substantial importance for various applications in fundamental and applied research, spanning areas from human health to aircraft design and optimization. Despite significant progress in experimental measurement techniques and postprocessing algorithms, temporally resolved wall-shear stress fields with adequate spatial resolution and within a suitable spatial domain remain an elusive goal. Furthermore, there is a systematic lack of universal models that can accurately replicate the instantaneous wall-shear stress dynamics in numerical simulations of multiscale systems where direct numerical simulations (DNSs) are prohibitively expensive. To address these gaps, we introduce a deep learning architecture that ingests wall-parallel streamwise velocity fields at $y^+ \approx 3.9 \sqrt {Re_\tau }$ of turbulent wall-bounded flows and outputs the corresponding two-dimensional streamwise wall-shear stress fields with identical spatial resolution and domain size. From a physical perspective, our framework acts as a surrogate model encapsulating the various mechanisms through which highly energetic outer-layer flow structures influence the governing wall-shear stress dynamics. The network is trained in a supervised fashion on a unified dataset comprising DNSs of statistically one-dimensional turbulent channel and spatially developing turbulent boundary layer flows at friction Reynolds numbers ranging from $390$ to $1500$ . We demonstrate a zero-shot applicability to experimental velocity fields obtained from particle image velocimetry measurements and verify the physical accuracy of the wall-shear stress estimates with synchronized wall-shear stress measurements using the micro-pillar shear-stress sensor for Reynolds numbers up to $2000$ . In summary, the presented framework lays the groundwork for extracting inaccessible experimental wall-shear stress information from readily available velocity measurements and thus, facilitates advancements in a variety of experimental applications.
Abstract. Model predictions are important to assess the subsurface state distributions (such as the stress), which are essential for, for instance, determining the location of potential nuclear waste disposal sites. … Abstract. Model predictions are important to assess the subsurface state distributions (such as the stress), which are essential for, for instance, determining the location of potential nuclear waste disposal sites. Providing these predictions with quantified uncertainties often requires a large number of simulations, which is difficult due to the high CPU time needed. One possibility for addressing the computational burden is to use surrogate models. Purely data-driven approaches face challenges when operating in data-sparse application fields such as geomechanical modeling or producing interpretable models. The latter aspect is critical for applications such as nuclear waste disposal, where it is essential to provide trustworthy predictions. To overcome the challenge of trustworthiness, we propose the usage of a novel hybrid machine learning method, namely the non-intrusive reduced-basis method, as a surrogate model. This method resolves both of the above challenges while being orders of magnitude faster than classical finite element simulations. In the paper, we demonstrate the usage of the non-intrusive reduced-basis method for 3-D geomechanical–numerical modeling with a comprehensive sensitivity assessment. The usage of these surrogate geomechanical models yields a speed-up of 6 orders of magnitude while maintaining global errors in the range of less than 0.01 %. Because of this enormous reduction in computation time, computationally demanding methods such as global sensitivity analyses, which provide valuable information about the contribution of the various model parameters to stress variability, become feasible. The opportunities of these added benefits are demonstrated with a benchmark example and a simplified study for a siting region for a potential nuclear waste repository in Nördlich Lägern (Switzerland).
Abstract Diffusion probabilistic models (DPMs) have achieved impressive success in high-resolution image synthesis, especially in recent large-scale text-to-image generation applications. An essential technique for improving the sample quality of DPMs … Abstract Diffusion probabilistic models (DPMs) have achieved impressive success in high-resolution image synthesis, especially in recent large-scale text-to-image generation applications. An essential technique for improving the sample quality of DPMs is guided sampling, which usually needs a large guidance scale to obtain the best sample quality. The commonly-used fast sampler for guided sampling is denoising diffusion implicit models (DDIM), a first-order diffusion ordinary differential equation (ODE) solver that generally needs 100 to 250 steps for high-quality samples. Although recent works propose dedicated high-order solvers and achieve a further speedup for sampling without guidance, their effectiveness for guided sampling has not been well-tested before. In this work, we demonstrate that previous high-order fast samplers suffer from instability issues, and they even become slower than DDIM when the guidance scale grows larger. To further speed up guided sampling, we propose DPM-Solver++, a high-order solver for the guided sampling of DPMs. DPM-Solver++ solves the diffusion ODE with the data prediction model and adopts thresholding methods to keep the solution matches training data distribution. We further propose a multistep variant of DPM-Solver++ to address the instability issue by reducing the effective step size. Experiments show that DPM-Solver++ can generate high-quality samples within only 15 to 20 steps for guided sampling by pixel-space and latent-space DPMs.
This paper presents a novel machine learning framework for reconstructing low-order gust-encounter flow field and lift coefficients from sparse, noisy surface pressure measurements. Our study thoroughly investigates the time-varying response … This paper presents a novel machine learning framework for reconstructing low-order gust-encounter flow field and lift coefficients from sparse, noisy surface pressure measurements. Our study thoroughly investigates the time-varying response of sensors to gust–airfoil interactions, uncovering valuable insights into optimal sensor placement. To address uncertainties in deep learning predictions, we implement probabilistic regression strategies to model both epistemic and aleatoric uncertainties. Epistemic uncertainty, reflecting the model’s confidence in its predictions, is modelled using Monte Carlo dropout – as an approximation to the variational inference in the Bayesian framework – treating the neural network as a stochastic entity. On the other hand, aleatoric uncertainty, arising from noisy input measurements, is captured via learned statistical parameters, and propagate measurement noise through the network into the final predictions. Our results showcase the efficacy of this dual uncertainty quantification strategy in accurately predicting aerodynamic behaviour under extreme conditions while maintaining computational efficiency, underscoring its potential to improve online sensor-based flow estimation in real-world applications.
This study is a collaborative effort within the NATO Science &amp; Technology Organization, bringing together multiple institutions to advance reduced-order modeling. Aerodynamic reduced-order models were developed using two pseudorandom binary … This study is a collaborative effort within the NATO Science &amp; Technology Organization, bringing together multiple institutions to advance reduced-order modeling. Aerodynamic reduced-order models were developed using two pseudorandom binary sequence (PRBS) training maneuvers, where the angle of attack and pitch rate varied in a periodic, deterministic manner with white-noise-like properties. The first maneuver maintained a constant Mach number of 0.85, while the second varied Mach from 0.1 to 0.9. The test case involved a generic triple-delta wing, simulated using the DoD HPCMP CREATE™-AV/Kestrel/Kestrel tools. Prescribed-body motion was used to vary input parameters under given freestream conditions. The resulting models predicted static and stability derivatives across different angles of attack and Mach numbers. They were also used to predict aerodynamic responses to arbitrary motions, including sinusoidal, chirp, Schroeder, and step inputs, showing good agreement with full-order data. Additionally, models predicting surface pressure accurately captured upper surface pressures across different spanwise and chordwise locations for both static and dynamic conditions.

None

2025-06-23
Benjamin Boutin , Pierre Le Barbenchon , Nicolas Seguin | Annales de la faculté des sciences de Toulouse Mathématiques
In this paper, we present a numerical strategy to check the strong stability (or GKS-stability) of one-step explicit finite difference schemes for the one-dimensional advection equation with an inflow boundary … In this paper, we present a numerical strategy to check the strong stability (or GKS-stability) of one-step explicit finite difference schemes for the one-dimensional advection equation with an inflow boundary condition. The strong stability is studied using the Kreiss–Lopatinskii theory. We introduce a new tool, the intrinsic Kreiss–Lopatinskii determinant, which possesses the same regularity as the vector bundle of discrete stable solutions. By applying standard results of complex analysis to this determinant, we are able to relate the strong stability of numerical schemes to the computation of a winding number, which is robust and cheap. The study is illustrated with the O3 scheme and the fifth-order Lax–Wendroff (LW5) scheme together with a reconstruction procedure at the boundary.
Leonardo Lucio Custode , Giovanni Iacca , Ivanoe De Falco +2 more | ACM Transactions on Evolutionary Learning and Optimization
In the past few years, Federated Learning (FL) has emerged as an effective approach for training Neural Networks (NNs) over a computing network while preserving data privacy. Most existing FL … In the past few years, Federated Learning (FL) has emerged as an effective approach for training Neural Networks (NNs) over a computing network while preserving data privacy. Most existing FL approaches require defining a priori 1) a predefined structure for all the NNs running on the clients and 2) an explicit aggregation procedure. These can be limiting factors in cases where pre-defining such algorithmic details is difficult. Recently, NEvoFed was proposed, an FL method that leverages Neuroevolution running on the clients, in which the NN structures are heterogeneous and the aggregation is implicitly accomplished on the client side. Here, we propose MFC-NEvoFed, a novel approach to FL that does not require learning models, i.e., neural network parameters, to be distributed over the networks, thus taking a step towards security improvement. The only information exchanged in client/server communication is the performance of each model on local data, allowing the emergence of optimal NN architectures without needing any kind of model aggregation. Another appealing feature of our framework is that it can be used with any Machine Learning algorithm provided that, during the learning phase, the model updates do not depend on the input data. To assess the validity of MFC-NEvoFed, we test it on four datasets, showing that very compact NNs can be obtained without drops in performance compared to canonical FL. Finally, such compact structures allow for a step towards explainability, which is highly desirable in domains such as digital health, from which the tested datasets come.
Seong-Joon Yoo , Moon Gi Kang , Heonjun Yoon +1 more | International Journal of Precision Engineering and Manufacturing
<title>Abstract</title> The good choice of hyperparameters is crucial for the successful application of Deep Learning (DL) networks in order to find accurate solutions or the best parameter in solving Partial … <title>Abstract</title> The good choice of hyperparameters is crucial for the successful application of Deep Learning (DL) networks in order to find accurate solutions or the best parameter in solving Partial Differential Equations (PDEs), that are sensitive to errors in coefficient estimation. For this purpose, Hyperparameter Optimization of Multi-Output Physics-Informed Neural Networks (HOMO-PINNs) is based on the optimal search of PINN hyperparameters for solving PDEs with uncertain coefficients in the Uncertainty Quantification (UQ) field. By testing this novel methodology on different PDEs, the relationship between activation functions, the number of output neurons, and the degree of coefficient uncertainty can be observed. The experimental results show that adding output neurons to the Neural Network (NN) even if a theoretically incorrect activation function is chosen, keeps the predicted solution accurate.
Solving constrained optimal control problems (OCPs) is essential to ensure safety in real-world scenarios. Recent machine learning techniques have shown promise in addressing OCPs. This paper introduces a novel methodology … Solving constrained optimal control problems (OCPs) is essential to ensure safety in real-world scenarios. Recent machine learning techniques have shown promise in addressing OCPs. This paper introduces a novel methodology for solving OCPs with path constraints using physics-informed neural networks. Specifically, Pontryagin neural networks (PoNNs), which solve the boundary value problem arising from the indirect method and Pontryagin minimum principle, are extended to handle path constraints. In this new formulation, path constraints are incorporated into the Hamiltonian through additional Lagrange multipliers, which are treated as optimization variables. The complementary slackness conditions are enforced by ensuring the zero value of the Fischer–Burmeister function within the loss functions to be minimized. This approach adds minimal complexity to the original PoNN framework, as it avoids the need for continuation methods, penalty functions, or additional differential equations, which are often required in traditional methods to solve path-constrained OCPs via the indirect method. Numerical results for a fixed-time energy-optimal rendezvous with various path constraints and a constrained optimal rocket ascent demonstrate the effectiveness of the proposed method in solving path-constrained OCPs.
ABSTRACT Neural Ordinary Differential Equations (ODEs) represent a significant advancement at the intersection of machine learning and dynamical systems, offering a continuous‐time analog to discrete neural networks. Despite their promise, … ABSTRACT Neural Ordinary Differential Equations (ODEs) represent a significant advancement at the intersection of machine learning and dynamical systems, offering a continuous‐time analog to discrete neural networks. Despite their promise, deploying neural ODEs in practical applications often encounters the challenge of stiffness, a condition where rapid variations in some components of the solution demand prohibitively small time steps for explicit solvers. This work addresses the stiffness issue when employing neural ODEs for model order reduction by introducing a suitable reparametrization in time. The considered map is data‐driven, and it is induced by the adaptive time‐stepping of an implicit solver on a reference solution. We show that the map produces a non‐stiff system that can be cheaply solved with an explicit time integration scheme. The original, stiff, time dynamic is recovered by means of a map learnt by a neural network that connects the state space to the time reparametrization. We validate our method through extensive experiments, demonstrating improvements in efficiency for the neural ODE inference while maintaining robustness and accuracy when compared to an implicit solver applied to the stiff system with the original right‐hand side.
Proper orthogonal decomposition (POD) is a widely used linear dimensionality reduction technique, but it often fails to capture critical features in complex nonlinear flows. In contrast, clustering methods are effective … Proper orthogonal decomposition (POD) is a widely used linear dimensionality reduction technique, but it often fails to capture critical features in complex nonlinear flows. In contrast, clustering methods are effective for nonlinear feature extraction, yet their application in dimensionality reduction methods is hindered by unstable cluster initialization and inefficient mode sorting. To address these issues, we propose a clustering-based dimensionality reduction method guided by POD structures (C-POD), which uses POD preprocessing to stabilize the selection of cluster centers. Additionally, we introduce an entropy-controlled Euclidean-to-probability mapping (ECEPM) method to improve modal sorting and assess mode importance. The C-POD approach is evaluated using the one-dimensional Burgers’ equation and a two-dimensional cylinder wake flow. Results show that C-POD achieves higher accuracy in dimensionality reduction than POD. Its dominant modes capture more temporal dynamics, while higher-order modes offer better physical interpretability. When solving an inverse problem using sparse sensor data, the Gappy C-POD method improves reconstruction accuracy by 19.75% and enhances the lower bound of reconstruction capability by 13.4% compared to Gappy POD. Overall, C-POD demonstrates strong potential for modeling and reconstructing complex nonlinear flow fields, providing a valuable tool for dimensionality reduction methods in fluid dynamics.
The incorporation of machine learning (ML) in aircraft engineering has transformed the design, analysis, and operation of intricate aerospace systems. This study examines the present and developing applications of machine … The incorporation of machine learning (ML) in aircraft engineering has transformed the design, analysis, and operation of intricate aerospace systems. This study examines the present and developing applications of machine learning techniques in critical domains like aircraft design optimisation, defect detection and diagnostics, flight control systems, and predictive maintenance. Utilising extensive information from simulations, sensors, and real-time operations, machine learning models facilitate more efficient decision-making, improved system reliability, and decreased operational costs. Moreover, progress in deep learning, reinforcement learning, and neural networks is being progressively utilised for applications spanning aerodynamic modelling to autonomous flight control. This study emphasises the difficulties related to data quality, interpretability, and model validation in safety-critical aircraft contexts.
The integration of machine learning (ML) with computational fluid dynamics (CFD) marks a significant advancement in the simulation and analysis of fluid flows. This chapter explores the synergistic role of … The integration of machine learning (ML) with computational fluid dynamics (CFD) marks a significant advancement in the simulation and analysis of fluid flows. This chapter explores the synergistic role of machine learning in enhancing CFD methodologies, focusing on applications in modeling, optimization, and real-time analysis. Machine learning algorithms, particularly deep learning, offer powerful tools for identifying patterns and correlations within large datasets generated by CFD simulations. These algorithms can be trained to predict fluid behavior, accelerate simulation processes, and improve the accuracy of models by learning from empirical data. In modeling, ML techniques reduce the reliance on traditional empirical models, offering more precise and computationally efficient alternatives. Furthermore, ML-driven optimization techniques enhance the design process of fluid systems by enabling rapid evaluation of multiple design variables. Real-time data processing and analysis facilitated by ML also support adaptive control and decision-making in dynamic fluid environments.
A structure-preserving kernel ridge regression method is presented that allows the recovery of nonlinear Hamiltonian functions out of datasets made of noisy observations of Hamiltonian vector fields. The method proposes … A structure-preserving kernel ridge regression method is presented that allows the recovery of nonlinear Hamiltonian functions out of datasets made of noisy observations of Hamiltonian vector fields. The method proposes a closed-form solution that yields excellent numerical performances that surpass other techniques proposed in the literature in this setup. From the methodological point of view, the paper extends kernel regression methods to problems in which loss functions involving linear functions of gradients are required and, in particular, a differential reproducing property and a Representer Theorem are proved in this context. The relation between the structure-preserving kernel estimator and the Gaussian posterior mean estimator is analyzed. A full error analysis is conducted that provides convergence rates using fixed and adaptive regularization parameters. The good performance of the proposed estimator together with the convergence rate is illustrated with various numerical experiments.
In modeling signal transduction networks, it is common to manually integrate experimental evidence through a process that involves trial and error constrained by domain knowledge. We implement a genetic algorithm-based … In modeling signal transduction networks, it is common to manually integrate experimental evidence through a process that involves trial and error constrained by domain knowledge. We implement a genetic algorithm-based workflow (boolmore) to streamline Boolean model refinement. Boolmore adjusts the functions of the model to enhance agreement with a corpus of curated perturbation-observation pairs. It leverages existing mechanistic knowledge to automatically limit the search space to biologically plausible models. We demonstrate boolmore's effectiveness in a published plant signaling model that exemplifies the challenges of manual model construction and refinement. The refined models surpass the accuracy gain achieved over two years of manual revision and yield new, testable predictions. By automating the laborious task of model validation and refinement, this workflow is a step towards fast, fully automated, and reliable model construction.
Hei Yin Lam , Gianluca Ceruti , Daniel Kreßner | SIAM Journal on Matrix Analysis and Applications
This study proposes an index-based quantum neural network (QNN) model, built upon a variational quantum circuit (VQC), as a surrogate framework for the static analysis of truss structures. Unlike coordinate-based … This study proposes an index-based quantum neural network (QNN) model, built upon a variational quantum circuit (VQC), as a surrogate framework for the static analysis of truss structures. Unlike coordinate-based models, the proposed QNN uses discrete member and node indices as inputs, and it adopts a separate-domain strategy that partitions the structure for parallel training. This architecture reflects the way nature organizes and optimizes complex systems, thereby enhancing both flexibility and scalability. Independent quantum circuits are assigned to each separate domain, and a mechanics-informed loss function based on the force method is formulated within a Lagrangian dual framework to embed physical constraints directly into the training process. As a result, the model achieves high prediction accuracy and fast convergence, even under complex structural conditions with relatively few parameters. Numerical experiments on 2D and 3D truss structures show that the QNN reduces the number of parameters by up to 64% compared to conventional neural networks, while achieving higher accuracy. Even within the same QNN architecture, the separate-domain approach outperforms the single-domain model with a 6.25% reduction in parameters. The proposed index-based QNN model has demonstrated practical applicability for structural analysis and shows strong potential as a quantum-based numerical analysis tool for future applications in building structure optimization and broader engineering domains.