Convergence Properties of the Nelder--Mead Simplex Method in Low Dimensions

Type: Article
Publication Date: 1998-01-01
Citations: 7118
DOI: https://doi.org/10.1137/s1052623496303470

Abstract

The Nelder--Mead simplex algorithm, first published in 1965, is an enormously popular direct search method for multidimensional unconstrained minimization. Despite its widespread use, essentially no theoretical results have been proved explicitly for the Nelder--Mead algorithm. This paper presents convergence properties of the Nelder--Mead algorithm applied to strictly convex functions in dimensions 1 and 2. We prove convergence to a minimizer for dimension 1, and various limited convergence results for dimension 2. A counterexample of McKinnon gives a family of strictly convex functions in two dimensions and a set of initial conditions for which the Nelder--Mead algorithm converges to a nonminimizer. It is not yet known whether the Nelder--Mead method can be proved to converge to a minimizer for a more specialized class of convex functions in two dimensions.

Locations

  • SIAM Journal on Optimization
The Nelder-Mead algorithm, a longstanding direct search method for unconstrained optimization published in 1965, is designed to minimize a scalar-valued function f of n real variables using only function values, … The Nelder-Mead algorithm, a longstanding direct search method for unconstrained optimization published in 1965, is designed to minimize a scalar-valued function f of n real variables using only function values, without any derivative information. Each Nelder-Mead iteration is associated with a nondegenerate simplex defined by n+1 vertices and their function values; a typical iteration produces a new simplex by replacing the worst vertex by a new point. Despite the method's widespread use, theoretical results have been limited: for strictly convex objective functions of one variable with bounded level sets, the algorithm always converges to the minimizer; for such functions of two variables, the diameter of the simplex converges to zero, but examples constructed by McKinnon show that the algorithm may converge to a nonminimizing point. This paper considers the restricted Nelder-Mead algorithm, a variant that does not allow expansion steps. In two dimensions we show that, for any nondegenerate starting simplex and any twice-continuously differentiable function with positive definite Hessian and bounded level sets, the algorithm always converges to the minimizer. The proof is based on treating the method as a discrete dynamical system, and relies on several techniques that are non-standard in convergence proofs for unconstrained optimization.
The Nelder-Mead algorithm, a longstanding direct search method for unconstrained optimization published in 1965, is designed to minimize a scalar-valued function f of n real variables using only function values, … The Nelder-Mead algorithm, a longstanding direct search method for unconstrained optimization published in 1965, is designed to minimize a scalar-valued function f of n real variables using only function values, without any derivative information. Each Nelder-Mead iteration is asso- ciated with a nondegenerate simplex defined by n + 1 vertices and their function values; a typical iteration produces a new simplex by replacing the worst vertex by a new point. Despite the method's widespread use, theoretical results have been limited: for strictly convex objective functions of one variable with bounded level sets, the algorithm always converges to the minimizer; for such func- tions of two variables, the diameter of the simplex converges to zero but examples constructed by McKinnon show that the algorithm may converge to a nonminimizing point. This paper considers the restricted Nelder-Mead algorithm, a variant that does not allow expansion steps. In two dimen- sions we show that for any nondegenerate starting simplex and any twice-continuously differentiable function with positive definite Hessian and bounded level sets, the algorithm always converges to the minimizer. The proof is based on treating the method as a discrete dynamical system and relies on several techniques that are nonstandard in convergence proofs for unconstrained optimization.
Nelder-Mead method (NM) for solving continuous non-linear optimization problem is probably the most cited and the most used method in the optimization literature and in practical applications, too. It belongs … Nelder-Mead method (NM) for solving continuous non-linear optimization problem is probably the most cited and the most used method in the optimization literature and in practical applications, too. It belongs to the direct search methods, those which do not use the first and the second order derivatives. The popularity of NM is based on its simplicity. In this paper we propose even more simple algorithm for larger instances that follows NM idea. We call it Simplified NM (SNM): instead of generating all n + 1 simplex points in Rn, we perform search using just q + 1 vertices, where q is usually much smaller than n. Though the results cannot be better than after performing calculations in n+1 points as in NM, significant speed-up allows to run many times SNM from different starting solutions, usually getting better results than those obtained by NM within the same cpu time. Computational analysis is performed on 10 classical convex and non-convex instances, where the number of variables n can be arbitrarily large. The obtained results show that SNM is more effective than the original NM, confirming that LIMA yields good results when solving a continuous optimization problem.
We investigate and compare two versions of the Nelder–Mead simplex algorithm for function minimization. Two types of convergence are studied: the convergence of function values at the simplex vertices and … We investigate and compare two versions of the Nelder–Mead simplex algorithm for function minimization. Two types of convergence are studied: the convergence of function values at the simplex vertices and convergence of the simplex sequence. For the first type of convergence, we generalize the main result of Lagarias, Reeds, Wright and Wright (1998). For the second type of convergence, we also improve recent results which indicate that the Lagarias et al.’s version of the Nelder–Mead algorithm has better convergence properties than the original Nelder–Mead method. This paper concludes with some open questions.
The Nelder--Mead algorithm can stagnate and converge to a nonoptimal point, even for very simple problems. In this note we propose a test for sufficient decrease which, if passed for … The Nelder--Mead algorithm can stagnate and converge to a nonoptimal point, even for very simple problems. In this note we propose a test for sufficient decrease which, if passed for all iterations, will guarantee convergence of the Nelder--Mead iteration to a stationary point if the objective function is smooth and the diameters of the Nelder--Mead simplices converge to zero. Failure of this condition is an indicator of potential stagnation. As a remedy we propose a new step, which we call an oriented restart, that reinitializes the simplex to a smaller one with orthogonal edges whose orientation is determined by an approximate descent direction from the current best point. We also give results that apply when the objective function is a low-amplitude perturbation of a smooth function. We illustrate our results with some numerical examples.
The Nelder-Mead simplex method is a direct search algorithm that has found widespread use in the optimization of nonlinear functions. Originally designed for deterministic optimization, the method is robust with … The Nelder-Mead simplex method is a direct search algorithm that has found widespread use in the optimization of nonlinear functions. Originally designed for deterministic optimization, the method is robust with respect to small perturbations in the function's values; and therefore, this method has been used for optimizing stochastic functions as well. However, if the random perturbations in the function's values are large enough the method may terminate before reaching the optimizer of the expected function. We prove convergence of the simplex to a point with probability 1 for constant functions with additive noise for 1- and 2-dimensional functions and provide empirical evidence for the same result in higher dimensions. This result implies that as the amount of noise increases, differences between the expected function's values at the vertices of the simplex become obscured and the probability of terminating early is increased. Also, we demonstrate empirically and analytically the probability of early termination on an unbounded univariate linear function with additive noise. We propose a new modification for the Nelder-Mead simplex method for use in stochastic optimization. This new method reduces to the original method when the noise level is negligible or nonexistent; and therefore, it is as efficient as the Nelder-Mead method when the noise level is low. And in Monte Carlo experiments, our proposed method continued to reduce the expected function for as long as the simulations were run; whereas the original method and the previously best know modified simplex methods terminated early.
Abstract The Nelder–Mead or simplex search algorithm is one of the best known algorithms for unconstrained optimization of non–smooth functions. Even though the basic algorithm is quite simple, it is … Abstract The Nelder–Mead or simplex search algorithm is one of the best known algorithms for unconstrained optimization of non–smooth functions. Even though the basic algorithm is quite simple, it is implemented in many different ways. Apart from some minor computational details, the main difference between various implementations lies in the selection of convergence (or termination) tests, which are used to break the iteration process. A fairly simple efficiency analysis of each iteration step reveals a potential computational bottleneck in the domain convergence test. To be efficient, such a test has to be sublinear in the number of vertices of the working simplex. We have tested some of the most common implementations of the Nelder–Mead algorithm, and none of them is efficient in this sense. Therefore, we propose a simple and efficient domain convergence test and discuss some of its properties. This test is based on tracking the volume of the working simplex throughout the iterations. Similar termination tests can also be applied in some other simplex–based direct search methods. (© 2004 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim)
Black-box optimization (BBO) is a widely used technique for solving a variety of optimization problems including real-world applications with a high-dimensional expensive objective function. Among BBO methods, the Nelder–Mead (NM) … Black-box optimization (BBO) is a widely used technique for solving a variety of optimization problems including real-world applications with a high-dimensional expensive objective function. Among BBO methods, the Nelder–Mead (NM) method, which is a local search heuristic using a simplex, has been successful due to its simplicity and practical performance on low-dimensional problems. However, the NM method requires <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$n+1$</tex> and <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$n$</tex> evaluations to perform its initialization and Shrinkage operations respectively to optimize an n-dimensional objective. This is problematic when the objective is computationally and/or financially expensive because, in such a situation, we usually have a limited evaluation budget but those operations consume most of the entire budget. In this study, to address this drawback, we propose a simple but practical modification of the NM method that efficiently works for high-dimensional low-budget optimization. Our numerical results demonstrate that the proposed approach outperforms the original NM method and the random search baselines on BBO benchmark problems.
Abstract We develop a matrix form of the Nelder-Mead simplex method and show that its convergence is related to the convergence of infinite matrix products. We then characterize the spectra … Abstract We develop a matrix form of the Nelder-Mead simplex method and show that its convergence is related to the convergence of infinite matrix products. We then characterize the spectra of the involved matrices necessary for the study of convergence. Using these results, we discuss several examples of possible convergence or failure modes. Then, we prove a general convergence theorem for the simplex sequences generated by the method. The key assumption of the convergence theorem is proved in low-dimensional spaces up to 8 dimensions.
Abstract : In this paper we describe the Nelder-Mead simplex method for obtaining the minimizer of a function. The Nelder-Mead algorithm has several properties that make it a natural choice … Abstract : In this paper we describe the Nelder-Mead simplex method for obtaining the minimizer of a function. The Nelder-Mead algorithm has several properties that make it a natural choice for implementation and utilization on microcomputers. Stopping criteria for the method are presented as well as a brief discussion of the convergence properties of the method. An algorithmic statement of the method is included as an appendix.
This chapter provides a brief introduction to mathematical optimization. It discusses sets and functions concepts in mathematics. The chapter presents the information on norms that include Euclidean norm, Manhattan distance, … This chapter provides a brief introduction to mathematical optimization. It discusses sets and functions concepts in mathematics. The chapter presents the information on norms that include Euclidean norm, Manhattan distance, and infinity-norm or uniform norm. It covers the example of global and local optimum of the mathematical optimization problems. The chapter provides the maximum and minimum values of continuous functions. It helps the students to solve unconstrained optimization problems, using the gradient method implemented in Python. The chapter shows how to find the optimum using the gradient, and provides a complete review of Lagrange multipliers. It also helps the students to solve equality-constrained optimization problems, using Newton's method implemented in Python. The chapter is also an excuse for presenting Python's features as programming and modeling language.
Despite the lack of theoretical and practical convergence support, the Nelder–Mead (NM) algorithm is widely used to solve unconstrained optimization problems. It is a derivative-free algorithm, that attempts iteratively to … Despite the lack of theoretical and practical convergence support, the Nelder–Mead (NM) algorithm is widely used to solve unconstrained optimization problems. It is a derivative-free algorithm, that attempts iteratively to replace the worst point of a simplex by a better one. The present paper proposes a way to extend the NM algorithm to inequality constrained optimization. This is done through a search step of the mesh adaptive direct search (Mads) algorithm, inspired by the NM algorithm. The proposed algorithm does not suffer from the NM lack of convergence, but instead inherits from the totality of the Mads convergence analysis. Numerical experiments show an important improvement in the quality of the solutions produced using this search step.
In spite of being one of the most popular optimization methods, Nelder–Mead's simplex search algorithm with the default choice of parameters performs poorly on high-dimensional problems. The work presented here … In spite of being one of the most popular optimization methods, Nelder–Mead's simplex search algorithm with the default choice of parameters performs poorly on high-dimensional problems. The work presented here concerns such values of the Nelder–Mead algorithm's parameters that help improve the convergence and success rate of the algorithm in high dimensions. In this work, a novel way of assigning parameters to the Nelder–Mead simplex search algorithm is proposed. The proposed scheme is based on Chebyshev spacing points and adapts itself to the dimension of the problem. The numerical experiments conducted for this study show that the proposed scheme is better not just in comparison with the original Nelder–Mead algorithm but it outperforms the other existing adaptive schemes as well.
We give a sufficient condition for a certain type of convergence behavior of the Nelder-Mead simplex method and apply this result to several examples.We also give two related examples for … We give a sufficient condition for a certain type of convergence behavior of the Nelder-Mead simplex method and apply this result to several examples.We also give two related examples for the case of repeated shrinking which indicates a kind of local character of the method.
Nonlinear minimization and solving nonlinear systems of equations often involve the computation of the Newton step. For a nonlinear system, i.e., find x ∊ ℛn such that F(x) = 0, … Nonlinear minimization and solving nonlinear systems of equations often involve the computation of the Newton step. For a nonlinear system, i.e., find x ∊ ℛn such that F(x) = 0, where F : ℛn → ℛn and continuously differentiable, the determination of the Newton step typically involves two major steps:
Yield curve in bond investment will provide visualization of yields of bond as the function of the time to maturities. A normal yield curve shows bond yields increasing steadily with … Yield curve in bond investment will provide visualization of yields of bond as the function of the time to maturities. A normal yield curve shows bond yields increasing steadily with the length of time until they mature but flattening a little for the longest terms. The bond market can help predict the direction of the economy. In this study, we analyse the progress of Indonesian Government Yield Curve (IGYSC) during the pandemic Covid-19. Here, we especially analyse the connection between the progress of the pandemic waves and the development of IGYSC. We apply the Nelson Siegel and Nelson-Siegel-Svensson model to obtain the IGYSC and the empirical study has been done using FR (Fixed Rate) bonds of Indonesian Bond Market and HIMDASUN data. All the computation are done using R software. Based on yield curve analysis, it was shown that the economic only slowed down at the beginning of the pandemic, and has been continuing to improve by the time of pandemic evolves.
This paper deals with the design of an optimally performed proportional–integral–derivative (PID) controller utilized for speed control of a direct current (DC) motor. To do so, a novel hybrid algorithm … This paper deals with the design of an optimally performed proportional–integral–derivative (PID) controller utilized for speed control of a direct current (DC) motor. To do so, a novel hybrid algorithm was proposed which employs a recent metaheuristic approach, named Lévy flight distribution (LFD) algorithm, and a simplex search method known as Nelder–Mead (NM) algorithm. The proposed algorithm (LFDNM) combines both LFD and NM algorithms in such a way that the good explorative behaviour of LFD and excellent local search capability of NM help to form a novel hybridized version that is well balanced in terms of exploration and exploitation. The promise of the proposed structure was observed through employment of a DC motor with PID controller. Optimum values for PID gains were obtained with the aid of an integral of time multiplied absolute error objective function. To verify the effectiveness of the proposed algorithm, comparative simulations were carried out using cuckoo search algorithm, genetic algorithm and original LFD algorithm. The system behaviour was assessed through analysing the results for statistical and non-parametric tests, transient and frequency responses, robustness, load disturbance, energy and maximum control signals. The respective evaluations showed better performance of the proposed approach. In addition, the better performance of the proposed approach was also demonstrated through experimental verification. Further evaluation to demonstrate better capability was performed by comparing the LFDNM-based PID controller with other state-of-the-art algorithms-based PID controllers with the same system parameters, which have also confirmed the superiority of the proposed approach.
Nested Sampling is a powerful algorithm for fitting models to data in the Bayesian setting, introduced by Skilling [1]. The nested sampling algorithm proceeds by carrying out a series of … Nested Sampling is a powerful algorithm for fitting models to data in the Bayesian setting, introduced by Skilling [1]. The nested sampling algorithm proceeds by carrying out a series of compressive steps, involving successively nested iso-likelihood boundaries, starting with the full prior distribution of the problem parameters. The "central problem" of nested sampling is to draw at each step a sample from the prior distribution whose likelihood is greater than the current likelihood threshold, i.e., a sample falling inside the current likelihood-restricted region. For both flat and informative priors this ultimately requires uniform sampling restricted to the likelihood-restricted region. We present two new methods of carrying out this sampling step, and illustrate their use with the lighthouse problem [2], a bivariate likelihood used by Gregory [3] and a trivariate Gaussian mixture likelihood. All the algorithm development and testing reported here has been done with Mathematica® [4].
The method of common lines is a well-established reconstruction technique in cryogenic electron microscopy (cryo-EM), which can be used to extract the relative orientations of an object given tomographic projection … The method of common lines is a well-established reconstruction technique in cryogenic electron microscopy (cryo-EM), which can be used to extract the relative orientations of an object given tomographic projection images from different directions. In this paper, we deal with an analogous problem in optical diffraction tomography. Based on the Fourier diffraction theorem, we show that rigid motions of the object, i.e., rotations and translations, can be determined by detecting common circles in the Fourier-transformed data. We introduce two methods to identify common circles. The first one is motivated by the common line approach for projection images and detects the relative orientation by parameterizing the common circles in the two images. The second one assumes a smooth motion over time and calculates the angular velocity of the rotational motion via an infinitesimal version of the common circle method. Interestingly, using the stereographic projection, both methods can be reformulated as common line methods, but these lines are, in contrast to those used in cryo-EM, not confined to pass through the origin and allow for a full reconstruction of the relative orientations. Numerical proof-of-the-concept examples demonstrate the performance of our reconstruction methods.
Quantifying the impact of parametric and model-form uncertainty on the predictions of stochastic models is a key challenge in many applications. Previous work has shown that the relative entropy rate … Quantifying the impact of parametric and model-form uncertainty on the predictions of stochastic models is a key challenge in many applications. Previous work has shown that the relative entropy rate is an effective tool for deriving path-space uncertainty quantification (UQ) bounds on ergodic averages. In this work we identify appropriate information-theoretic objects for a wider range of quantities of interest on path-space, such as hitting times and exponentially discounted observables, and develop the corresponding UQ bounds. In addition, our method yields tighter UQ bounds, even in cases where previous relative-entropy-based methods also apply, e.g. , for ergodic averages. We illustrate these results with examples from option pricing, non-reversible diffusion processes, stochastic control, semi-Markov queueing models, and expectations and distributions of hitting times.
&lt;p style='text-indent:20px;'&gt;An interior point modified Nelder Mead method for nonlinearly constrained optimization is described. This method neither uses nor estimates objective function or constraint gradients. A modified logarithmic barrier function … &lt;p style='text-indent:20px;'&gt;An interior point modified Nelder Mead method for nonlinearly constrained optimization is described. This method neither uses nor estimates objective function or constraint gradients. A modified logarithmic barrier function is used. The method generates a sequence of points which converges to KKT point(s) under mild conditions including existence of a Slater point. Numerical results are presented that show the algorithm performs well in practice.&lt;/p&gt;
The psychometric function describes how an experimental variable, such as stimulus strength, influences the behaviour of an observer. Estimation of psychometric functions from experimental data plays a central role in … The psychometric function describes how an experimental variable, such as stimulus strength, influences the behaviour of an observer. Estimation of psychometric functions from experimental data plays a central role in fields such as psychophysics, experimental psychology and in the behavioural neurosciences. Experimental data may exhibit substantial overdispersion, which may result from non-stationarity in the behaviour of observers. Here we extend the standard binomial model which is typically used for psychometric function estimation to a beta-binomial model. We show that the use of the beta-binomial model makes it possible to determine accurate credible intervals even in data which exhibit substantial overdispersion. This goes beyond classical measures for overdispersion—goodness-of-fit—which can detect overdispersion but provide no method to do correct inference for overdispersed data. We use Bayesian inference methods for estimating the posterior distribution of the parameters of the psychometric function. Unlike previous Bayesian psychometric inference methods our software implementation—psignifit 4—performs numerical integration of the posterior within automatically determined bounds. This avoids the use of Markov chain Monte Carlo (MCMC) methods typically requiring expert knowledge. Extensive numerical tests show the validity of the approach and we discuss implications of overdispersion for experimental design. A comprehensive MATLAB toolbox implementing the method is freely available; a python implementation providing the basic capabilities is also available.
This paper describes the development of an unstructured hybrid finite element Navier–Stokes (NS)–direct simulation Monte Carlo (DSMC) framework for hypersonic flows. State-based coupling is employed and simulations of varying thermochemical … This paper describes the development of an unstructured hybrid finite element Navier–Stokes (NS)–direct simulation Monte Carlo (DSMC) framework for hypersonic flows. State-based coupling is employed and simulations of varying thermochemical complexity demonstrate the accuracy, robustness, and computational efficiency of the hybrid all-Mach algorithm. An automatic mesh optimization process using a posteriori error estimates based on the Hessian of the solution goes much further than traditional mesh adaptation processes by equidistributing the error estimator and producing a “single optimal hybrid mesh” with no increase in mesh size and with much higher accuracy. The DSMC region cells of the resulting optimal mesh are smaller than in NS regions and are sized to the local mean free path. Mesh optimization is also shown to greatly improve the quality of the hybrid interfaces from those of the initial mesh. Unstructured meshes are found to represent the hybrid interfaces smoothly, while structured meshes showcase a castellated pattern in the interfaces. The optimal hybrid meshes are found to be statistically similar to optimal full DSMC meshes, thus highlighting the solver independence of the optimizer. Such a coupled hybrid mesh optimization strategy can therefore tackle hypersonic flows with multiscale flow features at any degree of rarefaction.
We consider gas flow in pipeline networks governed by the isothermal Euler equations and introduce a new modeling of compressors in gas networks. Compressor units are modeled as pipe–to–pipe intersections … We consider gas flow in pipeline networks governed by the isothermal Euler equations and introduce a new modeling of compressors in gas networks. Compressor units are modeled as pipe–to–pipe intersections with additional algebraic coupling conditions for the compressor behavior. We prove existence and uniqueness of solutions with respect to these conditions and use the results for numerical simulation and optimization of gas networks.
We study extremal properties of the determinant of the Laplacian in the Bergman metric on the moduli space of compact genus two Riemann surfaces. By a combination of analytical and … We study extremal properties of the determinant of the Laplacian in the Bergman metric on the moduli space of compact genus two Riemann surfaces. By a combination of analytical and numerical methods we identify four non-degenerate critical points of this function and compute the signature of the Hessian at these points. The curve with the maximal number of automorphisms (the Burnside curve) turns out to be the point of the absolute maximum. Our results agree with the mass formula for virtual Euler characteristics of the moduli space. A similar analysis is performed for Bolza's strata of symmetric Riemann surfaces of genus two.
Abstract Background Mathematical modeling constitutes an important tool for planning robust responses to epidemics. This study was conducted to guide the Qatari national response to the severe acute respiratory syndrome … Abstract Background Mathematical modeling constitutes an important tool for planning robust responses to epidemics. This study was conducted to guide the Qatari national response to the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) epidemic. The study investigated the time course of the epidemic, forecasted healthcare needs, predicted the impact of social and physical distancing restrictions, and rationalized and justified easing of restrictions. Methods An age-structured deterministic model was constructed to describe SARS-CoV-2 transmission dynamics and disease progression throughout the population. Results The enforced social and physical distancing interventions flattened the epidemic curve, reducing the peaks for incidence, prevalence, acute-care hospitalization, and intensive care unit (ICU) hospitalizations by 87%, 86%, 76%, and 78%, respectively. The daily number of new infections was predicted to peak at 12,750 on May 23, and active-infection prevalence was predicted to peak at 3.2% on May 25. Daily acute-care and ICU-care hospital admissions and occupancy were forecast accurately and precisely. By October 15, 2020, the basic reproduction number R 0 had varied between 1.07-2.78, and 50.8% of the population were estimated to have been infected (1.43 million infections). The proportion of actual infections diagnosed was estimated at 11.6%. Applying the concept of R t tuning, gradual easing of restrictions was rationalized and justified to start on June 15, 2020, when R t declined to 0.7, to buffer the increased interpersonal contact with easing of restrictions and to minimize the risk of a second wave. No second wave has materialized as of October 15, 2020, five months after the epidemic peak. Conclusions Use of modeling and forecasting to guide the national response proved to be a successful strategy, reducing the toll of the epidemic to a manageable level for the healthcare system.
Objective functions in large-scale machine-learning and artificial intelligence applications often live in high dimensions with strong non-convexity and massive local minima. First-order methods, such as the stochastic gradient method and … Objective functions in large-scale machine-learning and artificial intelligence applications often live in high dimensions with strong non-convexity and massive local minima. First-order methods, such as the stochastic gradient method and Adam, are often used to find global minima. Recently, the consensus-based optimization (CBO) method has been introduced as one of the gradient-free optimization methods and its convergence is proven with dimension-dependent parameters, which may suffer from the curse of dimensionality. By replacing the isotropic geometric Brownian motion with the component-wise one, the latest improvement of the CBO method is guaranteed to converge to the global minimizer with dimension-independent parameters, although the initial data need to be well-chosen. In this paper, based on the CBO method and Adam, we propose a consensus-based global optimization method with adaptive momentum estimation (Adam-CBO). Advantages of the Adam-CBO method include: (1) capable of finding global minima of non-convex objective functions with high success rates and low costs; (2) can handle non-differentiable activation functions and thus approximate low-regularity functions with better accuracy. The former is verified by approximating the $1000$ dimensional Rastrigin function with $100\%$ success rate at a cost only growing linearly with respect to the dimensionality. The latter is confirmed by solving a machine learning task for partial differential equations with low-regularity solutions where the Adam-CBO method provides better results than the state-of-the-art method Adam. A linear stability analysis is provided to understand the asymptotic behavior of the Adam-CBO method.
Abstract A detailed numerical study of solutions to the Serre–Green–Naghdi (SGN) equations in 2D with vanishing curl of the velocity field is presented. The transverse stability of line solitary waves, … Abstract A detailed numerical study of solutions to the Serre–Green–Naghdi (SGN) equations in 2D with vanishing curl of the velocity field is presented. The transverse stability of line solitary waves, 1D solitary waves being exact solutions of the 2D equations independent of the second variable, is established numerically. The study of localized initial data as well as crossing 1D solitary waves does not give an indication of existence of stable structures in SGN solutions localized in two spatial dimensions. For the numerical experiments, an approach based on a Fourier spectral method with a Krylov subspace technique is applied.
Abstract The time dependence of the apparent diffusion tensor of ex vivo calf heart and tongue was measured for diffusion times (τ d ) between 32 and 810 ms. The … Abstract The time dependence of the apparent diffusion tensor of ex vivo calf heart and tongue was measured for diffusion times (τ d ) between 32 and 810 ms. The results showed evidence of restricted diffusion in the muscle tissues of both organs. In regions where the myofibers are parallel, the largest eigenvalue (λ 1 ) of the diffusion tensor remained the same for all diffusion times measured, while the other eigenvalues (λ 2 , λ 3 ) decreased by 29–36% between τ d = 32 ms and τ d = 400 ms. In regions where the fibers cross, the λ 1 also changed, decreasing by 17% between τ d = 32 ms and τ d = 400 ms. The restricting compartment size and volume fraction were effectively estimated by fitting the time courses of the eigenvalues to a model consisting of a nonrestricted compartment and a cylindrically restricted compartment. To our knowledge, this study is the first demonstrating diffusion time dependence of measured water diffusion tensor in muscular tissue. With improvement in scanning technology, future studies may permit noninvasive, in vivo detection of changes in muscle myoarchitecture due to disease, treatment, and exercise. Magn Reson Med, 2005. Published 2005 Wiley‐Liss, Inc.
A novel hybrid simplex method and particle swarm optimization (HSMPSO) algorithm is presented in this article. Computational experiments on variety of benchmark functions indicate this SM-PSO hybrid is a promising … A novel hybrid simplex method and particle swarm optimization (HSMPSO) algorithm is presented in this article. Computational experiments on variety of benchmark functions indicate this SM-PSO hybrid is a promising way for locating global optima of continuous multimodal functions. Although very easy to be implemented, the hybrid method yields competitive results in both reliability and efficiency compared to other published algorithms. We provide an extensive analysis of the impact of the parameters of our hybrid algorithm on its performance as well.
Abstract In this paper, we investigate a joint modeling method for hard failures where both degradation signals and time‐to‐event data are available. The mixed‐effects model is used to model degradation … Abstract In this paper, we investigate a joint modeling method for hard failures where both degradation signals and time‐to‐event data are available. The mixed‐effects model is used to model degradation signals, and extended hazard model is used for the time‐to‐event data. The extended hazard is a general model which includes two well‐known hazard rate models, the Cox proportional hazards model and accelerated failure time model, as special cases. A two‐stage estimation approach is used to obtain model parameters, based on which remaining useful life for the in‐service unit can be predicted. The performance of the method is demonstrated through both simulation studies and a real case study.
The problem of estimating the width of a symmetric uniform distribution on the line together with the error variance, when data are measured with normal additive error, is considered. The … The problem of estimating the width of a symmetric uniform distribution on the line together with the error variance, when data are measured with normal additive error, is considered. The main purpose is to analyse the maximum-likelihood (ML) estimator and to compare it with the moment-method estimator. It is shown that this two-parameter model is regular so that the ML estimator is asymptotically efficient. Necessary and sufficient conditions are given for the existence of the ML estimator. As numerical problems are known to frequently occur while computing the ML estimator in this model, useful suggestions for computing the ML estimator are also given.
In this paper a new method for parameter estimation for elliptic partial differential equations is introduced. Parameter estimation includes minimizing an objective function, which is a measure for the difference … In this paper a new method for parameter estimation for elliptic partial differential equations is introduced. Parameter estimation includes minimizing an objective function, which is a measure for the difference between the parameter-dependent solution of the differential equation and some given data. We assume, that the given data results in a good approximation of the state of the system.In order to evaluate the objective function the solution of a differential equation has to be computed and hence, a large system of linear equations has to be solved. Minimization methods involve many evaluations of the objective function and therefore, the differential equation has to be solved multiple times. Thus, the computing time for parameter estimation can be large. Model order reduction was developed in order to reduce the computational effort of solving these differential equations multiple times. We use the given approximation of the state of the system as reduced basis and omit computing any snapshots. Therefore, our approach decreases the effort of the offline phase drastically. Furthermore, the dimension of the reduced system is one and thus, is much smaller than the dimension of other approaches. However, the obtained reduced model is a good approximation only close to the given data. Hence, the reduced system can lead to large errors for parameter sets, which correspond to solutions far away from the given approximation of the state of the system. In order to prevent convergence of the parameter estimator to such a local minimizer we penalise the approximation error between the full and the reduced system.
Left-censored data with one or more detection limits (DLs) often arise in environmental contexts. The computational procedure for the calculation of maximum likelihood estimators of the parameter for Type I … Left-censored data with one or more detection limits (DLs) often arise in environmental contexts. The computational procedure for the calculation of maximum likelihood estimators of the parameter for Type I multiply left-censored data from underlying exponential distribution is suggested and used considering various numbers of DLs. The expected Fisher information measure (FIM) is analytically determined and its performance is compared with sample (observed) FIM using simulations. Simulations are focused primarily on the properties of estimators for small sample sizes. Moreover, the simulations follow the possible applications of the results in the statistical analysis of real chemical data.
Many types of event sequence data exhibit triggering and clustering properties in space and time. Point processes are widely used in modeling such event data with applications such as predictive … Many types of event sequence data exhibit triggering and clustering properties in space and time. Point processes are widely used in modeling such event data with applications such as predictive policing and disaster event forecasting. Although current algorithms can achieve significant event prediction accuracy, the historic data or the self-excitation property can introduce biased prediction. For example, hotspots ranked by event hazard rates can make the visibility of a disadvantaged group (e.g., racial minorities or the communities of lower social economic status) more apparent. Existing methods have explored ways to achieve parity between the groups by penalizing the objective function with several group fairness metrics. However, these metrics fail to measure the fairness on every prefix of the ranking. In this paper, we propose a novel list-wise fairness criterion for point processes, which can efficiently evaluate the ranking fairness in event prediction. We also present a strict definition of the unfairness consistency property of a fairness metric and prove that our list-wise fairness criterion satisfies this property. Experiments on several real-world spatial-temporal sequence datasets demonstrate the effectiveness of our list-wise fairness criterion.
In the paper the aggregation algorithm for Wiener system modelling is proposed. The method uses exponential weights to combine (aggregate) models from a given collection of D λ linear dynamic … In the paper the aggregation algorithm for Wiener system modelling is proposed. The method uses exponential weights to combine (aggregate) models from a given collection of D λ linear dynamic and D m nonlinear static models of the genuine system blocks (that can be a priori hypotheses, other system identification outcomes, etc.). The resulting model has a noteworthy property: it is almost as accurate as the best hypothetical Wiener model built from a given collection. This is caused by the fact that the error introduced by the aggregation routine decreases (in the mean squared sense) with growing number of measurements N as fast as CqN −1ln (D m D λ ), where C and q are known constants. Simulations confirm theoretical findings and demonstrate applicability of the algorithm in engineering problems for small and moderate measurement data-sets.
We present an efficient high-precision numerical approach for Davey–Stewartson (DS) II type equa- tions, treating initial data from the Schwartz class of smooth, rapidly decreasing functions. As with previous approaches, … We present an efficient high-precision numerical approach for Davey–Stewartson (DS) II type equa- tions, treating initial data from the Schwartz class of smooth, rapidly decreasing functions. As with previous approaches, the presented code uses discrete Fourier transforms for the spatial dependence and Driscoll’s composite Runge–Kutta method for the time dependence. Since DS equations are non-local, nonlinear Schrödinger equations with a singular symbol for the non-locality, standard Fourier methods in practice only reach accuracy of the order of 10 −6 or less for typical examples. This was previously demonstrated for the defocusing integrable case by comparison with a numerical approach for DS II via inverse scattering. By applying a regularization to the singular symbol, originally developed for D-bar problems, the presented code is shown to reach machine precision. The code can treat integrable and non-integrable DS II equations. Moreover, it has the same numerical complexity as existing codes for DS II. Several examples for the integrable defocusing DS II equation are discussed as test cases. In an appendix by C. Kalla, a doubly periodic solution to the defocusing DS II equation is presented, providing a test for direct DS codes based on Fourier methods.
This article provides a new tool for examining the efficiency and robustness of derivative-free optimization algorithms based on high-dimensional normalized data profiles that test a variety of performance metrics. Unlike … This article provides a new tool for examining the efficiency and robustness of derivative-free optimization algorithms based on high-dimensional normalized data profiles that test a variety of performance metrics. Unlike the traditional data profiles that examine a single dimension, the proposed data profiles require several dimensions in order to analyze the relative performance of different optimization solutions. To design a use case, we utilize five sequences (solvers) of trigonometric simplex designs that extract different features of non-isometric reflections, as an example to show how various metrics (dimensions) are essential to provide a comprehensive evaluation about a particular solver relative to others. In addition, each designed sequence can rotate the starting simplex through an angle to designate the direction of the simplex. This type of features extraction is applied to each sequence of the triangular simplexes to determine a global minimum for a mathematical problem. To allocate an optimal sequence of trigonometric simplex designs, a linear model is used with the proposed data profiles to examine the convergence rate of the five simplexes. Furthermore, we compare the proposed five simplexes to an optimized version of the Nelder-Mead algorithm known as the Genetic Nelder-Mead algorithm. The experimental results demonstrate that the proposed data profiles lead to a better examination of the reliability and robustness for the considered solvers from a more comprehensive perspective than the existing data profiles. Finally, the high-dimensional data profiles reveal that the proposed solvers outperform the genetic solvers for all accuracy tests.
Fractional-order calculus can obtain better results than the integer-order in control theory, so it has become a research hotspot in recent years. However, the structure of the irrational fractional-order system … Fractional-order calculus can obtain better results than the integer-order in control theory, so it has become a research hotspot in recent years. However, the structure of the irrational fractional-order system is complex, so its theoretical analysis and controller design are more difficult. In this paper, a method based on convolution integral is proposed to obtain the frequency domain response of the irrational model. Combined with the optimization algorithm, the model parameters are identified. Moreover, the rationalization of the irrational model is realized, which facilitates the analysis and application design of this kind models. Finally, two examples are given to illustrate the effectiveness and feasibility of the method by identifying parameters and rationalization.
Modern probabilistic learning systems mainly assume symmetric distributions, however, real-world data typically obey skewed distributions and are thus not adequately modeled through symmetric distributions. To address this issue, a generalization … Modern probabilistic learning systems mainly assume symmetric distributions, however, real-world data typically obey skewed distributions and are thus not adequately modeled through symmetric distributions. To address this issue, a generalization of symmetric distributions called elliptical distributions are increasingly used, together with further improvements based on skewed elliptical distributions. However, existing approaches are either hard to estimate or have complicated and abstract representations. To this end, we propose a novel approach based on the von-Mises-Fisher (vMF) distribution to obtain an explicit and simple probability representation of skewed elliptical distributions. The analysis shows that this not only allows us to design and implement nonsymmetric learning systems but also provides a physically meaningful and intuitive way of generalizing skewed distributions. For rigor, the proposed framework is proven to share important and desirable properties with its symmetric counterpart. The proposed vMF distribution is demonstrated to be easy to generate and stable to estimate, both theoretically and through examples.
On December 7, 2022, the Chinese government optimized the current epidemic prevention and control policy, and no longer adopted the zero-COVID policy and mandatory quarantine measures. Based on the above … On December 7, 2022, the Chinese government optimized the current epidemic prevention and control policy, and no longer adopted the zero-COVID policy and mandatory quarantine measures. Based on the above policy changes, this paper establishes a compartment dynamics model considering age distribution, home isolation and vaccinations. Parameter estimation was performed using improved least squares and Nelder-Mead simplex algorithms combined with modified case data. Then, using the estimated parameter values to predict a second wave of the outbreak, the peak of severe cases will reach on 8 May 2023, the number of severe cases will reach 206,000. Next, it is proposed that with the extension of the effective time of antibodies obtained after infection, the peak of severe cases in the second wave of the epidemic will be delayed, and the final scale of the disease will be reduced. When the effectiveness of antibodies is 6 months, the severe cases of the second wave will peak on July 5, 2023, the number of severe cases is 194,000. Finally, the importance of vaccination rates is demonstrated, when the vaccination rate of susceptible people under 60 years old reaches 98%, and the vaccination rate of susceptible people over 60 years old reaches 96%, the peak of severe cases in the second wave of the epidemic will be reached on 13 July 2023, when the number of severe cases is 166,000.
The paper proposes a new simplex-based direct search algorithm for box-constrained nonlinear programming (NLP) problems. As is typical of such methods, the optimal solution is obtained by iteratively transforming a … The paper proposes a new simplex-based direct search algorithm for box-constrained nonlinear programming (NLP) problems. As is typical of such methods, the optimal solution is obtained by iteratively transforming a multi-dimensional simplex and no gradient information is calculated. In contrast to unconstrained optimization problems, where the decision about way of transforming an iterate simplex is based solely on the consideration of optimality, in the proposed algorithm, both optimality and constraint satisfaction affect the iterations. First novel feature of the proposed algorithm is a fast way of checking if a planned iteration of current simplex violates any of the bounds and, if yes, a systematic way of choosing an alternative without compromising on the optimality considerations. The proposed procedure, in fact, guarantees that each iterate simplex lies entirely within the feasible region. Second novel feature of the algorithm is the use of so-called free and bounded variables that allow for different dimensions of simplices in different iterations. This feature helps convergence of the algorithm when one or more of the decision variables lie on bounds. The performance of the proposed algorithm is demonstrated on some well known test problems.
A novel, tuning-free, population-based simplex method for continuous function optimization is proposed. The proposed method, called Adaptive Population-based Simplex (APS), uses a population from which different simplexes are selected. In … A novel, tuning-free, population-based simplex method for continuous function optimization is proposed. The proposed method, called Adaptive Population-based Simplex (APS), uses a population from which different simplexes are selected. In addition, a local search is performed using a hyper-sphere generated around the best individual in a simplex. The approach is easy to code and easy to understand. APS is compared with four state-of-the-art approaches on five real-world problems. The experimental results show that APS generally performs better than the other methods on the test problems.
The outbreak of novel coronavirus disease (COVID-19) has spread around the world since it was detected in December 2019. The Chinese government executed a series of interventions to curb the … The outbreak of novel coronavirus disease (COVID-19) has spread around the world since it was detected in December 2019. The Chinese government executed a series of interventions to curb the pandemic. The “battle” against COVID-19 in Shenzhen, China is valuable because populated industrial cities are the epic centres of COVID-19 in many regions. We made use of synthetic control methods to create a reference population matching specific characteristics of Shenzhen. With both the synthetic and observed data, we introduced an epidemic compartmental model to compare the spread of COVID-19 between Shenzhen and its counterpart regions in the United States that didn’t implement interventions for policy evaluation. Once the effects of policy interventions adopted in Shenzhen were estimated, the delay effects of those interventions were referred to provide the further control degree of interventions. Thus, the hypothetical epidemic situations in Shenzhen were inferred by using time-varying reproduction numbers in the proposed SIHR (Susceptible, Infectious, Hospitalized, Removed) model and considering if the interventions were delayed by 0 day to 5 days. The expected cumulative confirmed cases would be 1546, which is 5.75 times of the observed cumulative confirmed cases of 269 in Shenzhen on February 3, 2020, based on the data from the counterpart counties (mainly from Broward, New York, Santa Clara, Pinellas, and Westchester) in the United States. If the interventions were delayed by 5 days from the day when the interventions started, the expected cumulative confirmed cases of COVID-19 in Shenzhen on February 3, 2020 would be 676 with 95% credible interval (303,1959). Early implementation of mild interventions can subdue the epidemic of COVID-19. The later the interventions were implemented, the more severe the epidemic was in the hard-hit areas. Mild interventions are less damaging to the society but can be effective when implemented early. AMS 2000 subject classifications : Primary 00K00, 00K01; secondary 00K02.
Abstract The effect of dimensionality on the widely used Nelder–Mead simplex method for unconstrained optimization is investigated. It is shown that by using the quadratic function f(x)=x T x, the … Abstract The effect of dimensionality on the widely used Nelder–Mead simplex method for unconstrained optimization is investigated. It is shown that by using the quadratic function f(x)=x T x, the Nelder–Mead simplex method deteriorates as the dimension increases. Keywords: Nelder–Mead methodSimplexEffect of dimensionalityConvergenceOptimization Acknowledgements We are very grateful to the referees for their insightful and constructive comments and suggestions, which helped us to improve the content and presentation of the paper. We also thank David Byatt for providing us his thesis. Lixing Han's research was supported in part by a UM-Flint 2002 summer research fellowship. The work of this Michael Neumann was supported in part by NSF Grant No. DMS9973247.
Within the realm of business context, process duration signifies time spent by customers between successive activities. This temporal perspective offers important insight to customer behavior, highlighting potential bottlenecks, and influencing … Within the realm of business context, process duration signifies time spent by customers between successive activities. This temporal perspective offers important insight to customer behavior, highlighting potential bottlenecks, and influencing business management decisions. The distribution of these process duration often changes over time due to factors such as seasonality, emerging legislation, changes to supply chains, and customer demand. Referred to as concept drift, these variations pose challenges for robust process modeling, understanding, and refinement. Subsequently, gamma mixture models are widely employed to model durations. These source data can, however, become left-truncated and right-censored within any specific observation window thereby necessitating a (well-known) modification to the likelihood function. The approach reported in this article leveraged this adapted likelihood across a series of observation windows, applying the likelihood ratio test to identify duration changes/concept drift. Due to its flexibility in modelling any duration distribution, the gamma mixture model was used with Nelder–Mead optimized likelihood for the left-truncated and right-censored data. The number of gamma components was determined by the Bayesian information criterion. The proposed framework underwent validation through simulated exponential samples, leading to recommendations for its practical application. Subsequently, we applied the methodology to three real-life event logs exhibiting diverse characteristics. Experimental results showcase the effectiveness of our approach in terms of data fitting, as compared to Kaplan–Meier curves, and in detecting instances of drift. This comprehensive validation underscores the practical utility and reliability of our framework for dynamic business scenarios.
Sea clutter is the backscattered returns from a patch of the sea surface illuminated by a radar pulse. It is of great importance to develop statistical models that properly characterize … Sea clutter is the backscattered returns from a patch of the sea surface illuminated by a radar pulse. It is of great importance to develop statistical models that properly characterize radar clutter returns. A lot of effort has been made to fit various distributions to the observed amplitude data of sea clutter. However, the fitting of those distributions to real sea clutter data is not excellent, and using parameters estimated from those distributions is not very effective for detecting targets within sea clutter. This may be due to the fact that sea clutter data is highly non-stationary. This non-stationarity motivates us to perform distributional analysis on the data obtained by differencing amplitude data of sea clutter. By systematically analyzing differenced data of 280 sea clutter time series measured under various sea and weather conditions, we show that differenced data of sea clutter are well fitted by the Tsallis distribution. The latter is obtained by maximizing the Tsallis entropy, a generalization of the Shannon (or Boltzman-Gibbs) entropy. Interestingly, we find that the nonextensitivity parameters estimated from the Tsallis distribution may be used to help detect low observable targets within sea clutter.
We portray the evolution of the Covid-19 epidemic during the crisis of March–April 2020 in the Paris area, by analyzing the medical emergency calls received by the EMS of the … We portray the evolution of the Covid-19 epidemic during the crisis of March–April 2020 in the Paris area, by analyzing the medical emergency calls received by the EMS of the four central departments of this area (Centre 15 of SAMU 75, 92, 93 and 94). Our study reveals strong dissimilarities between these departments. We show that the logarithm of each epidemic observable can be approximated by a piecewise linear function of time. This allows us to distinguish the different phases of the epidemic, and to identify the delay between sanitary measures and their influence on the load of EMS. This also leads to an algorithm, allowing one to detect epidemic resurgences. We rely on a transport PDE epidemiological model, and we use methods from Perron–Frobenius theory and tropical geometry.
Abstract We address the calibration of a computationally expensive nuclear physics model for which derivative information with respect to the fit parameters is not readily available. Of particular interest is … Abstract We address the calibration of a computationally expensive nuclear physics model for which derivative information with respect to the fit parameters is not readily available. Of particular interest is the performance of optimization-based training algorithms when dozens, rather than millions or more, of training data are available and when the expense of the model places limitations on the number of concurrent model evaluations that can be performed. As a case study, we consider the Fayans energy density functional model, which has characteristics similar to many model fitting and calibration problems in nuclear physics. We analyze hyperparameter tuning considerations and variability associated with stochastic optimization algorithms and illustrate considerations for tuning in different computational settings.
This book unites the study of dynamical systems and numerical solution of differential equations. The first three chapters contain the elements of the theory of dynamical systems and the numerical … This book unites the study of dynamical systems and numerical solution of differential equations. The first three chapters contain the elements of the theory of dynamical systems and the numerical solution of initial-value problems. In the remaining chapters, numerical methods are formulted as dynamical systems and the convergence and stability properties of the methods are examined. Topics studied include the stability of numerical methods for contractive, dissipative, gradient and Hamiltonian systems together with the convergence properties of equilibria, periodic solutions and strage attractors under numerical approximation. This book will be an invaluable tool for graduate students and researchers in the fields of numerical analysis and dynamical systems.
This paper analyzes the behavior of the Nelder--Mead simplex method for a family of examples which cause the method to converge to a nonstationary point. All the examples use continuous … This paper analyzes the behavior of the Nelder--Mead simplex method for a family of examples which cause the method to converge to a nonstationary point. All the examples use continuous functions of two variables. The family of functions contains strictly convex functions with up to three continuous derivatives. In all the examples the method repeatedly applies the inside contraction step with the best vertex remaining fixed. The simplices tend to a straight line which is orthogonal to the steepest descent direction. It is shown that this behavior cannot occur for functions with more than three continuous derivatives. The stability of the examples is analyzed.
The Nelder--Mead algorithm can stagnate and converge to a nonoptimal point, even for very simple problems. In this note we propose a test for sufficient decrease which, if passed for … The Nelder--Mead algorithm can stagnate and converge to a nonoptimal point, even for very simple problems. In this note we propose a test for sufficient decrease which, if passed for all iterations, will guarantee convergence of the Nelder--Mead iteration to a stationary point if the objective function is smooth and the diameters of the Nelder--Mead simplices converge to zero. Failure of this condition is an indicator of potential stagnation. As a remedy we propose a new step, which we call an oriented restart, that reinitializes the simplex to a smaller one with orthogonal edges whose orientation is determined by an approximate descent direction from the current best point. We also give results that apply when the objective function is a low-amplitude perturbation of a smooth function. We illustrate our results with some numerical examples.
We propose a new simplex-based direct search method for unconstrained minimization of a real-valued function f of n variables. As in other methods of this kind, the intent is to … We propose a new simplex-based direct search method for unconstrained minimization of a real-valued function f of n variables. As in other methods of this kind, the intent is to iteratively improve an n-dimensional simplex through certain reflection/expansion/contraction steps. The method has three novel features. First, a user-chosen integer $\bar m_k$ specifies the number of "good" vertices to be retained in constructing the initial trial simplices---reflected, then either expanded or contracted---at iteration k. Second, a trial simplex is accepted only when it satisfies the criteria of fortified descent, which are stronger than the criterion of strict descent used in most direct search methods. Third, the number of additional function evaluations needed to check a trial reflected/expanded simplex for fortified descent can be controlled. If one of the initial trial simplices satisfies the fortified-descent criteria, it is accepted as the new simplex; otherwise, the simplex is shrunk a fraction of the way toward a best vertex and the process is restarted, etc., until either a trial simplex is accepted or the simplex effectively has shrunk to a single point. We prove several theoretical properties of the new method. If f is continuously differentiable, bounded below, and uniformly continuous on its lower level set and we choose $\bar m_k$ with the same value at all iterations k, then every cluster point of the generated sequence of iterates is a stationary point. The same conclusion holds if the function is continuously differentiable, bounded below, and we choose $\bar m_k=1$ at all iterations k.
We introduce an abstract definition of pattern search methods for solving nonlinear unconstrained optimization problems. Our definition unifies an important collection of optimization methods that neither compute nor explicitly approximate … We introduce an abstract definition of pattern search methods for solving nonlinear unconstrained optimization problems. Our definition unifies an important collection of optimization methods that neither compute nor explicitly approximate derivatives. We exploit our characterization of pattern search methods to establish a global convergence theory that does not enforce a notion of sufficient decrease. Our analysis is possible because the iterates of a pattern search method lie on a scaled, translated integer lattice. This allows us to relax the classical requirements on the acceptance of the step, at the expense of stronger conditions on the form of the step, and still guarantee global convergence.
Direct search methods are methods designed to solve unconstrained minimization problems of the form: $$\mathop {\min }\limits_{x \in {\mathbb{R}^n}} f(x),$$ where f: ℝn → ℝ. These methods are distinguished by … Direct search methods are methods designed to solve unconstrained minimization problems of the form: $$\mathop {\min }\limits_{x \in {\mathbb{R}^n}} f(x),$$ where f: ℝn → ℝ. These methods are distinguished by the fact that they neither use nor require explicit derivative information; the search for a local minimizer is driven solely by function information. Popular methods in this class include the factorial design algorithm of Box [1], the pattern search algorithm of Hooke and Jeeves [4], and the simplex method of Neider and Mead [5].
A method is described for the minimization of a function of n variables, which depends on the comparison of function values at the (n + 1) vertices of a general … A method is described for the minimization of a function of n variables, which depends on the comparison of function values at the (n + 1) vertices of a general simplex, followed by the replacement of the vertex with the highest value by another point. The simplex adapts itself to the local landscape, and contracts on to the final minimum. The method is shown to be effective and computationally compact. A procedure is given for the estimation of the Hessian matrix in the neighbourhood of the minimum, needed in statistical estimation problems.
This paper describes an approach to constructing derivative-free algorithms for unconstrained optimization that are easy to implement on parallel machines. A special feature of this approach is the ease with … This paper describes an approach to constructing derivative-free algorithms for unconstrained optimization that are easy to implement on parallel machines. A special feature of this approach is the ease with which algorithms can be generated to take advantage of any number of processors and to adapt to any cost ratio of communication to function evaluation. Numerical tests show speed-ups on two fronts. The cost of synchronization being minimal, the speed-up is almost linear with the addition of more processors, i.e., given a problem and a search strategy, the decrease in execution time is proportional to the number of processors added. Even more encouraging, however, is that different search strategies, devised to take advantage of additional (or more powerful) processors, may actually lead to dramatic improvements in the performance of the basic algorithm. Thus search strategies intended for many processors actually may generate algorithms that are better even when implemented sequentially. The key difference is that the additional processors are not used simply to enhance the performance of an inherently sequential algorithm; they are used to spur the design of ever more ambitious—and effective—search strategies. The algorithms given here are supported by a strong convergence theorem, promising computational results on a variety of problems, and an intuitively appealing interpretation as multidirectional line search methods.