Engineering Safety, Risk, Reliability and Quality

Reliability and Maintenance Optimization

Description

This cluster of papers focuses on reliability engineering and maintenance optimization, covering topics such as degradation modeling, condition-based maintenance, multi-state systems, prognostic models, accelerated degradation tests, risk-based maintenance, stochastic modeling, and system reliability. The papers explore various methods and strategies for optimizing maintenance policies and improving the reliability of deteriorating systems.

Keywords

Reliability Engineering; Maintenance Optimization; Degradation Modeling; Condition-Based Maintenance; Multi-State Systems; Prognostic Models; Accelerated Degradation Tests; Risk-Based Maintenance; Stochastic Modeling; System Reliability

The main concern of this text is the application of stochastic models to practical situations involving uncertainty and dynamism. A unique feature is the integrated treatment of models and computational … The main concern of this text is the application of stochastic models to practical situations involving uncertainty and dynamism. A unique feature is the integrated treatment of models and computational methods for stochastic design and stochastic optimization problems. The book uses realistic examples to explore a wide variety of applications, such as inventory and production control, reliability, maintenance, queueing computer and communication systems. Exercises and suggestions for further reading are provided at the end of each chapter. The book was written with advanced students in mind, however, as it contains a wealth of material not found in other texts, it will also be of considerable interest to practitioners and researchers in operations research, statistics, computer science and engineering.
Degradation models have become an important analytic tool for complex systems. During the last two decades, a number of degradation models have been developed to capture the degradation dynamics of … Degradation models have become an important analytic tool for complex systems. During the last two decades, a number of degradation models have been developed to capture the degradation dynamics of a system and aid the subsequent decision‐makings. This paper is aimed at providing a summary of the state of the arts in the field, and discussing some further research issues from both analytical and practical point of view. In this paper, degradation models are classified into three classes, that is, stochastic process models, general path models, and other models beyond these two classes. A review on the three classes is given with emphasis on the class of stochastic process models. A comprehensive comparison between stochastic process models and general path models is given to expound the pros and cons of these two methods. Applications of degradation models in degradation test planning and burn‐in modelling will also be discussed. Copyright © 2014 John Wiley & Sons, Ltd.
Several enumeration and reliability problems are shown to be # P-complete, and hence, at least as hard as NP-complete problems. Included are important problems in network reliability analysis, namely, computing … Several enumeration and reliability problems are shown to be # P-complete, and hence, at least as hard as NP-complete problems. Included are important problems in network reliability analysis, namely, computing the probability that a graph is connected and counting the number of minimum cardinality $(s,t)$-cuts or directed network cuts. Also shown to be # P-complete are counting vertex covers in a bipartite graph, counting antichains in a partial order, and approximating the probability that a graph is connected and the probability that a pair of vertices is connected.
The remaining useful life (RUL) prediction of rolling element bearings has attracted substantial attention recently due to its importance for the bearing health management. The exponential model is one of … The remaining useful life (RUL) prediction of rolling element bearings has attracted substantial attention recently due to its importance for the bearing health management. The exponential model is one of the most widely used methods for RUL prediction of rolling element bearings. However, two shortcomings exist in the exponential model: 1) the first predicting time (FPT) is selected subjectively; and 2) random errors of the stochastic process decrease the prediction accuracy. To deal with these two shortcomings, an improved exponential model is proposed in this paper. In the improved model, an adaptive FPT selection approach is established based on the 3σ interval, and particle filtering is utilized to reduce random errors of the stochastic process. In order to demonstrate the effectiveness of the improved model, a simulation and four tests of bearing degradation processes are utilized for the RUL prediction. The results show that the improved model is able to select an appropriate FPT and reduce random errors of the stochastic process. Consequently, it performs better in the RUL prediction of rolling element bearings than the original exponential model.
"Probability and Random Processes for Electrical Engineering." Technometrics, 33(3), pp. 372–373 "Probability and Random Processes for Electrical Engineering." Technometrics, 33(3), pp. 372–373
"System Reliability Theory: Models, Statistical Methods, and Applications." Technometrics, 46(4), pp. 495–496 "System Reliability Theory: Models, Statistical Methods, and Applications." Technometrics, 46(4), pp. 495–496
Several different and complex electromechanical and mechanical systems are shown to have remarkably similar rates of reliability improvement during system development. These similarities provide the basis for a learning curve … Several different and complex electromechanical and mechanical systems are shown to have remarkably similar rates of reliability improvement during system development. These similarities provide the basis for a learning curve which can be used to monitor development progress, predict growth patterns, and plan programs for reliability improvement.
Abstract Some life tests result in few or no failures. In such cases, it is difficult to assess reliability with traditional life tests that record only time to failure. For … Abstract Some life tests result in few or no failures. In such cases, it is difficult to assess reliability with traditional life tests that record only time to failure. For some devices, it is possible to obtain degradation measurements over time, and these measurements may contain useful information about product reliability. Even with little or no censoring, there may be important practical advantages to analyzing degradation data. If failure is defined in terms of a specified level of degradation, a degradation model defines a particular time-to-failure distribution. Generally it is not possible to obtain a closed-form expression for this distribution. The purpose of this work is to develop statistical methods for using degradation measures to estimate a time-to-failure distribution for a broad class of degradation models. We use a nonlinear mixed-effects model and develop methods based on Monte Carlo simulation to obtain point estimates and confidence intervals for reliability assessment. KEY WORDS: First crossing timeNonlinear estimationRandom effectReliability
A survey of the research done on preventive maintenance is presented. The scope of the present survey is on the research published after the 1976 paper by Pierskalla and Voelker … A survey of the research done on preventive maintenance is presented. The scope of the present survey is on the research published after the 1976 paper by Pierskalla and Voelker [98]. This article includes optimization models for repair, replacement, and inspection of systems subject to stochastic deterioration. A classification scheme is used that categorizes recent research into inspection models, minimal repair models, shock models, or miscellaneous replacement models.
Abstract The literature on maintenance models is surveyed. The focus is on work appearing since the 1965 survey, “Maintenance Policies for Stochastically Failing Equipment: A Survey” by John McCall and … Abstract The literature on maintenance models is surveyed. The focus is on work appearing since the 1965 survey, “Maintenance Policies for Stochastically Failing Equipment: A Survey” by John McCall and the 1965 book, The Mathematical Theory of Reliability , by Richard Barlow and Frank Proschan. The survey includes models which involve an optimal decision to procure, inspect, and repair and/or replace a unit subject to deterioration in service.
Remaining useful life estimation is central to the prognostics and health management of systems, particularly for safety-critical systems, and systems that are very expensive. We present a non-linear model to … Remaining useful life estimation is central to the prognostics and health management of systems, particularly for safety-critical systems, and systems that are very expensive. We present a non-linear model to estimate the remaining useful life of a system based on monitored degradation signals. A diffusion process with a nonlinear drift coefficient with a constant threshold was transformed to a linear model with a variable threshold to characterize the dynamics and nonlinearity of the degradation process. This new diffusion process contrasts sharply with existing models that use a linear drift, and also with models that use a linear drift based on transformed data that were originally nonlinear. Both existing models are based on a constant threshold. To estimate the remaining useful life, an analytical approximation to the distribution of the first hitting time of the diffusion process crossing a threshold level is obtained in a closed form by a time-space transformation under a mild assumption. The unknown parameters in the established model are estimated using the maximum likelihood estimation approach, and goodness of fit measures are applied. The usefulness of the proposed model is demonstrated by several real-world examples. The results reveal that considering nonlinearity in the degradation process can significantly improve the accuracy of remaining useful life estimation.
This paper provides: an overview of the methods that have been developed since 1977 for solving various reliability optimization problems; applications of these methods to various types of design problems; … This paper provides: an overview of the methods that have been developed since 1977 for solving various reliability optimization problems; applications of these methods to various types of design problems; and heuristics, metaheuristic algorithms, exact methods, reliability-redundancy allocation, multi-objective optimization and assignment of interchangeable components in reliability systems. Like other applications, exact solutions for reliability optimization problems are not necessarily desirable because exact solutions are difficult to obtain, and even when they are available, their utility is marginal. A majority of the work in this area is devoted to developing heuristic and metaheuristic algorithms for solving optimal redundancy-allocation problems.
Real-time condition monitoring is becoming an important tool in maintenance decision-making. Condition monitoring is the process of collecting real-time sensor information from a functioning device in order to reason about … Real-time condition monitoring is becoming an important tool in maintenance decision-making. Condition monitoring is the process of collecting real-time sensor information from a functioning device in order to reason about the health of the device. To make effective use of condition information, it is useful to characterize a device degradation signal, a quantity computed from condition information that captures the current state of the device and provides information on how that condition is likely to evolve in the future. If properly modeled, the degradation signal can be used to compute a residual-life distribution for the device being monitored, which can then be used in decision models. In this work, we develop Bayesian updating methods that use real-time condition monitoring information to update the stochastic parameters of exponential degradation models. We use these degradation models to develop a closed-form residual-life distribution for the monitored device. Finally, we apply these degradation and residual-life models to degradation signals obtained through the accelerated testing of bearings.
A problem-specific genetic algorithm (GA) is developed and demonstrated to analyze series-parallel systems and to determine the optimal design configuration when there are multiple component choices available for each of … A problem-specific genetic algorithm (GA) is developed and demonstrated to analyze series-parallel systems and to determine the optimal design configuration when there are multiple component choices available for each of several k-out-of-n:G subsystems. The problem is to select components and redundancy-levels to optimize some objective function, given system-level constraints on reliability, cost, and/or weight. Previous formulations of the problem have implicit restrictions concerning the type of redundancy allowed, the number of available component choices, and whether mixing of components is allowed. GA is a robust evolutionary optimization search technique with very few restrictions concerning the type or size of the design problem. The solution approach was to solve the dual of a nonlinear optimization problem by using a dynamic penalty function. GA performs very well on two types of problems: (1) redundancy allocation originally proposed by Fyffe, Hines, Lee, and (2) randomly generated problem with more complex k-out-of-n:G configurations.
Abstract : This is the first of two books on the statistical theory of reliability and life testing. The present book concentrates on probabilistic aspects of reliability theory, while the … Abstract : This is the first of two books on the statistical theory of reliability and life testing. The present book concentrates on probabilistic aspects of reliability theory, while the forthcoming book will focus on inferential aspects of reliability and life testing, applying the probabilistic tools developed in this volume. This book emphasizes the newer, research aspects of reliability theory. The concept of a coherent system serves as a unifying theme for much of the book. A number of new classes of life distributions arising naturally in reliability models are treated systematically: the increasing failure rate average, new better than used, decreasing mean residual life, and other classes of distributions. As the names would seem to indicate, each such class of life distributions provides a realistic probabilistic description of a physical property occurring in the reliability context. Also various types of positive dependence among random variables are considered, thus permitting more realistic modeling of commonly occurring reliability situations.
Partial table of contents: Reliability Concepts and Reliability Data. Nonparametric Estimation. Other Parametric Distributions. Probability Plotting. Bootstrap Confidence Intervals. Planning Life Tests. Degradation Data, Models, and Data Analysis. Introduction to … Partial table of contents: Reliability Concepts and Reliability Data. Nonparametric Estimation. Other Parametric Distributions. Probability Plotting. Bootstrap Confidence Intervals. Planning Life Tests. Degradation Data, Models, and Data Analysis. Introduction to the Use of Bayesian Methods for Reliability Data. Failure--Time Regression Analysis. Accelerated Test Models. Accelerated Life Tests. Case Studies and Further Applications. Epilogue. Appendices. References. Indexes.
This paper presents statistical models and methods for analyzing accelerated life-test data from step-stress tests. Maximum likelihood methods provide estimates of the parameters of such models, the life distribution under … This paper presents statistical models and methods for analyzing accelerated life-test data from step-stress tests. Maximum likelihood methods provide estimates of the parameters of such models, the life distribution under constant stress, and other information. While the methods are applied to the Weibull distribution and inverse power law, they apply to many other accelerated life test models. These methods are illustrated with step-stress data on time to breakdown of an electrical insulation.
Many real systems are composed of multi-state components with different performance levels and several failure modes. These affect the whole system's performance. Most books on reliability theory cover binary models … Many real systems are composed of multi-state components with different performance levels and several failure modes. These affect the whole system's performance. Most books on reliability theory cover binary models that allow a system only to function perfectly or fail completely. The Universal Generating Function in Reliability Analysis and Optimization is the first book that gives a comprehensive description of the universal generating function technique and its applications in binary and multi-state system reliability analysis. Features:- an introduction to basic tools of multi-state system reliability and optimization;- applications of the universal generating function in widely used multi-state systems;- examples of the adaptation of the universal generating function to different systems in mechanical, industrial and software engineering. This monograph will be of value to anyone interested in system reliability, performance analysis and optimization in industrial, electrical and nuclear engineering.
Machining centers are complex systems that consist of multiple subsystems. When maintaining these subsystems, considering opportunistic maintenance can prevent frequent shutdowns during the machining process and reduce costs. This paper … Machining centers are complex systems that consist of multiple subsystems. When maintaining these subsystems, considering opportunistic maintenance can prevent frequent shutdowns during the machining process and reduce costs. This paper proposes an opportunistic maintenance strategy for machining centers. Firstly, the reliability of the machining center subsystem was modeled, which serves as the basis for determining when to repair a subsystem. In this process, an improved average rank method was employed, which considers the time correlation of subsystem failures and can achieve better model-fitting results. In the opportunistic maintenance strategy, imperfect maintenance is considered. Additionally, the strategy includes direct maintenance costs, downtime costs, failure risk costs, and penalty costs for incomplete utilization of subsystems. The opportunistic maintenance threshold helps determine whether other subsystems need to be repaired during this maintenance opportunity. The optimization objective is to minimize the total cost within the specified operating time. By modeling the reliability of subsystems using the failure data collected from five machining centers, the opportunistic maintenance strategy can reduce downtime by 10 times, preventive downtime by 29%, and cost by 7%. The results indicate that for machining centers or other complex systems, the opportunistic maintenance strategy mentioned in this article can lead to good results.
Timely diagnosis and prognosis based on degradation symptoms are essential steps for condition-based maintenance (CBM) to guarantee industrial safety and productivity. Most industrial machines operate under variable operating conditions. This … Timely diagnosis and prognosis based on degradation symptoms are essential steps for condition-based maintenance (CBM) to guarantee industrial safety and productivity. Most industrial machines operate under variable operating conditions. This time-varying operating condition can accelerate the machinery’s degradation process. It may have a massive influence on data and impede the process of diagnosis and prognosis of the machinery. Therefore, in this paper, to address the mentioned problems, we introduced an approach for modelling non-stationary long-term condition monitoring data. This procedure includes separating random and deterministic parts and identifying possible autodependence hidden in the random sequence, as well as potential time-dependent variance. To achieve these objectives, we employ a time-varying coefficient autoregressive (TVC-AR) model within a Bayesian framework. However, due to the limited availability of diverse run-to-failure data sets, we validate the proposed procedure using a simulated degradation model and two widely recognized benchmark data sets (FEMTO and wind turbine drive), which demonstrate the model’s effectiveness in capturing complex non-stationary degradation characteristics.
Manufacturing system degradation can damage its reliability, resulting in decreased product quality and delayed deliveries. These challenges are characteristic of imperfect manufacturing systems. Moreover, the propagation of delay time across … Manufacturing system degradation can damage its reliability, resulting in decreased product quality and delayed deliveries. These challenges are characteristic of imperfect manufacturing systems. Moreover, the propagation of delay time across task periods may reduce the operational availability in future periods. Condition-based maintenance is an effective method for mitigating system degradation and enhancing reliability. However, existing condition-based maintenance studies often overlook the impact of delay propagation on operational availability. To address this issue, this paper proposes a condition-based maintenance model based on a Markov decision process. By introducing delay time as a state variable to capture changes in operational availability and incorporating it into the reward model, the proposed strategy aims to maximize enterprise profit. A case study and comparative analysis using data from a manufacturing enterprise validate the effectiveness and superiority of the proposed model in improving economic performance.
Over the past thirty years, reliability engineering has significantly evolved beyond its conventional focus on system reliability indices, profit evaluations, and cost-benefit analyses. With the advent of smart manufacturing, the … Over the past thirty years, reliability engineering has significantly evolved beyond its conventional focus on system reliability indices, profit evaluations, and cost-benefit analyses. With the advent of smart manufacturing, the field now integrates sophisticated stochastic modeling, multi-objective optimization, and AI-powered predictive maintenance. This review highlights key developments, including improvements in the reliability of single-unit, dual-unit, and multi-unit industrial systems, applications in various industries, the incorporation of renewable energy, and AI-driven monitoring and analysis. Furthermore, it identifies current research gaps and presents potential avenues for further innovation in reliability assessment.
Abstract Maintaining high-complexity aircraft requires resilient and data-driven maintenance planning. This article presents the efficient task allocation and packing problem solver (ETTAPS), a novel framework that integrates predictive analytics and … Abstract Maintaining high-complexity aircraft requires resilient and data-driven maintenance planning. This article presents the efficient task allocation and packing problem solver (ETTAPS), a novel framework that integrates predictive analytics and optimisation models to generate adaptive maintenance schedules. ETTAPS employs a trial-and-error approach to optimise maintenance intervals, leveraging a branch-and-cut solver combined with first-fit decreasing (FFD) task grouping to minimise costs and enhance aircraft availability. Additionally, a random forest model, retrained using a rolling 24-month data window, continuously refines predictions, leading to progressive cost reductions and improved system reliability over multiple maintenance cycles. Our results demonstrate that ETTAPS significantly reduces maintenance costs and increases aircraft availability by efficiently grouping tasks and incorporating real-world constraints, such as mechanic skill levels, task dependencies and resource limitations. The framework addresses key gaps in MSG-3 and certification analysis, improving task scheduling efficiency and ensuring long-term operational resilience. Furthermore, ETTAPS lays the groundwork for integration with digital twins, real-time anomaly detection and flight planning systems, supporting a more intelligent and proactive approach to aircraft maintenance. This research advances resilience and sustainable aviation maintenance planning by optimising costs, reducing downtime and proactively adapting to operational demands. By aligning with Industry 4.0 and aviation sustainability goals for 2050, ETTAPS contributes to the next generation of intelligent maintenance systems.
ABSTRACT Switching time redundancy is crucial for optimizing the performance of standby systems, where a redundancy period commences upon the failure of the online operating unit, requiring the standby unit … ABSTRACT Switching time redundancy is crucial for optimizing the performance of standby systems, where a redundancy period commences upon the failure of the online operating unit, requiring the standby unit to activate within a designated random time interval to ensure operational continuity. This paper examines a two‐unit warm standby system that integrates switching time redundancy. System failure can occur due to simultaneous failures of both units or failure of the standby unit's activation within the allowed redundancy period. The proposed warm standby system model with switching time redundancy is especially pertinent in critical applications, such as hospitals, where backup generators are vital for maintaining a continuous power supply to life‐support systems and other essential functions. Our primary aim is to identify the optimal switching timing for the standby unit's transition to the online mode, striking a balance between key trade‐offs: early activation may lead to excessive wear and increased operational costs, whereas delayed activation could result in detrimental downtime. We propose switching strategies aimed at (a) maximizing expected system lifetime and (b) maximizing expected operational profit. Numerical examples illustrate the practical applicability of these strategies and offer valuable guidance for effective operations management within warm standby systems.
Xiang Jia | Proceedings of the Institution of Mechanical Engineers Part O Journal of Risk and Reliability
Redundancy optimization problem is extremely popular as redundancy is a useful technique to improve the reliability of system. In this paper, the progresses of this problem during the past two … Redundancy optimization problem is extremely popular as redundancy is a useful technique to improve the reliability of system. In this paper, the progresses of this problem during the past two decades are reviewed. Firstly, a framework is proposed with decomposition into four main modules consist of component characteristic, redundancy strategy, reliability modeling of system, and optimization algorithm. Next, the existing literatures are classified into common and extended studies through this framework to identify the details and methods associated with each module thoroughly. The conclusions demonstrate that the framework is indeed general to contain all the existing studies. Further, the widely used benchmarks are collected and divided for ease of referring. Finally, the future studies are pointed out based on the gaps and extensions between the current studies and framework, including absent studies concerning different combinations of elements, the expansions of component characteristics, more general redundancy structures, exploitation of knowledge-inspired optimization algorithm, integration with other techniques improving reliability and focus on real system in engineering.
ABSTRACT It is challenging for the power industry to conduct reliability demonstration tests (RDTs) on smart electricity meters (SEMs) due to time‐consuming and limited sample sizes. Accelerated reliability demonstration tests … ABSTRACT It is challenging for the power industry to conduct reliability demonstration tests (RDTs) on smart electricity meters (SEMs) due to time‐consuming and limited sample sizes. Accelerated reliability demonstration tests (ARDTs) have attracted widespread attention due to the advantages of reducing testing duration. A novel practice of the RDT is carried out for SEMs in operating fields under extreme natural environments, leading to difficulties in designing tests under uncontrolled accelerated stresses and determining the accelerated factor using the current testing samples. Therefore, the historical operating data are collected from similar SEMs operated in the field under the extreme nature environment and normal conditions, and then are employed to estimate the SEM reliability and AFs based on the Lomax distribution. The accelerated acceptance sampling plans for SEMs are constructed by a combination of the operating characteristic functions and AFs. The ARDTs against reliable life, failure rate, and mean time between failures are developed. A case study verifies the effectiveness and validation. The developed methods in this research can be applied to the ARDTs for the products tested in the operating field under harsher‐than‐normal stresses.
Wang Deng-Long , Yang Yu , Yonghua Li +2 more | Proceedings of the Institution of Mechanical Engineers Part O Journal of Risk and Reliability
To address the issues of resource wastage and excessive maintenance caused by traditional preventive maintenance, which overlooks the actual operating conditions of mechanical equipment, a condition-based maintenance decision-making method for … To address the issues of resource wastage and excessive maintenance caused by traditional preventive maintenance, which overlooks the actual operating conditions of mechanical equipment, a condition-based maintenance decision-making method for rolling bearings based on the Weibull proportional hazards model is proposed. First, aiming at the problem of complex calculation and poor convergence of traditional Weibull proportional risk model parameter estimation method, a step-by-step estimation method of model parameters combined with an improved genetic algorithm is proposed. Second, by integrating covariates representing the actual operating conditions of the bearing, a Weibull proportional hazards model is developed to reflect the degradation state of the bearing. Finally, considering the issue of delayed maintenance that arises from using a single failure threshold model, the traditional maintenance threshold is optimized. Using bearing operational data as a case study, an optimized condition-based maintenance decision-making strategy, aimed at maximizing availability, is proposed. The results show that the proposed method can comprehensively account for the actual operating conditions, develop appropriate maintenance strategies, and enable condition-based maintenance decisions that balance operational safety and resource efficiency. Moreover, this method can be further extended to related fields, providing valuable reference for research on maintenance decision-making methods for other mechanical equipment.
Abstract In recent years, the civil aviation industry has been booming at an unprecedented pace. Civil aircraft have become an indispensable means of transportation in modern society. However, with the … Abstract In recent years, the civil aviation industry has been booming at an unprecedented pace. Civil aircraft have become an indispensable means of transportation in modern society. However, with the frequent accidents of aircraft manufactured by Boeing in recent years, the public’s concern about the safety of civil aircraft has also been increasing. The high reliability of avionics equipment is of utmost importance for ensuring the safe flight of airliners. As the main DC power supply of general civil aircraft, the reliability of the transformer rectifier unit (TRU) makes a huge contribution to the safe flight of flights. Taking the TRU of a certain civil aircraft as an example, this paper deeply explores the practical methods of reliability design for civil avionics equipment. It elaborates in detail on aspects such as the thermal design, stress analysis, redundancy design, and online replaceable module design of the TRU, aiming to provide strategic guidelines for effectively enhancing the reliability of civil avionics electronic equipment and inject a strong impetus into the stable and sustainable development of the aviation industry.
| IEEE reliability magazine
Abstract In some test phases of equipment, the small sample size of test data and the absence of some maintenance operations may lead to a multi-peak phenomenon in data distribution, … Abstract In some test phases of equipment, the small sample size of test data and the absence of some maintenance operations may lead to a multi-peak phenomenon in data distribution, which is a challenge for Bayesian information fusion based on maintainability assessment. In this paper, prior information at two levels, the system level and the maintenance operation level, is integrated with the field test data via the Bayesian melding method (BMM). Mixture priors are used to avoid prior-data conflicts in the Bayesian framework, and a Bayesian posterior distribution is used to estimate system maintainability. Adaptive sampling importance resampling (ASIR) is used to overcome computational difficulties in simulation algorithms. Compared to the other methods, the proposed method provides more information sources for maintainability estimation, whose estimation effect is shown to be satisfactory based on two validation cases.