Author Description

Login to generate an author description

Ask a Question About This Mathematician

A uniform design (UD) seeks design points that are uniformly scattered on the domain. It has been popular since 1980. A survey of UD is given in the first portion: … A uniform design (UD) seeks design points that are uniformly scattered on the domain. It has been popular since 1980. A survey of UD is given in the first portion: The fundamental idea and construction method are presented and discussed and examples are given for illustration. It is shown that UD's have many desirable properties for a wide variety of applications. Furthermore, we use the global optimization algorithm, threshold accepting, to generate UD's with low discrepancy. The relationship between uniformity and orthogonality is investigated. It turns out that most UD's obtained here are indeed orthogonal.
The projection properties of the 2 R q–p fractional factorials are well known and have been used effectively in a number of published examples of experimental investigations. The Plackett and … The projection properties of the 2 R q–p fractional factorials are well known and have been used effectively in a number of published examples of experimental investigations. The Plackett and Burman designs also have interesting projective properties, knowledge of which allows the experimenter to follow up an initial Plackett and Burman design with runs that increase the initial resolution for the factors that appear to matter and thus permit efficient separation of effects of interest. Projections of designs into 2–5 dimensions are discussed, and the 12-run case is given in detail. A numerical example illustrates the practical uses of these projections.
In this paper, we review multivariate control charts designed for monitoring changes in a covariance matrix that have been developed in the last 15 years. The focus is on control … In this paper, we review multivariate control charts designed for monitoring changes in a covariance matrix that have been developed in the last 15 years. The focus is on control charts developed for multivariate normal processes, assuming that independent subgroups of observations or independent individual observations are sampled as process monitoring proceeds. Control charts developed between 1990 and 2005 are reviewed according to the types of the control chart: multivariate Shewhart chart, multivariate CUSUM chart and multivariate EWMA chart. In addition to these developments, a new multivariate EWMA control chart is proposed. We also discuss comparisons of chart performance that have been carried out in the literature, as well as the issue of diagnostics. Some potential future research ideas are also given.
Analysis of massive data sets is challenging owing to limitations of computer primary memory. In this paper, we propose an approach to estimate population parameters from a massive data set. … Analysis of massive data sets is challenging owing to limitations of computer primary memory. In this paper, we propose an approach to estimate population parameters from a massive data set. The proposed approach significantly reduces the required amount of primary memory, and the resulting estimate will be as efficient if the entire data set was analyzed simultaneously. Asymptotic properties of the resulting estimate are studied, and the asymptotic normality of the resulting estimator is established. The standard error formula for the resulting estimate is proposed and empirically tested; thus, statistical inference for parameters of interest can be performed. The effectiveness of the proposed approach is illustrated using simulation studies and an Internet traffic data example. Copyright © 2012 John Wiley & Sons, Ltd.
A commonly used follow-up experiment strategy involves the use of a foldover design by reversing the signs of one or more columns of the initial design. Defining a foldover plan … A commonly used follow-up experiment strategy involves the use of a foldover design by reversing the signs of one or more columns of the initial design. Defining a foldover plan as the collection of columns whose signs are to be reversed in the foldover design, this article answers the following question: Given a 2k−p design with k factors and p generators, what is its optimal foldover plan? We obtain optimal foldover plans for 16 and 32 runs and tabulate the results for practical use. Most of these plans differ from traditional foldover plans that involve reversing the signs of one or all columns. There are several equivalent ways to generate a particular foldover design. We demonstrate that any foldover plan of a 2k−p fractional factorial design is equivalent to a core foldover plan consisting only of the p out of k factors. Furthermore, we prove that there are exactly 2k−p foldover plans that are equivalent to any core foldover plan of a 2k−p design and demonstrate how these foldover plans can be constructed. A new class of designs called combined-optimal designs is introduced. An n-run combined-optimal 2k−p design is the one such that the combined 2k−p+1 design consisting of the initial design and its optimal foldover has the minimum aberration among all 2k−p designs.
This paper introduces a new multivariate exponentially weighted moving average (EWMA) control chart. The proposed control chart, called an EWMA V-chart, is designed to detect small changes in the variability … This paper introduces a new multivariate exponentially weighted moving average (EWMA) control chart. The proposed control chart, called an EWMA V-chart, is designed to detect small changes in the variability of correlated multivariate quality characteristics. Through examples and simulations, it is demonstrated that the EWMA V-chart is superior to the |S|-chart in detecting small changes in process variability. Furthermore, a counterpart of the EWMA V-chart for monitoring process mean, called the EWMA M-chart is proposed. In detecting small changes in process variability, the combination of EWMA M-chart and EWMA V-chart is a better alternative to the combination of MEWMA control chart (Lowry et al. , 1992) and |S|-chart. Furthermore, the EWMA M- chart and V-chart can be plotted in one single figure. As for monitoring both process mean and process variability, the combined MEWMA and EWMA V-charts provide the best control procedure.
Estimating component and system reliabilities frequently requires using data from the system level. Because of cost and time constraints, however, the exact cause of system failure may be unknown. Instead, … Estimating component and system reliabilities frequently requires using data from the system level. Because of cost and time constraints, however, the exact cause of system failure may be unknown. Instead, it may only be ascertained that the cause of system failure is due to a component in a subset of components. This paper develops methods for analysing such masked data from a Bayesian perspective. This work was motivated by a data set on a system unit of a particular type of IBM PS/2 computer. This data set is discussed and our methods are applied to it
The minimum aberration criterion has been frequently used in the selection of fractional factorial designs with nominal factors. For designs with quantitative factors, however, level permutation of factors could alter … The minimum aberration criterion has been frequently used in the selection of fractional factorial designs with nominal factors. For designs with quantitative factors, however, level permutation of factors could alter their geometrical structures and statistical properties. In this paper uniformity is used to further distinguish fractional factorial designs, besides the minimum aberration criterion. We show that minimum aberration designs have low discrepancies on average. An efficient method for constructing uniform minimum aberration designs is proposed and optimal designs with 27 and 81 runs are obtained for practical use. These designs have good uniformity and are effective for studying quantitative factors.
This work estimates component reliability from masked series-system life data, viz, data where the exact component causing system failure might be unknown. The authors extend the results of Usher and … This work estimates component reliability from masked series-system life data, viz, data where the exact component causing system failure might be unknown. The authors extend the results of Usher and Hodgson (1988) by deriving exact maximum likelihood estimators (MLE) for the general case of a series system of three exponential components with independent masking. Their previous work shows that closed-form MLE are intractable, and they propose an iterative method for the solution of a system of three nonlinear likelihood equations.< <ETX xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">&gt;</ETX>
This paper estimates component reliability from masked series-system life data, viz, data where the exact component causing system failure might be unknown. It focuses on a Bayes approach which considers … This paper estimates component reliability from masked series-system life data, viz, data where the exact component causing system failure might be unknown. It focuses on a Bayes approach which considers prior information on the component reliabilities. In most practical settings, prior engineering knowledge on component reliabilities is extensive. Engineers routinely use prior knowledge and judgment in a variety of ways. The Bayes methodology proposed here provides a formal, realistic means of incorporating such subjective knowledge into the estimation process. In the event that little prior knowledge is available, conservative or even noninformative priors, can be selected. The model is illustrated for a 2-component series system of exponential components. In particular it uses discrete-step priors because of their ease of development and interpretation. By taking advantage of the prior information, the Bayes point-estimates consistently perform well, i.e., are close to the MLE. While the approach is computationally intensive, the calculations can be easily computerized.
Summary In an order-of-addition experiment, each treatment is a permutation of $m$ components. It is often unaffordable to test all the $m!$ possible treatments, and thus the design problem arises. … Summary In an order-of-addition experiment, each treatment is a permutation of $m$ components. It is often unaffordable to test all the $m!$ possible treatments, and thus the design problem arises. We consider a flexible model that incorporates the order of each pair of components and can also account for the distance between the two components in every such pair. Under this model, the optimality of the uniform design measure is established, via the approximate theory, for a broad range of criteria. Coupled with an eigenanalysis, this result serves as a benchmark that paves the way for assessing the efficiency and robustness of any exact design. The closed-form construction of a class of robust optimal fractional designs that can also facilitate model selection is explored and illustrated.
ABSTRACT Motivated by a vertical density profile problem, in this article we focus on monitoring the slopes of linear profiles. A Shewhart-type control chart for monitoring slopes of linear profiles … ABSTRACT Motivated by a vertical density profile problem, in this article we focus on monitoring the slopes of linear profiles. A Shewhart-type control chart for monitoring slopes of linear profiles is proposed. Both Phase I and Phase II applications are discussed. The performance of the proposed chart in Phase I applications is demonstrated using both real-life data in an illustrative example and simulated data in a probability of signal study. For Phase II applications, it is shown that the average run length (ARL) of the proposed control chart depends only on the shifts of slopes, whereas the ARL of the multivariate T 2 chart depends on both the shifts of slopes and the correlation between the estimated slope and the intercept. When such a correlation is low, the proposed control chart has a better ARL performance than the T 2 chart.
A statistical process control chart called the cumulative probability control chart (CPC-chart) is proposed. The CPC-chart is motivated from two existing statistical control charts, the cumulative count control chart (CCC-chart) … A statistical process control chart called the cumulative probability control chart (CPC-chart) is proposed. The CPC-chart is motivated from two existing statistical control charts, the cumulative count control chart (CCC-chart) and the cumulative quantity control chart (CQC-chart). The CCC- and CQC-charts are effective in monitoring production processes when the defect rate is low and the traditional p - and c -charts do not perform well. In a CPC-chart, the cumulative probability of the geometric or exponential random variable is plotted against the sample number, and hence the actual cumulative probability is indicated on the chart. Apart from maintaining all the favourable features of the CCC- and CQC-charts, the CPC-chart is more flexible and it can resolve a technical plotting inconvenience of the CCC- and CQC-charts.
Dimensional analysis (DA) is a well-developed, widely-employed methodology in the physical and engineering sciences. The application of dimensional analysis in statistics leads to three advantages: (1) the reduction of the … Dimensional analysis (DA) is a well-developed, widely-employed methodology in the physical and engineering sciences. The application of dimensional analysis in statistics leads to three advantages: (1) the reduction of the number of potential causal factors that we need to consider, (2) the analytical insights into the relations among variables that it generates, and (3) the scalability of results. The formalization of the dimensional-analysis method in statistical design and analysis gives a clear view of its generality and overlooked significance. In this paper, we first provide general procedures for dimensional analysis prior to statistical design and analysis. We illustrate the use of dimensional analysis with three practical examples. In the first example, we demonstrate the basic dimensional-analysis process in connection with a study of factors that affect vehicle stopping distance. The second example integrates dimensional analysis into the regression analysis of the pine tree data. In our third example, we show how dimensional analysis can be used to develop a superior experimental design for the well-known paper helicopter experiment. In the regression example and in the paper helicopter experiment, we compare results obtained via the dimensional-analysis approach to those obtained via conventional approaches. From those, we demonstrate the general properties of dimensional analysis from a statistical perspective and recommend its usage based on its favorable performance.
When studying both location and dispersion effects in unreplicated fractional factorial designs, a "standard" procedure is to identify location effects using ordinary least squares analysis, fit a model, and then … When studying both location and dispersion effects in unreplicated fractional factorial designs, a "standard" procedure is to identify location effects using ordinary least squares analysis, fit a model, and then identify dispersion effects by analyzing the residuals. In this paper, we show that if the model in the above procedure does not include all active location effects, then null dispersion effects may be mistakenly identified as active. We derive an exact relationship between location and dispersion effects, and we show that without information in addition to the unreplicated fractional factorial (such as replication) we can not determine whether a dispersion effect or two location effects are active.
In unreplicated 2k−p designs, the assumption of constant variance is commonly made. When the variance of the response differs between the two levels of a column in the effect matrix, … In unreplicated 2k−p designs, the assumption of constant variance is commonly made. When the variance of the response differs between the two levels of a column in the effect matrix, that column produces a dispersion effect. In this article we show that two active dispersion effects may create a spurious dispersion effect in their interaction column. Most existing methods for dispersion-effect testing in unreplicated fractional factorial designs are subject to these spurious effects. We propose a method of dispersion-effect testing based on geometric means of residual sample variances. We show through examples from the literature and simulations that the proposed test has many desirable properties that are lacking in other tests.
In this paper, we propose a new variables control chart, called the box-chart, to simultaneously monitor, on a single chart, the process mean and process variability for multivariate processes. The … In this paper, we propose a new variables control chart, called the box-chart, to simultaneously monitor, on a single chart, the process mean and process variability for multivariate processes. The box-chart uses a probability integral transformation to obtain two independently and identically distributed uniform distributions. Therefore, a box-shaped (thus the name), two-dimensional control chart can be constructed. We discuss in detail on how to construct the box-chart. The proposed chart is applied to two real-life examples. The performance of the box-chart is also compared to that of the traditional T 2 - and |S|-charts.
AbstractWhen a problem occurs in a system, its causes should be identified for the problem to be fixed. Ishikawa Cause and Effect (CE) diagrams are popular tools to investigate and … AbstractWhen a problem occurs in a system, its causes should be identified for the problem to be fixed. Ishikawa Cause and Effect (CE) diagrams are popular tools to investigate and identify numerous different causes of a problem. A CE diagram can be used as a guideline to allocate resources and make necessary investments to fix the problem. Although important decisions are based on CE diagrams, there is a scarcity of analytical methodology that supports the construction of these diagrams. We propose a methodology based on capture-recapture analysis to analytically estimate the causes of a problem and build CE diagrams. An estimate of the number of causes can be used to determine whether the CE study should be terminated or additional iterations are required. It is shown that integration of Capture-Recapture analysis concepts into CE diagrams enables the users to evaluate the progress of CE sessions.Key Words: Capture-Recapture analysisCE diagram construction methodologyIshikawa CE diagrams Additional informationNotes on contributorsR. Ufuk BilselR. Ufuk Bilsel received his B.S. and M.S. degrees, both in Industrial Engineering, from Galatasaray University in Istanbul, Turkey, in 2003 and 2005 respectively. He received his M.Eng degree in Industrial Engineering and Ph.D. in Industrial Engineering and Operations Research from The Pennsylvania State University in 2008 and 2009 respectively. He is currently with The Boston Consulting Group. His areas of interest include supply chain optimization, decision making under uncertainty, risk analysis and statistical analysis of large datasets. He is a member of Alpha Pi Mu, Industrial Engineering Honor Society.Dennis K.J. LinDennis K. J. Lin Dr. Dennis Lin is a University Distinguished Professor of Statistics and Supply Chain Management at Penn State University. His research interests are quality engineering, industrial statistics, data mining and response surface. He has published over 150 papers in a wide variety of journals. Dr. Lin is an elected fellow of ASA and ASQ, an elected member of ISI, a lifetime member of ICSA, and a fellow of RSS. He is an honorary chair professor for various universities, including a Chang-Jiang Scholar of China at Renmin University, National Chengchi University (Taiwan), Fudan University, Jinan University, and XiAn Statistical Institute (China). Dr. Lin presents several distinguished lectures, including the 2010 Youden Address (FTC) and the 2011 Loutit Address (SSC). He is also the recipient of the 2004 Faculty Scholar Medal Award at Penn State University.
Monitoring several correlated quality characteristics of a process is common in modern manufacturing and service industries. Although a lot of attention has been paid to monitoring the multivariate process mean, … Monitoring several correlated quality characteristics of a process is common in modern manufacturing and service industries. Although a lot of attention has been paid to monitoring the multivariate process mean, not many control charts are available for monitoring the covariance matrix. This paper presents a comprehensive overview of the literature on control charts for monitoring the covariance matrix in a multivariate statistical process monitoring (MSPM) framework. It classifies the research that has previously appeared in the literature. We highlight the challenging areas for research and provide some directions for future research.
Supersaturated design (SSD) has received much recent interest because of its potential in factor screening experiments. In this paper, we provide equivalent conditions for two columns to be fully aliased … Supersaturated design (SSD) has received much recent interest because of its potential in factor screening experiments. In this paper, we provide equivalent conditions for two columns to be fully aliased and consequently propose methods for constructing E(fNOD)- and χ2-optimal mixed-level SSDs without fully aliased columns, via equidistant designs and difference matrices. The methods can be easily performed and many new optimal mixed-level SSDs have been obtained. Furthermore, it is proved that the nonorthogonality between columns of the resulting design is well controlled by the source designs. A rather complete list of newly generated optimal mixed-level SSDs are tabulated for practical use.
Abstract Response Surface Methodology is concerned with estimating a surface to a typically small set of observations with the purpose of determining what levels of the independent variables maximize the … Abstract Response Surface Methodology is concerned with estimating a surface to a typically small set of observations with the purpose of determining what levels of the independent variables maximize the response. This usually entails fitting a quadratic regression function to the available data and calculating the function's derivatives. Artificial Neural Networks are information-processing paradigms inspired by the way the human brain processes information. They are known to be universal function approximators under certain general conditions. This ability to approximate functions to any desired degree of accuracy makes them an attractive tool for use in a Response Surface analysis. This paper presents Artificial Neural Networks as a tool for Response Surface Methodology and demonstrates their use empirically. Keywords: function approximationoptimization Additional informationNotes on contributorsSandy D. Balkin
Abstract The biasness problem of the maximum‐likelihood estimate (MLE) of the common shape parameter of several Weibull populations is examined in detail. A modified MLE (MMLE) approach is proposed. In … Abstract The biasness problem of the maximum‐likelihood estimate (MLE) of the common shape parameter of several Weibull populations is examined in detail. A modified MLE (MMLE) approach is proposed. In the case of complete and Type II censored data, the bias of the MLE can be substantial. This is noticeable even when the sample size is large. Such a bias increases rapidly as the degree of censorship increases and as more populations are involved. The proposed MMLE, however, is nearly unbiased and much more efficient than the MLE, irrespective of the degree of censorship, the sample sizes, and the number of populations involved. Copyright © 2007 John Wiley &amp; Sons, Ltd.
Abstract Interaction is very common in reality, but has received little attention in logistic regression literature. This is especially true for higher-order interactions. In conventional logistic regression, interactions are typically … Abstract Interaction is very common in reality, but has received little attention in logistic regression literature. This is especially true for higher-order interactions. In conventional logistic regression, interactions are typically ignored. We propose a model selection procedure by implementing an association rules analysis. We do this by (1) exploring the combinations of input variables which have significant impacts to response (via association rules analysis); (2) selecting the potential (low- and high-order) interactions; (3) converting these potential interactions into new dummy variables; and (4) performing variable selections among all the input variables and the newly created dummy variables (interactions) to build up the optimal logistic regression model. Our model selection procedure establishes the optimal combination of main effects and potential interactions. The comparisons are made through thorough simulations. It is shown that the proposed method outperforms the existing methods in all cases. A real-life example is discussed in detail to demonstrate the proposed method. Keywords: association rules analysisinteraction effectslogistic regression modelsmodel selection
The combination generator, first proposed by Wichmann and Hill (1982), is constructed by taking the fractional part of the sum of several random number generators. It is probably one of … The combination generator, first proposed by Wichmann and Hill (1982), is constructed by taking the fractional part of the sum of several random number generators. It is probably one of the most popular random number generators used. Its empirical performance is superior to the classical Lehmer congruential generator. However, its theoretical justification is somewhat primitive. In this paper, we give some theoretical support for such an important generator, from a statistical theory viewpoint. Specifically, we prove that the combination generator method is superior to each component random number generator method, in terms of (1) uniformity and (2) independence.
The projection properties of the 2 R q–p fractional factorials are well known and have been used effectively in a number of published examples of experimental investigations. The Plackett and … The projection properties of the 2 R q–p fractional factorials are well known and have been used effectively in a number of published examples of experimental investigations. The Plackett and Burman designs also have interesting projective properties, knowledge of which allows the experimenter to follow up an initial Plackett and Burman design with runs that increase the initial resolution for the factors that appear to matter and thus permit efficient separation of effects of interest. Projections of designs into 2–5 dimensions are discussed, and the 12-run case is given in detail. A numerical example illustrates the practical uses of these projections.
Dimensional analysis (DA) is a methodology widely used in physics and engineering. The main idea is to extract key variables based on physical dimensions. Its overlooked importance in statistics has … Dimensional analysis (DA) is a methodology widely used in physics and engineering. The main idea is to extract key variables based on physical dimensions. Its overlooked importance in statistics has been recognized recently. However, most literature treats DA as merely a preprocessing tool, leading to multiple statistical issues. In particular, there are three critical aspects: (a) the nonunique choice of basis quantities and dimensionless variables; (b) the statistical representation and testing of DA constraints; (c) the spurious correlations between post-DA variables. There is an immediate need for an appropriate statistical methodology that integrates DA and the quantitative modeling. In this article, we propose a power-law type of "DA conjugate" model that is useful for incorporating dimensional information and analyzing post-DA variables. Adapting the similar idea of "conjugacy" in Bayesian analysis, we show that the proposed modeling technique not only produces flexible and effective results, but also provides good solutions to the above three issues. A modified projection pursuit regression analysis is implemented to fit the additive power-law model. A numerical study on ocean wave speed is discussed in detail to illustrate and evaluate the advantages of the proposed procedure. Supplementary materials for this article are available online.
Classical statistical process control often relies on univariate characteristics. In many contemporary applications, however, the quality of products must be characterized by some functional relation between a response variable and … Classical statistical process control often relies on univariate characteristics. In many contemporary applications, however, the quality of products must be characterized by some functional relation between a response variable and its explanatory variables. Monitoring such functional profiles has been a rapidly growing field due to increasing demands. This paper develops a novel nonparametric L-1 location-scale model to screen the shapes of profiles. The model is built on three basic elements: location shifts, local shape distortions, and overall shape deviations, which are quantified by three individual metrics. The proposed approach is applied to the previously analyzed vertical density profile data, leading to some interesting insights.
A consistent product/process will have little variability, i.e. dispersion. The widely-used unreplicated two-level fractional factorial designs can play an important role in detecting dispersion effects with a minimum expenditure of … A consistent product/process will have little variability, i.e. dispersion. The widely-used unreplicated two-level fractional factorial designs can play an important role in detecting dispersion effects with a minimum expenditure of resources. In this paper we develop a nonparametric dispersion test for unreplicated two-level fractional factorial designs. The test statistic is defined, critical values are provided, and large sample approximations are given. Through simulations and examples from the literature, the test is compared to general nonparametric dispersion tests and a parametric test based on a normality assumption. These comparisons show the test to be the most robust of those studied and even superior to the normality-based test under normality in some situations. An example is given where this new test is the only one of those studied that does not incorrectly detect a spurious dispersion effect.
We propose a low-storage, single-pass, sequential method for the execution of convex hull peeling for massive datasets. The method is shown to vastly reduce the computation time required for the … We propose a low-storage, single-pass, sequential method for the execution of convex hull peeling for massive datasets. The method is shown to vastly reduce the computation time required for the existing convex hull peeling algorithm from O(n 2) to O(n). Furthermore, the proposed method has significantly smaller storage requirements compared to the existing method. We present algorithms for low-storage, sequential computation of both the convex hull peeling multivariate median and the convex hull peeling pth depth contour, where 0 < p < 1. We demonstrate the accuracy and reduced computation time required of the proposed method by comparing to the existing convex hull peeling method through simulation studies.
Abstract The D‐optimal minimax criterion is proposed to construct fractional factorial designs. The resulting designs are very efficient, and robust against misspecification of the effects in the linear model. The … Abstract The D‐optimal minimax criterion is proposed to construct fractional factorial designs. The resulting designs are very efficient, and robust against misspecification of the effects in the linear model. The criterion was first proposed by Wilmut &amp; Zhou (2011); their work is limited to two‐level factorial designs, however. In this paper we extend this criterion to designs with factors having any levels (including mixed levels) and explore several important properties of this criterion. Theoretical results are obtained for construction of fractional factorial designs in general. This minimax criterion is not only scale invariant, but also invariant under level permutations. Moreover, it can be applied to any run size. This is an advantage over some other existing criteria. The Canadian Journal of Statistics 41: 325–340; 2013 © 2013 Statistical Society of Canada
Process capability analysis is an important aspect of quality control. Various process capability indices are proposed when the distribution is normal. For non‐normal cases, percentile and yield‐based indices have been … Process capability analysis is an important aspect of quality control. Various process capability indices are proposed when the distribution is normal. For non‐normal cases, percentile and yield‐based indices have been introduced. These two methods use partial features of a process distribution, such as key percentiles and the proportion of non‐conforming (PNC) to estimate the process capability. However, these local features may not reflect the uniformity of a process appropriately when the distribution is non‐normal. In this paper, continuous ranked probability score (CRPS) is introduced to process capability analysis and a CRPS‐based approach is proposed. This method can assess the dispersion of process variation across the overall distribution and is applicable to any continuous distribution. An example and simulations show that CRPS‐based indices are more stable and accurate indicators of process capability than the existing indices in reflecting the degree of process fluctuation. Copyright © 2016 John Wiley &amp; Sons, Ltd.
Profile data emerges when the quality of a product or process is characterized by a functional relationship among (input and output) variables. In this paper, we focus on the case … Profile data emerges when the quality of a product or process is characterized by a functional relationship among (input and output) variables. In this paper, we focus on the case where each profile has one response variable Y, one explanatory variable x, and the functional relationship between these two variables can be rather arbitrary. The basic concept can be applied to a much wider case, however. We propose a general method based on the Generalized Likelihood Ratio Test (GLRT) for monitoring of profile data. The proposed method uses nonparametric regression to estimate the on-line profiles and thus does not require any functional form for the profiles. Both Shewhart-type and EWMA-type control charts are considered. The average run length (ARL) performance of the proposed method is studied. It is shown that the proposed GLRT-based control chart can efficiently detect both location and dispersion shifts of the on-line profiles from the baseline profile. An upper control limit (UCL) corresponding to a desired in-control ARL value is constructed.
It is sometimes favorable to conduct experiments in a systematic run order rather than the conventional random run order. In this article, we propose an algorithm to construct the optimal … It is sometimes favorable to conduct experiments in a systematic run order rather than the conventional random run order. In this article, we propose an algorithm to construct the optimal run order. The algorithm is very flexible: it is applicable to any design and works whenever the optimality criterion can be represented as distance between any two experimental runs. Specifically, the proposed method first formulates the run order problem into a graph, then makes use of the existing traveling salesman problem-solver to obtain the optimal run order. It can always reach the optimal result in an efficient manner.A special case where level change is used as the distance criterion is investigated thoroughly. The optimal run orders for popular two-level designs are obtained and tabulated for practical use. For higher- or mixed-level designs a generic table is not possible, although the proposed algorithm still works well for finding the optimal run order. Some supporting theoretical results are derived.
Since the beginning of 2020, the coronavirus disease 2019 (COVID-19) has spread rapidly in the city of Wuhan, P.R. China, and subsequently, across the world. The swift spread of the … Since the beginning of 2020, the coronavirus disease 2019 (COVID-19) has spread rapidly in the city of Wuhan, P.R. China, and subsequently, across the world. The swift spread of the virus is largely attributed to its stealth transmissions in which infected patients may be asymptomatic. Undetected transmissions present a remarkable challenge for the containment of the virus and pose an appalling threat to the public. An urgent question that has been asked by the public is "Should I be tested for COVID-19 if I am sick?". While different regions established their own criteria for screening infected cases, the screening criteria have been modified based on new evidence and understanding of the virus as well as the availability of resources. The shortage of test kits and medical personnel has considerably limited our ability to do as many tests as possible. Public health officials and clinicians are facing a dilemma of balancing the limited resources and unlimited demands. On one hand, they are striving to achieve the best outcome by optimizing the usage of the scant resources. On the other hand, they are challenged by the patients' frustrations and anxieties, stemming from the concerns of not being tested for COVID-19 for not meeting the definition of PUI (person under investigation). In this paper, we evaluate the situation from the statistical viewpoint by factoring into the considerations of the uncertainty and inaccuracy of the test, an issue that is often overlooked by the general public. We aim to shed light on the tough situation by providing evidence-based reasoning from the statistical angle, and we expect this examination will help the general public understand and assess the situation rationally. Most importantly, the development offers recommendations for physicians to make sensible evaluations to optimally use the limited resources for the best medical outcome.
Computational capability often falls short when confronted with massive data, posing a common challenge in establishing a statistical model or statistical inference method dealing with big data. While subsampling techniques … Computational capability often falls short when confronted with massive data, posing a common challenge in establishing a statistical model or statistical inference method dealing with big data. While subsampling techniques have been extensively developed to downsize the data volume, there is a notable gap in addressing the unique challenge of handling extensive reliability data, in which a common situation is that a large proportion of data is censored. In this article, we propose an efficient subsampling method for reliability analysis in the presence of censoring data, intending to estimate the parameters of lifetime distribution. Moreover, a novel subsampling method for subsampling from severely censored data is proposed, i.e., only a tiny proportion of data is complete. The subsampling-based estimators are given, and their asymptotic properties are derived. The optimal subsampling probabilities are derived through the L-optimality criterion, which minimizes the trace of the product of the asymptotic covariance matrix and a constant matrix. Efficient algorithms are proposed to implement the proposed subsampling methods to address the challenge that optimal subsampling strategy depends on unknown parameter estimation from full data. Real-world hard drive dataset case and simulative empirical studies are employed to demonstrate the superior performance of the proposed methods.
It is difficult to handle the extraordinary data volume generated in many fields with current computational resources and techniques. This is very challenging when applying conventional statistical methods to big … It is difficult to handle the extraordinary data volume generated in many fields with current computational resources and techniques. This is very challenging when applying conventional statistical methods to big data. A common approach is to partition full data into smaller subdata for purposes such as training, testing, and validation. The primary purpose of training data is to represent the full data. To achieve this goal, the selection of training subdata becomes pivotal in retaining essential characteristics of the full data. Recently, several procedures have been proposed to select "optimal design points" as training subdata under pre-specified models, such as linear regression and logistic regression. However, these subdata will not be "optimal" if the assumed model is not appropriate. Furthermore, such subdata cannot be useful to build alternative models because it is not an appropriate representative sample of the full data. In this article, we propose a novel algorithm for better model building and prediction via a process of selecting a "good" training sample. The proposed subdata can retain most characteristics of the original big data. It is also more robust that one can fit various response model and select the optimal model. Supplementary materials for this article are available online.
Abstract Run length distributions are generally used to characterize the performance of a control chart in signaling alarms when a process is out‐of‐control. Since it is usually difficult to directly … Abstract Run length distributions are generally used to characterize the performance of a control chart in signaling alarms when a process is out‐of‐control. Since it is usually difficult to directly compare distributions, statistics of the run length distribution are commonly adopted as the performance criteria in practice. Particularly, the average run length (ARL) and its extended versions play a dominant role. However, due to the skewness of the run length distribution, the ARL cannot accurately reflect the central tendency and may be misleading in some cases. In order to comprehensively summarize the information of the run length distribution, a novel criterion is proposed based on the continuous ranked probability score (CRPS). The CRPS‐based criterion measures the difference between the run length distribution and the ideal constant value 0 for the run length. It has advantages of easy computation and good interpretability. Furthermore, theoretical properties and geometric representation guarantee that the CRPS‐based criterion is statistically consistent, informative of both first and second moments of the run length distribution, and robust to extreme values. Results of numerical experiments show that the proposed criterion favors control charts with higher probability to detect outliers earlier, and is a superior metric for characterizing the run length distribution.
Publisher: New England Statistical Society, Journal: The New England Journal of Statistics in Data Science, Title: Comments on Xiao-Li Meng’s Double Your Variance, Dirtify Your Bayes, Devour Your Pufferfish, and … Publisher: New England Statistical Society, Journal: The New England Journal of Statistics in Data Science, Title: Comments on Xiao-Li Meng’s Double Your Variance, Dirtify Your Bayes, Devour Your Pufferfish, and Draw Your Kidstogram, Authors: Dennis K.J. Lin
Sequential Latin hypercube designs have recently received great attention for computer experiments. Much of the work has been restricted to invariant spaces. The related systematic construction methods are inflexible while … Sequential Latin hypercube designs have recently received great attention for computer experiments. Much of the work has been restricted to invariant spaces. The related systematic construction methods are inflexible while algorithmic methods are ineffective for large designs. For such designs in space contraction, systematic construction methods have not been investigated yet. This paper proposes a new method for constructing sequential Latin hypercube designs via good lattice point sets in a variety of experimental spaces. These designs are called sequential good lattice point sets. Moreover, we provide fast and efficient approaches for identifying the (nearly) optimal sequential good lattice point sets under a given criterion. Combining with the linear level permutation technique, we obtain a class of asymptotically optimal sequential Latin hypercube designs in invariant spaces where the $L_1$-distance in each stage is either optimal or asymptotically optimal. Numerical results demonstrate that the sequential good lattice point set has a better space-filling property than the existing sequential Latin hypercube designs in the invariant space. It is also shown that the sequential good lattice point sets have less computational complexity and more adaptability.
Interval data are widely used in many fields, notably in economics, industry, and health areas. Analogous to the scatterplot for single-value data, the rectangle plot and cross plot are the … Interval data are widely used in many fields, notably in economics, industry, and health areas. Analogous to the scatterplot for single-value data, the rectangle plot and cross plot are the conventional visualization methods for the relationship between two variables in interval forms. These methods do not provide much information to assess complicated relationships, however. In this article, we propose two visualization methods: Segment and Dandelion plots. They offer much more information than the existing visualization methods and allow us to have a much better understanding of the relationship between two variables in interval forms. A general guide for reading these plots is provided. Relevant theoretical support is developed. Both empirical and real data examples are provided to demonstrate the advantages of the proposed visualization methods. Supplementary materials for this article are available online.
Monitoring several correlated quality characteristics of a process is common in modern manufacturing and service industries. Although a lot of attention has been paid to monitoring the multivariate process mean, … Monitoring several correlated quality characteristics of a process is common in modern manufacturing and service industries. Although a lot of attention has been paid to monitoring the multivariate process mean, not many control charts are available for monitoring the covariance matrix. This paper presents a comprehensive overview of the literature on control charts for monitoring the covariance matrix in a multivariate statistical process monitoring (MSPM) framework. It classifies the research that has previously appeared in the literature. We highlight the challenging areas for research and provide some directions for future research.
Profile data emerges when the quality of a product or process is characterized by a functional relationship among (input and output) variables. In this paper, we focus on the case … Profile data emerges when the quality of a product or process is characterized by a functional relationship among (input and output) variables. In this paper, we focus on the case where each profile has one response variable Y, one explanatory variable x, and the functional relationship between these two variables can be rather arbitrary. The basic concept can be applied to a much wider case, however. We propose a general method based on the Generalized Likelihood Ratio Test (GLRT) for monitoring of profile data. The proposed method uses nonparametric regression to estimate the on-line profiles and thus does not require any functional form for the profiles. Both Shewhart-type and EWMA-type control charts are considered. The average run length (ARL) performance of the proposed method is studied. It is shown that the proposed GLRT-based control chart can efficiently detect both location and dispersion shifts of the on-line profiles from the baseline profile. An upper control limit (UCL) corresponding to a desired in-control ARL value is constructed.
Monitoring several correlated quality characteristics of a process is common in modern manufacturing and service industries. Although a lot of attention has been paid to monitoring the multivariate process mean, … Monitoring several correlated quality characteristics of a process is common in modern manufacturing and service industries. Although a lot of attention has been paid to monitoring the multivariate process mean, not many control charts are available for monitoring the covariance matrix. This paper presents a comprehensive overview of the literature on control charts for monitoring the covariance matrix in a multivariate statistical process monitoring (MSPM) framework. It classifies the research that has previously appeared in the literature. We highlight the challenging areas for research and provide some directions for future research.
Since the beginning of 2020, the coronavirus disease 2019 (COVID-19) has spread rapidly in the city of Wuhan, P.R. China, and subsequently, across the world. The swift spread of the … Since the beginning of 2020, the coronavirus disease 2019 (COVID-19) has spread rapidly in the city of Wuhan, P.R. China, and subsequently, across the world. The swift spread of the virus is largely attributed to its stealth transmissions in which infected patients may be asymptomatic. Undetected transmissions present a remarkable challenge for the containment of the virus and pose an appalling threat to the public. An urgent question that has been asked by the public is "Should I be tested for COVID-19 if I am sick?". While different regions established their own criteria for screening infected cases, the screening criteria have been modified based on new evidence and understanding of the virus as well as the availability of resources. The shortage of test kits and medical personnel has considerably limited our ability to do as many tests as possible. Public health officials and clinicians are facing a dilemma of balancing the limited resources and unlimited demands. On one hand, they are striving to achieve the best outcome by optimizing the usage of the scant resources. On the other hand, they are challenged by the patients' frustrations and anxieties, stemming from the concerns of not being tested for COVID-19 for not meeting the definition of PUI (person under investigation). In this paper, we evaluate the situation from the statistical viewpoint by factoring into the considerations of the uncertainty and inaccuracy of the test, an issue that is often overlooked by the general public. We aim to shed light on the tough situation by providing evidence-based reasoning from the statistical angle, and we expect this examination will help the general public understand and assess the situation rationally. Most importantly, the development offers recommendations for physicians to make sensible evaluations to optimally use the limited resources for the best medical outcome.
Monitoring several correlated quality characteristics of a process is common in modern manufacturing and service industries. Although a lot of attention has been paid to monitoring the multivariate process mean, … Monitoring several correlated quality characteristics of a process is common in modern manufacturing and service industries. Although a lot of attention has been paid to monitoring the multivariate process mean, not many control charts are available for monitoring the covariance matrix. This paper presents a comprehensive overview of the literature on control charts for monitoring the covariance matrix in a multivariate statistical process monitoring (MSPM) framework. It classifies the research that has previously appeared in the literature. We highlight the challenging areas for research and provide some directions for future research.
Summary In an order-of-addition experiment, each treatment is a permutation of $m$ components. It is often unaffordable to test all the $m!$ possible treatments, and thus the design problem arises. … Summary In an order-of-addition experiment, each treatment is a permutation of $m$ components. It is often unaffordable to test all the $m!$ possible treatments, and thus the design problem arises. We consider a flexible model that incorporates the order of each pair of components and can also account for the distance between the two components in every such pair. Under this model, the optimality of the uniform design measure is established, via the approximate theory, for a broad range of criteria. Coupled with an eigenanalysis, this result serves as a benchmark that paves the way for assessing the efficiency and robustness of any exact design. The closed-form construction of a class of robust optimal fractional designs that can also facilitate model selection is explored and illustrated.
It is sometimes favorable to conduct experiments in a systematic run order rather than the conventional random run order. In this article, we propose an algorithm to construct the optimal … It is sometimes favorable to conduct experiments in a systematic run order rather than the conventional random run order. In this article, we propose an algorithm to construct the optimal run order. The algorithm is very flexible: it is applicable to any design and works whenever the optimality criterion can be represented as distance between any two experimental runs. Specifically, the proposed method first formulates the run order problem into a graph, then makes use of the existing traveling salesman problem-solver to obtain the optimal run order. It can always reach the optimal result in an efficient manner.A special case where level change is used as the distance criterion is investigated thoroughly. The optimal run orders for popular two-level designs are obtained and tabulated for practical use. For higher- or mixed-level designs a generic table is not possible, although the proposed algorithm still works well for finding the optimal run order. Some supporting theoretical results are derived.
Abstract The detection performance of a conventional control chart is usually degraded by a large sample size as in Wang and Tsung. This paper proposes a new control chart under … Abstract The detection performance of a conventional control chart is usually degraded by a large sample size as in Wang and Tsung. This paper proposes a new control chart under data‐rich environment. The proposed chart is based on the continuous ranked probability score and aims to simultaneously monitor the location and the scale parameters of any continuous process. We simulate different monitoring schemes with various shift patterns to examine the chart performance. Both in‐control and out‐of‐control performances are studied through simulation studies in terms of the mean, the standard deviation, the median, and some percentiles of the average run length distribution. Simulation results show that the proposed chart keeps a high sensitivity to shifts in location and/or scale without any distributional assumptions, and the outperformance improves, as the sample size becomes larger. Examples are given for illustration.
In an order-of-addition experiment, each treatment is a permutation of m components. It is often unaffordable to test all the m! treatments, and the design problem arises. We consider a … In an order-of-addition experiment, each treatment is a permutation of m components. It is often unaffordable to test all the m! treatments, and the design problem arises. We consider a model that incorporates the order of each pair of components and can also account for the distance between the two components in every such pair. Under this model, the optimality of the uniform design measure is established, via the approximate theory, for a broad range of criteria. Coupled with an eigen-analysis, this result serves as a benchmark that paves the way for assessing the efficiency and robustness of any exact design. The closed-form construction of a class of robust optimal fractional designs is then explored and illustrated.
A regularized artificial neural network (RANN) is proposed for interval-valued data prediction. The ANN model is selected due to its powerful capability in fitting linear and nonlinear functions. To meet … A regularized artificial neural network (RANN) is proposed for interval-valued data prediction. The ANN model is selected due to its powerful capability in fitting linear and nonlinear functions. To meet mathematical coherence requirement for an interval (i.e., the predicted lower bounds should not cross over their upper bounds), a soft non-crossing regularizer is introduced to the interval-valued ANN model. We conduct extensive experiments based on both simulation datasets and real-life datasets, and compare the proposed RANN method with multiple traditional models, including the linear constrained center and range method (CCRM), the least absolute shrinkage and selection operator-based interval-valued regression method (Lasso-IR), the nonlinear interval kernel regression (IKR), the interval multi-layer perceptron (iMLP) and the multi-output support vector regression (MSVR). Experimental results show that the proposed RANN model is an effective tool for interval-valued prediction tasks with high prediction accuracy.
Dimensional analysis (DA) is a methodology widely used in physics and engineering. The main idea is to extract key variables based on physical dimensions. Its overlooked importance in statistics has … Dimensional analysis (DA) is a methodology widely used in physics and engineering. The main idea is to extract key variables based on physical dimensions. Its overlooked importance in statistics has been recognized recently. However, most literature treats DA as merely a preprocessing tool, leading to multiple statistical issues. In particular, there are three critical aspects: (a) the nonunique choice of basis quantities and dimensionless variables; (b) the statistical representation and testing of DA constraints; (c) the spurious correlations between post-DA variables. There is an immediate need for an appropriate statistical methodology that integrates DA and the quantitative modeling. In this article, we propose a power-law type of "DA conjugate" model that is useful for incorporating dimensional information and analyzing post-DA variables. Adapting the similar idea of "conjugacy" in Bayesian analysis, we show that the proposed modeling technique not only produces flexible and effective results, but also provides good solutions to the above three issues. A modified projection pursuit regression analysis is implemented to fit the additive power-law model. A numerical study on ocean wave speed is discussed in detail to illustrate and evaluate the advantages of the proposed procedure. Supplementary materials for this article are available online.
We introduce systematic methods to create optimal designs for order-of-addition (OofA) experiments, those that study the order in which $m$ components are applied---for example, the order in which chemicals are … We introduce systematic methods to create optimal designs for order-of-addition (OofA) experiments, those that study the order in which $m$ components are applied---for example, the order in which chemicals are added to a reaction or layers are added to a film. Full designs require $m!$ runs, so we investigate design fractions. Balance criteria for creating such designs employ an extension of orthogonal arrays (OA's) to OofA-OA's. A connection is made between $D$-efficient and OofA-OA designs. Necessary conditions are found for the number of runs needed to create OofA-OA's of strengths 2 and 3. We create a number of new, optimal, designs: 12-run OofA-OA's in 4 and 5 components, 24-run OofA-OA's in 5 and 6 components, and near OofA-OA's in 7 components. We extend these designs to include (a) process factors, and (b) the common case in which component orderings are restricted. We also suggest how such designs may be analyzed.
Process capability analysis is an important aspect of quality control. Various process capability indices are proposed when the distribution is normal. For non‐normal cases, percentile and yield‐based indices have been … Process capability analysis is an important aspect of quality control. Various process capability indices are proposed when the distribution is normal. For non‐normal cases, percentile and yield‐based indices have been introduced. These two methods use partial features of a process distribution, such as key percentiles and the proportion of non‐conforming (PNC) to estimate the process capability. However, these local features may not reflect the uniformity of a process appropriately when the distribution is non‐normal. In this paper, continuous ranked probability score (CRPS) is introduced to process capability analysis and a CRPS‐based approach is proposed. This method can assess the dispersion of process variation across the overall distribution and is applicable to any continuous distribution. An example and simulations show that CRPS‐based indices are more stable and accurate indicators of process capability than the existing indices in reflecting the degree of process fluctuation. Copyright © 2016 John Wiley &amp; Sons, Ltd.
Abstract In this article, we discuss two multivariate control charts for monitoring changes in a covariance matrix. The | S | chart is applicable when the subgroup size n is … Abstract In this article, we discuss two multivariate control charts for monitoring changes in a covariance matrix. The | S | chart is applicable when the subgroup size n is larger than the dimensionality p . The multivariate exponentially weighted moving squared‐deviation (MEWMS) chart is specifically designed for the case when n = 1. Examples are given and discussed to demonstrate how each chart can be used in practice. Related issues and potential future research are also discussed.
Dimensional analysis (DA) is a well-developed, widely-employed methodology in the physical and engineering sciences. The application of dimensional analysis in statistics leads to three advantages: (1) the reduction of the … Dimensional analysis (DA) is a well-developed, widely-employed methodology in the physical and engineering sciences. The application of dimensional analysis in statistics leads to three advantages: (1) the reduction of the number of potential causal factors that we need to consider, (2) the analytical insights into the relations among variables that it generates, and (3) the scalability of results. The formalization of the dimensional-analysis method in statistical design and analysis gives a clear view of its generality and overlooked significance. In this paper, we first provide general procedures for dimensional analysis prior to statistical design and analysis. We illustrate the use of dimensional analysis with three practical examples. In the first example, we demonstrate the basic dimensional-analysis process in connection with a study of factors that affect vehicle stopping distance. The second example integrates dimensional analysis into the regression analysis of the pine tree data. In our third example, we show how dimensional analysis can be used to develop a superior experimental design for the well-known paper helicopter experiment. In the regression example and in the paper helicopter experiment, we compare results obtained via the dimensional-analysis approach to those obtained via conventional approaches. From those, we demonstrate the general properties of dimensional analysis from a statistical perspective and recommend its usage based on its favorable performance.
We investigate asymptotic properties of least-absolute-deviation or median quantile estimates of the location and scale functions in nonparametric regression models with dependent data from multiple subjects. Under a general dependence … We investigate asymptotic properties of least-absolute-deviation or median quantile estimates of the location and scale functions in nonparametric regression models with dependent data from multiple subjects. Under a general dependence structure that allows for longitudinal data and some spatially correlated data, we establish uniform Bahadur representations for the proposed median quantile estimates. The obtained Bahadur representations provide deep insights into the asymptotic behavior of the estimates. Our main theoretical development is based on studying the modulus of continuity of kernel weighted empirical process through a coupling argument. Progesterone data is used for an illustration.
Abstract The D‐optimal minimax criterion is proposed to construct fractional factorial designs. The resulting designs are very efficient, and robust against misspecification of the effects in the linear model. The … Abstract The D‐optimal minimax criterion is proposed to construct fractional factorial designs. The resulting designs are very efficient, and robust against misspecification of the effects in the linear model. The criterion was first proposed by Wilmut &amp; Zhou (2011); their work is limited to two‐level factorial designs, however. In this paper we extend this criterion to designs with factors having any levels (including mixed levels) and explore several important properties of this criterion. Theoretical results are obtained for construction of fractional factorial designs in general. This minimax criterion is not only scale invariant, but also invariant under level permutations. Moreover, it can be applied to any run size. This is an advantage over some other existing criteria. The Canadian Journal of Statistics 41: 325–340; 2013 © 2013 Statistical Society of Canada
Analysis of massive data sets is challenging owing to limitations of computer primary memory. In this paper, we propose an approach to estimate population parameters from a massive data set. … Analysis of massive data sets is challenging owing to limitations of computer primary memory. In this paper, we propose an approach to estimate population parameters from a massive data set. The proposed approach significantly reduces the required amount of primary memory, and the resulting estimate will be as efficient if the entire data set was analyzed simultaneously. Asymptotic properties of the resulting estimate are studied, and the asymptotic normality of the resulting estimator is established. The standard error formula for the resulting estimate is proposed and empirically tested; thus, statistical inference for parameters of interest can be performed. The effectiveness of the proposed approach is illustrated using simulation studies and an Internet traffic data example. Copyright © 2012 John Wiley &amp; Sons, Ltd.
The minimum aberration criterion has been frequently used in the selection of fractional factorial designs with nominal factors. For designs with quantitative factors, however, level permutation of factors could alter … The minimum aberration criterion has been frequently used in the selection of fractional factorial designs with nominal factors. For designs with quantitative factors, however, level permutation of factors could alter their geometrical structures and statistical properties. In this paper uniformity is used to further distinguish fractional factorial designs, besides the minimum aberration criterion. We show that minimum aberration designs have low discrepancies on average. An efficient method for constructing uniform minimum aberration designs is proposed and optimal designs with 27 and 81 runs are obtained for practical use. These designs have good uniformity and are effective for studying quantitative factors.
Classical statistical process control often relies on univariate characteristics. In many contemporary applications, however, the quality of products must be characterized by some functional relation between a response variable and … Classical statistical process control often relies on univariate characteristics. In many contemporary applications, however, the quality of products must be characterized by some functional relation between a response variable and its explanatory variables. Monitoring such functional profiles has been a rapidly growing field due to increasing demands. This paper develops a novel nonparametric L-1 location-scale model to screen the shapes of profiles. The model is built on three basic elements: location shifts, local shape distortions, and overall shape deviations, which are quantified by three individual metrics. The proposed approach is applied to the previously analyzed vertical density profile data, leading to some interesting insights.
Abstract Interaction is very common in reality, but has received little attention in logistic regression literature. This is especially true for higher-order interactions. In conventional logistic regression, interactions are typically … Abstract Interaction is very common in reality, but has received little attention in logistic regression literature. This is especially true for higher-order interactions. In conventional logistic regression, interactions are typically ignored. We propose a model selection procedure by implementing an association rules analysis. We do this by (1) exploring the combinations of input variables which have significant impacts to response (via association rules analysis); (2) selecting the potential (low- and high-order) interactions; (3) converting these potential interactions into new dummy variables; and (4) performing variable selections among all the input variables and the newly created dummy variables (interactions) to build up the optimal logistic regression model. Our model selection procedure establishes the optimal combination of main effects and potential interactions. The comparisons are made through thorough simulations. It is shown that the proposed method outperforms the existing methods in all cases. A real-life example is discussed in detail to demonstrate the proposed method. Keywords: association rules analysisinteraction effectslogistic regression modelsmodel selection
AbstractWhen a problem occurs in a system, its causes should be identified for the problem to be fixed. Ishikawa Cause and Effect (CE) diagrams are popular tools to investigate and … AbstractWhen a problem occurs in a system, its causes should be identified for the problem to be fixed. Ishikawa Cause and Effect (CE) diagrams are popular tools to investigate and identify numerous different causes of a problem. A CE diagram can be used as a guideline to allocate resources and make necessary investments to fix the problem. Although important decisions are based on CE diagrams, there is a scarcity of analytical methodology that supports the construction of these diagrams. We propose a methodology based on capture-recapture analysis to analytically estimate the causes of a problem and build CE diagrams. An estimate of the number of causes can be used to determine whether the CE study should be terminated or additional iterations are required. It is shown that integration of Capture-Recapture analysis concepts into CE diagrams enables the users to evaluate the progress of CE sessions.Key Words: Capture-Recapture analysisCE diagram construction methodologyIshikawa CE diagrams Additional informationNotes on contributorsR. Ufuk BilselR. Ufuk Bilsel received his B.S. and M.S. degrees, both in Industrial Engineering, from Galatasaray University in Istanbul, Turkey, in 2003 and 2005 respectively. He received his M.Eng degree in Industrial Engineering and Ph.D. in Industrial Engineering and Operations Research from The Pennsylvania State University in 2008 and 2009 respectively. He is currently with The Boston Consulting Group. His areas of interest include supply chain optimization, decision making under uncertainty, risk analysis and statistical analysis of large datasets. He is a member of Alpha Pi Mu, Industrial Engineering Honor Society.Dennis K.J. LinDennis K. J. Lin Dr. Dennis Lin is a University Distinguished Professor of Statistics and Supply Chain Management at Penn State University. His research interests are quality engineering, industrial statistics, data mining and response surface. He has published over 150 papers in a wide variety of journals. Dr. Lin is an elected fellow of ASA and ASQ, an elected member of ISI, a lifetime member of ICSA, and a fellow of RSS. He is an honorary chair professor for various universities, including a Chang-Jiang Scholar of China at Renmin University, National Chengchi University (Taiwan), Fudan University, Jinan University, and XiAn Statistical Institute (China). Dr. Lin presents several distinguished lectures, including the 2010 Youden Address (FTC) and the 2011 Loutit Address (SSC). He is also the recipient of the 2004 Faculty Scholar Medal Award at Penn State University.
Supersaturated design (SSD) has received much recent interest because of its potential in factor screening experiments. In this paper, we provide equivalent conditions for two columns to be fully aliased … Supersaturated design (SSD) has received much recent interest because of its potential in factor screening experiments. In this paper, we provide equivalent conditions for two columns to be fully aliased and consequently propose methods for constructing E(fNOD)- and χ2-optimal mixed-level SSDs without fully aliased columns, via equidistant designs and difference matrices. The methods can be easily performed and many new optimal mixed-level SSDs have been obtained. Furthermore, it is proved that the nonorthogonality between columns of the resulting design is well controlled by the source designs. A rather complete list of newly generated optimal mixed-level SSDs are tabulated for practical use.
Abstract Many industries have incorporated design of experiment (DoE) as an important tool for product and process improvement. DoE allows researchers to vary multiple factors at a time in a … Abstract Many industries have incorporated design of experiment (DoE) as an important tool for product and process improvement. DoE allows researchers to vary multiple factors at a time in a systematic fashion, in order to obtain more information for the same amount of resources. This article discusses the DoEs that are relevant to pharmaceutical industry.
ABSTRACT Motivated by a vertical density profile problem, in this article we focus on monitoring the slopes of linear profiles. A Shewhart-type control chart for monitoring slopes of linear profiles … ABSTRACT Motivated by a vertical density profile problem, in this article we focus on monitoring the slopes of linear profiles. A Shewhart-type control chart for monitoring slopes of linear profiles is proposed. Both Phase I and Phase II applications are discussed. The performance of the proposed chart in Phase I applications is demonstrated using both real-life data in an illustrative example and simulated data in a probability of signal study. For Phase II applications, it is shown that the average run length (ARL) of the proposed control chart depends only on the shifts of slopes, whereas the ARL of the multivariate T 2 chart depends on both the shifts of slopes and the correlation between the estimated slope and the intercept. When such a correlation is low, the proposed control chart has a better ARL performance than the T 2 chart.
The first step in many applications of response surface methodology is typically the screening process. Variable selection plays an important role in screening experiments when a large number of potential … The first step in many applications of response surface methodology is typically the screening process. Variable selection plays an important role in screening experiments when a large number of potential factors are introduced in a preliminary study. Traditional approaches, such as the best subset variable selection and stepwise deletion, may not be appropriate in this situation. In this paper we introduce a variable selection procedure via penalized least squares with the SCAD penalty. An algorithm to find the penalized least squares solution is suggested, and a standard error formula for the penalized least squares estimate is derived. With a proper choice of the regularization parameter, it is shown that the resulting estimate is root n consistent and possesses an oracle property; namely, it works as well as if the correct submodel were known. An automatic and data-driven approach was proposed to select the regularization parameter. Examples are used to illustrate the effectiveness of the newly proposed approach. The computer codes (written in MATLAB) to perform all calculation are available through the authors for an automatic data-driven variable selection procedure.
Dear ASMBI readers, We would like to inform you of the new and exciting changes going on with the journal. First of all, you might have noticed some changes in … Dear ASMBI readers, We would like to inform you of the new and exciting changes going on with the journal. First of all, you might have noticed some changes in the cover page. ASMBI is now the Official Journal of the International Society for Business and Industrial Statistics (ISBIS), complete with a new Editorial Board. ISBIS was formed as an international society in April 2005, as one of the youngest sections of the International Statistical Institute (ISI). As described in its web site (www.isbis.org), ISBIS seeks to ‘promote the advancement and exchange of knowledge in business, financial and industrial statistics’. In addition to running biennial international conferences, regional conferences in collaboration with other societies (especially in developing countries) and ISI satellite conferences, ISBIS has now assumed editorial responsibility for a well-established journal, ASMBI. By combining the strong scientific expertise of ISBIS members with the outstanding publishing record of Wiley-Blackwell, we will strive to enhance the role played by ASMBI in promoting top-quality papers on stochastic/statistical modeling in industrial and business applications. An ISBIS taskforce chaired by Vijay Nair initiated creation of the new Editorial Board. Of us, Fabrizio Ruggeri (CNR IMATI, Italy) is the new Editor-in-Chief, while Nalini Ravishanker (University of Connecticut, U.S.A.) is Editor for Theory and Methods and Dennis Lin (Penn State University, U.S.A.) is Editor for Case Studies and Applications. The three of us as Editors have selected 32 Associate Editors who, between them, represent diverse working experiences in academe and industry, a broad range of expertise in several scientific areas, and considerable geographic diversity. Although this has been a lengthy process, we deeply appreciate the enthusiasm for involvement with ASMBI shown by these people we contacted. The new Editorial Board retains a few people who have been involved with ASMBI in the previous years, including the Editor-in-Chief who served as Editor for four years. Close cooperation with the outgoing Editor-in-Chief, Jef Teugels, has been very beneficial for the incoming Editor-in-Chief. We aspire to continue maintaining the high standard set by Jef Teugels during his years as Editor-in-Chief. We are indebted to him for his work and support and we are sure he will follow our progress closely. We are extremely grateful to Sir David Cox, Nick Fisher, Vijay Nair, and Jef Teugels who have graciously accepted our invitation to serve as Honorary Editors to ASMBI. We look forward to their valuable guidance and support. You will not see changes in the next few issues, since ASMBI is a healthy journal with a long backlog of excellent papers still in the pipeline. In the future, the first major change will be the introduction of discussion/review papers. We plan to invite leading experts in ‘hot’ fields to contribute highly innovative and in-depth review papers, and we will invite key researchers to discuss them. The first such paper is, quite naturally, about The Future of Business and Industrial Statistics and it will be written by Nick Fisher (ISBIS President) and Vijay Nair. Another important innovation is that from now on, the entire editorial process, from submission of a paper up to its publication decision, will be done via the Web. Achieving this has been a lengthy process for us and our partners at Wiley, but we now have a structure that equips us to tackle the challenging job we have ahead of us. The two Editors have differing areas of responsibility: Nalini Ravishanker for Theory and Methods and Dennis Lin of Case Studies and Applications. We are particularly interested in promoting the latter area. Specifically, we are interested in using Case Studies to demonstrate how sound statistical methodology and practice can have a significant beneficial impact on the prosperity of business and industrial enterprises. This issue is being sent to all ISBIS members and we kindly invite them not just to read the journal but also (especially!) to send us their best papers. We would like to make ASMBI a key resource in the field of business and industrial statistics. As you can see, we are planning some innovations about the content of the journal and we have appointed a diverse and authoritative Editorial Board. We hope you will join us in this adventure and be an integral part of it! Best regards
Abstract In this short exposition, we provide an overview of the aliasing (or confounding) among effects that is caused by studying fewer treatment combinations than required in a full factorial … Abstract In this short exposition, we provide an overview of the aliasing (or confounding) among effects that is caused by studying fewer treatment combinations than required in a full factorial design. We show, by example, how to determine the alias structure of the regular two‐level fractional factorial (2 k − p ) designs, nonregular two‐level designs, and the regular three‐level fractional factorial (3 k − p ) designs.
Abstract In this article, we discuss two multivariate control charts for monitoring changes in a covariance matrix. The | S | chart is applicable when the subgroup size n is … Abstract In this article, we discuss two multivariate control charts for monitoring changes in a covariance matrix. The | S | chart is applicable when the subgroup size n is larger than the dimensionality p . The multivariate exponentially weighted moving squared‐deviation (MEWMS) chart is specifically designed for the case when n = 1. Examples are given and discussed to demonstrate how each chart can be used in practice. Related issues and potential future research are also discussed.
Abstract The biasness problem of the maximum‐likelihood estimate (MLE) of the common shape parameter of several Weibull populations is examined in detail. A modified MLE (MMLE) approach is proposed. In … Abstract The biasness problem of the maximum‐likelihood estimate (MLE) of the common shape parameter of several Weibull populations is examined in detail. A modified MLE (MMLE) approach is proposed. In the case of complete and Type II censored data, the bias of the MLE can be substantial. This is noticeable even when the sample size is large. Such a bias increases rapidly as the degree of censorship increases and as more populations are involved. The proposed MMLE, however, is nearly unbiased and much more efficient than the MLE, irrespective of the degree of censorship, the sample sizes, and the number of populations involved. Copyright © 2007 John Wiley &amp; Sons, Ltd.
We propose a low-storage, single-pass, sequential method for the execution of convex hull peeling for massive datasets. The method is shown to vastly reduce the computation time required for the … We propose a low-storage, single-pass, sequential method for the execution of convex hull peeling for massive datasets. The method is shown to vastly reduce the computation time required for the existing convex hull peeling algorithm from O(n 2) to O(n). Furthermore, the proposed method has significantly smaller storage requirements compared to the existing method. We present algorithms for low-storage, sequential computation of both the convex hull peeling multivariate median and the convex hull peeling pth depth contour, where 0 < p < 1. We demonstrate the accuracy and reduced computation time required of the proposed method by comparing to the existing convex hull peeling method through simulation studies.
Abstract Massive data sets are becoming popular in this information era. Due to the limitation of computer memory space and the computing time, the kernel density estimation for massive data … Abstract Massive data sets are becoming popular in this information era. Due to the limitation of computer memory space and the computing time, the kernel density estimation for massive data sets, although strongly demanding, is rather challenging. In this paper, we propose a quick algorithm for multivariate density estimation which is suitable for massive data sets. The term quick is referred to indicate the computing ease. Theoretical properties of the proposed algorithm are developed. Its empirical performance is demonstrated through a credit card example and numerous simulation studies. It is shown that in addition to its computational ease, the proposed algorithm is as good as the traditional methods (for the situations where these traditional methods are feasible). Copyright © 2006 John Wiley &amp; Sons, Ltd.
In this paper, we review multivariate control charts designed for monitoring changes in a covariance matrix that have been developed in the last 15 years. The focus is on control … In this paper, we review multivariate control charts designed for monitoring changes in a covariance matrix that have been developed in the last 15 years. The focus is on control charts developed for multivariate normal processes, assuming that independent subgroups of observations or independent individual observations are sampled as process monitoring proceeds. Control charts developed between 1990 and 2005 are reviewed according to the types of the control chart: multivariate Shewhart chart, multivariate CUSUM chart and multivariate EWMA chart. In addition to these developments, a new multivariate EWMA control chart is proposed. We also discuss comparisons of chart performance that have been carried out in the literature, as well as the issue of diagnostics. Some potential future research ideas are also given.
Abstract Ridge analysis in response surface methodology has received extensive discussion in the literature, while little is known for ridge analysis in the multi‐response case. In this paper, the ridge … Abstract Ridge analysis in response surface methodology has received extensive discussion in the literature, while little is known for ridge analysis in the multi‐response case. In this paper, the ridge path is investigated for multi‐response surfaces and a large‐sample simultaneous confidence interval (confidence band) for the ridge path is developed. Copyright © 2005 John Wiley &amp; Sons, Ltd.
THE DESIGN OF OPTIMUM MULTIFACTORIAL EXPERIMENTS Get access R. L. PLACKETT, R. L. PLACKETT Search for other works by this author on: Oxford Academic Google Scholar J. P. BURMAN J. … THE DESIGN OF OPTIMUM MULTIFACTORIAL EXPERIMENTS Get access R. L. PLACKETT, R. L. PLACKETT Search for other works by this author on: Oxford Academic Google Scholar J. P. BURMAN J. P. BURMAN Search for other works by this author on: Oxford Academic Google Scholar Biometrika, Volume 33, Issue 4, June 1946, Pages 305–325, https://doi.org/10.1093/biomet/33.4.305 Published: 01 June 1946
Fractional factorial designs-especially the twolevel designs-are useful in a variety of experimental situations, for example, (i) screening studies in which only a subset of the variables is expected to be … Fractional factorial designs-especially the twolevel designs-are useful in a variety of experimental situations, for example, (i) screening studies in which only a subset of the variables is expected to be important, (ii) research investigations in which certain interactions are expected to be negligible and (iii) experimental programs in which groups of runs are to be performed sequentially, ambiguities being resolved as the investigation evolves (see Box, Hunter and Hunter, 1978). The literature on fractional factorial designs is extensive. For references before 1969, see the comprehensive bibliography of Herzberg and Cox (1969). For more recent references, see Daniel (1976) and Joiner (1975-79). A useful concept associated with 2k-P fractional factorial designs is that of resolution (Box and Hunter, 1961). A design is of resolution R if no cfactor effect is confounded with any other effect containing less than R c factors. For example, a design of resolution III does not confound main effects with one another but does confound main effects with two-factor interactions, and a design of resolution IV does not confound main effects with two-factor interactions but does confound two-factor interactions with one another. The resolution of a two-level fractional factorial design is the length of the shortest word in the defining relation. Usually an experimenter will prefer to use a design which has the highest
A uniform design (UD) seeks design points that are uniformly scattered on the domain. It has been popular since 1980. A survey of UD is given in the first portion: … A uniform design (UD) seeks design points that are uniformly scattered on the domain. It has been popular since 1980. A survey of UD is given in the first portion: The fundamental idea and construction method are presented and discussed and examples are given for illustration. It is shown that UD's have many desirable properties for a wide variety of applications. Furthermore, we use the global optimization algorithm, threshold accepting, to generate UD's with low discrepancy. The relationship between uniformity and orthogonality is investigated. It turns out that most UD's obtained here are indeed orthogonal.
A review of the literature on control charts for multivariate quality control (MQC) is given, with a concentration on developments occurring since the mid-1980s. Multivariate cumulative sum (CUSUM) control procedures … A review of the literature on control charts for multivariate quality control (MQC) is given, with a concentration on developments occurring since the mid-1980s. Multivariate cumulative sum (CUSUM) control procedures and a multivariate exponentially weighted moving average (EWMA) control chart are reviewed and recommendations are made regarding their use. Several recent articles that give methods for interpreting an out-of-control signal on a multivariate control chart are analyzed and discussed. Other topics such as the use of principal components and regression adjustment of variables in MQC, as well as frequently used approximations in MQC, are discussed.
Plotting the empirical cumulative distribution of the usual set of orthogonal contrasts computed from a 2 p experiment on a special grid may aid in its criticism and interpretation. Bad … Plotting the empirical cumulative distribution of the usual set of orthogonal contrasts computed from a 2 p experiment on a special grid may aid in its criticism and interpretation. Bad values, heteroscedasticity, dependence of variance on mean, and some types of defective randomization, all leave characteristic stigmata. The halfnormal plot can be used to estimate the error standard deviation and to make judgments about the reality of the observed effects. An accompanying paper by A. Birnbaum gives some operating characteristics of these judgments. Examples are given of the use of half-normal plots in each of these ways.
Parametric and nonparametric estimation methods are proposed for reliability characteristics of one failure mode under competing risks with incomplete data; some of the failure times are observed without observing the … Parametric and nonparametric estimation methods are proposed for reliability characteristics of one failure mode under competing risks with incomplete data; some of the failure times are observed without observing the cause of failure. The efficiencies of these methods are discussed.
Life data from multicomponent systems are often analyzed to estimate the reliability of each system component. Due to the cost and diagnostic constraints, however, the exact cause of system failure … Life data from multicomponent systems are often analyzed to estimate the reliability of each system component. Due to the cost and diagnostic constraints, however, the exact cause of system failure might be unknown. Referring to such situations as being masked, the authors use a likelihood approach to exploit all the available information. They focus on a series system of three components, each with a constant failure rate, and propose a single numerical procedure for obtaining maximum-likelihood estimations (MLEs) in the general case. It is shown that, under certain assumptions, closed-form solutions for the MLEs can be obtained. The authors consider that the cause of system failure can be isolated to some subset of components, which allows them to consider the full range of possible information on the cause of system failure. The likelihood, while presented for complete data, can be extended to censoring. The general likelihood expressions can be used with various component life distributions, e.g., Weibull, lognormal. However, closed-form MLEs would most certainly be intractable and numerical methods would be required.< <ETX xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">&gt;</ETX>
Most of the existing control charts for monitoring multivariate process variability are based on subgroup sizes greater than one. In many practical applications, however, only individual observations are available and … Most of the existing control charts for monitoring multivariate process variability are based on subgroup sizes greater than one. In many practical applications, however, only individual observations are available and the usual control charts are not applicable in these cases. In this paper, two new control charts are proposed to monitor multivariate process variability for individual observations. The proposed control charts are constructed based on the traces of the estimated covariance matrices derived from the individual observations. When there is only one quality characteristic, these two charts respectively reduce to the exponentially weighted mean squared deviation and exponentially weighted moving variance charts. It is shown based on the simulation studies that the proposed charts are superior to the existing ones in detecting increases in variance and changes in correlation. An example from the semiconductor industry is also presented to illustrate the applicability of the proposed charts.
Box and Meyer (1986) introduced a method for assessing the sizes of contrasts in unreplicated factorial and fractional factorial designs. This is a useful technique, and an associated graphical display … Box and Meyer (1986) introduced a method for assessing the sizes of contrasts in unreplicated factorial and fractional factorial designs. This is a useful technique, and an associated graphical display popularly known as a Bayes plot makes it even more effective. This article presents a competing technique that is also effective and is computationally simple. An advantage of the new method is that the results are given in terms of the original units of measurement. This direct association with the data may make the analysis easier to explain.
When studying both location and dispersion effects in unreplicated fractional factorial designs, a "standard" procedure is to identify location effects using ordinary least squares analysis, fit a model, and then … When studying both location and dispersion effects in unreplicated fractional factorial designs, a "standard" procedure is to identify location effects using ordinary least squares analysis, fit a model, and then identify dispersion effects by analyzing the residuals. In this paper, we show that if the model in the above procedure does not include all active location effects, then null dispersion effects may be mistakenly identified as active. We derive an exact relationship between location and dispersion effects, and we show that without information in addition to the unreplicated fractional factorial (such as replication) we can not determine whether a dispersion effect or two location effects are active.
Quality improvement requires reduction of undesired variation. Design of experiments haa become a useful tool for discovering such minimum variance conditions, but most existing techniques require a large experimental effort … Quality improvement requires reduction of undesired variation. Design of experiments haa become a useful tool for discovering such minimum variance conditions, but most existing techniques require a large experimental effort involving replicated experiments. In this article we develop a method for significance testing of dispersion effects from unreplicated two-level fractional factorial experiments using principles close to those for identifying location effects. The method provides an F-distributed test statistic and appears to be useful for identification of dispersion effects as early as the screening stage when fractionated unreplicated experiments are frequently employed.
Traditionally, Plackett-Burman (PB) designs have been used in screening experiments for identifying important main effects. The PB designs whose run sizes are not a power of two have been criticized … Traditionally, Plackett-Burman (PB) designs have been used in screening experiments for identifying important main effects. The PB designs whose run sizes are not a power of two have been criticized for their complex aliasing patterns, which according to conventional wisdom gives confusing results. This paper goes beyond the traditional approach by proposing an analysis strategy that entertains interactions in addition to main effects. Based on the precepts of effect sparsity and effect heredity, the proposed procedure exploits the designs' complex aliasing patterns, thereby turning their “liability” into an advantage. Demonstration of the procedure on three real experiments shows the potential for extracting important information available in the data that has, until now, been missed. Some limitations are discussed, and extensions to overcome them are given. The proposed procedure also applies to more general mixed level designs that have become increasingly popular.
An error bound for multidimensional quadrature is derived that includes the Koksma-Hlawka inequality as a special case. This error bound takes the form of a product of two terms. One … An error bound for multidimensional quadrature is derived that includes the Koksma-Hlawka inequality as a special case. This error bound takes the form of a product of two terms. One term, which depends only on the integrand, is defined as a generalized variation. The other term, which depends only on the quadrature rule, is defined as a generalized discrepancy. The generalized discrepancy is a figure of merit for quadrature rules and includes as special cases the<inline-formula content-type="math/mathml"><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" alttext="script upper L Superscript p"><mml:semantics><mml:msup><mml:mrow class="MJX-TeXAtom-ORD"><mml:mrow class="MJX-TeXAtom-ORD"><mml:mi class="MJX-tex-caligraphic" mathvariant="script">L</mml:mi></mml:mrow></mml:mrow><mml:mi>p</mml:mi></mml:msup><mml:annotation encoding="application/x-tex">{\mathcal L}^p</mml:annotation></mml:semantics></mml:math></inline-formula>-star discrepancy and<inline-formula content-type="math/mathml"><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" alttext="upper P Subscript alpha"><mml:semantics><mml:msub><mml:mi>P</mml:mi><mml:mi>α<!-- α --></mml:mi></mml:msub><mml:annotation encoding="application/x-tex">P_\alpha</mml:annotation></mml:semantics></mml:math></inline-formula>that arises in the study of lattice rules.
Hotelling's T 2 is customarily used as the control chart for multivariate SPC analysis. This chart responds to changes in both the mean values and the covariance matrix of the … Hotelling's T 2 is customarily used as the control chart for multivariate SPC analysis. This chart responds to changes in both the mean values and the covariance matrix of the responses. In this article, we propose the use of a chart that concentrates on changes in the covariance matrix. The use of this covariance chart in concert with the T 2 chart enables the user to better determine whether T 2 points out of control are due to changes in mean values or due to changes in the covariance matrix. Using this chart in conjunction with T 2 thus furnishes a suite of tools similar to the x-bar and standard deviation charts for univariate processes.
(C.D.F.) F.(x), (i=1,2,...,p). We assume that the X.'s are not observable but U = min(Xn,...,X ) or V = max(X-,...,X ) is. We would like to determine uniquely 1 ? … (C.D.F.) F.(x), (i=1,2,...,p). We assume that the X.'s are not observable but U = min(Xn,...,X ) or V = max(X-,...,X ) is. We would like to determine uniquely 1 ? 1 ? the marginal C.D.F.fs, F.'s, from that of U in the competing risks problem or from that of V in the complementary risks problem. We would also consider related inference problems. As examples of the concepts consider the following: (a) Let X. be the time to death (failure) from cause C. (of component C.). Here X.fs are not observable but we observe a death time U (or time to ?
Plotting the empirical cumulative distribution of the usual set of orthogonal contrasts computed from a 2 p experiment on a special grid may aid in its criticism and interpretation. Bad … Plotting the empirical cumulative distribution of the usual set of orthogonal contrasts computed from a 2 p experiment on a special grid may aid in its criticism and interpretation. Bad values, heteroscedasticity, dependence of variance on mean, and some types of defective randomization, all leave characteristic stigmata. The halfnormal plot can be used to estimate the error standard deviation and to make judgments about the reality of the observed effects. An accompanying paper by A. Birnbaum gives some operating characteristics of these judgments. Examples are given of the use of half-normal plots in each of these ways.
Summary This paper presents some techniques for monitoring and controlling the dispersion of multivariate normal processes based on subgroup data. The procedures involve use of independent statistics resulting from the … Summary This paper presents some techniques for monitoring and controlling the dispersion of multivariate normal processes based on subgroup data. The procedures involve use of independent statistics resulting from the decomposition of the covariance matrix. Those that do not depend on prior estimates of the process covariance matrix are particularly attractive to short‐run or low volume manufacturing environments.
In unreplicated 2k−p designs, the assumption of constant variance is commonly made. When the variance of the response differs between the two levels of a column in the effect matrix, … In unreplicated 2k−p designs, the assumption of constant variance is commonly made. When the variance of the response differs between the two levels of a column in the effect matrix, that column produces a dispersion effect. In this article we show that two active dispersion effects may create a spurious dispersion effect in their interaction column. Most existing methods for dispersion-effect testing in unreplicated fractional factorial designs are subject to these spurious effects. We propose a method of dispersion-effect testing based on geometric means of residual sample variances. We show through examples from the literature and simulations that the proposed test has many desirable properties that are lacking in other tests.
In this paper, we review multivariate control charts designed for monitoring changes in a covariance matrix that have been developed in the last 15 years. The focus is on control … In this paper, we review multivariate control charts designed for monitoring changes in a covariance matrix that have been developed in the last 15 years. The focus is on control charts developed for multivariate normal processes, assuming that independent subgroups of observations or independent individual observations are sampled as process monitoring proceeds. Control charts developed between 1990 and 2005 are reviewed according to the types of the control chart: multivariate Shewhart chart, multivariate CUSUM chart and multivariate EWMA chart. In addition to these developments, a new multivariate EWMA control chart is proposed. We also discuss comparisons of chart performance that have been carried out in the literature, as well as the issue of diagnostics. Some potential future research ideas are also given.
SUMMARY We propose a new method for estimation in linear models. The ‘lasso’ minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients … SUMMARY We propose a new method for estimation in linear models. The ‘lasso’ minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant. Because of the nature of this constraint it tends to produce some coefficients that are exactly 0 and hence gives interpretable models. Our simulation studies suggest that the lasso enjoys some of the favourable properties of both subset selection and ridge regression. It produces interpretable models like subset selection and exhibits the stability of ridge regression. There is also an interesting relationship with recent work in adaptive function estimation by Donoho and Johnstone. The lasso idea is quite general and can be applied in a variety of statistical models: extensions to generalized regression models and tree-based models are briefly described.
AbstractStatistical process control (SPC) methods are widely used to monitor and improve manufacturing processes and service operations. Disputes over the theory and application of these methods are frequent and often … AbstractStatistical process control (SPC) methods are widely used to monitor and improve manufacturing processes and service operations. Disputes over the theory and application of these methods are frequent and often very intense. Some of the controversies and issues discussed are the relationship between hypothesis testing and control charting, the role of theory and the modeling of control chart performance, the relative merits of competing methods, the relevance of research on SPC and even the relevance of SPC itself. One purpose of the paper is to offer a resolution of some of these disagreements in order to improve the communication between practitioners and researchers.KeywordsAverage Run LengthControl ChartsCumulative Sum Control ChartsExponentially Weighted Moving Average Control Charts Additional informationNotes on contributorsWilliam H. WoodallDr. Woodall is a Professor in the Department of Statistics. He is a Fellow of ASQ. His e-mail address is [email protected].
Abstract Life data from systems of components are often analysed to estimate the reliability of the individual components. These estimates are useful since they reflect the reliability of the components … Abstract Life data from systems of components are often analysed to estimate the reliability of the individual components. These estimates are useful since they reflect the reliability of the components under actual operating conditions. However, owing to the cost or time involved with failure analysis, the exact component causing system failure may be unknown or ‘masked’. That is, the cause may only be isolated to some subset of the system's components. We present an iterative approach for obtaining component reliability estimates from such data for series systems. The approach is analogous to traditional probability plotting. That is, it involves the fitting of a parametric reliability function to a set of nonparametric reliability estimates (plotting points). We present a numerical example assuming Weibull component life distributions and a two‐component series system. In this example we find estimates with only 4 per cent of the computation time required to find comparable MLEs.
An overview is given of current research on control charting methods for process monitoring and improvement. The discussion includes a historical perspective along with ideas for future research. Research topics … An overview is given of current research on control charting methods for process monitoring and improvement. The discussion includes a historical perspective along with ideas for future research. Research topics include, for example, variable sample size and sampling interval methods, economic designs, attribute data methods, charts based on autocorrelated observations, multivariate methods, and nonparametric methods. Recommendations and references are provided to those interested in pursuing research ideas in statistical process control (SPC). Some issues regarding the relevance of SPC research are also discussed.
Abstract This article uses the concept of data depth to introduce several new control charts for monitoring processes of multivariate quality measurements. For any dimension of the measurements, these charts … Abstract This article uses the concept of data depth to introduce several new control charts for monitoring processes of multivariate quality measurements. For any dimension of the measurements, these charts are in the form of two-dimensional graphs that can be visualized and interpreted just as easily as the well-known univariate X, X, and CUSUM charts. Moreover, they have several significant advantages. First, they can detect simultaneously the location shift and scale increase of the process, unlike the existing methods, which can detect only the location shift. Second, their construction is completely nonparametric; in particular, it does not require the assumption of normality for the quality distribution, which is needed in standard approaches such as the χ2 and Hotelling's T 2 charts. Thus these new charts generalize the principle of control charts to multivariate settings and apply to a much broader class of quality distributions.
A control chart is proposed to effectively monitor changes in the population variance-covariance matrix of a multivariate normal process when individual observations are collected. The proposed control chart is constructed … A control chart is proposed to effectively monitor changes in the population variance-covariance matrix of a multivariate normal process when individual observations are collected. The proposed control chart is constructed based on first taking the exponentially weighted moving average of the product of each observation and its transpose. Appropriate statistics which are based on square distances between estimators and true parameters are then developed to detect changes in the variances and covariances of the variance-covariance matrix. The simulation studies show that the proposed control chart outperforms existing procedures in cases where either the variances or correlations increase or both increase. The improvement in performance of the proposed control chart is particularly notable when variables are strongly positively correlated. The proposed control chart is applied to a real-life example taken from the semiconductor industry.
Control charts monitor processes where performance is measured by one or multiple quality characteristics. Some processes, however, are characterized by a profile or a function. Here we focus on monitoring … Control charts monitor processes where performance is measured by one or multiple quality characteristics. Some processes, however, are characterized by a profile or a function. Here we focus on monitoring a process in semiconductor manufacturing that is characterized by a linear function. While the linear function is the simplest, it occurs frequently, for example in calibration studies. Two monitoring approaches are proposed: (1) monitor parameters, slope and intercept, with multivariate T2 and (2) monitor average residuals between sample and reference lines with EWMA and R charts. Simulation studies indicate that both methods work well. Both methods are extendable to complex functions.
By studying treatment contrasts and ANOVA models, we propose a generalized minimum aberration criterion for comparing asymmetrical fractional factorial designs. The criterion is independent of the choice of treatment contrasts … By studying treatment contrasts and ANOVA models, we propose a generalized minimum aberration criterion for comparing asymmetrical fractional factorial designs. The criterion is independent of the choice of treatment contrasts and thus model-free. It works for symmetrical and asymmetrical designs, regular and nonregular designs. In particular,it reduces to the minimum aberration criterion for regular designs and the minimum $G_2$-aberration criterion for two-level nonregular designs. In addition, by exploring the connection between factorial design theory and coding theory, we develop a complementary design theory for general symmetrical designs,which covers many existing results as special cases.
This paper introduces a new multivariate exponentially weighted moving average (EWMA) control chart. The proposed control chart, called an EWMA V-chart, is designed to detect small changes in the variability … This paper introduces a new multivariate exponentially weighted moving average (EWMA) control chart. The proposed control chart, called an EWMA V-chart, is designed to detect small changes in the variability of correlated multivariate quality characteristics. Through examples and simulations, it is demonstrated that the EWMA V-chart is superior to the |S|-chart in detecting small changes in process variability. Furthermore, a counterpart of the EWMA V-chart for monitoring process mean, called the EWMA M-chart is proposed. In detecting small changes in process variability, the combination of EWMA M-chart and EWMA V-chart is a better alternative to the combination of MEWMA control chart (Lowry et al. , 1992) and |S|-chart. Furthermore, the EWMA M- chart and V-chart can be plotted in one single figure. As for monitoring both process mean and process variability, the combined MEWMA and EWMA V-charts provide the best control procedure.
Preface to the Third Edition.Preface to the Second Edition.Preface to the First Edition.1. Introduction.2. The Multivariate Normal Distribution.3. Estimation of the Mean Vector and the Covariance Matrix.4. The Distributions and … Preface to the Third Edition.Preface to the Second Edition.Preface to the First Edition.1. Introduction.2. The Multivariate Normal Distribution.3. Estimation of the Mean Vector and the Covariance Matrix.4. The Distributions and Uses of Sample Correlation Coefficients.5. The Generalized T2-Statistic.6. Classification of Observations.7. The Distribution of the Sample Covariance Matrix and the Sample Generalized Variance.8. Testing the General Linear Hypothesis: Multivariate Analysis of Variance9. Testing Independence of Sets of Variates.10. Testing Hypotheses of Equality of Covariance Matrices and Equality of Mean Vectors and Covariance Matrices.11. Principal Components.12. Cononical Correlations and Cononical Variables.13. The Distributions of Characteristic Roots and Vectors.14. Factor Analysis.15. Pattern of Dependence Graphical Models.Appendix A: Matrix Theory.Appendix B: Tables.References.Index.
Summary Tang &amp; Barnett (1996) present some techniques for monitoring and controlling the dispersion of multivariate normal processes based on subgroup data. The current paper compares the proposed techniques and … Summary Tang &amp; Barnett (1996) present some techniques for monitoring and controlling the dispersion of multivariate normal processes based on subgroup data. The current paper compares the proposed techniques and various competing procedures. Simulation results indicate that the proposed techniques are superior to existing procedures.
Multivariate control charts are considered for the simultaneous monitoring of the mean vector and covariance matrix when the joint distribution of process variables is multivariate normal. Emphasis is on the … Multivariate control charts are considered for the simultaneous monitoring of the mean vector and covariance matrix when the joint distribution of process variables is multivariate normal. Emphasis is on the use of combinations of multivariate exponentially weighted moving average (MEWMA) control charts based on sample means and on the sum of the squared deviations from target. The performance of these combinations is compared with the performance of standard multivariate Shewhart charts and to combinations of univariate EWMA charts applied to each of the variables. The performance of these control charts with and without the use of Hawkins' (1991) method of regression adjustment of the variables is investigated. The performance of many of the control charts depends on the direction of the shift in the mean vector or covariance matrix, so performance is investigated for specific shift directions and also for averages over all directions. The best overall performance is achieved using a combination of MEWMA charts based on the sample means and on the sum of squared regression adjusted deviations from target.
The projection properties of the 2 R q–p fractional factorials are well known and have been used effectively in a number of published examples of experimental investigations. The Plackett and … The projection properties of the 2 R q–p fractional factorials are well known and have been used effectively in a number of published examples of experimental investigations. The Plackett and Burman designs also have interesting projective properties, knowledge of which allows the experimenter to follow up an initial Plackett and Burman design with runs that increase the initial resolution for the factors that appear to matter and thus permit efficient separation of effects of interest. Projections of designs into 2–5 dimensions are discussed, and the 12-run case is given in detail. A numerical example illustrates the practical uses of these projections.
Abstract Cumulative sum (CUSUM) procedures are among the most powerful tools for detecting a shift from a good quality distribution to a bad quality distribution. This article discusses the natural … Abstract Cumulative sum (CUSUM) procedures are among the most powerful tools for detecting a shift from a good quality distribution to a bad quality distribution. This article discusses the natural application of CUSUM procedures to the multivariate normal distribution. It discusses two cases, detecting a shift in the mean vector and detecting a shift in the covariance matrix. As an example, the procedure is applied to measurements taken on optical fibers. KEY WORDS: Cumulative sumQuality control
AbstractWe propose control chart methods for process monitoring when the quality of a process or product is characterized by a linear function. In the historical analysis of Phase I data, … AbstractWe propose control chart methods for process monitoring when the quality of a process or product is characterized by a linear function. In the historical analysis of Phase I data, we recommend methods including the use of a bivariate T2 chart to check for stability of the regression coefficients in conjunction with a univariate Shewhart chart to check for stability of the variation about the regression line. We recommend the use of three univariate control charts in Phase II. These three charts are used to monitor the 𝘠-intercept, the slope, and the variance of the deviations about the regression line, respectively. A simulation study shows that this type of Phase II method can detect sustained shifts in the parameters better than competing methods in terms of average run length performance. We also relate the monitoring of linear profiles to the control charting of regression-adjusted variables and other methods.KeywordsCalibrationExponentially Weighted Moving Average Control ChartMultivariate T2 Control ChartsStatistical Process Control Additional informationNotes on contributorsKeunpyo KimMr. Kim is a Ph.D. student in the Department of Statistics. His email address is [email protected] A. MahmoudMr. Mahmoud is a Ph.D. student in the Department of Statistics. His email address is [email protected] H. WoodallDr. Woodall is a Professor in the Department of Statistics. He is a Fellow of ASQ. His email address is [email protected].
In this paper, we propose a new variables control chart, called the box-chart, to simultaneously monitor, on a single chart, the process mean and process variability for multivariate processes. The … In this paper, we propose a new variables control chart, called the box-chart, to simultaneously monitor, on a single chart, the process mean and process variability for multivariate processes. The box-chart uses a probability integral transformation to obtain two independently and identically distributed uniform distributions. Therefore, a box-shaped (thus the name), two-dimensional control chart can be constructed. We discuss in detail on how to construct the box-chart. The proposed chart is applied to two real-life examples. The performance of the box-chart is also compared to that of the traditional T 2 - and |S|-charts.
Loss of markets to Japan has recently caused attention to return to the enormous potential that experimental design possesses for the improvement of product design, for the improvement of the … Loss of markets to Japan has recently caused attention to return to the enormous potential that experimental design possesses for the improvement of product design, for the improvement of the manufacturing process, and hence for improvement of overall product quality. In the screening stage of industrial experimentation it is frequently true that the "Pareto Principle" applies; that is, a large proportion of process variation is associated with a small proportion of the process variables. In such circumstances of "factor sparsity," unreplicated fractional designs and other orthogonal arrays have frequently been effective when used as a screen for isolating preponderant factors. A useful graphical analysis due to Daniel (1959) employs normal probability plotting. A more formal analysis is presented here, which may be used to supplement such plots and hence to facilitate the use of these unreplicated experimental arrangements.
In most statistical process control (SPC) applications, it is assumed that the quality of a process or product can be adequately represented by the distribution of a univariate quality characteristic … In most statistical process control (SPC) applications, it is assumed that the quality of a process or product can be adequately represented by the distribution of a univariate quality characteristic or by the general multivariate distribution of a vector consisting of several correlated quality characteristics. In many practical situations, however, the quality of a process or product is better characterized and summarized by a relationship between a response variable and one or more explanatory variables. Thus, at each sampling stage, one observes a collection of data points that can be represented by a curve (or profile). In some calibration applications, the profile can be represented adequately by a simple straight-line model, while in other applications, more complicated models are needed. In this expository paper, we discuss some of the general issues involved in using control charts to monitor such process- and product-quality profiles and review the SPC literature on the topic. We relate this application to functional data analysis and review applications involving linear profiles, nonlinear profiles, and the use of splines and wavelets. We strongly encourage research in profile monitoring and provide some research ideas.
Innovation in the design and manufacture of processes and products usually comes about as a result of careful investigation—a directed process of sequential learning. Many practitioners, although familiar with "one-shot" … Innovation in the design and manufacture of processes and products usually comes about as a result of careful investigation—a directed process of sequential learning. Many practitioners, although familiar with "one-shot" statistical procedures, have little knowledge of the power of statistical techniques designed to catalyze investigation itself. A simple means of demonstrating and experiencing this learning process is illustrated using response surface methods to find an improved design for a paper helicopter.
Abstract In this article, control charts for bivariate as well as for multivariate normal data are proposed to detect a shift in the process variability. Methods of obtaining design parameters … Abstract In this article, control charts for bivariate as well as for multivariate normal data are proposed to detect a shift in the process variability. Methods of obtaining design parameters and procedures of implementing the proposed charts are discussed. Performance of the proposed charts is compared with some existing control charts. It is verified that the proposed charts significantly reduce the out of control “average run length” (ARL) as compared to other charts considered in the study. Also, when the process variability decreases (process improvement), it is verified that the ARL of the proposed multivariate control chart increases as compared to other charts considered in the study. Keywords: Average run lengthAverage time to signalConforming run lengthDeterminant ratioSample generalized varianceSteady state ARLSynthetic chartTransition probability matrixZero state ARLMathematics Subject Classification: 62P30
Deng and Tang proposed generalized resolution and minimum aberration criteria for comparing and assessing nonregular fractional factorials, of which Plackett–Burman designs are special cases.A relaxed variant of generalized aberration is … Deng and Tang proposed generalized resolution and minimum aberration criteria for comparing and assessing nonregular fractional factorials, of which Plackett–Burman designs are special cases.A relaxed variant of generalized aberration is proposed and studied in this paper.We show that a best design according to this criterion minimizes the contamination of nonnegligible interactions on the estimation of main effects in the order of importance given by the hierarchical assumption.The new criterion is defined through a set of $B$ values, a generalization of word length pattern. We derive some theoretical results that relate the $B$ values of a nonregular fractional factorial and those of its complementary design. Application of this theory to the construction of the best designs according to the new aberration criterion is discussed. The results in this paper generalize those in Tang and Wu, which characterize a minimum aberration (regular) $2^{m-k}$ design through its complementary design.
A data depth can be used to measure the “depth” or “outlyingness” of a given multivariate sample with respect to its underlying distribution. This leads to a natural center-outward ordering … A data depth can be used to measure the “depth” or “outlyingness” of a given multivariate sample with respect to its underlying distribution. This leads to a natural center-outward ordering of the sample points. Based on this ordering, quantitative and graphical methods are introduced for analyzing multivariate distributional characteristics such as location, scale, bias, skewness and kurtosis, as well as for comparing inference methods. All graphs are one-dimensional curves in the plane and can be easily visualized and interpreted. A “sunburst plot” is presented as a bivariate generalization of the box-plot. DD-(depth versus depth) plots are proposed and examined as graphical inference tools. Some new diagnostic tools for checking multivariate normality are introduced. One of them monitors the exact rate of growth of the maximum deviation from the mean, while the others examine the ratio of the overall dispersion to the dispersion of a certain central region. The affine invariance property of a data depth also leads to appropriate invariance properties for the proposed statistics and methods.
Multivariate control charts are widely used in various industries to monitor the shifts in process mean and process variability. In Phase I monitoring, control limits are computed using the historical … Multivariate control charts are widely used in various industries to monitor the shifts in process mean and process variability. In Phase I monitoring, control limits are computed using the historical data, and control charts based on classical estimators (sample mean and the sample covariance) are highly sensitive to the outliers in the data. We propose robust control charts with high breakdown robust estimators based on the re‐weighted minimum covariance determinant and the re‐weighted minimum volume ellipsoid to monitor the process variability of multivariate individual observations in Phase I data under multivariate exponentially weighted mean square error and multivariate exponentially weighted moving variance schemes. The control limits are computed empirically, and the performance of the proposed charts is assessed with Monte Carlo simulations by considering different data scenarios. The proposed robust control charts are shown to perform better than charts based on classical estimators. Copyright © 2013 John Wiley &amp; Sons, Ltd.
Abstract Recent research works have shown that control statistics based on squared deviation of observations from target have the ability to monitor variability in both univariate and multivariate processes. In … Abstract Recent research works have shown that control statistics based on squared deviation of observations from target have the ability to monitor variability in both univariate and multivariate processes. In the current research, the properties of the control statistic S t that has been proposed by Huwang et al. ( J. Quality Technology 2007; 39 :258–278) are first reviewed and three new S t ‐based multivariate schemes are then presented. Extensive simulation experiments are performed to compare the performances of the proposed schemes with those of the multivariate exponentially weighted mean squared deviation (MEWMS) and the L 1 ‐norm distance of the MEWMS deviation from its expected value (MEWMSL 1 ) charts. The results show that one of the proposed schemes outperforms the others in detecting shifts in correlation coefficients and another has the best general performance among the compared charts in detecting shifts in which at least one of the variances changes. Copyright © 2011 John Wiley &amp; Sons, Ltd.