Author Description

Login to generate an author description

Ask a Question About This Mathematician

The extreme-value index is an important parameter in extreme-value theory since it controls the fi rst order behavior of the distribution tail. In the literature, numerous estimators of this parameter … The extreme-value index is an important parameter in extreme-value theory since it controls the fi rst order behavior of the distribution tail. In the literature, numerous estimators of this parameter have been proposed especially in the case of heavy-tailed distributions, which is the situation considered here. Most of these estimators depend on the k largest observations of the underlying sample. Their bias is controlled by the second order parameter. In order to reduce the bias of extreme-value index estimators or to select the best number k of observations to use, the knowledge of the second order parameter is essential. In this paper, we propose a simple approach to estimate the second order parameter leading to both existing and new estimators. We establish a general result that can be used to easily prove the asymptotic normality of a large number of estimators proposed in the literature or to compare di erent estimators within a given family. Some illustrations on simulations are also provided.
Let X 1 , X 2 ,. . .be a sequence of independent copies (s.i.c) of a real random variable (r.v.) X 1, with distribution function df F(x) = P(X … Let X 1 , X 2 ,. . .be a sequence of independent copies (s.i.c) of a real random variable (r.v.) X 1, with distribution function df F(x) = P(X x) and let X 1,n X 2,n • • • X n,n be the order statistics based on the n 1 first of these observations.The following continuous generalized Hill processτ > 0, 1 k n, has been introduced as a continuous family of estimators of the extreme value index, and largely studied for statistical purposes with asymptotic normality results restricted to τ > 1/2.We extend those results to 0 < τ 1/2 and show that asymptotic normality is still valid for τ = 1/2.For 0 < τ < 1/2, we get non Gaussian asymptotic laws which are closely related to the Riemann function ζ (s) = ∑ ∞ n=1 n -s , s > 1.
Overlapping coefficient is a direct measure of similarity between two distributions which is recently becoming very useful.This paper investigates estimation for some well-known measures of overlap, namely Matusita's measure , … Overlapping coefficient is a direct measure of similarity between two distributions which is recently becoming very useful.This paper investigates estimation for some well-known measures of overlap, namely Matusita's measure , Weitzman's measure ∆ and Λ based on Kullback-Leibler.Two estimation methods considered in this study are point estimation and Bayesian approach.Two inverse Lomax populations with different shape parameters are considered.The bias and mean square error properties of the estimators are studied through a simulation study and a real data example.
Nonparametric density estimation, based on kernel-type estimators, is a very popular method in statistical research, especially when we want to model the probabilistic or stochastic structure of a data set. … Nonparametric density estimation, based on kernel-type estimators, is a very popular method in statistical research, especially when we want to model the probabilistic or stochastic structure of a data set. In this paper, we investigate the asymptotic confidence bands for the distribution with kernel-estimators for some types of divergence measures (Rényi-α and Tsallis-α divergence). Our aim is to use the method based on empirical process techniques, in order to derive some asymptotic results. Under different assumptions, we establish a variety of fundamental and theoretical properties, such as the strong consistency of an uniform-in-bandwidth of the divergence estimators. We further apply the previous results in simulated examples, including the kernel-type estimator for Hellinger, Bhattacharyya and Kullback-Leibler divergence, to illustrate this approach, and we show that that the method performs competitively.
nge studies require comprehensive databases to analyze the climate signal, to monitor its evolution, and to predict more accurately future changes. Since complete observations of any continuous process is almost … nge studies require comprehensive databases to analyze the climate signal, to monitor its evolution, and to predict more accurately future changes. Since complete observations of any continuous process is almost impossible, it is then inevitable to encounter missing information in meteorological databases. The aim of this work is to evaluate the performance of five ($5$) imputation methods: missForest, $k$-nn, ppca, mice and imputeTS. The results show that missForest is the best performing method to handle missing temperature data. In the case of precipitation data, the imputeTS method is the preferred one.
We are concerned in this paper with the functional asymptotic behaviour of the sequence of stochastic processes T_{n}(f)=\sum_{j=1}^{j=k}f(j)(\log X_{n-j+1,n}-\log X_{n-j,n}), indexed by some classes $\mathcal{F}$ of functions $f:\mathbb{N} \backslash {0} … We are concerned in this paper with the functional asymptotic behaviour of the sequence of stochastic processes T_{n}(f)=\sum_{j=1}^{j=k}f(j)(\log X_{n-j+1,n}-\log X_{n-j,n}), indexed by some classes $\mathcal{F}$ of functions $f:\mathbb{N} \backslash {0} \longmapsto \mathbb{R}_{+}$ and where $k=k(n)$ satisfies 1\leq k\leq n,k/n\rightarrow 0\text{as}n\rightarrow \infty. This is a functional generalized Hill process including as many new estimators of the extremal index when $F$ is in the extremal domain. We focus in this paper on its functional and uniform asymptotic law in the new setting of weak convergence in the space of bounded real functions. The results are next particularized for explicit examples of classes $\mathcal{F} $.
Let $X_{1},X_{2},...$ be a sequence of independent copies (s.i.c) of a real random variable (r.v.) $X\geq 1$, with distribution function $df$ $F(x)=\mathbb{P}% (X\leq x)$ and let $X_{1,n}\leq X_{2,n} \leq ... … Let $X_{1},X_{2},...$ be a sequence of independent copies (s.i.c) of a real random variable (r.v.) $X\geq 1$, with distribution function $df$ $F(x)=\mathbb{P}% (X\leq x)$ and let $X_{1,n}\leq X_{2,n} \leq ... \leq X_{n,n}$ be the order statistics based on the $n\geq 1$ first of these observations. The following continuous generalized Hill process {equation*} T_{n}(\tau)=k^{-\tau}\sum_{j=1}^{j=k}j^{\tau}(\log X_{n-j+1,n}-\log X_{n-j,n}), \label{dl02} {equation*} $\tau >0$, $1\leq k \leq n$, has been introduced as a continuous family of estimators of the extreme value index, and largely studied for statistical purposes with asymptotic normality results restricted to $\tau > 1/2$. We extend those results to $0 1$
We are concerned in this paper with the functional asymptotic behavior of the sequence of stochastic processes$$T_{n}(f)=\sum_{j=1}^{j=k}f(j)\left( \log X_{n-j+1,n}-\log X_{n-j,n}\right),\eqno(0.1)$$indexed by some classes $\mathcal{F}$ of functions $f:\mathbb{N} \backslash \{0\} \longmapsto … We are concerned in this paper with the functional asymptotic behavior of the sequence of stochastic processes$$T_{n}(f)=\sum_{j=1}^{j=k}f(j)\left( \log X_{n-j+1,n}-\log X_{n-j,n}\right),\eqno(0.1)$$indexed by some classes $\mathcal{F}$ of functions $f:\mathbb{N} \backslash \{0\} \longmapsto \mathbb{R}_{+}$ and where $k=k(n)$ satisfies\begin{equation*}1\leq k\leq n,k/n\rightarrow 0\text{ as }n\rightarrow \infty .\end{equation*}This is a functional generalized Hill process including as many new estimators of the extreme value index when $F$ is in the extreme value domain. We focus in this paper on its functional and uniform asymptotic law in the new setting of weak convergence in the space of bounded real functions. The results are next particularized for explicit examples of classes $\mathcal{F}$.
In this paper, we investigate the problem of estimating the probability density function. The kernel density estimation with bias reduced is nowadays a standard technique in explorative data analysis, there … In this paper, we investigate the problem of estimating the probability density function. The kernel density estimation with bias reduced is nowadays a standard technique in explorative data analysis, there is still a big dispute on how to assess the quality of the estimate and which choice of bandwidth is optimal. This framework examines the most important bandwidth selection methods for kernel density estimation in the context of with bias reduction. Normal reference, least squares cross-validation, biased cross-validation and β-divergence loss methods are described and expressions are presented. In order to assess the performance of our various bandwidth selectors, numerical simulations and environmental data are carried out.
In this paper, we propose the realized Hyperbolic GARCH model for the joint-dynamics of lowfrequency returns and realized measures that generalizes the realized GARCH model of Hansen et al.(2012) as … In this paper, we propose the realized Hyperbolic GARCH model for the joint-dynamics of lowfrequency returns and realized measures that generalizes the realized GARCH model of Hansen et al.(2012) as well as the FLoGARCH model introduced by Vander Elst (2015). This model is sufficiently flexible to capture both long memory and asymmetries related to leverage effects. In addition, we will study the strictly and weak stationarity conditions of the model. To evaluate its performance, experimental simulations, using the Monte Carlo method, are made to forecast the Value at Risk (VaR) and the Expected Shortfall (ES). These simulation studies show that for ES and VaR forecasting, the realized Hyperbolic GARCH (RHYGARCH-GG) model with Gaussian-Gaussian errors provide more adequate estimates than the realized Hyperbolic GARCH model with student- Gaussian errors.
Binary responses are often present in medical studies. When the dependent variable Y represents a rare event, the logistic regression model shows relevant drawbacks. To overcome these drawbacks, we propose … Binary responses are often present in medical studies. When the dependent variable Y represents a rare event, the logistic regression model shows relevant drawbacks. To overcome these drawbacks, we propose the quantile function of the generalized extreme value regression distribution as a link function and focus our attention on values close to one. One problem arising in the presence of cure fraction is that, it is usually unknown who are the cured and the susceptible subjects, unless the outcome of interest has been observed. In these settings, a logistic regression analysis is no more straightforward. We develop a maximum likelihood estimation procedure, based on the joint modeling of the binary response of interest and the cure status. We investigate the identifiability of the resulting model and establish the asymptotic properties. We conduct a simulation study to investigate its finite-sample behaviour, and application to real data.
We are concerned in this paper with the functional asymptotic behaviour of the sequence of stochastic processes T_{n}(f)=\sum_{j=1}^{j=k}f(j)(\log X_{n-j+1,n}-\log X_{n-j,n}), indexed by some classes $\mathcal{F}$ of functions $f:\mathbb{N} \backslash {0} … We are concerned in this paper with the functional asymptotic behaviour of the sequence of stochastic processes T_{n}(f)=\sum_{j=1}^{j=k}f(j)(\log X_{n-j+1,n}-\log X_{n-j,n}), indexed by some classes $\mathcal{F}$ of functions $f:\mathbb{N} \backslash {0} \longmapsto \mathbb{R}_{+}$ and where $k=k(n)$ satisfies 1\leq k\leq n,k/n\rightarrow 0\text{as}n\rightarrow \infty. This is a functional generalized Hill process including as many new estimators of the extremal index when $F$ is in the extremal domain. We focus in this paper on its functional and uniform asymptotic law in the new setting of weak convergence in the space of bounded real functions. The results are next particularized for explicit examples of classes $\mathcal{F} $.
We propose nonparametric estimation of divergence measures between continuous distributions. Our approach is based on a plug-in kernel- type estimators of density functions. We give the uniform in bandwidth consistency … We propose nonparametric estimation of divergence measures between continuous distributions. Our approach is based on a plug-in kernel- type estimators of density functions. We give the uniform in bandwidth consistency for the proposal estimators. As a consequence, their asymp- totic 100% confidence intervals are also provided.
We introduce a kernel-type estimators of ( ) φ /, v -divergence for continuous distributions.We discuss this approach of goodness-of-fit test for a model selection criterion relative to these divergence … We introduce a kernel-type estimators of ( ) φ /, v -divergence for continuous distributions.We discuss this approach of goodness-of-fit test for a model selection criterion relative to these divergence measures.Our interest is in the problem to testing for choosing between two models using some informational type statistics (on random walk and autoregressive AR (1)).The limit laws of the estimates and test statistics are given under both the null and the alternative hypotheses.We also describe how to apply estimators and illustrate their efficiency through numerical experiences.
Abstract We introduce a location-scale model for conditional heavy-tailed distributions when the covariate is deterministic. First, nonparametric estimators of the location and scale functions are introduced. Second, an estimator of … Abstract We introduce a location-scale model for conditional heavy-tailed distributions when the covariate is deterministic. First, nonparametric estimators of the location and scale functions are introduced. Second, an estimator of the conditional extreme-value index is derived. The asymptotic properties of the estimators are established under mild assumptions and their finite sample properties are illustrated both on simulated and real data.
We propose nonparametric estimation of divergence measures between continuous distributions. Our approach is based on a plug-in kernel- type estimators of density functions. We give the uniform in bandwidth consistency … We propose nonparametric estimation of divergence measures between continuous distributions. Our approach is based on a plug-in kernel- type estimators of density functions. We give the uniform in bandwidth consistency for the proposal estimators. As a consequence, their asymp- totic 100% confidence intervals are also provided.
Let $X_{1},X_{2},...$ be a sequence of independent copies (s.i.c) of a real random variable (r.v.) $X\geq 1$, with distribution function $df$ $F(x)=\mathbb{P}% (X\leq x)$ and let $X_{1,n}\leq X_{2,n} \leq ... … Let $X_{1},X_{2},...$ be a sequence of independent copies (s.i.c) of a real random variable (r.v.) $X\geq 1$, with distribution function $df$ $F(x)=\mathbb{P}% (X\leq x)$ and let $X_{1,n}\leq X_{2,n} \leq ... \leq X_{n,n}$ be the order statistics based on the $n\geq 1$ first of these observations. The following continuous generalized Hill process {equation*} T_{n}(\tau)=k^{-\tau}\sum_{j=1}^{j=k}j^{\tau}(\log X_{n-j+1,n}-\log X_{n-j,n}), \label{dl02} {equation*} $\tau >0$, $1\leq k \leq n$, has been introduced as a continuous family of estimators of the extreme value index, and largely studied for statistical purposes with asymptotic normality results restricted to $\tau > 1/2$. We extend those results to $0 < \tau \leq 1/2$ and show that asymptotic normality is still valid for $\tau=1/2$. For $0 < \tau <1/2$, we get non Gaussian asymptotic laws which are closely related to the Riemann function $% \zeta(s)=\sum_{n=1}^{\infty} n^{-s},s>1$
Cette these est divisee en cinq chapitres auxquels s'ajoutent une introduction et une conclusion. Dans le premier chapitre, nous rappelons quelques notions de base sur la theorie des valeurs extremes. … Cette these est divisee en cinq chapitres auxquels s'ajoutent une introduction et une conclusion. Dans le premier chapitre, nous rappelons quelques notions de base sur la theorie des valeurs extremes. Dans le deuxieme chapitre, nous considerons un processus statistique dependant d'un parametre continu tau et dont chaque marge peut etre consideree comme un estimateur de Hill generalis.. Ce processus statistique permet de discriminer entierement les domaines d'attraction des valeurs extremes. La normalite asymptotique de ce processus statistiquea ete seulement donnee pour tau > 1/2. Nous completons cette etude pour 0 < tau< 1/2, en donnant une approximation des domaines de Gumbel et de Frechet. Des etudes de simulations effectuees avec le logiciel R , permettent de montrer la performance de ces estimateurs. Comme illustration, nous proposons une application de notre methodologie aux donnees hydrauliques. Dans le troisieme chapitre, nous etendons l'etude du processus statistique precedent dans un cadre fonctionnel. Nous proposons donc un processus stochastique dependant d'une fonctionnelle positive pour obtenir une grande classe d'estimateurs de l'indice des valeurs extremes dont chaque estimateur est une marge d'un seul processus stochastique. L'etude theorique de ces processus stochastiques que nous avions menee, est basee sur la theorie moderne de convergence vague fonctionnelle. Cette derniere permet de gerer des estimateurs plus complexes sous forme de processus stochastiques. Nous donnons les distributions asymptotiques fonctionnelles de ces processus et nous montrons que pour certaines classes de fonctions, nous avons un comportement asymptotique non Gaussien et qui sera entierement caracterise. Dans le quatrieme chapitre, on s'interesse a l'estimation du parametre du second ordre. Notons que ce parametre joue un role tres important dans le choix adaptatif du nombre optimal de valeurs extremes utilise lors de l'estimation de l'indice des valeurs extremes. L'estimation de ce parametre est egalement utilisee pour la reduction du biais des estimateurs de l'indice de queue et a recu une grande attention dans la litterature des valeurs extremes .Nous proposons une simple et generale approche pour estimer le parametre du second ordre, permettant de regrouper un grand nombre d'estimateurs. Il est montre que les estimateurs cites precedemment peuvent etre vus comme des cas particuliers de notre approche. Nous tirons egalement parti de notre formalisme pour proposer de nouveaux estimateurs asymptotiquement Gaussiens du parametre du second ordre. Finalement, certains estimateurs sont compares tant du point de vue asymptotique que performance sur des echantillons de tailles finies. Comme illustration, nous proposons une application sur des donnees d'assurance. Dans le dernier chapitre, on s'interesse aux mesures de risque actuariel pour des phenomenes capables d'engendrer des pertes financieres tres importantes (ou phenomenes extremes c'est-a-dire a des risques dont on ne sait pas si le systeme d'assurance sera capable de les supporte). De nombreuses mesures de risque ou principes de calcul de la prime ont ete proposes dans la litterature actuarielle. Nous nous concentrons sur la prime de risque-ajustee. Jones et Zitikis (2003) ont donne une estimation de cette derniere basee sur la distribution empirique et ont etabli sa normalite asymptotique sous certaines conditions appropriees, et qui ne sont pas souvent remplies dans le cas des distributions a queues lourdes. Ainsi, nous regardons ce cadre la et nous considerons une famille d'estimateurs de la prime de risque-ajustee basee sur l'approche de la theorie des valeurs extremes. Nous etablissons leur normalite asymptotique et nous proposons egalement une approche de reduction de biais pour ces estimateurs. Des etudes de simulation permettent d'apprecier la qualite de nos estimateurs. Comme illustration, nous proposons une application sur des donnees d'assurance.
An important parameter in extreme value theory is the extreme value index $\gamma$. It controls the fi rst order behavior of the distribution tail. In the literature, numerous estimators of … An important parameter in extreme value theory is the extreme value index $\gamma$. It controls the fi rst order behavior of the distribution tail. In the literature, numerous estimators of this parameter have been proposed especially in the case of heavy tailed distributions (which is the situation considered here). The most known estimator was proposed by [2]. It depends on the $k$ largest observations of the underlying sample. The bias of the tail index estimator is controlled by the second order parameter $\rho$. In order to reduce the bias of $\gamma$'s estimators or to select the best number $k$ of observations to use, the knowledge of $\rho$ is essential. Some estimators of $\rho$ can be found in the literature, see for example [1, 2, 3]. We propose a semiparametric family of estimators for $\rho$ that encompasses the three previously mentioned estimators. The asymptotic normality of these estimators is then proved in an uni fied way. New estimators of $\rho$ are also introduced.
We are interested in a location-scale model for heavy-tailed distributions where the covariate is deterministic. We first address the nonparametric estimation of the location and scale functions and derive an … We are interested in a location-scale model for heavy-tailed distributions where the covariate is deterministic. We first address the nonparametric estimation of the location and scale functions and derive an estimator of the conditional extreme-value index. Second, new estimators of the extreme conditional quantiles are introduced. The asymptotic properties of the estimators are established under mild assumptions.
In this paper, we propose an estimator of Foster, Greer and Thorbecke class of measures $\displaystyle P(z,α) = \int_0^{z}\Big(\frac{z-x}{z}\Big)^αf(x)\, dx$, where $z&gt;0$ is the poverty line, $f$ is the probabily … In this paper, we propose an estimator of Foster, Greer and Thorbecke class of measures $\displaystyle P(z,α) = \int_0^{z}\Big(\frac{z-x}{z}\Big)^αf(x)\, dx$, where $z&gt;0$ is the poverty line, $f$ is the probabily density function of the income distribution and $α$ is the so-called poverty aversion. The estimator is constructed with a bias reduced kernel estimator. Uniform almost sure consistency and uniform mean square consistenty are established. A simulation study indicates that our new estimator performs well.
Overlapping coefficient is a direct measure of similarity between two distributions which is recently becoming very useful. This paper investigates estimation for some well-known measures of overlap, namely Matusita's measure … Overlapping coefficient is a direct measure of similarity between two distributions which is recently becoming very useful. This paper investigates estimation for some well-known measures of overlap, namely Matusita's measure $\rho$, Weitzman's measure $\Delta$ and $\Lambda$ based on Kullback-Leibler. Two estimation methods considered in this study are point estimation and Bayesian approach. Two Inverse Lomax populations with different shape parameters are considered. The bias and mean square error properties of the estimators are studied through a simulation study and a real data example.
The extreme-value index is an important parameter in extreme-value theory since it controls the first order behavior of the distribution tail. Numerous estimators of this parameter have been proposed especially … The extreme-value index is an important parameter in extreme-value theory since it controls the first order behavior of the distribution tail. Numerous estimators of this parameter have been proposed especially in the case of heavy-tailed distributions, which is the situation considered here. Most of these estimators depend on the largest observations of the underlying sample. Their bias is controlled by the second order parameter. In order to reduce the bias of extreme-value index estimators or to select the best number of observations to use, the knowledge of the second order parameter is essential. We propose a simple approach to estimate the second order parameter leading to both existing and new estimators. We establish a general result that can be used to easily prove the asymptotic normality of a large number of estimators proposed in the literature or to compare different estimators within a given family. Some illustrations on simulations are also provided.
In this paper, we propose an Adaptive Realized Hyperbolic GARCH (A-Realized HYGARCH) process to model the long memory of high-frequency time series with possible structural breaks. The structural change is … In this paper, we propose an Adaptive Realized Hyperbolic GARCH (A-Realized HYGARCH) process to model the long memory of high-frequency time series with possible structural breaks. The structural change is modeled by allowing the intercept to follow the smooth and flexible function form introduced by Gallant (1984). In addition, stability conditions of the process are investigated. A Monte Carlo study is investigated in order to illustrate the performance of the A-Realized HYGARCH process compared to the Realized HYGARCH with or without structural change.
Logistic regression model is widely used in many studies to investigate the relationship between a binary response variable $Y$ and a set of potential predictors $\mathbf X$. The binary response … Logistic regression model is widely used in many studies to investigate the relationship between a binary response variable $Y$ and a set of potential predictors $\mathbf X$. The binary response may represent, for example, the occurrence of some outcome of interest ($Y=1$ if the outcome occurred and $Y=0$ otherwise). When the dependent variable $Y$ represents a rare event, the logistic regression model shows relevant drawbacks. In order to overcome these drawbacks we propose the Generalized Extreme Value (GEV) regression model. In particular, we suggest the quantile function of the GEV distribution as link function, so our attention is focused on the tail of the response curve for values close to one. A sample of observations is said to contain a cure fraction when a proportion of the study subjects (the so-called cured individuals, as opposed to the susceptibles) cannot experience the outcome of interest. One problem arising then is that it is usually unknown who are the cured and the susceptible subjects, unless the outcome of interest has been observed. In these settings, a logistic regression analysis of the relationship between $\mathbf X$ and $Y$ among the susceptibles is no more straightforward. We develop a maximum likelihood estimation procedure for this problem, based on the joint modeling of the binary response of interest and the cure status. We investigate the identifiability of the resulting model. Then, we conduct a simulation study to investigate its finite-sample behavior, and application to real data.
Generalized extreme value (GEV) regression is often more adapted when we investigate a relationship between a binary response variable $Y$ which represents a rare event and potentiel predictors $\mathbf{X}$. In … Generalized extreme value (GEV) regression is often more adapted when we investigate a relationship between a binary response variable $Y$ which represents a rare event and potentiel predictors $\mathbf{X}$. In particular, we use the quantile function of the GEV distribution as link function. Bootstrapping assigns measures of accuracy (bias, variance, confidence intervals, prediction error, test of hypothesis) to sample estimates. This technique allows estimation of the sampling distribution of almost any statistic using random sampling methods. Bootstrapping estimates the properties of an estimator by measuring those properties when sampling from an approximating distribution. In this paper, we fitted the generalized extreme value regression model, then we performed parametric bootstrap method for testing hupthesis, estimating confidence interval of parameters for generalized extreme value regression model and a real data application.
In this paper, we propose an Adaptive Realized Hyperbolic GARCH (A-Realized HYGARCH) process to model the long memory of high-frequency time series with possible structural breaks.The structural change is modeled … In this paper, we propose an Adaptive Realized Hyperbolic GARCH (A-Realized HYGARCH) process to model the long memory of high-frequency time series with possible structural breaks.The structural change is modeled by allowing the intercept to follow the smooth and flexible function form introduced by Gallant in [20].In addition, stability conditions of the process are investigated.A Monte Carlo study is investigated in order to illustrate the performance of the A-Realized HY-GARCH process compared to the Realized HYGARCH with or without structural change.
The modeling of extreme events arises in many fields such as finance, insurance or environmental science. A recurrent statistical problem is then the estimation of extreme quantiles associated with a … The modeling of extreme events arises in many fields such as finance, insurance or environmental science. A recurrent statistical problem is then the estimation of extreme quantiles associated with a random variable $Y$ recorded simultaneously with a multidimensional covariate x in R^d, the goal being to describe how tail characteristics such as extreme quantiles or small exceedance probabilities of the response variable Y may depend on the explanatory variable x. Here, we focus on the challenging situation where Y given x is heavy-tailed. Without additional assumptions on the pair (Y,x), the estimation of extreme conditional quantiles is addressed using semi-parametric method. More specifically, we assume that the response variable and the deterministic covariate are linked by a location-dispersion regression model Y=a(x)+b(x)Z where Z is a heavy-tailed random variable. This model is flexible since (i) no parametric assumptions are made on a(.), b(.) and Z, (ii) it allows for heteroscedasticity via the function b(.). Moreover, another feature of this model is that Y inherits its tail behaviour from Z which thus does not depend on the covariate x. We propose to take profit of this important property to decouple the estimation of the nonparametric and extreme structures. First, nonparametric estimators of the regression function a(.) and the dispersion function b(.) are introduced. This permits, in a second step, to derive an estimator of the conditional extreme-value index computed on the residuals. A plug-in estimator of extreme conditional quantiles is then built using these two preliminary steps. We show that the resulting semi-parametric estimator is asymptotically Gaussian and may benefit from the same rate of convergence as in the unconditional situation. Its finite sample properties are illustrated both on simulated and real tsunami data.
In this paper, we propose an Adaptive Realized Hyperbolic GARCH (A-Realized HYGARCH) process to model the long memory of high-frequency time series with possible structural breaks. The structural change is … In this paper, we propose an Adaptive Realized Hyperbolic GARCH (A-Realized HYGARCH) process to model the long memory of high-frequency time series with possible structural breaks. The structural change is modeled by allowing the intercept to follow the smooth and flexible function form introduced by Gallant (1984). In addition, stability conditions of the process are investigated. A Monte Carlo study is investigated in order to illustrate the performance of the A-Realized HYGARCH process compared to the Realized HYGARCH with or without structural change.
Overlapping coefficient is a direct measure of similarity between two distributions which is recently becoming very useful. This paper investigates estimation for some well-known measures of overlap, namely Matusita's measure … Overlapping coefficient is a direct measure of similarity between two distributions which is recently becoming very useful. This paper investigates estimation for some well-known measures of overlap, namely Matusita's measure $\rho$, Weitzman's measure $\Delta$ and $\Lambda$ based on Kullback-Leibler. Two estimation methods considered in this study are point estimation and Bayesian approach. Two Inverse Lomax populations with different shape parameters are considered. The bias and mean square error properties of the estimators are studied through a simulation study and a real data example.
In this paper, we establish the asymptotic normality of the recursive estimator of the density function for the right-censored random model when the data present some form of dependence. It … In this paper, we establish the asymptotic normality of the recursive estimator of the density function for the right-censored random model when the data present some form of dependence. It is assumed that, the survival and the censoring times form a stationary \(\beta\)-mixing-mixing. Therefore this paper is part of this vast project aimed to extending the results obtained with independent variables in the dependent case.
In this paper, we introduce a class of semi-parametric estimators of the distortion risk premiums for dependent insurance losses with heavy-tailed marginals. Our approach is based on the kernel estimation … In this paper, we introduce a class of semi-parametric estimators of the distortion risk premiums for dependent insurance losses with heavy-tailed marginals. Our approach is based on the kernel estimation of the tail index and extreme quantiles under the first and second orders regularly varying assumptions for stationary insured risks with heavy-tailed distribution under dependence serials. Moreover, we illustrate the behaviour of our proposed estimator and give a comparison between this estimator and the classical one in terms of the absolute bias and the root median squared error.
In this paper, we introduce a class of semi-parametric estimators of the distortion risk premiums for dependent insurance losses with heavy-tailed marginals. Our approach is based on the kernel estimation … In this paper, we introduce a class of semi-parametric estimators of the distortion risk premiums for dependent insurance losses with heavy-tailed marginals. Our approach is based on the kernel estimation of the tail index and extreme quantiles under the first and second orders regularly varying assumptions for stationary insured risks with heavy-tailed distribution under dependence serials. Moreover, we illustrate the behaviour of our proposed estimator and give a comparison between this estimator and the classical one in terms of the absolute bias and the root median squared error.
In this paper, we establish the asymptotic normality of the recursive estimator of the density function for the right-censored random model when the data present some form of dependence. It … In this paper, we establish the asymptotic normality of the recursive estimator of the density function for the right-censored random model when the data present some form of dependence. It is assumed that, the survival and the censoring times form a stationary \(\beta\)-mixing-mixing. Therefore this paper is part of this vast project aimed to extending the results obtained with independent variables in the dependent case.
Binary responses are often present in medical studies. When the dependent variable Y represents a rare event, the logistic regression model shows relevant drawbacks. To overcome these drawbacks, we propose … Binary responses are often present in medical studies. When the dependent variable Y represents a rare event, the logistic regression model shows relevant drawbacks. To overcome these drawbacks, we propose the quantile function of the generalized extreme value regression distribution as a link function and focus our attention on values close to one. One problem arising in the presence of cure fraction is that, it is usually unknown who are the cured and the susceptible subjects, unless the outcome of interest has been observed. In these settings, a logistic regression analysis is no more straightforward. We develop a maximum likelihood estimation procedure, based on the joint modeling of the binary response of interest and the cure status. We investigate the identifiability of the resulting model and establish the asymptotic properties. We conduct a simulation study to investigate its finite-sample behaviour, and application to real data.
nge studies require comprehensive databases to analyze the climate signal, to monitor its evolution, and to predict more accurately future changes. Since complete observations of any continuous process is almost … nge studies require comprehensive databases to analyze the climate signal, to monitor its evolution, and to predict more accurately future changes. Since complete observations of any continuous process is almost impossible, it is then inevitable to encounter missing information in meteorological databases. The aim of this work is to evaluate the performance of five ($5$) imputation methods: missForest, $k$-nn, ppca, mice and imputeTS. The results show that missForest is the best performing method to handle missing temperature data. In the case of precipitation data, the imputeTS method is the preferred one.
In this paper, we propose an Adaptive Realized Hyperbolic GARCH (A-Realized HYGARCH) process to model the long memory of high-frequency time series with possible structural breaks. The structural change is … In this paper, we propose an Adaptive Realized Hyperbolic GARCH (A-Realized HYGARCH) process to model the long memory of high-frequency time series with possible structural breaks. The structural change is modeled by allowing the intercept to follow the smooth and flexible function form introduced by Gallant (1984). In addition, stability conditions of the process are investigated. A Monte Carlo study is investigated in order to illustrate the performance of the A-Realized HYGARCH process compared to the Realized HYGARCH with or without structural change.
Overlapping coefficient is a direct measure of similarity between two distributions which is recently becoming very useful.This paper investigates estimation for some well-known measures of overlap, namely Matusita's measure , … Overlapping coefficient is a direct measure of similarity between two distributions which is recently becoming very useful.This paper investigates estimation for some well-known measures of overlap, namely Matusita's measure , Weitzman's measure ∆ and Λ based on Kullback-Leibler.Two estimation methods considered in this study are point estimation and Bayesian approach.Two inverse Lomax populations with different shape parameters are considered.The bias and mean square error properties of the estimators are studied through a simulation study and a real data example.
Logistic regression model is widely used in many studies to investigate the relationship between a binary response variable $Y$ and a set of potential predictors $\mathbf X$. The binary response … Logistic regression model is widely used in many studies to investigate the relationship between a binary response variable $Y$ and a set of potential predictors $\mathbf X$. The binary response may represent, for example, the occurrence of some outcome of interest ($Y=1$ if the outcome occurred and $Y=0$ otherwise). When the dependent variable $Y$ represents a rare event, the logistic regression model shows relevant drawbacks. In order to overcome these drawbacks we propose the Generalized Extreme Value (GEV) regression model. In particular, we suggest the quantile function of the GEV distribution as link function, so our attention is focused on the tail of the response curve for values close to one. A sample of observations is said to contain a cure fraction when a proportion of the study subjects (the so-called cured individuals, as opposed to the susceptibles) cannot experience the outcome of interest. One problem arising then is that it is usually unknown who are the cured and the susceptible subjects, unless the outcome of interest has been observed. In these settings, a logistic regression analysis of the relationship between $\mathbf X$ and $Y$ among the susceptibles is no more straightforward. We develop a maximum likelihood estimation procedure for this problem, based on the joint modeling of the binary response of interest and the cure status. We investigate the identifiability of the resulting model. Then, we conduct a simulation study to investigate its finite-sample behavior, and application to real data.
Generalized extreme value (GEV) regression is often more adapted when we investigate a relationship between a binary response variable $Y$ which represents a rare event and potentiel predictors $\mathbf{X}$. In … Generalized extreme value (GEV) regression is often more adapted when we investigate a relationship between a binary response variable $Y$ which represents a rare event and potentiel predictors $\mathbf{X}$. In particular, we use the quantile function of the GEV distribution as link function. Bootstrapping assigns measures of accuracy (bias, variance, confidence intervals, prediction error, test of hypothesis) to sample estimates. This technique allows estimation of the sampling distribution of almost any statistic using random sampling methods. Bootstrapping estimates the properties of an estimator by measuring those properties when sampling from an approximating distribution. In this paper, we fitted the generalized extreme value regression model, then we performed parametric bootstrap method for testing hupthesis, estimating confidence interval of parameters for generalized extreme value regression model and a real data application.
In this paper, we propose the realized Hyperbolic GARCH model for the joint-dynamics of lowfrequency returns and realized measures that generalizes the realized GARCH model of Hansen et al.(2012) as … In this paper, we propose the realized Hyperbolic GARCH model for the joint-dynamics of lowfrequency returns and realized measures that generalizes the realized GARCH model of Hansen et al.(2012) as well as the FLoGARCH model introduced by Vander Elst (2015). This model is sufficiently flexible to capture both long memory and asymmetries related to leverage effects. In addition, we will study the strictly and weak stationarity conditions of the model. To evaluate its performance, experimental simulations, using the Monte Carlo method, are made to forecast the Value at Risk (VaR) and the Expected Shortfall (ES). These simulation studies show that for ES and VaR forecasting, the realized Hyperbolic GARCH (RHYGARCH-GG) model with Gaussian-Gaussian errors provide more adequate estimates than the realized Hyperbolic GARCH model with student- Gaussian errors.
In this paper, we propose an Adaptive Realized Hyperbolic GARCH (A-Realized HYGARCH) process to model the long memory of high-frequency time series with possible structural breaks.The structural change is modeled … In this paper, we propose an Adaptive Realized Hyperbolic GARCH (A-Realized HYGARCH) process to model the long memory of high-frequency time series with possible structural breaks.The structural change is modeled by allowing the intercept to follow the smooth and flexible function form introduced by Gallant in [20].In addition, stability conditions of the process are investigated.A Monte Carlo study is investigated in order to illustrate the performance of the A-Realized HY-GARCH process compared to the Realized HYGARCH with or without structural change.
In this paper, we propose an Adaptive Realized Hyperbolic GARCH (A-Realized HYGARCH) process to model the long memory of high-frequency time series with possible structural breaks. The structural change is … In this paper, we propose an Adaptive Realized Hyperbolic GARCH (A-Realized HYGARCH) process to model the long memory of high-frequency time series with possible structural breaks. The structural change is modeled by allowing the intercept to follow the smooth and flexible function form introduced by Gallant (1984). In addition, stability conditions of the process are investigated. A Monte Carlo study is investigated in order to illustrate the performance of the A-Realized HYGARCH process compared to the Realized HYGARCH with or without structural change.
In this paper, we investigate the problem of estimating the probability density function. The kernel density estimation with bias reduced is nowadays a standard technique in explorative data analysis, there … In this paper, we investigate the problem of estimating the probability density function. The kernel density estimation with bias reduced is nowadays a standard technique in explorative data analysis, there is still a big dispute on how to assess the quality of the estimate and which choice of bandwidth is optimal. This framework examines the most important bandwidth selection methods for kernel density estimation in the context of with bias reduction. Normal reference, least squares cross-validation, biased cross-validation and β-divergence loss methods are described and expressions are presented. In order to assess the performance of our various bandwidth selectors, numerical simulations and environmental data are carried out.
The modeling of extreme events arises in many fields such as finance, insurance or environmental science. A recurrent statistical problem is then the estimation of extreme quantiles associated with a … The modeling of extreme events arises in many fields such as finance, insurance or environmental science. A recurrent statistical problem is then the estimation of extreme quantiles associated with a random variable $Y$ recorded simultaneously with a multidimensional covariate x in R^d, the goal being to describe how tail characteristics such as extreme quantiles or small exceedance probabilities of the response variable Y may depend on the explanatory variable x. Here, we focus on the challenging situation where Y given x is heavy-tailed. Without additional assumptions on the pair (Y,x), the estimation of extreme conditional quantiles is addressed using semi-parametric method. More specifically, we assume that the response variable and the deterministic covariate are linked by a location-dispersion regression model Y=a(x)+b(x)Z where Z is a heavy-tailed random variable. This model is flexible since (i) no parametric assumptions are made on a(.), b(.) and Z, (ii) it allows for heteroscedasticity via the function b(.). Moreover, another feature of this model is that Y inherits its tail behaviour from Z which thus does not depend on the covariate x. We propose to take profit of this important property to decouple the estimation of the nonparametric and extreme structures. First, nonparametric estimators of the regression function a(.) and the dispersion function b(.) are introduced. This permits, in a second step, to derive an estimator of the conditional extreme-value index computed on the residuals. A plug-in estimator of extreme conditional quantiles is then built using these two preliminary steps. We show that the resulting semi-parametric estimator is asymptotically Gaussian and may benefit from the same rate of convergence as in the unconditional situation. Its finite sample properties are illustrated both on simulated and real tsunami data.
Overlapping coefficient is a direct measure of similarity between two distributions which is recently becoming very useful. This paper investigates estimation for some well-known measures of overlap, namely Matusita's measure … Overlapping coefficient is a direct measure of similarity between two distributions which is recently becoming very useful. This paper investigates estimation for some well-known measures of overlap, namely Matusita's measure $\rho$, Weitzman's measure $\Delta$ and $\Lambda$ based on Kullback-Leibler. Two estimation methods considered in this study are point estimation and Bayesian approach. Two Inverse Lomax populations with different shape parameters are considered. The bias and mean square error properties of the estimators are studied through a simulation study and a real data example.
In this paper, we propose an estimator of Foster, Greer and Thorbecke class of measures $\displaystyle P(z,α) = \int_0^{z}\Big(\frac{z-x}{z}\Big)^αf(x)\, dx$, where $z&gt;0$ is the poverty line, $f$ is the probabily … In this paper, we propose an estimator of Foster, Greer and Thorbecke class of measures $\displaystyle P(z,α) = \int_0^{z}\Big(\frac{z-x}{z}\Big)^αf(x)\, dx$, where $z&gt;0$ is the poverty line, $f$ is the probabily density function of the income distribution and $α$ is the so-called poverty aversion. The estimator is constructed with a bias reduced kernel estimator. Uniform almost sure consistency and uniform mean square consistenty are established. A simulation study indicates that our new estimator performs well.
Abstract We introduce a location-scale model for conditional heavy-tailed distributions when the covariate is deterministic. First, nonparametric estimators of the location and scale functions are introduced. Second, an estimator of … Abstract We introduce a location-scale model for conditional heavy-tailed distributions when the covariate is deterministic. First, nonparametric estimators of the location and scale functions are introduced. Second, an estimator of the conditional extreme-value index is derived. The asymptotic properties of the estimators are established under mild assumptions and their finite sample properties are illustrated both on simulated and real data.
Overlapping coefficient is a direct measure of similarity between two distributions which is recently becoming very useful. This paper investigates estimation for some well-known measures of overlap, namely Matusita's measure … Overlapping coefficient is a direct measure of similarity between two distributions which is recently becoming very useful. This paper investigates estimation for some well-known measures of overlap, namely Matusita's measure $\rho$, Weitzman's measure $\Delta$ and $\Lambda$ based on Kullback-Leibler. Two estimation methods considered in this study are point estimation and Bayesian approach. Two Inverse Lomax populations with different shape parameters are considered. The bias and mean square error properties of the estimators are studied through a simulation study and a real data example.
We are interested in a location-scale model for heavy-tailed distributions where the covariate is deterministic. We first address the nonparametric estimation of the location and scale functions and derive an … We are interested in a location-scale model for heavy-tailed distributions where the covariate is deterministic. We first address the nonparametric estimation of the location and scale functions and derive an estimator of the conditional extreme-value index. Second, new estimators of the extreme conditional quantiles are introduced. The asymptotic properties of the estimators are established under mild assumptions.
We introduce a kernel-type estimators of ( ) φ /, v -divergence for continuous distributions.We discuss this approach of goodness-of-fit test for a model selection criterion relative to these divergence … We introduce a kernel-type estimators of ( ) φ /, v -divergence for continuous distributions.We discuss this approach of goodness-of-fit test for a model selection criterion relative to these divergence measures.Our interest is in the problem to testing for choosing between two models using some informational type statistics (on random walk and autoregressive AR (1)).The limit laws of the estimates and test statistics are given under both the null and the alternative hypotheses.We also describe how to apply estimators and illustrate their efficiency through numerical experiences.
Nonparametric density estimation, based on kernel-type estimators, is a very popular method in statistical research, especially when we want to model the probabilistic or stochastic structure of a data set. … Nonparametric density estimation, based on kernel-type estimators, is a very popular method in statistical research, especially when we want to model the probabilistic or stochastic structure of a data set. In this paper, we investigate the asymptotic confidence bands for the distribution with kernel-estimators for some types of divergence measures (Rényi-α and Tsallis-α divergence). Our aim is to use the method based on empirical process techniques, in order to derive some asymptotic results. Under different assumptions, we establish a variety of fundamental and theoretical properties, such as the strong consistency of an uniform-in-bandwidth of the divergence estimators. We further apply the previous results in simulated examples, including the kernel-type estimator for Hellinger, Bhattacharyya and Kullback-Leibler divergence, to illustrate this approach, and we show that that the method performs competitively.
We propose nonparametric estimation of divergence measures between continuous distributions. Our approach is based on a plug-in kernel- type estimators of density functions. We give the uniform in bandwidth consistency … We propose nonparametric estimation of divergence measures between continuous distributions. Our approach is based on a plug-in kernel- type estimators of density functions. We give the uniform in bandwidth consistency for the proposal estimators. As a consequence, their asymp- totic 100% confidence intervals are also provided.
We propose nonparametric estimation of divergence measures between continuous distributions. Our approach is based on a plug-in kernel- type estimators of density functions. We give the uniform in bandwidth consistency … We propose nonparametric estimation of divergence measures between continuous distributions. Our approach is based on a plug-in kernel- type estimators of density functions. We give the uniform in bandwidth consistency for the proposal estimators. As a consequence, their asymp- totic 100% confidence intervals are also provided.
The extreme-value index is an important parameter in extreme-value theory since it controls the fi rst order behavior of the distribution tail. In the literature, numerous estimators of this parameter … The extreme-value index is an important parameter in extreme-value theory since it controls the fi rst order behavior of the distribution tail. In the literature, numerous estimators of this parameter have been proposed especially in the case of heavy-tailed distributions, which is the situation considered here. Most of these estimators depend on the k largest observations of the underlying sample. Their bias is controlled by the second order parameter. In order to reduce the bias of extreme-value index estimators or to select the best number k of observations to use, the knowledge of the second order parameter is essential. In this paper, we propose a simple approach to estimate the second order parameter leading to both existing and new estimators. We establish a general result that can be used to easily prove the asymptotic normality of a large number of estimators proposed in the literature or to compare di erent estimators within a given family. Some illustrations on simulations are also provided.
The extreme-value index is an important parameter in extreme-value theory since it controls the first order behavior of the distribution tail. Numerous estimators of this parameter have been proposed especially … The extreme-value index is an important parameter in extreme-value theory since it controls the first order behavior of the distribution tail. Numerous estimators of this parameter have been proposed especially in the case of heavy-tailed distributions, which is the situation considered here. Most of these estimators depend on the largest observations of the underlying sample. Their bias is controlled by the second order parameter. In order to reduce the bias of extreme-value index estimators or to select the best number of observations to use, the knowledge of the second order parameter is essential. We propose a simple approach to estimate the second order parameter leading to both existing and new estimators. We establish a general result that can be used to easily prove the asymptotic normality of a large number of estimators proposed in the literature or to compare different estimators within a given family. Some illustrations on simulations are also provided.
Cette these est divisee en cinq chapitres auxquels s'ajoutent une introduction et une conclusion. Dans le premier chapitre, nous rappelons quelques notions de base sur la theorie des valeurs extremes. … Cette these est divisee en cinq chapitres auxquels s'ajoutent une introduction et une conclusion. Dans le premier chapitre, nous rappelons quelques notions de base sur la theorie des valeurs extremes. Dans le deuxieme chapitre, nous considerons un processus statistique dependant d'un parametre continu tau et dont chaque marge peut etre consideree comme un estimateur de Hill generalis.. Ce processus statistique permet de discriminer entierement les domaines d'attraction des valeurs extremes. La normalite asymptotique de ce processus statistiquea ete seulement donnee pour tau > 1/2. Nous completons cette etude pour 0 < tau< 1/2, en donnant une approximation des domaines de Gumbel et de Frechet. Des etudes de simulations effectuees avec le logiciel R , permettent de montrer la performance de ces estimateurs. Comme illustration, nous proposons une application de notre methodologie aux donnees hydrauliques. Dans le troisieme chapitre, nous etendons l'etude du processus statistique precedent dans un cadre fonctionnel. Nous proposons donc un processus stochastique dependant d'une fonctionnelle positive pour obtenir une grande classe d'estimateurs de l'indice des valeurs extremes dont chaque estimateur est une marge d'un seul processus stochastique. L'etude theorique de ces processus stochastiques que nous avions menee, est basee sur la theorie moderne de convergence vague fonctionnelle. Cette derniere permet de gerer des estimateurs plus complexes sous forme de processus stochastiques. Nous donnons les distributions asymptotiques fonctionnelles de ces processus et nous montrons que pour certaines classes de fonctions, nous avons un comportement asymptotique non Gaussien et qui sera entierement caracterise. Dans le quatrieme chapitre, on s'interesse a l'estimation du parametre du second ordre. Notons que ce parametre joue un role tres important dans le choix adaptatif du nombre optimal de valeurs extremes utilise lors de l'estimation de l'indice des valeurs extremes. L'estimation de ce parametre est egalement utilisee pour la reduction du biais des estimateurs de l'indice de queue et a recu une grande attention dans la litterature des valeurs extremes .Nous proposons une simple et generale approche pour estimer le parametre du second ordre, permettant de regrouper un grand nombre d'estimateurs. Il est montre que les estimateurs cites precedemment peuvent etre vus comme des cas particuliers de notre approche. Nous tirons egalement parti de notre formalisme pour proposer de nouveaux estimateurs asymptotiquement Gaussiens du parametre du second ordre. Finalement, certains estimateurs sont compares tant du point de vue asymptotique que performance sur des echantillons de tailles finies. Comme illustration, nous proposons une application sur des donnees d'assurance. Dans le dernier chapitre, on s'interesse aux mesures de risque actuariel pour des phenomenes capables d'engendrer des pertes financieres tres importantes (ou phenomenes extremes c'est-a-dire a des risques dont on ne sait pas si le systeme d'assurance sera capable de les supporte). De nombreuses mesures de risque ou principes de calcul de la prime ont ete proposes dans la litterature actuarielle. Nous nous concentrons sur la prime de risque-ajustee. Jones et Zitikis (2003) ont donne une estimation de cette derniere basee sur la distribution empirique et ont etabli sa normalite asymptotique sous certaines conditions appropriees, et qui ne sont pas souvent remplies dans le cas des distributions a queues lourdes. Ainsi, nous regardons ce cadre la et nous considerons une famille d'estimateurs de la prime de risque-ajustee basee sur l'approche de la theorie des valeurs extremes. Nous etablissons leur normalite asymptotique et nous proposons egalement une approche de reduction de biais pour ces estimateurs. Des etudes de simulation permettent d'apprecier la qualite de nos estimateurs. Comme illustration, nous proposons une application sur des donnees d'assurance.
Let X 1 , X 2 ,. . .be a sequence of independent copies (s.i.c) of a real random variable (r.v.) X 1, with distribution function df F(x) = P(X … Let X 1 , X 2 ,. . .be a sequence of independent copies (s.i.c) of a real random variable (r.v.) X 1, with distribution function df F(x) = P(X x) and let X 1,n X 2,n • • • X n,n be the order statistics based on the n 1 first of these observations.The following continuous generalized Hill processτ > 0, 1 k n, has been introduced as a continuous family of estimators of the extreme value index, and largely studied for statistical purposes with asymptotic normality results restricted to τ > 1/2.We extend those results to 0 < τ 1/2 and show that asymptotic normality is still valid for τ = 1/2.For 0 < τ < 1/2, we get non Gaussian asymptotic laws which are closely related to the Riemann function ζ (s) = ∑ ∞ n=1 n -s , s > 1.
We are concerned in this paper with the functional asymptotic behavior of the sequence of stochastic processes$$T_{n}(f)=\sum_{j=1}^{j=k}f(j)\left( \log X_{n-j+1,n}-\log X_{n-j,n}\right),\eqno(0.1)$$indexed by some classes $\mathcal{F}$ of functions $f:\mathbb{N} \backslash \{0\} \longmapsto … We are concerned in this paper with the functional asymptotic behavior of the sequence of stochastic processes$$T_{n}(f)=\sum_{j=1}^{j=k}f(j)\left( \log X_{n-j+1,n}-\log X_{n-j,n}\right),\eqno(0.1)$$indexed by some classes $\mathcal{F}$ of functions $f:\mathbb{N} \backslash \{0\} \longmapsto \mathbb{R}_{+}$ and where $k=k(n)$ satisfies\begin{equation*}1\leq k\leq n,k/n\rightarrow 0\text{ as }n\rightarrow \infty .\end{equation*}This is a functional generalized Hill process including as many new estimators of the extreme value index when $F$ is in the extreme value domain. We focus in this paper on its functional and uniform asymptotic law in the new setting of weak convergence in the space of bounded real functions. The results are next particularized for explicit examples of classes $\mathcal{F}$.
Let $X_{1},X_{2},...$ be a sequence of independent copies (s.i.c) of a real random variable (r.v.) $X\geq 1$, with distribution function $df$ $F(x)=\mathbb{P}% (X\leq x)$ and let $X_{1,n}\leq X_{2,n} \leq ... … Let $X_{1},X_{2},...$ be a sequence of independent copies (s.i.c) of a real random variable (r.v.) $X\geq 1$, with distribution function $df$ $F(x)=\mathbb{P}% (X\leq x)$ and let $X_{1,n}\leq X_{2,n} \leq ... \leq X_{n,n}$ be the order statistics based on the $n\geq 1$ first of these observations. The following continuous generalized Hill process {equation*} T_{n}(\tau)=k^{-\tau}\sum_{j=1}^{j=k}j^{\tau}(\log X_{n-j+1,n}-\log X_{n-j,n}), \label{dl02} {equation*} $\tau >0$, $1\leq k \leq n$, has been introduced as a continuous family of estimators of the extreme value index, and largely studied for statistical purposes with asymptotic normality results restricted to $\tau > 1/2$. We extend those results to $0 1$
We are concerned in this paper with the functional asymptotic behaviour of the sequence of stochastic processes T_{n}(f)=\sum_{j=1}^{j=k}f(j)(\log X_{n-j+1,n}-\log X_{n-j,n}), indexed by some classes $\mathcal{F}$ of functions $f:\mathbb{N} \backslash {0} … We are concerned in this paper with the functional asymptotic behaviour of the sequence of stochastic processes T_{n}(f)=\sum_{j=1}^{j=k}f(j)(\log X_{n-j+1,n}-\log X_{n-j,n}), indexed by some classes $\mathcal{F}$ of functions $f:\mathbb{N} \backslash {0} \longmapsto \mathbb{R}_{+}$ and where $k=k(n)$ satisfies 1\leq k\leq n,k/n\rightarrow 0\text{as}n\rightarrow \infty. This is a functional generalized Hill process including as many new estimators of the extremal index when $F$ is in the extremal domain. We focus in this paper on its functional and uniform asymptotic law in the new setting of weak convergence in the space of bounded real functions. The results are next particularized for explicit examples of classes $\mathcal{F} $.
An important parameter in extreme value theory is the extreme value index $\gamma$. It controls the fi rst order behavior of the distribution tail. In the literature, numerous estimators of … An important parameter in extreme value theory is the extreme value index $\gamma$. It controls the fi rst order behavior of the distribution tail. In the literature, numerous estimators of this parameter have been proposed especially in the case of heavy tailed distributions (which is the situation considered here). The most known estimator was proposed by [2]. It depends on the $k$ largest observations of the underlying sample. The bias of the tail index estimator is controlled by the second order parameter $\rho$. In order to reduce the bias of $\gamma$'s estimators or to select the best number $k$ of observations to use, the knowledge of $\rho$ is essential. Some estimators of $\rho$ can be found in the literature, see for example [1, 2, 3]. We propose a semiparametric family of estimators for $\rho$ that encompasses the three previously mentioned estimators. The asymptotic normality of these estimators is then proved in an uni fied way. New estimators of $\rho$ are also introduced.
We are concerned in this paper with the functional asymptotic behaviour of the sequence of stochastic processes T_{n}(f)=\sum_{j=1}^{j=k}f(j)(\log X_{n-j+1,n}-\log X_{n-j,n}), indexed by some classes $\mathcal{F}$ of functions $f:\mathbb{N} \backslash {0} … We are concerned in this paper with the functional asymptotic behaviour of the sequence of stochastic processes T_{n}(f)=\sum_{j=1}^{j=k}f(j)(\log X_{n-j+1,n}-\log X_{n-j,n}), indexed by some classes $\mathcal{F}$ of functions $f:\mathbb{N} \backslash {0} \longmapsto \mathbb{R}_{+}$ and where $k=k(n)$ satisfies 1\leq k\leq n,k/n\rightarrow 0\text{as}n\rightarrow \infty. This is a functional generalized Hill process including as many new estimators of the extremal index when $F$ is in the extremal domain. We focus in this paper on its functional and uniform asymptotic law in the new setting of weak convergence in the space of bounded real functions. The results are next particularized for explicit examples of classes $\mathcal{F} $.
Let $X_{1},X_{2},...$ be a sequence of independent copies (s.i.c) of a real random variable (r.v.) $X\geq 1$, with distribution function $df$ $F(x)=\mathbb{P}% (X\leq x)$ and let $X_{1,n}\leq X_{2,n} \leq ... … Let $X_{1},X_{2},...$ be a sequence of independent copies (s.i.c) of a real random variable (r.v.) $X\geq 1$, with distribution function $df$ $F(x)=\mathbb{P}% (X\leq x)$ and let $X_{1,n}\leq X_{2,n} \leq ... \leq X_{n,n}$ be the order statistics based on the $n\geq 1$ first of these observations. The following continuous generalized Hill process {equation*} T_{n}(\tau)=k^{-\tau}\sum_{j=1}^{j=k}j^{\tau}(\log X_{n-j+1,n}-\log X_{n-j,n}), \label{dl02} {equation*} $\tau >0$, $1\leq k \leq n$, has been introduced as a continuous family of estimators of the extreme value index, and largely studied for statistical purposes with asymptotic normality results restricted to $\tau > 1/2$. We extend those results to $0 < \tau \leq 1/2$ and show that asymptotic normality is still valid for $\tau=1/2$. For $0 < \tau <1/2$, we get non Gaussian asymptotic laws which are closely related to the Riemann function $% \zeta(s)=\sum_{n=1}^{\infty} n^{-s},s>1$
We introduce a new Brownian bridge approximation to weighted empirical and quantile processes with rates in probability. This approximation leads to a number of general invariance theorems for empirical and … We introduce a new Brownian bridge approximation to weighted empirical and quantile processes with rates in probability. This approximation leads to a number of general invariance theorems for empirical and quantile processes indexed by functions. Improved versions of the Chibisov-O'Reilly theorems, the Eicker-Jaeschke theorems for standardized empirical and quantile processes, the normal convergence criterion, and various other old and new asymptotic results on empirical and quantile processes are presented as consequences of our general theorems. In the process, we provide a new characterization of Erdos-Feller-Kolmogorov-Petrovski upper-class functions for the Brownian motion in an improved form.
We introduce a new estimate of the exponent of a distribution whose tail varies regularly at infinity. This estimate is expressed as the convolution of a kernel with the logarithm … We introduce a new estimate of the exponent of a distribution whose tail varies regularly at infinity. This estimate is expressed as the convolution of a kernel with the logarithm of the quantile function, and includes as particular cases the estimates introduced by Hill and by De Haan. Under very weak conditions, we prove asymptotic normality, consistency and discuss the optimal choices of the kernel and of the bandwidth parameter.
We extend Hill's well-known estimator for the index of a distribution function with regularly varying tail to an estimate for the index of an extreme-value distribution. Consistency and asymptotic normality … We extend Hill's well-known estimator for the index of a distribution function with regularly varying tail to an estimate for the index of an extreme-value distribution. Consistency and asymptotic normality are proved. The estimator is used for high quantile and endpoint estimation.
A simple general approach to inference about the tail behavior of a distribution is proposed. It is not required to assume any global form for the distribution function, but merely … A simple general approach to inference about the tail behavior of a distribution is proposed. It is not required to assume any global form for the distribution function, but merely the form of behavior in the tail where it is desired to draw inference. Results are particularly simple for distributions of the Zipf type, i.e., where $G(y) = 1 - Cy^{-\alpha}$ for large $y$. The methods of inference are based upon an evaluation of the conditional likelihood for the parameters describing the tail behavior, given the values of the extreme order statistics, and can be implemented from both Bayesian and frequentist viewpoints.
Abstract. Let X j denote the life length of the j th component of a machine. In reliability theory, one is interested in the life length Z n of the … Abstract. Let X j denote the life length of the j th component of a machine. In reliability theory, one is interested in the life length Z n of the machine where n signifies its number of components. Evidently, Z n = min (X j : 1 ≤ j ≤ n). Another important problem, which is extensively discussed in the literature, is the service time W n of a machine with n components. If Y j is the time period required for servicing the j th component, then W n = max (Y j : 1 ≤ j ≤ n). In the early investigations, it was usually assumed that the X's or Y's are stochastically independent and identically distributed random variables. If n is large, then asymptotic theory is used for describing Z n or W n . Classical theory thus gives that the (asymptotic) distribution of these extremes (Z n or W n ) is of Weibull type. While the independence assumptions are practically never satisfied, data usually fits well the assumed Weibull distribution. This contradictory situation leads to the following mathematical problems: (i) What type of dependence property of the X's (or the Y's) will result in a Weibull distribution as the asymptotic law of Z n (or W n )? (ii) given the dependence structure of the X's (or Y's), what type of new asymptotic laws can be obtained for Z n (or W n )? The aim of the present paper is to analyze the recent development of the (mathematical) theory of the asymptotic distribution of extremes in the light of the questions (i) and (ii). Several dependence concepts will be introduced, each of which leads to a solution of (i). In regard to (ii), the following result holds: the class of limit laws of extremes for exchangeable variables is identical to the class of limit laws of extremes for arbitrary random variables. One can therefore limit attention to exchangeable variables. The basic references to this paper are the author's recent papers in Duke Math. J. 40 (1973), 581–586, J. Appl. Probability 10 (1973, 122–129 and 11 (1974), 219–222 and Zeitschrift fur Wahrscheinlichkeitstheorie 32 (1975), 197–207. For multivariate extensions see H. A. David and the author, J. Appl. Probability 11 (1974), 762–770 and the author's paper in J. Amer. Statist. Assoc. 70 (1975), 674–680. Finally, we shall point out the difficulty of distinguishing between several distributions based on data. Hence, only a combination of theoretical results and experimentations can be used as conclusive evidence on the laws governing the behavior of extremes.
We establish uniform-in-bandwidth consistency for kernel-type estimators of the differential entropy. We consider two kernel-type estimators of Shannon's entropy. As a consequence, an asymptotic 100% confidence interval of entropy is … We establish uniform-in-bandwidth consistency for kernel-type estimators of the differential entropy. We consider two kernel-type estimators of Shannon's entropy. As a consequence, an asymptotic 100% confidence interval of entropy is provided.
Summary A simple asymptotic estimate is constructed for the index of a stable distribution based on order statistics from a distribution in its domain of attraction. The asymptotic distribution of … Summary A simple asymptotic estimate is constructed for the index of a stable distribution based on order statistics from a distribution in its domain of attraction. The asymptotic distribution of the estimate is then found in case the order statistics are taken from the stable distribution itself.
This note discusses some aspects of the estimation of the density function of a univariate probability distribution. All estimates of the density function satisfying relatively mild conditions are shown to … This note discusses some aspects of the estimation of the density function of a univariate probability distribution. All estimates of the density function satisfying relatively mild conditions are shown to be biased. The asymptotic mean square error of a particular class of estimates is evaluated.
In the information system research, a question of particular interest is to interpret and to predict the probability of a firm to adopt a new technology such that market promotions … In the information system research, a question of particular interest is to interpret and to predict the probability of a firm to adopt a new technology such that market promotions are targeted to only those firms that were more likely to adopt the technology. Typically, there exists significant difference between the observed number of ``adopters'' and ``nonadopters,'' which is usually coded as binary response. A critical issue involved in modeling such binary response data is the appropriate choice of link functions in a regression model. In this paper we introduce a new flexible skewed link function for modeling binary response data based on the generalized extreme value (GEV) distribution. We show how the proposed GEV links provide more flexible and improved skewed link regression models than the existing skewed links, especially when dealing with imbalance between the observed number of 0's and 1's in a data. The flexibility of the proposed model is illustrated through simulated data sets and a billing data set of the electronic payments system adoption from a Fortune 100 company in 2005.
Abstract We consider an estimation problem when only the k largest observations of a sample of size n are available. It is assumed that the underlying distribution function F belongs … Abstract We consider an estimation problem when only the k largest observations of a sample of size n are available. It is assumed that the underlying distribution function F belongs to the domain of attraction of a known extreme-value distribution and that k remains fixed as n → ∞. We present estimators for the location and scale parameters and for p-quantiles of F, where p is of the form 1 — c/n (c fixed). These estimators are either asymptotically maximum likelihood or minimum variance.
We establish uniform limit laws for kernel density estimators with minimal assumptions upon the kernel and the density. We establish uniform limit laws for kernel density estimators with minimal assumptions upon the kernel and the density.
The generalized Poisson regression model has been used to model dispersed count data. It is a good competitor to the negative binomial regression model when the count data is over-dispersed. … The generalized Poisson regression model has been used to model dispersed count data. It is a good competitor to the negative binomial regression model when the count data is over-dispersed. Zero-inflated Poisson and zero-inflated negative binomial regression models have been proposed for the situations where the data generating process results into too many zeros. In this paper, we propose a zero-inflated generalized Poisson (ZIGP) regression model to model domestic violence data with too many zeros. Estimation of the model parameters using the method of maximum likelihood is provided. A score test is presented to test whether the number of zeros is too large for the generalized Poisson model to adequately fit the domestic violence data
Kumaraswamy [Generalized probability density-function for double-bounded random-processes, J. Hydrol. 462 (1980), pp. 79–88] introduced a distribution for double-bounded random processes with hydrological applications. For the first time, based on this … Kumaraswamy [Generalized probability density-function for double-bounded random-processes, J. Hydrol. 462 (1980), pp. 79–88] introduced a distribution for double-bounded random processes with hydrological applications. For the first time, based on this distribution, we describe a new family of generalized distributions (denoted with the prefix 'Kw') to extend the normal, Weibull, gamma, Gumbel, inverse Gaussian distributions, among several well-known distributions. Some special distributions in the new family such as the Kw-normal, Kw-Weibull, Kw-gamma, Kw-Gumbel and Kw-inverse Gaussian distribution are discussed. We express the ordinary moments of any Kw generalized distribution as linear functions of probability weighted moments (PWMs) of the parent distribution. We also obtain the ordinary moments of order statistics as functions of PWMs of the baseline distribution. We use the method of maximum likelihood to fit the distributions in the new class and illustrate the potentiality of the new model with an application to real data.
• Classical extreme value index estimators are known to be quite sensitive to the number k of top order statistics used in the estimation. The recently developed second order reduced-bias … • Classical extreme value index estimators are known to be quite sensitive to the number k of top order statistics used in the estimation. The recently developed second order reduced-bias estimators show much less sensitivity to changes in k. Here, we are interested in the improvement of the performance of reduced-bias extreme value index estimators based on an exponential second order regression model applied to the scaled log-spacings of the top k order statistics. In order to achieve that improvement, the estimation of a “scale” and a “shape” second order parameters in the bias is performed at a level k1 of a larger order than that of the level k at which we compute the extreme value index estimators. This enables us to keep the asymptotic variance of the new estimators of a positive extreme value index γ equal to the asymptotic variance of the Hill estimator, the maximum likelihood estimator of γ, under a strict Pareto model. These new estimators are then alternatives to the classical estimators, not only around optimal and/or large levels k, but for other levels too. To enhance the interesting performance of this type of estimators, we also consider the estimation of the “scale” second order parameter only, at the same level k used for the extreme value index estimation. The asymptotic distributional properties of the proposed class of γ-estimators are derived and the estimators are compared with other similar alternative estimators of γ recently introduced in the literature, not only asymptotically, but also for finite samples through Monte Carlo techniques. Case-studies in the fields of finance and insurance will illustrate the performance of the new second order reduced-bias extreme value index estimators.
Abstract A simple statistic is proposed to test the hypothesis that a sample comes from a distribution in the domain of attraction of the Gumbel distribution. It is based on … Abstract A simple statistic is proposed to test the hypothesis that a sample comes from a distribution in the domain of attraction of the Gumbel distribution. It is based on the top k order statistics and is a generalization of the Shapiro–Wilk goodness-of-fit statistic. The critical region of the test and its power against the alternative that the sample comes from a distribution in another domain of attraction are studied theoretically and by simulation. The power turns out to be superior to that of other tests previously proposed.
SUMMARY We discuss the analysis of the extremes of data by modelling the sizes and occurrence of exceedances over high thresholds. The natural distribution for such exceedances, the generalized Pareto … SUMMARY We discuss the analysis of the extremes of data by modelling the sizes and occurrence of exceedances over high thresholds. The natural distribution for such exceedances, the generalized Pareto distribution, is described and its properties elucidated. Estimation and model-checking procedures for univariate and regression data are developed, and the influence of and information contained in the most extreme observations in a sample are studied. Models for seasonality and serial dependence in the point process of exceedances are described. Sets of data on river flows and wave heights are discussed, and an application to the siting of nuclear installations is described.
This paper shows how to use realised kernels to carry out efficient feasible inference on the expost variation of underlying equity prices in the presence of simple models of market … This paper shows how to use realised kernels to carry out efficient feasible inference on the expost variation of underlying equity prices in the presence of simple models of market frictions. The issue is subtle with only estimators which have symmetric weights delivering consistent estimators with mixed Gaussian limit theorems. The weights can be chosen to achieve the best possible rate of convergence and to have an asymptotic variance which is close to that of the maximum likelihood estimator in the parametric version of this problem. Realised kernels can also be selected to (i) be analysed using endogenously spaced data such as that in databases on transactions, (ii) allow for market frictions which are endogenous, (iii) allow for temporally dependent noise. The finite sample performance of our estimators is studied using simulation, while empirical work illustrates their use in practice.
Abstract In this note we characterize those sequences k n such that the Hill estimator of the tail index based on the k n upper order statistics of a sample … Abstract In this note we characterize those sequences k n such that the Hill estimator of the tail index based on the k n upper order statistics of a sample of size n from a Pareto-type distribution is strongly consistent.
Concrete decision rules are given for the problem of goodness of fit and the problem of two samples with a risk smaller than any preassigned value. The problem of estimation … Concrete decision rules are given for the problem of goodness of fit and the problem of two samples with a risk smaller than any preassigned value. The problem of estimation is also treated.
In a 1992 Technometrics paper, Lambert (1992, 34, 1-14) described zero-inflated Poisson (ZIP) regression, a class of models for count data with excess zeros. In a ZIP model, a count … In a 1992 Technometrics paper, Lambert (1992, 34, 1-14) described zero-inflated Poisson (ZIP) regression, a class of models for count data with excess zeros. In a ZIP model, a count response variable is assumed to be distributed as a mixture of a Poisson(lambda) distribution and a distribution with point mass of one at zero, with mixing probability p. Both p and lambda are allowed to depend on covariates through canonical link generalized linear models. In this paper, we adapt Lambert's methodology to an upper bounded count situation, thereby obtaining a zero-inflated binomial (ZIB) model. In addition, we add to the flexibility of these fixed effects models by incorporating random effects so that, e.g., the within-subject correlation and between-subject heterogeneity typical of repeated measures data can be accommodated. We motivate, develop, and illustrate the methods described here with an example from horticulture, where both upper bounded count (binomial-type) and unbounded count (Poisson-type) data with excess zeros were collected in a repeated measures designed experiment.
Abstract Given a sequence of non-negative independent and identically distributed random variables, we determine conditions on the common distribution such that the sum of appropriately normalized and centred upper k … Abstract Given a sequence of non-negative independent and identically distributed random variables, we determine conditions on the common distribution such that the sum of appropriately normalized and centred upper k n extreme values based on the first n random variables converges in distribution to a normal random variable, where k n → ∞ and k n/ n → 0 as n → ∞. The probabilistic problem is motivated by recent statistical work on the estimation of the exponent of a regularly varying distribution function. Our main tool is a new Brownian bridge approximation to the uniform empirical and quantile processes in weighted supremum norms.
Logistic regression is widely used in medical studies to investigate the relationship between a binary response variable Y and a set of potential predictors X. The binary response may represent, … Logistic regression is widely used in medical studies to investigate the relationship between a binary response variable Y and a set of potential predictors X. The binary response may represent, for example, the occurrence of some outcome of interest (Y=1 if the outcome occurred and Y=0 otherwise). In this paper, we consider the problem of estimating the logistic regression model with a cure fraction. A sample of observations is said to contain a cure fraction when a proportion of the study subjects (the so-called cured individuals, as opposed to the susceptibles) cannot experience the outcome of interest. One problem arising then is that it is usually unknown who are the cured and the susceptible subjects, unless the outcome of interest has been observed. In this setting, a logistic regression analysis of the relationship between X and Y among the susceptibles is no more straightforward. We develop a maximum likelihood estimation procedure for this problem, based on the joint modeling of the binary response of interest and the cure status. We investigate the identifiability of the resulting model. Then, we establish the consistency and asymptotic normality of the proposed estimator, and we conduct a simulation study to investigate its finite-sample behavior.
Let f n denote the usual kernel density estimator in several dimensions.It is shown that if {a n } is a regular band sequence, K is a bounded square integrable … Let f n denote the usual kernel density estimator in several dimensions.It is shown that if {a n } is a regular band sequence, K is a bounded square integrable kernel of several variables, satisfying some additional mild conditions ((K 1 ) below), and if the data consist of an i.i.d.sample from a distribution possessing a bounded density f with respect to Lebesgue measure on R d , then lim supfor some absolute constant C that depends only on d.With some additional but still weak conditions, it is proved that the above sequence of normalized suprema converges a.s. toConvergence of the moment generating functions is also proved.Neither of these results require f to be strictly positive.These results improve upon, and extend to several dimensions, results by Silverman [13] for univariate densities.
ABSTRACT Applications of zero‐inflated count data models have proliferated in health economics. However, zero‐inflated Poisson or zero‐inflated negative binomial maximum likelihood estimators are not robust to misspecification. This article proposes … ABSTRACT Applications of zero‐inflated count data models have proliferated in health economics. However, zero‐inflated Poisson or zero‐inflated negative binomial maximum likelihood estimators are not robust to misspecification. This article proposes Poisson quasi‐likelihood estimators as an alternative. These estimators are consistent in the presence of excess zeros without having to specify the full distribution. The advantages of the Poisson quasi‐likelihood approach are illustrated in a series of Monte Carlo simulations and in an application to the demand for health services. Copyright © 2012 John Wiley &amp; Sons, Ltd.
Abstract. We consider large sample inference in a semiparametric logistic/proportional-hazards mixture model. This model has been proposed to model survival data where there exists a positive portion of subjects in … Abstract. We consider large sample inference in a semiparametric logistic/proportional-hazards mixture model. This model has been proposed to model survival data where there exists a positive portion of subjects in the population who are not susceptible to the event under consideration. Previous studies of the logistic/proportional-hazards mixture model have focused on developing point estimation procedures for the unknown parameters. This paper studies large sample inferences based on the semiparametric maximum likelihood estimator. Specifically, we establish existence, consistency and asymptotic normality results for the semiparametric maximum likelihood estimator. We also derive consistent variance estimates for both the parametric and non-parametric components. The results provide a theoretical foundation for making large sample inference under the logistic/proportional-hazards mixture model.
The extreme-value index is an important parameter in extreme-value theory since it controls the fi rst order behavior of the distribution tail. In the literature, numerous estimators of this parameter … The extreme-value index is an important parameter in extreme-value theory since it controls the fi rst order behavior of the distribution tail. In the literature, numerous estimators of this parameter have been proposed especially in the case of heavy-tailed distributions, which is the situation considered here. Most of these estimators depend on the k largest observations of the underlying sample. Their bias is controlled by the second order parameter. In order to reduce the bias of extreme-value index estimators or to select the best number k of observations to use, the knowledge of the second order parameter is essential. In this paper, we propose a simple approach to estimate the second order parameter leading to both existing and new estimators. We establish a general result that can be used to easily prove the asymptotic normality of a large number of estimators proposed in the literature or to compare di erent estimators within a given family. Some illustrations on simulations are also provided.
We prove functional laws of the iterated logarithm for empirical processes based upon censored data in the neighborhood of a fixed point. We apply these results to obtain strong laws … We prove functional laws of the iterated logarithm for empirical processes based upon censored data in the neighborhood of a fixed point. We apply these results to obtain strong laws for estimators of local functionals of the lifetime distribution. In particular, we describe the pointwise strong limiting behavior of the kernel density estimator based upon the Kaplan-Meier product-limit estimator.
We introduce a general method to prove uniform in bandwidth consistency of kernel-type function estimators. Examples include the kernel density estimator, the Nadaraya–Watson regression estimator and the conditional empirical process. … We introduce a general method to prove uniform in bandwidth consistency of kernel-type function estimators. Examples include the kernel density estimator, the Nadaraya–Watson regression estimator and the conditional empirical process. Our results may be useful to establish uniform consistency of data-driven bandwidth kernel-type function estimators.
Abstract In this paper, we are mainly interested in estimating the reliability R=P(X>Y) in the Marshall–Olkin extended Lomax distribution, recently proposed by Ghitany et al. [Marshall–Olkin extended Lomax distribution and … Abstract In this paper, we are mainly interested in estimating the reliability R=P(X>Y) in the Marshall–Olkin extended Lomax distribution, recently proposed by Ghitany et al. [Marshall–Olkin extended Lomax distribution and its application, Commun. Statist. Theory Methods 36 (2007), pp. 1855–1866]. The model arises as a proportional odds model where the covariate effect is replaced by an additional parameter. Maximum likelihood estimators of the parameters are developed and an asymptotic confidence interval for R is obtained. Extensive simulation studies are carried out to investigate the performance of these intervals. Using real data we illustrate the procedure. Keywords: proportional odds modelasymptotic confidence intervalsstrength–stress modelsimulation studies Acknowledgements The authors are thankful to the referee for making some useful comments in enhancing the presentation.
Let X 1 , X 2 ,. . .be a sequence of independent copies (s.i.c) of a real random variable (r.v.) X 1, with distribution function df F(x) = P(X … Let X 1 , X 2 ,. . .be a sequence of independent copies (s.i.c) of a real random variable (r.v.) X 1, with distribution function df F(x) = P(X x) and let X 1,n X 2,n • • • X n,n be the order statistics based on the n 1 first of these observations.The following continuous generalized Hill processτ > 0, 1 k n, has been introduced as a continuous family of estimators of the extreme value index, and largely studied for statistical purposes with asymptotic normality results restricted to τ > 1/2.We extend those results to 0 < τ 1/2 and show that asymptotic normality is still valid for τ = 1/2.For 0 < τ < 1/2, we get non Gaussian asymptotic laws which are closely related to the Riemann function ζ (s) = ∑ ∞ n=1 n -s , s > 1.
We propose new nonparametric, consistent Renyi-α and Tsallis-α divergence estimators for continuous distributions. Given two independent and identically distributed samples, a “naive” approach would be to simply estimate the underlying … We propose new nonparametric, consistent Renyi-α and Tsallis-α divergence estimators for continuous distributions. Given two independent and identically distributed samples, a “naive” approach would be to simply estimate the underlying densities and plug the estimated densities into the corresponding formulas. Our proposed estimators, in contrast, avoid density estimation completely, estimating the divergences directly using only simple k-nearest-neighbor statistics. We are nonetheless able to prove that the estimators are consistent under certain conditions. We also describe how to apply these estimators to mutual information and demonstrate their efficiency via numerical experiments.
It is shown that Hill's estimator (1975) for the exponent of regular variation is asymptotically normal if the number $k_n$ of extreme order statistics used to construct it tends to … It is shown that Hill's estimator (1975) for the exponent of regular variation is asymptotically normal if the number $k_n$ of extreme order statistics used to construct it tends to infinity appropriately with the sample size $n.$ As our main result, we derive a general condition which can be used to determine the optimal $k_n$ explicitly, provided that some prior knowledge is available on the underlying distribution function with regularly varying upper tail. This condition is simplified under appropriate assumptions and then applied to several examples.
The two group t test is generalized here to produce a hypothesis testing procedure on the overlapping coefficient of two normally distributed populations with common variance, assuming that the researcher … The two group t test is generalized here to produce a hypothesis testing procedure on the overlapping coefficient of two normally distributed populations with common variance, assuming that the researcher knows the direction of the population means. The confi¬dence intervals are constructed on the overlapping coefficient. An illustrative example is given using the proposed procedures
The overlapping coefficient, defined as the common area under two probability density curves, is used as a measure of agreement between two distributions. It has recently been proposed as a … The overlapping coefficient, defined as the common area under two probability density curves, is used as a measure of agreement between two distributions. It has recently been proposed as a measure of bioequivalence under the name proportion of similar responses. Confidence intervals for this measure have been considered for the special case of two normal distributions with equal variances. We review and compare two procedures for this confidence interval based on the non-central t- and F-distributions. Our comparison is based on both theoretical considerations and a simulation study. Data on a marker from a study of recurrence of breast cancer are used to illustrate the methodology.
Let $X_1, X_2, \cdots$, be a sequence of nonnegative i.i.d. random variables with common distribution $F$, and for each $n \geq 1$ let $X_{1n} \leq \cdots \leq X_{nn}$ denote the … Let $X_1, X_2, \cdots$, be a sequence of nonnegative i.i.d. random variables with common distribution $F$, and for each $n \geq 1$ let $X_{1n} \leq \cdots \leq X_{nn}$ denote the order statistics based on $X_1, \cdots, X_n$. Necessary and sufficient conditions are obtained for averages of the extreme values $X_{n+1-i, n}i = 1, \cdots, k_n + 1$ of the form: $k^{-1}_n \sum^{k_n}_{i = 1} (X_{n+1-i, n} - X_{n-k_n,n})$, where $k_n \rightarrow\infty$ and $n^{-1}k_n \rightarrow 0$, to converge in probability or almost surely to a finite positive constant. In the process, characterizations are given of the classes of distributions with regularly varying upper tails and of distributions with "exponential-like" upper tails.
The overlapping coefficient is defined as a measure of the agreement between two probability distributions. Its relationship to the dissimilarity index and its propertie are described. An extensive treatment of … The overlapping coefficient is defined as a measure of the agreement between two probability distributions. Its relationship to the dissimilarity index and its propertie are described. An extensive treatment of maximum-likelihood estimation of the overlap between two normal distributions is presented as an example of estimating the overlapping coefficient from sample data.