Author Description

Login to generate an author description

Ask a Question About This Mathematician

Let $X_1, \cdot, X_n$ be $n$ independent random vectors, $X_\nu = (X^{(1)}_\nu, \cdots, X^{(r)}_\nu),$ and $\Phi(x_1, \cdots, x_m)$ a function of $m(\leq n)$ vectors $x_\nu = (x^{(1)}_\nu , \cdots, x^{(r)}_\nu)$. … Let $X_1, \cdot, X_n$ be $n$ independent random vectors, $X_\nu = (X^{(1)}_\nu, \cdots, X^{(r)}_\nu),$ and $\Phi(x_1, \cdots, x_m)$ a function of $m(\leq n)$ vectors $x_\nu = (x^{(1)}_\nu , \cdots, x^{(r)}_\nu)$. A statistic of the form $U = \sum"\Phi(X_{\alpha 1}, \cdots, X_{\alpha_m})/n(n - 1) \cdots (n - m + 1),$ where the sum $\sum"$ is extended over all permutations $(\alpha_1, \cdots, \alpha_m)$ of $m$ different integers, $1 \leq \alpha_i \leq n$, is called a $U$-statistic. If $X_1, \cdots, X_n$ have the same (cumulative) distribution function (d.f.) $F(x), U$ is an unbiased estimate of the population characteristic $\theta(F) = \int \cdots \int\Phi(x_1, \cdots, x_m) dF(x_1) \cdots dF(x_m). \theta(F)$ is called a regular functional of the d.f. $F(x)$. Certain optimal properties of $U$-statistics as unbiased estimates of regular functionals have been established by Halmos [9] (cf. Section 4). The variance of a $U$-statistic as a function of the sample size $n$ and of certain population characteristics is studied in Section 5. It is shown that if $X_1, \cdots, X_n$ have the same distribution and $\Phi(x_1, \cdots, x_m)$ is independent of $n$, the d.f. of $\sqrt n(U - \theta)$ tends to a normal d.f. as $n \rightarrow \infty$ under the sole condition of the existence of $E\Phi^2(X_1, \cdots, X_m)$. Similar results hold for the joint distribution of several $U$-statistics (Theorems 7.1 and 7.2), for statistics $U'$ which, in a certain sense, are asymptotically equivalent to $U$ (Theorems 7.3 and 7.4), for certain functions of statistics $U$ or $U'$ (Theorem 7.5) and, under certain additional assumptions, for the case of the $X_\nu$'s having different distributions (Theorems 8.1 and 8.2). Results of a similar character, though under different assumptions, are contained in a recent paper by von Mises [18] (cf. Section 7). Examples of statistics of the form $U$ or $U'$ are the moments, Fisher's $k$-statistics, Gini's mean difference, and several rank correlation statistics such as Spearman's rank correlation and the difference sign correlation (cf. Section 9). Asymptotic power functions for the non-parametric tests of independence based on these rank statistics are obtained. They show that these tests are not unbiased in the limit (Section 9f). The asymptotic distribution of the coefficient of partial difference sign correlation which has been suggested by Kendall also is obtained (Section 9h).
A test is proposed for the independence of two random variables with continuous distribution function (d.f.). The test is consistent with respect to the class $\Omega''$ of d.f.'s with continuous … A test is proposed for the independence of two random variables with continuous distribution function (d.f.). The test is consistent with respect to the class $\Omega''$ of d.f.'s with continuous joint and marginal probability densities (p.d.). The test statistic $D$ depends only on the rank order of the observations. The mean and variance of $D$ are given and $\sqrt n(D - ED)$ is shown to have a normal limiting distribution for any parent distribution. In the case of independence this limiting distribution is degenerate, and $nD$ has a non-normal limiting distribution whose characteristic function and cumulants are given. The exact distribution of $D$ in the case of independence for samples of size $n = 5, 6, 7$ is tabulated. In the Appendix it is shown that there do not exist tests of independence based on ranks which are unbiased on any significance level with respect to the class $\Omega''$. It is also shown that if the parent distribution belongs to $\Omega''$ and for some $n \geq 5$ the probabilities of the $n!$ rank permutations are equal, the random variables are independent.
The central limit theorem has been extended to the case of dependent random variables by several authors (Bruns, Markoff, S. Bernstein, P. Levy, Loeve). The conditions under which these theorems … The central limit theorem has been extended to the case of dependent random variables by several authors (Bruns, Markoff, S. Bernstein, P. Levy, Loeve). The conditions under which these theorems are stated either are very restrictive or involve conditional distributions, which makes them difficult to apply. In the present paper we prove central limit theorems for sequences of dependent random variables of a certain special type which occurs frequently in mathematical statistics. The hypotheses do not involve conditional distributions.
Let $S$ be the number of successes in $n$ independent trials, and let $p_j$ denote the probability of success in the $j$th trial, $j = 1, 2, \cdots, n$ (Poisson … Let $S$ be the number of successes in $n$ independent trials, and let $p_j$ denote the probability of success in the $j$th trial, $j = 1, 2, \cdots, n$ (Poisson trials). We consider the problem of finding the maximum and the minimum of $Eg(S),$ the expected value of a given real-valued function of $S,$ when $ES = np$ is fixed. It is well known that the maximum of the variance of $S$ is attained when $p_1 = p_2 = \cdots = p_n = p.$ This can be interpreted as showing that the variability in the number of successes is highest when the successes are equally probable (Bernoulli trials). This interpretation is further supported by the following two theorems, proved in this paper. If $b$ and $c$ are two integers, $0 \leqq b \leqq np \leqq c \leqq n,$ the probability $P(b \leqq S \leqq c)$ attains its minimum if and only if $p_1 = p_2 = \cdots = p_n = p,$ unless $b = 0$ and $c = n$ (Theorem 5, a corollary of Theorem 4, which gives the maximum and the minimum of $P(S \leqq c)$). If $g$ is a strictly convex function, $Eg(S)$ attains its maximum if and only if $p_1 = p_2 = \cdots = p_n = p$ (Theorem 3). These results are obtained with the help of two theorems concerning the extrema of the expected value of an arbitrary function $g(S)$ under the condition $ES = np.$ Theorem 1 gives necessary conditions for the maximum and the minimum of $Eg(S).$ Theorem 2 gives a partial characterization of the set of points at which an extremum is attained. Corollary 2.1 states that the maximum and the minimum are attained when $p_1, p_2, \cdots, p_n$ take on, at most, three different values, only one of which is distinct from 0 and 1. Applications of Theorems 3 and 5 to problems of estimation and testing are pointed out in Section 5.
The paper investigates the power of a family of nonparametric tests which includes those known as tests based on permutations of observations. Under general conditions the tests are found to … The paper investigates the power of a family of nonparametric tests which includes those known as tests based on permutations of observations. Under general conditions the tests are found to be asymptotically (as the sample size tends to $\infty$) as powerful as certain related standard parametric tests. The results are based on a study of the convergence in probability of certain random distribution functions. A more detailed summary will be found at the end of the Introduction.
Tests of simple and composite hypothesis for multinomial distributions are considered. It is assumed that the size $\alpha_N$ of a test tends to 0 as the sample size $N$ increases. … Tests of simple and composite hypothesis for multinomial distributions are considered. It is assumed that the size $\alpha_N$ of a test tends to 0 as the sample size $N$ increases. The main concern of this paper is to substantiate the following proposition: If a given test of size $\alpha_N$ is "sufficiently different" from a likelihood ratio test then there is a likelihood ratio test of size $\leqq\alpha_N$ which is considerably more powerful than the given test at "most" points in the set of alternatives when $N$ is large enough, provided that $\alpha_N \rightarrow 0$ at a suitable rate. In particular, it is shown that chi-square tests of simple and of some composite hypotheses are inferior, in the sense described, to the corresponding likelihood ratio tests. Certain Bayes tests are shown to share the above-mentioned property of a likelihood ratio test.
Let $(Y_{n1}, \cdots, Y_{nn})$ be a random vector which takes on the $n!$ permutations of $(1, \cdots, n)$ with equal probabilities. Let $c_n(i, j), i,j = 1, \cdots, n,$ be … Let $(Y_{n1}, \cdots, Y_{nn})$ be a random vector which takes on the $n!$ permutations of $(1, \cdots, n)$ with equal probabilities. Let $c_n(i, j), i,j = 1, \cdots, n,$ be $n^2$ real numbers. Sufficient conditions for the asymptotic normality of $S_n = \sum^n_{i=1} c_n(i, Y_{ni})$ are given (Theorem 3). For the special case $c_n(i,j) = a_n(i)b_n(j)$ a stronger version of a theorem of Wald, Wolfowitz and Noether is obtained (Theorem 4). A condition of Noether is simplified (Theorem 1).
Let $X_1, X_2, \cdots, X_n$ be independent with a common distribution function $F(x)$ which has a finite mean, and let $Z_{n1} \leqq Z_{n2} \leqq \cdots \leqq Z_{nn}$ be the ordered … Let $X_1, X_2, \cdots, X_n$ be independent with a common distribution function $F(x)$ which has a finite mean, and let $Z_{n1} \leqq Z_{n2} \leqq \cdots \leqq Z_{nn}$ be the ordered values $X_1, \cdots, X_n$. The distribution of the $n$ values $EZ_{n1}, \cdots, EZ_{nn}$ on the real line is studied for large $n$. In particular, it is shown that as $n \rightarrow \infty$, the corresponding distribution function converges to $F(x)$ and any moment of that distribution converges to the corresponding moment of $F(x)$ if the latter exists. The distribution of the values $Ef(Z_{nm})$ for certain functions $f(x)$ is also considered.
Sections 1-6 are concerned with lower bounds for the expected sample size, $E_0(N)$, of an arbitrary sequential test whose error probabilities at two parameter points, $\theta_1$ and $\theta_2$, do not … Sections 1-6 are concerned with lower bounds for the expected sample size, $E_0(N)$, of an arbitrary sequential test whose error probabilities at two parameter points, $\theta_1$ and $\theta_2$, do not exceed given numbers, $\alpha_1$ and $\alpha_2$, where $E_0(N)$ is evaluated at a third parameter point, $\theta_0$. The bounds in (1.3) and (1.4) are shown to be attainable or nearly attainable in certain cases where $\theta_0$ lies between $\theta_1$ and $\theta_2$. In Section 7 lower bounds for the average risk of a general sequential procedure are obtained. In Section 8 these bounds are used to derive further lower bounds for $E_0(N)$ which in general are better than (1.3).
Hajek (1968) proved that under weak conditions the distribution of a simple linear rank statistic $S$ is asymptotically normal, centered at the mean $ES$. He left open the question whether … Hajek (1968) proved that under weak conditions the distribution of a simple linear rank statistic $S$ is asymptotically normal, centered at the mean $ES$. He left open the question whether under the same conditions the centering constant $ES$ may be replaced by a simpler constant $\mu$, as was found to be true in the two-sample case and under different conditions by Chernoff and Savage (1958) and Govindarajulu, LeCam and Rhagavachari (1966). In this paper it is shown that the replacement of $ES$ by $\mu$ is permissible if one of Hajek's conditions is slightly strengthened.
Procedures are exhibited and analyzed for converting a sequence of i.i.d. Bernoulli variables with unknown mean $p$ into a Bernoulli variable with mean $\frac{1}{2}$. The efficiency of several procedures is … Procedures are exhibited and analyzed for converting a sequence of i.i.d. Bernoulli variables with unknown mean $p$ into a Bernoulli variable with mean $\frac{1}{2}$. The efficiency of several procedures is studied.
Let $\xi = \eta = \zeta $, where $\eta $ and $\zeta $ are independent random variables, $\eta $ has the probability density (7) and ${\bf E}\exp ({\zeta / 2}) … Let $\xi = \eta = \zeta $, where $\eta $ and $\zeta $ are independent random variables, $\eta $ has the probability density (7) and ${\bf E}\exp ({\zeta / 2}) = K < \infty $. It is shown that formula (10) is true if $m \geqq 1$, or if $0 < m < 1$ and condition (11) which is implied by (12) is satisfied. If ${\bf P}\{ {\zeta < 0} \} = 0$, inequality (13) holds for $m \geqq 1$. Formula (14) is true if conditions (15) and (in the case $r > m - 1$) (16) are satisfied. An application to the random variable (1), a weighted sum of independent $\chi ^2 $ random variables, implies a result of V. M. Zolotarev [1].
The efficiency of a family of tests is defined. Methods for evaluating the efficiency are discussed. The asymptotic efficiency is obtained for certain families of tests under assumptions which imply … The efficiency of a family of tests is defined. Methods for evaluating the efficiency are discussed. The asymptotic efficiency is obtained for certain families of tests under assumptions which imply that the sample size is large.
Abstract Consider estimating the value of a real-valued function f(p), p = (p 0, p 1, …, pr ), on the basis of an observation of the random vector X … Abstract Consider estimating the value of a real-valued function f(p), p = (p 0, p 1, …, pr ), on the basis of an observation of the random vector X = (X 0, X 1, …, Xr ) whose distribution is multinomial (n, p). It is known that an unbiased estimator exists if and only if f is a polynomial of degree at most n, in which case the unbiased estimator of f(p) is unique. In general, however, this estimator has the serious fault of not being range preserving; that is, its value may fall outside the range of f(p). In this article, a condition on f is derived that is necessary for the unbiased estimator to be range preserving and that is sufficient when n is large enough.
Let $\mathscr{P}$ be a family of distributions on a measurable space such that $(\dagger) \int u_i dP = c_i, i = 1, \cdots, k$, for all $P\in\mathscr{P}$, and which is … Let $\mathscr{P}$ be a family of distributions on a measurable space such that $(\dagger) \int u_i dP = c_i, i = 1, \cdots, k$, for all $P\in\mathscr{P}$, and which is sufficiently rich; for example, $\mathscr{P}$ consists of all distributions dominated by a $\sigma$-finite measure and satisfying $(\dagger)$. It is known that when conditions $(\dagger)$ are not present, no nontrivial symmetric unbiased estimator of zero (s.u.e.z.) based on a random sample of any size $n$ exists. Here it is shown that (I) if $g(x_1, \cdots, x_n)$ is a s.u.e.z. then there exist symmetric functions $h_i(x_1, \cdots, x_{n - 1}), i = 1, \cdots, k$, such that $g(x_1, \cdots, x_n) = \sum^k_{i = 1} \sum^n_{j = 1} \{u_i(x_j) - c_i\}h_i(x_1, \cdots, x_{j - 1}, x_{j + 1}, \cdots, x_n);$ and (II) if every nontrivial linear combination of $u_1, \cdots, u_k$ is unbounded then no bounded nontrivial s.u.e.z. exists. Applications to unbiased estimation and similar tests are discussed.
Abstract Abstract Consider estimating the value of a real-valued function f(p), p = (p 0, p 1, …, pr ), on the basis of an observation of the random vector … Abstract Abstract Consider estimating the value of a real-valued function f(p), p = (p 0, p 1, …, pr ), on the basis of an observation of the random vector X = (X 0, X 1, …, Xr ) whose distribution is multinomial (n, p). It is known that an unbiased estimator exists if and only if f is a polynomial of degree at most n, in which case the unbiased estimator of f(p) is unique. In general, however, this estimator has the serious fault of not being range preserving; that is, its value may fall outside the range of f(p). In this article, a condition on f is derived that is necessary for the unbiased estimator to be range preserving and that is sufficient when n is large enough. Key Words: Range preserving estimatorUnbiased estimatorPrior rangePosterior rangeBinomial distributionMultinomial distribution
Abstract Consider estimating the value of a real-valued function f(p), p = (p 0, p 1, …, pr ), on the basis of an observation of the random vector X … Abstract Consider estimating the value of a real-valued function f(p), p = (p 0, p 1, …, pr ), on the basis of an observation of the random vector X = (X 0, X 1, …, Xr ) whose distribution is multinomial (n, p). It is known that an unbiased estimator exists if and only if f is a polynomial of degree at most n, in which case the unbiased estimator of f(p) is unique. In general, however, this estimator has the serious fault of not being range preserving; that is, its value may fall outside the range of f(p). In this article, a condition on f is derived that is necessary for the unbiased estimator to be range preserving and that is sufficient when n is large enough.
Abstract Abstract Consider estimating the value of a real-valued function f(p), p = (p 0, p 1, …, pr ), on the basis of an observation of the random vector … Abstract Abstract Consider estimating the value of a real-valued function f(p), p = (p 0, p 1, …, pr ), on the basis of an observation of the random vector X = (X 0, X 1, …, Xr ) whose distribution is multinomial (n, p). It is known that an unbiased estimator exists if and only if f is a polynomial of degree at most n, in which case the unbiased estimator of f(p) is unique. In general, however, this estimator has the serious fault of not being range preserving; that is, its value may fall outside the range of f(p). In this article, a condition on f is derived that is necessary for the unbiased estimator to be range preserving and that is sufficient when n is large enough. Key Words: Range preserving estimatorUnbiased estimatorPrior rangePosterior rangeBinomial distributionMultinomial distribution
Let $\mathscr{P}$ be a family of distributions on a measurable space such that $(\dagger) \int u_i dP = c_i, i = 1, \cdots, k$, for all $P\in\mathscr{P}$, and which is … Let $\mathscr{P}$ be a family of distributions on a measurable space such that $(\dagger) \int u_i dP = c_i, i = 1, \cdots, k$, for all $P\in\mathscr{P}$, and which is sufficiently rich; for example, $\mathscr{P}$ consists of all distributions dominated by a $\sigma$-finite measure and satisfying $(\dagger)$. It is known that when conditions $(\dagger)$ are not present, no nontrivial symmetric unbiased estimator of zero (s.u.e.z.) based on a random sample of any size $n$ exists. Here it is shown that (I) if $g(x_1, \cdots, x_n)$ is a s.u.e.z. then there exist symmetric functions $h_i(x_1, \cdots, x_{n - 1}), i = 1, \cdots, k$, such that $g(x_1, \cdots, x_n) = \sum^k_{i = 1} \sum^n_{j = 1} \{u_i(x_j) - c_i\}h_i(x_1, \cdots, x_{j - 1}, x_{j + 1}, \cdots, x_n);$ and (II) if every nontrivial linear combination of $u_1, \cdots, u_k$ is unbounded then no bounded nontrivial s.u.e.z. exists. Applications to unbiased estimation and similar tests are discussed.
Hajek (1968) proved that under weak conditions the distribution of a simple linear rank statistic $S$ is asymptotically normal, centered at the mean $ES$. He left open the question whether … Hajek (1968) proved that under weak conditions the distribution of a simple linear rank statistic $S$ is asymptotically normal, centered at the mean $ES$. He left open the question whether under the same conditions the centering constant $ES$ may be replaced by a simpler constant $\mu$, as was found to be true in the two-sample case and under different conditions by Chernoff and Savage (1958) and Govindarajulu, LeCam and Rhagavachari (1966). In this paper it is shown that the replacement of $ES$ by $\mu$ is permissible if one of Hajek's conditions is slightly strengthened.
Procedures are exhibited and analyzed for converting a sequence of i.i.d. Bernoulli variables with unknown mean $p$ into a Bernoulli variable with mean $\frac{1}{2}$. The efficiency of several procedures is … Procedures are exhibited and analyzed for converting a sequence of i.i.d. Bernoulli variables with unknown mean $p$ into a Bernoulli variable with mean $\frac{1}{2}$. The efficiency of several procedures is studied.
Tests of simple and composite hypothesis for multinomial distributions are considered. It is assumed that the size $\alpha_N$ of a test tends to 0 as the sample size $N$ increases. … Tests of simple and composite hypothesis for multinomial distributions are considered. It is assumed that the size $\alpha_N$ of a test tends to 0 as the sample size $N$ increases. The main concern of this paper is to substantiate the following proposition: If a given test of size $\alpha_N$ is "sufficiently different" from a likelihood ratio test then there is a likelihood ratio test of size $\leqq\alpha_N$ which is considerably more powerful than the given test at "most" points in the set of alternatives when $N$ is large enough, provided that $\alpha_N \rightarrow 0$ at a suitable rate. In particular, it is shown that chi-square tests of simple and of some composite hypotheses are inferior, in the sense described, to the corresponding likelihood ratio tests. Certain Bayes tests are shown to share the above-mentioned property of a likelihood ratio test.
Let $\xi = \eta = \zeta $, where $\eta $ and $\zeta $ are independent random variables, $\eta $ has the probability density (7) and ${\bf E}\exp ({\zeta / 2}) … Let $\xi = \eta = \zeta $, where $\eta $ and $\zeta $ are independent random variables, $\eta $ has the probability density (7) and ${\bf E}\exp ({\zeta / 2}) = K < \infty $. It is shown that formula (10) is true if $m \geqq 1$, or if $0 < m < 1$ and condition (11) which is implied by (12) is satisfied. If ${\bf P}\{ {\zeta < 0} \} = 0$, inequality (13) holds for $m \geqq 1$. Formula (14) is true if conditions (15) and (in the case $r > m - 1$) (16) are satisfied. An application to the random variable (1), a weighted sum of independent $\chi ^2 $ random variables, implies a result of V. M. Zolotarev [1].
Let $F(P)$ be a real valued function defined on a subset $\mathscr{D}$ of the set $\mathscr{D}^\ast$ of all probability distributions on the real line. A function $f$ of $n$ real … Let $F(P)$ be a real valued function defined on a subset $\mathscr{D}$ of the set $\mathscr{D}^\ast$ of all probability distributions on the real line. A function $f$ of $n$ real variables is an unbiased estimate of $F$ if for every system, $X_1, \cdots, X_n$, of independent random variables with the common distribution $P$, the expectation of $f(X_1 \cdots, X_n)$ exists and equals $F(P)$, for all $P$ in $\mathscr{D}$. A necessary and sufficient condition for the existence of an unbiased estimate is given (Theorem 1), and the way in which this condition applies to the moments of a distribution is described (Theorem 2). Under the assumptions that this condition is satisfied and that $\mathscr{D}$ contains all purely discontinuous distributions it is shown that there is a unique symmetric unbiased estimate (Theorem 3); the most general (non symmetric) unbiased estimates are described (Theorem 4); and it is proved that among them the symmetric one is best in the sense of having the least variance (Theorem 5). Thus the classical estimates of the mean and the variance are justified from a new point of view, and also, from the theory, computable estimates of all higher moments are easily derived. It is interesting to note that for $n$ greater than 3 neither the sample $n$th moment about the sample mean nor any constant multiple thereof is an unbiased estimate of the $n$th moment about the mean. Attention is called to a paradoxical situation arising in estimating such non linear functions as the square of the first moment.
Results of Chernoff-Savage (1958) and Govindarajulu-LeCam-Raghavachari (1966) are extended from the two-sample case to the general regression case and, simultaneously, the conditions on the scores-generating function are relaxed. The main … Results of Chernoff-Savage (1958) and Govindarajulu-LeCam-Raghavachari (1966) are extended from the two-sample case to the general regression case and, simultaneously, the conditions on the scores-generating function are relaxed. The main results are stated in Section 2 and their proofs are given in Section 5. Sections 3 and 4 contain auxiliary propositions, on which the methods of the present paper are based. Section 6 includes a counterexample, showing that the theorem cannot be extended to discontinuous scores-generating functions.
Let $X_1, \cdot, X_n$ be $n$ independent random vectors, $X_\nu = (X^{(1)}_\nu, \cdots, X^{(r)}_\nu),$ and $\Phi(x_1, \cdots, x_m)$ a function of $m(\leq n)$ vectors $x_\nu = (x^{(1)}_\nu , \cdots, x^{(r)}_\nu)$. … Let $X_1, \cdot, X_n$ be $n$ independent random vectors, $X_\nu = (X^{(1)}_\nu, \cdots, X^{(r)}_\nu),$ and $\Phi(x_1, \cdots, x_m)$ a function of $m(\leq n)$ vectors $x_\nu = (x^{(1)}_\nu , \cdots, x^{(r)}_\nu)$. A statistic of the form $U = \sum"\Phi(X_{\alpha 1}, \cdots, X_{\alpha_m})/n(n - 1) \cdots (n - m + 1),$ where the sum $\sum"$ is extended over all permutations $(\alpha_1, \cdots, \alpha_m)$ of $m$ different integers, $1 \leq \alpha_i \leq n$, is called a $U$-statistic. If $X_1, \cdots, X_n$ have the same (cumulative) distribution function (d.f.) $F(x), U$ is an unbiased estimate of the population characteristic $\theta(F) = \int \cdots \int\Phi(x_1, \cdots, x_m) dF(x_1) \cdots dF(x_m). \theta(F)$ is called a regular functional of the d.f. $F(x)$. Certain optimal properties of $U$-statistics as unbiased estimates of regular functionals have been established by Halmos [9] (cf. Section 4). The variance of a $U$-statistic as a function of the sample size $n$ and of certain population characteristics is studied in Section 5. It is shown that if $X_1, \cdots, X_n$ have the same distribution and $\Phi(x_1, \cdots, x_m)$ is independent of $n$, the d.f. of $\sqrt n(U - \theta)$ tends to a normal d.f. as $n \rightarrow \infty$ under the sole condition of the existence of $E\Phi^2(X_1, \cdots, X_m)$. Similar results hold for the joint distribution of several $U$-statistics (Theorems 7.1 and 7.2), for statistics $U'$ which, in a certain sense, are asymptotically equivalent to $U$ (Theorems 7.3 and 7.4), for certain functions of statistics $U$ or $U'$ (Theorem 7.5) and, under certain additional assumptions, for the case of the $X_\nu$'s having different distributions (Theorems 8.1 and 8.2). Results of a similar character, though under different assumptions, are contained in a recent paper by von Mises [18] (cf. Section 7). Examples of statistics of the form $U$ or $U'$ are the moments, Fisher's $k$-statistics, Gini's mean difference, and several rank correlation statistics such as Spearman's rank correlation and the difference sign correlation (cf. Section 9). Asymptotic power functions for the non-parametric tests of independence based on these rank statistics are obtained. They show that these tests are not unbiased in the limit (Section 9f). The asymptotic distribution of the coefficient of partial difference sign correlation which has been suggested by Kendall also is obtained (Section 9h).
For two types of non-parametric hypotheses optimum tests are derived against certain classes of alternatives. The two kinds of hypotheses are related and may be illustrated by the following example: … For two types of non-parametric hypotheses optimum tests are derived against certain classes of alternatives. The two kinds of hypotheses are related and may be illustrated by the following example: (1) The joint distribution of the variables $X_1, \cdots, X_m, Y_1, \cdots, Y_n$ is invariant under all permutations of the variables; (2) the variables are independently and identically distributed. It is shown that the theory of optimum tests for hypotheses of the first kind is the same as that of optimum similar tests for hypotheses of the second kind. Most powerful tests are obtained against arbitrary simple alternatives, and in a number of important cases most stringent tests are derived against certain composite alternatives. For the example (1), if the distributions are restricted to probability densities, Pitman's test based on $\bar y - \bar x$ is most powerful against the alternatives that the $X$'s and $Y$'s are independently normally distributed with common variance, and that $E(X_i) = \xi, E(Y_i) = \eta$ where $\eta > \xi$. If $\eta - \xi$ may be positive or negative the test based on $|\bar y - \bar x|$ is most stringent. The definitions are sufficiently general that the theory applies to both continuous and discrete problems, and that tied observations present no difficulties. It is shown that continuous and discrete problems may be combined. Pitman's test for example, when applied to certain discrete problems, coincides with Fisher's exact test, and when $m = n$ the test based on $|\bar y - \bar x|$ is most stringent for hypothesis (1) against a broad class of alternatives which includes both discrete and absolutely continuous distributions.
A NEW MEASURE OF RANK CORRELATION M. G. KENDALL M. G. KENDALL Search for other works by this author on: Oxford Academic Google Scholar Biometrika, Volume 30, Issue 1-2, June … A NEW MEASURE OF RANK CORRELATION M. G. KENDALL M. G. KENDALL Search for other works by this author on: Oxford Academic Google Scholar Biometrika, Volume 30, Issue 1-2, June 1938, Pages 81–93, https://doi.org/10.1093/biomet/30.1-2.81 Published: 01 June 1938
The central limit theorem has been extended to the case of dependent random variables by several authors (Bruns, Markoff, S. Bernstein, P. Levy, Loeve). The conditions under which these theorems … The central limit theorem has been extended to the case of dependent random variables by several authors (Bruns, Markoff, S. Bernstein, P. Levy, Loeve). The conditions under which these theorems are stated either are very restrictive or involve conditional distributions, which makes them difficult to apply. In the present paper we prove central limit theorems for sequences of dependent random variables of a certain special type which occurs frequently in mathematical statistics. The hypotheses do not involve conditional distributions.
This is a straightforward continuation of Hajek (1968). We provide a further extension of the Chernoff-Savage (1958) limit theorem. The requirements concerning the scores-generating function are relaxed to a minimum: … This is a straightforward continuation of Hajek (1968). We provide a further extension of the Chernoff-Savage (1958) limit theorem. The requirements concerning the scores-generating function are relaxed to a minimum: we assume that this function is a difference of two non-decreasing and square integrable functions. Thus, in contradistinction to Hajek (1968), we dropped the assumption of absolute continuity. The main results are accumulated in Section 2 without proofs. The proofs are given in Sections 4 through 7. Section 3 contains auxiliary results.
Let $X_1, \cdots, X_m$ and $Y_1, \cdots, Y_n$ be ordered observations from the absolutely continuous cumulative distribution functions $F(x)$ and $G(x)$ respectively. If $z_{Ni} = 1$ when the $i$th smallest … Let $X_1, \cdots, X_m$ and $Y_1, \cdots, Y_n$ be ordered observations from the absolutely continuous cumulative distribution functions $F(x)$ and $G(x)$ respectively. If $z_{Ni} = 1$ when the $i$th smallest of $N = m + n$ observations is an $X$ and $z_{Ni} = 0$ otherwise, then many nonparametric test statistics are of the form $$mT_N = \sum^N_{i = 1} E_{Ni}z_{Ni}.$$ Theorems of Wald and Wolfowitz, Noether, Hoeffding, Lehmann, Madow, and Dwass have given sufficient conditions for the asymptotic normality of $T_N$. In this paper we extend some of these results to cover more situations with $F \neq G$. In particular it is shown for all alternative hypotheses that the Fisher-Yates-Terry-Hoeffding $c_1$-statistic is asymptotically normal and the test for translation based on it is at least as efficient as the $t$-test.
The paper investigates certain asymptotic properties of the test of randomness based on the statistic $R_h = \sum^n_{i=1} x_ix_{i+h}$ proposed by Wald and Wolfowitz. It is shown that the conditions … The paper investigates certain asymptotic properties of the test of randomness based on the statistic $R_h = \sum^n_{i=1} x_ix_{i+h}$ proposed by Wald and Wolfowitz. It is shown that the conditions given in the original paper for asymptotic normality of $R_h$ when the null hypothesis of randomness is true can be weakened considerably. Conditions are given for the consistency of the test when under the alternative hypothesis consecutive observations are drawn independently from changing populations with continuous cumulative distribution functions. In particular a downward (upward) trend and a regular cyclical movement are considered. For the special case of a regular cyclical movement of known length the asymptotic relative efficiency of the test based on ranks with respect to the test based on original observations is found. A simple condition for the asymptotic normality of $R_h$ for ranks under the alternative hypothesis is given. This asymptotic normality is used to compare the asymptotic power of the $R_h$-test with that of the Mann $T$-test in the case of a downward trend.
This chapter presents the basic concepts and results of the theory of testing statistical hypotheses. The generalized likelihood ratio tests that are discussed can be applied to testing in the … This chapter presents the basic concepts and results of the theory of testing statistical hypotheses. The generalized likelihood ratio tests that are discussed can be applied to testing in the presence of nuisance parameters. Besides the likelihood ratio tests, for testing in the presence of nuisance parameters one can use conditional tests. The chapter also presents the motivation for steps of the proof of the randomization principle theorem. It considers the case of a single observation, but the extension to the case of n observations will be obvious. The chapter presents an approach that requires unbiasedness and explains how the theory of testing statistical hypotheses is related to the theory of confidence intervals. It reviews the major testing procedures for parameters of normal distributions and is intended as a convenient reference for users rather than an exposition of new concepts or results.
Let $X_1, X_2, \cdots, X_n$ be a sample of a one-dimensional random variable $X$; let the order statistic $T(X_1, X_2, \cdots, X_n)$ be defined in such a manner that $T(x_1, … Let $X_1, X_2, \cdots, X_n$ be a sample of a one-dimensional random variable $X$; let the order statistic $T(X_1, X_2, \cdots, X_n)$ be defined in such a manner that $T(x_1, x_2, \cdots, x_n) = (x^{(1)}, x^{(2)}, \cdots, x^{(n)})$ where $x^{(1)} \leqq x^{(2)} \leqq \cdots \leqq x^{(n)}$ denote the ordered $x's$; and let $\Omega$ be a class of one-dimensional cpf's, i.e., cumulative probability functions. The order statistic, $T$, is said to be a complete statistic with respect to the class, $\{P^{(n)} \mid P \epsilon \Omega\}$, of $n$-fold power probability distributions if $E_p^{(n)}\{h\lbrack T(X_1, \cdots, X_n)\rbrack\} = 0$ for all $P \epsilon \Omega$ implies $h\lbrack T(x_1, \cdots, x_n)\rbrack = 0, a.e., P^{(n)}$, for all $F \epsilon \Omega$. The class $\Omega$ is said to be symmetrically complete whenever the latter condition holds. Since the completeness of the order statistic plays an essential role in nonparametric estimation and hypothesis testing, e.g., Fraser [2] and Bell [1], it is of interest to determine those classes of cpf's for which the order statistic is complete. Many of the traditionally studied classes of cpf's on the real line are known to be symmetrically complete, e.g., all continuous cpf's ([4], pp. 131-134, 152-153); all cpf's absolutely continuous with respect to Lebesgue measure ([3], pp. 23-31); and all exponentials of a certain form ([4], pp. 131-134). The object of this note is to present a different ([4], pp. 131-134, 152-153) demonstration of the symmetric completeness of the class of all continuous cpf's; and to extend this and other known completeness results to probability spaces other than the real line, e.g., Fraser [2], and Lehmann and Scheffe [5], [6]. The paper is divided into four sections. Section 1 contains the introduction and summary. In Section 2 the notation and terminology are introduced. The main theorem is presented in Section 3, and some consequences of the proof of the main theorem and known results are indicated in Section 4.
Tests of simple and composite hypothesis for multinomial distributions are considered. It is assumed that the size $\alpha_N$ of a test tends to 0 as the sample size $N$ increases. … Tests of simple and composite hypothesis for multinomial distributions are considered. It is assumed that the size $\alpha_N$ of a test tends to 0 as the sample size $N$ increases. The main concern of this paper is to substantiate the following proposition: If a given test of size $\alpha_N$ is "sufficiently different" from a likelihood ratio test then there is a likelihood ratio test of size $\leqq\alpha_N$ which is considerably more powerful than the given test at "most" points in the set of alternatives when $N$ is large enough, provided that $\alpha_N \rightarrow 0$ at a suitable rate. In particular, it is shown that chi-square tests of simple and of some composite hypotheses are inferior, in the sense described, to the corresponding likelihood ratio tests. Certain Bayes tests are shown to share the above-mentioned property of a likelihood ratio test.
Let $(Y_{n1}, \cdots, Y_{nn})$ be a random vector which takes on the $n!$ permutations of $(1, \cdots, n)$ with equal probabilities. Let $c_n(i, j), i,j = 1, \cdots, n,$ be … Let $(Y_{n1}, \cdots, Y_{nn})$ be a random vector which takes on the $n!$ permutations of $(1, \cdots, n)$ with equal probabilities. Let $c_n(i, j), i,j = 1, \cdots, n,$ be $n^2$ real numbers. Sufficient conditions for the asymptotic normality of $S_n = \sum^n_{i=1} c_n(i, Y_{ni})$ are given (Theorem 3). For the special case $c_n(i,j) = a_n(i)b_n(j)$ a stronger version of a theorem of Wald, Wolfowitz and Noether is obtained (Theorem 4). A condition of Noether is simplified (Theorem 1).
1. Introduction. Under the non-parametric assumption that a set of observations is a sample from an absolutely continuous distribution, the order statistics are known to form a complete sufficient statistic. … 1. Introduction. Under the non-parametric assumption that a set of observations is a sample from an absolutely continuous distribution, the order statistics are known to form a complete sufficient statistic. It is proved in this note that it suffices to have the class of uniform distributions over finite numbers of intervals or the class of uniform distributions over sets of a ring which is a basis for the σ-algebra of Borel sets. This result is derived as a particular case of that of several samples from more general distributions.
Abstract An early extension of Lindeberg's central limit theorem was Bernstein's (1939) discovery of necessary and sufficient conditions for the convergence of moments in the central limit theorem. Von Bahr … Abstract An early extension of Lindeberg's central limit theorem was Bernstein's (1939) discovery of necessary and sufficient conditions for the convergence of moments in the central limit theorem. Von Bahr (1965) made a study of some asymptotic expansions in the central limit theorem, and obtained rates of convergence for moments. However, his results do not in general imply that the moments converge. Some better rates have been obtained by Bhattacharya and Rao for moments between the second and third. In this paper we give improved rates of convergence for absolute moments between the third and fourth.
Assuming only the existence of the third absolute moment we prove that $\sup_x |P(\sigma_n^{-1} U_n \leqq x) - \Phi (x)| \leqq C_{\nu_3\sigma_g}^{-3}n^{-\frac{1}{2}}$ where $U_n$ is a $U$-statistic. This concludes a … Assuming only the existence of the third absolute moment we prove that $\sup_x |P(\sigma_n^{-1} U_n \leqq x) - \Phi (x)| \leqq C_{\nu_3\sigma_g}^{-3}n^{-\frac{1}{2}}$ where $U_n$ is a $U$-statistic. This concludes a series of investigations on the Berry-Esseen theorem for $U$-statistics by Grams and Serfling, Bickel, and Chan and Wierman.
The existence of unbiased nonnegative definite quadratic estimates for linear combinations of variance covariance components is characterized by means of the natural parameter set in a residual model. In the … The existence of unbiased nonnegative definite quadratic estimates for linear combinations of variance covariance components is characterized by means of the natural parameter set in a residual model. In the presence of a quadratic subspace condition the following disjunction for nonnegative estimability is derived: either standard methods suffice, or the concepts of unbiasedness and nonnegative definiteness are incompatible. For the case of a single variance component it is shown that unbiasedness and nonnegative definiteness always entail a reduction to a trivial model in which the variance component under investigation is the sole remaining parameter. Several examples illustrate these results.
Generalized sequential probability ratio tests (hereafter abbreviated GSPRT's) for testing between two simple hypotheses have been defined in [1]. The present paper, divided into four sections, discusses certain properties of … Generalized sequential probability ratio tests (hereafter abbreviated GSPRT's) for testing between two simple hypotheses have been defined in [1]. The present paper, divided into four sections, discusses certain properties of GSPRT's. In Section 1 it is shown that under certain conditions the distributions of the sample size under the two hypotheses uniquely determine a GSPRT. In the second section, the admissibility of GSPRT's is discussed, admissibility being defined in terms of the probabilities of the two types of error and the distributions of the sample size required to come to a decision; in particular, notwithstanding the result of Section 1, many GSPRT's are inadmissible. In Section 3 it is shown that, under certain monotonicity assumptions on the probability ratios, the GSPRT's are a complete class with respect to the probabilities of the two types of error and the average distribution of the sample size over a finite set of other distributions. In Section 4, finer characterizations are given of GSPRT's which minimize the expected sample size under a third distribution satisfying certain monotonicity properties relative to the other two distributions; these characterizations give monotonicity properties of the decision bounds.
In a general variance component model, nonnegative quadratic estimators of the components of variance are considered which are invariant with respect to mean value translations and have minimum bias, analogously … In a general variance component model, nonnegative quadratic estimators of the components of variance are considered which are invariant with respect to mean value translations and have minimum bias, analogously to estimation theory of mean value parameters. Here the minimum is taken over an appropriate cone of positive semidefinite matrices, after having made a reduction by invariance. Among these estimators, which always exist, the one of minimum norm is characterized. This characterization is achieved by systems of necessary and sufficient conditions, and by a nonlinear cone-restricted pseudoinverse. A representation of this pseudoinverse is given, that allows computation without consideration of the boundary. In models where the decomposing covariance matrices span a commutative quadratic subspace, a representation of the considered estimator is derived that requires merely to solve an ordinary convex quadratic optimization problem. As an example, we present the two-way nested classification random model. In the case that unbiased nonnegative quadratic estimation is possible, this estimator automatically becomes the "nonnegative MINQUE". Besides this, a general representation of the MINQUE is given, that involves just one matrix pseudoinversion in the reduced model.
Let us consider a sequence of processes $\xi _n (t)$ such that the multivariate distribution of $\xi _n (t_1 ),\xi _n (t_2 ), \cdots ,\xi _n (t_k )$ tends to … Let us consider a sequence of processes $\xi _n (t)$ such that the multivariate distribution of $\xi _n (t_1 ),\xi _n (t_2 ), \cdots ,\xi _n (t_k )$ tends to the multivariate distribution of $\xi _0 (t_1 ),\xi _0 (t_2 ), \cdots ,\xi _0 (t_k )$ for all k and $t_1 ,t_2 , \cdots ,t_k $. Let f be the functional for which $f(\xi _n (t))$ are determined with a probability of 1, the latter being random variables (i.e, those that have probability distributions). This paper contains several sufficient conditions, for which the distributions of $f(\xi _n (t))$ tend to the distribution of $f(\xi _0 (t))$ as $n \to \infty $. Let K be the space of all functions not having discontinuities higher than simple jumps, and let us assume that $\xi _n (t)$ with a probability of 1 is in K. Several topologies in K are defined. The necessary and sufficient conditions are found for all functionals f that are continuous in these topologies for which the distribution of $f(\xi _n (t))$ tends to the distribution of $f(\xi _0 (t))$. The results are demonstrated in the example of topology ${\bf J}_1 $ which is defined as follows. The sequence $x_n (t)$ tends to $x_0 (t)$ in topology ${\bf J}_1 $ if there exists a sequence of monotonic continuous functions $\lambda _n (t)$ for which$\begin{gathered} \lambda _n (0),\lambda _n (1) = 1,\mathop {\lim }\limits_{n \to \infty } \mathop {\sup }\limits_t \left| {\lambda _n (t) - t} \right| = 0, \hfill \\ \mathop {\lim }\limits_{n \to \infty } \mathop {\sup }\limits_t \left| {x_n \left( \lambda _n (t)\right) - t} \right| = 0. \hfill \\ \end{gathered} $ Theorem. The distribution of$f(\xi _n (t))$tends to the distribution of$f(\xi _0 (t))$for allfthat are continuous in topology${\bf J}_1 $, if and only if a) the multivariate distribution of$\xi _n (t_1 ), \cdots ,\xi _n (t_k )$tends to the multivariate distribution of$\xi _0 (t_1 ), \cdots ,\xi _0 (t_k )$for allk, and$t_1 ,t_2 , \cdots ,t_k $from some setNthat is dense on$[0,1]$. b) for all$\varepsilon > 0$\[ \mathop {\lim }\limits_{c \to 0} \mathop {\overline {\lim } }\limits_{n \to 0} P\left\{ {\mathop {\sup }\limits_{t - c < t_1 < t < t_2 < t + c} \min \left[ {\left| {\xi _n \left( {t_1 } \right) - \xi _n \left( t \right)} \right|;\left| {\xi _n \left( t \right) - \xi _n \left( {t_0 } \right)} \right|} \right] > \varepsilon } \right\} = 0. \]
Introduction.A function Fix), defined for all real x, will be called a "law of probability," if the following conditions are satisfied: (i) Fix) is monotone non-decreasing in (-co, oo) and … Introduction.A function Fix), defined for all real x, will be called a "law of probability," if the following conditions are satisfied: (i) Fix) is monotone non-decreasing in (-co, oo) and continuous to the left, (ii) F(-co) = 0, F(co) = l.fA particular case is represented by dFix) =fix)dx, where fix), summable and è 0, is the "probability density" or "law of distribution" for x.The expression f-*x'dFix) is called the "5th moment" of the distribution, 5 taking values 0, 1, 2, • • • .The Second Limit-Theorem, which was the starting point of this paper, can be stated, with A. Markoff,t as follows:// a sequence of laws of probability Fkix) (£ = 1, 2, • • • ) is such that they admit moments of all orders, and if /OO /• 00x'dFkix) = tt-1/2 I x'e~x,dx (5 = 0, 1, • • • ),-00•'-oo then, for all x, lim f dFkix) =w~1'2 f er^dx.Ä-.00 •/_a, •'-00Markoff's proof is rather complicated, being based on the distribution of roots and other properties of Hermite polynomials, also on the so-called Tchebycheff inequalities in the theory of algebraic continued fractions.He points out that the theorem still holds if we replace the law of probability 7T_1/Sj"-Ke~x"ldx by a more general one: f_aofix)dx (in which case, however, his considerations need many supplements).§ * Presented to the Society, April 18
Asymptotic normality is established for multivariate linear rank statistics of general type in the non-i.i.d. case covering null hypotheses as well as almost arbitrary alternatives. The functions generating the regression … Asymptotic normality is established for multivariate linear rank statistics of general type in the non-i.i.d. case covering null hypotheses as well as almost arbitrary alternatives. The functions generating the regression constants and the scores are allowed to have a finite number of discontinuities of the first kind, and to tend to infinity near 0 and 1. The proof is based on properties of empirical df's in the non-i.i.d. case and is patterned on the 1958 Chernoff-Savage method. As special cases e.g. rank statistics used for testing against regression and rank statistics for testing independence are included.
Abstract Linear combinations of variance components for which there exist unbiased, non-negative quadratic estimators are characterized. It is shown that the 'error' component in ANOVA models is the only single … Abstract Linear combinations of variance components for which there exist unbiased, non-negative quadratic estimators are characterized. It is shown that the 'error' component in ANOVA models is the only single component which can be so estimated.
Journal Article PARTIAL RANK CORRELATION Get access M. G. KENDALL M. G. KENDALL Search for other works by this author on: Oxford Academic Google Scholar Biometrika, Volume 32, Issue 3-4, … Journal Article PARTIAL RANK CORRELATION Get access M. G. KENDALL M. G. KENDALL Search for other works by this author on: Oxford Academic Google Scholar Biometrika, Volume 32, Issue 3-4, April 1942, Pages 277–283, https://doi.org/10.1093/biomet/32.3-4.277 Published: 01 April 1942
Let $X_1, X_2, \cdots, X_n$ be a sequence of independent random variables (r.v.'s) with zero mean and finite standard deviation $\sigma_i, 1 \leqq i \leqq n$. According to the central … Let $X_1, X_2, \cdots, X_n$ be a sequence of independent random variables (r.v.'s) with zero mean and finite standard deviation $\sigma_i, 1 \leqq i \leqq n$. According to the central limit theorem, the normed sum $Y_n = (1/s_n) \sum^n_{i=1} X_i,$ where $s_n = \sum^n_{i=1} \sigma^2_i$, is under certain additional conditions approximatively normally distributed. We will here examine the convergence of the moments and the absolute moments of $Y_n$ towards the corresponding moments of the normal distribution. The results in this general case are stated in Theorem 3 and Theorem 4, but, in order to avoid repetition and unnecessary complication, explicit proofs will only be given in the case of equally distributed random variables. (Theorem 1 and Theorem 2).
* research for this paper was supported by the United States Air Force under Contract No. AF18(600-685) monitored by the Office of Scientific Research. 1 A discussion of the problem … * research for this paper was supported by the United States Air Force under Contract No. AF18(600-685) monitored by the Office of Scientific Research. 1 A discussion of the problem along with the necessary references will be found in Harry Pollard, The Harmonic Analysis of Bounded Functions, Duke Math. / . , 20, 499-512, 1953. 2 See ibid. 3 See S. Bochner, Fouriersche Integrate (Leipzig, 1932), p. 33. 4 Levitan polynomial. See N.I. Achieser, Approximationstheorie (Berlin, 1953), p. 146.
Journal Article THE RELATION BETWEEN MEASURES OF CORRELATION IN THE UNIVERSE OF SAMPLE PERMUTATIONS Get access H. E. DANIELS Wool H. E. DANIELS Wool Industries Research Association Search for other … Journal Article THE RELATION BETWEEN MEASURES OF CORRELATION IN THE UNIVERSE OF SAMPLE PERMUTATIONS Get access H. E. DANIELS Wool H. E. DANIELS Wool Industries Research Association Search for other works by this author on: Oxford Academic Google Scholar Biometrika, Volume 33, Issue 2, August 1944, Pages 129–135, https://doi.org/10.1093/biomet/33.2.129 Published: 01 August 1944
Let $(R_{\nu 1}, \cdots, R_{{\nu N}_\nu})$ be a random vector which takes on the $N_\nu!$ permutations of $(1, \cdots, N_\nu)$ with equal probabilities. Let $\{b_{\nu i}, 1 \leqq i \leqq … Let $(R_{\nu 1}, \cdots, R_{{\nu N}_\nu})$ be a random vector which takes on the $N_\nu!$ permutations of $(1, \cdots, N_\nu)$ with equal probabilities. Let $\{b_{\nu i}, 1 \leqq i \leqq N_\nu, v \geqq 1\}$ and $\{a_{\nu i}, 1 \leqq i \leqq N_\nu, v \geqq 1\}$ be double sequences of real numbers. Put \begin{equation*}\tag{1.1}S_\nu = \sum^{N_\nu}_{i = 1} b_{\nu i}a_{\nu R_{\nu i}}.\end{equation*} We shall prove that the sufficient and necessary condition for asymptotic $(N_\nu \rightarrow \infty)$ normality of $S_\nu$ is of Lindeberg type. This result generalizes previous results by Wald-Wolfowitz [1], Noether [3], Hoeffding [4], Dwass [6], [7] and Motoo [8]. In respect to Motoo [8] we show, in fact, that his condition, applied to our case, is not only sufficient but also necessary. Cases encountered in rank-test theory are studied in more detail in Section 6 by means of the theory of martingales. The method of this paper consists in proving asymptotic equivalency in the mean of (1.1) to a sum of infinitesimal independent components.
A theorem by Pitman on the asymptotic relative efficiency of two tests is extended and some of its properties are discussed. A theorem by Pitman on the asymptotic relative efficiency of two tests is extended and some of its properties are discussed.
The sum of finitely many variates possesses, under familiar conditions, an almost Gaussian probability distribution. This already much discussed central limit theorem(x) in the theory of probability is the object … The sum of finitely many variates possesses, under familiar conditions, an almost Gaussian probability distribution. This already much discussed central limit theorem(x) in the theory of probability is the object of further investigation in the present paper. The cases of Liapounoff(2), Lindeberg(3), and Feller(4) will be reviewed. Numerical estimates for the degrees of approximation attained in these cases will be presented in the three theorems of §4. Theorem 3, the arithmetical refinement of the general theorem of Feller, constitutes our principal result. As the foregoing implies, we require throughout the paper that the given variates be totally independent. And we consider only one-dimensional variates. The first three sections of the paper are devoted to the preparatory Theorem 1 in which the variates.meet the further condition of possessing finite third order absolute moments. Let X\, Xi, • • • , Xn be the given variates. For each k{k = \,2, ■ ■ ■ , n) let ^(Xk) and ixs{Xk) denote, respectively, the second and third order absolute moments of Xk about its mean (expected) value a*. These moments are either both zero or both positive. The former case arises only when Xk is essentially constant, i.e., differs from its mean value at most in cases of total probability zero. To avoid trivialities we suppose that PziXk) >0 for at least one k (k = 1, 2, • • • , n). The non-negative square root of m{Xk) is the standard deviation of Xk and will be denoted by ak. We call
The convergence of stochastic processes is defined in terms of the so-called “weak convergence” (w. c.) of probability measures in appropriate functional spaces (c. s. m. s.). Chapter 1. Let … The convergence of stochastic processes is defined in terms of the so-called “weak convergence” (w. c.) of probability measures in appropriate functional spaces (c. s. m. s.). Chapter 1. Let $\Re $ be the c.s.m.s. and v a set of all finite measures on $\Re $. The distance $L(\mu _1 ,\mu _2 )$ (that is analogous to the Lévy distance) is introduced, and equivalence of L-convergence and w. c. is proved. It is shown that $V\Re = (v,L)$ is c. s. m. s. Then, the necessary and sufficient conditions for compactness in $V\Re $ are given. In section 1.6 the concept of “characteristic functionals” is applied to the study of w. cc of measures in Hilbert space. Chapter 2. On the basis of the above results the necessary and sufficient compactness conditions for families of probability measures in spaces $C[0,1]$ and $D[0,1]$ (space of functions that are continuous in $[0,1]$ except for jumps) are formulated. Chapter 3. The general form of the “invariance principle” for the sums of independent random variables is developed. Chapter 4. An estimate of the remainder term in the well-known Kolmogorov theorem is given (cf. [3.1]).
Let $n$ successive independent observations be made on the same chance variable whose distribution function $f(x, \theta)$ depends on a single parameter $\theta$. The number $n$ is a chance variable … Let $n$ successive independent observations be made on the same chance variable whose distribution function $f(x, \theta)$ depends on a single parameter $\theta$. The number $n$ is a chance variable which depends upon the outcomes of successive observations; it is precisely defined in the text below. Let $\theta^\ast(x_1, \cdots, x_n)$ be an estimate of $\theta$ whose bias is $b(\theta)$. Subject to certain regularity conditions stated below, it is proved that $\sigma^2(\theta^\ast) \geq \big(1 + \frac{db}{d\theta}\big)^2\big\lbrack EnE\big(\frac{\partial\log f}{\partial\theta}\big)^2\big\rbrack^{-1}.$ When $f(x, \theta)$ is the binomial distribution and $\theta^\ast$ is unbiased the lower bound given here specializes to one first announced by Girshick [3], obtained under no doubt different conditions of regularity. When the chance variable $n$ is a constant the lower bound given above is the same as that obtained in [2], page 480, under different conditions of regularity. Let the parameter $\theta$ consist of $l$ components $\theta_1, \cdots, \theta_l$ for which there are given the respective unbiased estimates $\theta^\ast_1(x_1, \cdots, x_n), \cdots, \theta^\ast_1(x_1, \cdots, x_n)$. Let $\|\lambda_{ij}\|$ be the non-singular covariance matrix of the latter, and $\|\lambda^{ij}\|$ its inverse. The concentration ellipsoid in the space of $(k_1, \cdots, k_l)$ is defined as $\sum_{i,j} \lambda^{ij}(k_i - \theta_i)(k_j - \theta_i) = l + 2.$ (This valuable concept is due to Cramer). If a unit mass be uniformly distributed over the concentration ellipsoid, the matrix of its products of inertia will coincide with the covariance matrix $\|\lambda)_{ij}\|$. In [4] Cramer proves that no matter what the unbiased estimates $\theta^\ast_1, \cdots, \theta^\ast_l$, (provided that certain regularity conditions are fulfilled), when $n$ is constant their concentration ellipsoid always contains within itself the ellipsoid $\sum_{i,j} \mu_{ij}(k_i - \theta_i)(k_j - \theta_j) = l + 2$ where $\mu_{ij} = nE\big(\frac{\partial\log f}{\partial\theta_i}\frac{\partial\log f}{\partial\theta_i}\big).$ Consider now the sequential procedure of this paper. Let $\theta^\ast_1, \cdots, \theta^\ast_l$ be, as before, unbiased estimates of $\theta_1, \cdots, \theta_l$, respectively, recalling, however, that the number of $n$ of observations is a chance variable. It is proved that the concentration ellipsoid of $\theta^\ast_1, \cdots, \theta^\ast_l$ always contains within itself the ellipsoid $\sum_{i,j} \mu'_{ij}(k_i - \theta_i)(k_j - \theta_j) = l + 2$ where $\mu'_{ij} = EnE\big(\frac{\partial\log f}{\partial\theta_i}\frac{\partial\log f}{\partial\theta_j}\big).$ When $n$ is a constant this becomes Cramer's result (under different conditions of regularity). In section 7 is presented a number of results related to the equation $EZ_n = EnEX$, which is due to Wald [6] and is fundamental for sequential analysis.
Asymptotic theorems on the difference between the (empirical) distribution function calculated from a sample and the true distribution function governing the sampling process are well known. Simple proofs of an … Asymptotic theorems on the difference between the (empirical) distribution function calculated from a sample and the true distribution function governing the sampling process are well known. Simple proofs of an elementary nature have been obtained for the basic theorems of Komogorov and Smirnov by Feller, but even these proofs conceal to some extent, in their emphasis on elementary methodology, the naturalness of the results (qualitatively at least), and their mutual relations. Feller suggested that the author publish his own approach (which had also been used by Kac), which does not have these disadvantages, although rather deep analysis would be necessary for its rigorous justification. The approach is therefore presented (at one critical point) as heuristic reasoning which leads to results in investigations of this kind, even though the easiest proofs may use entirely different methods. No calculations are required to obtain the qualitative results, that is the existence of limiting distributions for large samples of various measures of the discrepancy between empirical and true distribution functions. The numerical evaluation of these limiting distributions requires certain results concerning the Brownian movement stochastic process and its relation to other Gaussian processes which will be derived in the Appendix.