Type: Article
Publication Date: 1964-06-01
Citations: 81
DOI: https://doi.org/10.1214/aoms/1177703562
or less by accident. In this paper we shall be concerned with experiments where variables are missing not by accident, but by design. As an example encountered frequently in psychological research, consider the construction of standardized tests. One phase in the standardization of such tests is the estimation of correlations between parallel forms. If three or more such forms are required, as is frequently the case for tests to be applied on the national level, estimation of correlation coefficients would necessitate the application of all forms to a representative standardization group. The application of more than two forms to the same student may however introduce errors, for recall, learning, or fatigue may seriously influence the results. A given student in the standardization group may receive only two tests, and symmetry suggests that an equal number of students be tested on each pair of examinations. To facilitate the handling of rather general situations, we shall assume a modification of the general linear model for multivariate analysis, E(Y'M) = AtM, where YT(N X p) is a matrix which contains all observations, A(N X m) is the design matrix, and l (m X p), a matrix of parameters. The matrix M, of order (p X u), was introduced by Roy [8] for allowing given linear combinations of variables in the model. It is particularly useful in the present case since, by a suitable array of ones and zeros in the matrix M, we can indicate whether or not a particular variable is observed in a given group of subjects. It will be recalled that models for simple and multiple regression and analysis of variance and covariance are special cases of this general linear model. In accordance with customary assumptions made in this model, we shall