Marginal asymptotics for the “large $p$, small $n$” paradigm: With applications to microarray data

Type: Article

Publication Date: 2007-08-01

Citations: 80

DOI: https://doi.org/10.1214/009053606000001433

Abstract

The “large $p$, small $n$” paradigm arises in microarray studies, image analysis, high throughput molecular screening, astronomy, and in many other high dimensional applications. False discovery rate (FDR) methods are useful for resolving the accompanying multiple testing problems. In cDNA microarray studies, for example, $p$-values may be computed for each of $p$ genes using data from $n$ arrays, where typically $p$ is in the thousands and $n$ is less than 30. For FDR methods to be valid in identifying differentially expressed genes, the $p$-values for the nondifferentially expressed genes must simultaneously have uniform distributions marginally. While feasible for permutation $p$-values, this uniformity is problematic for asymptotic based $p$-values since the number of $p$-values involved goes to infinity and intuition suggests that at least some of the $p$-values should behave erratically. We examine this neglected issue when $n$ is moderately large but $p$ is almost exponentially large relative to $n$. We show the somewhat surprising result that, under very general dependence structures and for both mean and median tests, the $p$-values are simultaneously valid. A small simulation study and data analysis are used for illustration.

Locations

  • The Annals of Statistics - View - PDF

Similar Works

Action Title Year Authors
+ Marginal asymptotics for the "large p, small n" paradigm: with applications to microarray data 2005 Michael R. Kosorok
Shuangge Ma
+ Comparison of Methods for Estimating the Proportion of Null Hypotheses π0 in High Dimensional Data When the Test Statistics is Continuous 2017 Isaac Dialsingh
Sherwin P Cedeno
+ PDF Chat Empirical Bayes screening of many p-values with applications to microarray studies 2005 Somnath Datta
Somnath Datta
+ PDF Chat Estimating the Proportion of True Null Hypotheses for Multiple Comparisons 2008 Hongmei Jiang
R. W. Doerge
+ HighProbability determines which alternative hypotheses are sufficiently probable: Genomic applications include detection of differential gene expression 2004 David R. Bickel
+ PDF Chat Significance levels for studies with correlated test statistics 2007 Jianxin Shi
Douglas F. Levinson
Alice S. Whittemore
+ Estimating the proportion of true null hypotheses for multiple comparisons. 2008 Hongmei Jiang
R. W. Doerge
+ PDF Chat A geometric interpretation of the permutation p-value and its application in eQTL studies 2010 Wei Sun
Fred A. Wright
+ A two-sample test for the equality of univariate marginal distributions for high-dimensional data 2019 Marta Cousido‐Rocha
Jacobo de Uña‐Álvarez
Jeffrey D. Hart
+ Estimating the proportion of true null hypotheses with application in microarray data 2019 Aniket Biswas
+ Estimating the proportion of true null hypotheses with application in microarray data 2019 Aniket Biswas
+ Approximate Sample Size Calculations with Microarray Data: An Illustration 2006 José A. Ferreira
Aeilko H. Zwinderman
+ Test Statistics Null Distributions in Multiple Testing: Simulation Studies and Applications to Genomics 2005 Katherine S. Pollard
Merrill D. Birkner
Mark J. van der Laan
Sandrine Dudoit
+ Inference on the Limiting False Discovery Rate and the P-value Threshold Parameter Assuming Weak Dependence between Gene Expression Levels within Subject 2007 Glenn Heller
Jing Qin
+ PDF Chat Direction-Projection-Permutation for High-Dimensional Hypothesis Tests 2015 Susan Wei
Chihoon Lee
Lindsay Wichers
J. S. Marron
+ SPARTA: super-fast permutation approach to approximate extremely low p-values 2018 Sangseob Leem
Dae Ho Lee
Taesung Park
+ SPARTA: super-fast permutation approach to approximate extremely low p-values 2018 Sangseob Leem
Dae Ho Lee
Taesung Park
+ Practical FDR-based sample size calculations in microarray experiments 2005 Jianhua Hu
Fei Zou
Fred A. Wright
+ A note on using permutation-based false discovery rate estimates to compare different analysis methods for microarray data 2005 Yang Xie
Wei Pan
Arkady Khodursky
+ PDF Chat Quick calculation for sample size while controlling false discovery rate with application to microarray analysis 2007 Peng Liu
J. T. Gene Hwang