A tool for comparing different statistical methods on identifying differentially expressed genes

Fogel, Paul; Liu, Li; Dumas, Bruno; Ge, Nanxiang

doi:10.1186/gb-2004-6-1-p2

Deposited research article
Published: 08 December 2004

A tool for comparing different statistical methods on identifying differentially expressed genes

Paul Fogel¹,
Li Liu²,
Bruno Dumas³ &
…
Nanxiang Ge²

Genome Biology volume 6, Article number: P2 (2005) Cite this article

4723 Accesses
Metrics details

Abstract

Background

Many different statistical methods have been developed to deal with two groupcomparison microarray experiments. Most often, a substantial number of genes may be selected or not, depending on which method was actually used. Practical guidance on the application of these methods is therefore required. We developed a procedure based on bootstrap and a criterion to allow viewing and quantifying differences between method-dependent selections. We applied this procedure on three datasets that cover a range of possible sample sizes to compare three well known methods, namely: t-test, LPE and SAM.

Results

Our visualization method and associated variability conformation rate (VCR) criterion show that standard t-test is appropriate for large sample sizes to allow accurate variance estimates. LPE borrows strength from neighboring genes to estimate the variances and is therefore more appropriate for small sample sizes whenever gene variances are similar for similar gene intensity levels. SAM has both advantages of considering gene specific variance like t-test and adjusting multiple tests by permutation based false discovery rate. However, for small sample sizes and in cases of numerous expressed genes, the distribution based on permutated datasets may not approximate the null distribution well, resulting in an inaccurate false discovery rate. Moreover, genes with low variances may be filtered because of the fudge factor.

Conclusions

We proposed using VCR to assess different statistical methods available for analyzing microarray data and developed a bootstrap method - on which our criterion is based - to estimate the 2-d distribution of treated vs. control gene intensity levels, under the null hypothesis that there is no difference between the treatment and control group. The biological evaluation of selected genes according to one or another method confirmed that this criterion is indeed appropriate to help identifying the most suitable method.

Additional data files

The following additional data files are provided with this article: Additional data file 1, depicting a table showing the overlap among different methods for the yeast data; Additional data file 2, showing additional Figure 1; Additional data file 3, showing additional Figure 2; Additional data file 4, showing additional Figure 3; Additional data file 5, showing additional Figure 4.

Author information

Authors and Affiliations

Paul Fogel Consultant, 4 rue Le Goff, 75005, Paris, France
Paul Fogel
Biometrics and Data Management, Sanofi-Aventis, Mail Stop B-203A, 1041 Route 202-206, PO Box 6800, Bridgewater, NJ, 08873, USA
Li Liu & Nanxiang Ge
Yeast Genomics, Functional Genomics, Sanofi-Aventis, 13 Quai Jules Guesde, 94403, Vitry sur Seine Cedex, France
Bruno Dumas

Authors

Paul Fogel
View author publications
You can also search for this author in PubMed Google Scholar
Li Liu
View author publications
You can also search for this author in PubMed Google Scholar
Bruno Dumas
View author publications
You can also search for this author in PubMed Google Scholar
Nanxiang Ge
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Li Liu.

Additional information

Paul Fogel, Li Liu contributed equally to this work.

Electronic supplementary material

Additional data file 1: A table showing the overlap among different methods for the yeast data (PDF 106 KB)

Additional data file 2: Additional figure 1 (PNG 18 KB)

Additional data file 3: Additional figure 2 (PNG 16 KB)

Additional data file 4: Additional figure 3 (PNG 16 KB)

Additional data file 5: Additional figure 4 (PNG 17 KB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fogel, P., Liu, L., Dumas, B. et al. A tool for comparing different statistical methods on identifying differentially expressed genes. Genome Biol 6, P2 (2005). https://doi.org/10.1186/gb-2004-6-1-p2

Download citation

Received: 07 December 2004
Published: 08 December 2004
DOI: https://doi.org/10.1186/gb-2004-6-1-p2

A tool for comparing different statistical methods on identifying differentially expressed genes

Abstract

Background

Results

Conclusions

Additional data files

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

Additional data file 1: A table showing the overlap among different methods for the yeast data (PDF 106 KB)

Additional data file 2: Additional figure 1 (PNG 18 KB)

Additional data file 3: Additional figure 2 (PNG 16 KB)

Additional data file 4: Additional figure 3 (PNG 16 KB)

Additional data file 5: Additional figure 4 (PNG 17 KB)

Rights and permissions

About this article

Cite this article

Keywords

Genome Biology

Contact us

A tool for comparing different statistical methods on identifying differentially expressed genes

Abstract

Background

Results

Conclusions

Additional data files

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

Additional data file 1: A table showing the overlap among different methods for the yeast data (PDF 106 KB)

Additional data file 2: Additional figure 1 (PNG 18 KB)

Additional data file 3: Additional figure 2 (PNG 16 KB)

Additional data file 4: Additional figure 3 (PNG 16 KB)

Additional data file 5: Additional figure 4 (PNG 17 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Genome Biology

Contact us