Improving identification of differentially expressed genes in microarray studies using information from public databases
1 Harvard-Partners Center for Genetics and Genomics, 77 Avenue Louis Pasteur, Boston, MA 02115, USA
2 Children's Hospital Informatics Program, 300 Longwood Ave, Boston, MA 02115, USA
Genome Biology 2004, 5:R70 doi:10.1186/gb-2004-5-9-r70Published: 26 August 2004
We demonstrate that the process of identifying differentially expressed genes in microarray studies with small sample sizes can be substantially improved by extracting information from a large number of datasets accumulated in public databases. The improvement comes from more reliable estimates of gene-specific variances based on other datasets. For a two-group comparison with two arrays in each group, for example, the result of our method was comparable to that of a t-test analysis with five samples in each group or to that of a regularized t-test analysis with three samples in each group. Our results are further improved by weighting the results of our approach with the regularized t-test results in a hybrid method.