Email updates

Keep up to date with the latest news and content from Genome Biology and BioMed Central.

Open Access Open Badges Method

Improving identification of differentially expressed genes in microarray studies using information from public databases

Richard D Kim1 and Peter J Park12*

Author Affiliations

1 Harvard-Partners Center for Genetics and Genomics, 77 Avenue Louis Pasteur, Boston, MA 02115, USA

2 Children's Hospital Informatics Program, 300 Longwood Ave, Boston, MA 02115, USA

For all author emails, please log on.

Genome Biology 2004, 5:R70  doi:10.1186/gb-2004-5-9-r70

Published: 26 August 2004


We demonstrate that the process of identifying differentially expressed genes in microarray studies with small sample sizes can be substantially improved by extracting information from a large number of datasets accumulated in public databases. The improvement comes from more reliable estimates of gene-specific variances based on other datasets. For a two-group comparison with two arrays in each group, for example, the result of our method was comparable to that of a t-test analysis with five samples in each group or to that of a regularized t-test analysis with three samples in each group. Our results are further improved by weighting the results of our approach with the regularized t-test results in a hybrid method.