The LeFE algorithm: embracing the complexity of gene expression in the interpretation of microarray data
1 Genomics and Bioinformatics Groups, Laboratory of Molecular Pharmacology, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892, USA
2 Bioinformatics Program, Boston University, Cummington St, Boston, Massachusetts 02215, USA
3 Virginia Commonwealth University, Biostatistics Department, E Marshall St, Richmond, Virginia 23284, USA
4 SRA International, Fair Lakes Court, Fairfax, Virginia 22033, USA
Genome Biology 2007, 8:R187 doi:10.1186/gb-2007-8-9-r187Published: 10 September 2007
Interpretation of microarray data remains a challenge, and most methods fail to consider the complex, nonlinear regulation of gene expression. To address that limitation, we introduce Learner of Functional Enrichment (LeFE), a statistical/machine learning algorithm based on Random Forest, and demonstrate it on several diverse datasets: smoker/never smoker, breast cancer classification, and cancer drug sensitivity. We also compare it with previously published algorithms, including Gene Set Enrichment Analysis. LeFE regularly identifies statistically significant functional themes consistent with known biology.