ROC curves for all expression summary datasets. The curves are color-coded to highlight how the ability to detect differential expression is dependent on the different options at each step of analysis, using the CyberT regularized t-statistic metric. (a) All 152 expression summary datasets are represented here, with the different colors depicting whether the second loess normalization step at the probe set level was performed. In general, the second loess normalization (blue) improves the detection of true DEGs. (b-f)To decrease clutter, only the 76 expression summary datasets involving the second normalization step are shown. (b) When comparing the two background correction methods, the MAS algorithm is superior to the RMA algorithm. (c) The various probe-level normalization methods do not show great differences between each other. (d) Among the different PM-correction options, using the method in MAS 5.0 clearly is the most successful. (e) Various robust estimators were examined, revealing that the median polish method is the most sensitive (with MAS 5.0's Tukey Biweight a close second). (f) Depiction (in blue and orange) of the 10 datasets which maximize detection of truly differentially expressed genes, while minimizing false positives. These datasets are generated using the options circled in Figure 3. MAS 5.0, with the inclusion of the second loess normalization step, falls within these top 10.
Choe et al. Genome Biology 2005 6:R16 doi:10.1186/gb-2005-6-2-r16