Improving RNA-Seq expression estimates by correcting for fragment bias
1 Department of Computer Science, 387 Soda Hall, UC Berkeley, Berkeley, CA 94720, USA
2 Broad Institute, 7 Cambridge Center, Cambridge, MA 02142, USA
3 Department of Stem Cell and Regenerative Biology, 7 Divinity Avenue, Harvard University, Cambridge, MA 02138, USA
4 Departments of Mathematics and Molecular & Cell Biology, 970 Evans Hall, UC Berkeley, Berkeley, CA 94720, USA
Genome Biology 2011, 12:R22 doi:10.1186/gb-2011-12-3-r22Published: 16 March 2011
The biochemistry of RNA-Seq library preparation results in cDNA fragments that are not uniformly distributed within the transcripts they represent. This non-uniformity must be accounted for when estimating expression levels, and we show how to perform the needed corrections using a likelihood based approach. We find improvements in expression estimates as measured by correlation with independently performed qRT-PCR and show that correction of bias leads to improved replicability of results across libraries and sequencing technologies.