Fusion gene identification by paired-end RNA-sequencing. (a) Identification of fusion gene candidates through selection of paired-end reads, the ends of which align to two different and non-adjacent genes. (b) Identification of the exact fusion junction by aligning non-mapped short reads against a computer generated database of all possible exon-exon junctions between the two partner genes. Separation of true fusions (left) from false positives (right) by examining the pattern of short read alignments across exon-exon junctions. Genuine fusion junctions are characterized by a stacked/ladder-like pattern of short reads across the fusion point. False positives lack this pattern; instead, all junction matching short reads align to the exact same position or are shifted by one to two base pairs. Furthermore, this alignment is mostly to one of the exons.
Edgren et al. Genome Biology 2011 12:R6 doi:10.1186/gb-2011-12-1-r6