Outline of the miRTRAP program, Ciona abundance versus conservation, neighbor window. (a) Schematic illustration of the miRTRAP program. The algorithm first identifies read regions that do not overlap repeats or tRNAs. The genomic region up to 150 nucleotides around the individual read is folded using RNAfold. Then, all read products within the hairpin window are identified as 5p-miR/3p-miR, 5p-moR/3p-moR or loop based on their positions relative to the hairpin and loop. Each read region is then evaluated by a set of filters to remove those incompatible with the biochemical rules of miR biogenesis. All the rejected read regions are used to filter the initial set of candidate loci to produce a list of positive predictions. (b) Average antisense product displacement (AAPD) score distribution from the Ciona dataset shows that the majority of known miRs have an AAPD score of zero, while non-miR loci have a broad distribution and peaks at 8 and 10. (c) The difference between the non-miR neighbor counts within windows centered at known miRs and non-miR loci in Ciona. Whereas non-miR neighbor counts centered around non-miR loci increases sharply as window sizes expand, all known miR loci have non-miR neighbor counts equal or fewer than 10.
Hendrix et al. Genome Biology 2010 11:R39 doi:10.1186/gb-2010-11-4-r39