Hierarchical clustering algorithm and TSS identification. ESTs were hierarchically clustered in four main steps. 1) ESTs were mapped to the 5' ends of genes. 2) Large initial clusters were formed from grouping adjacent ESTs together that were less than 100 bp apart. 3) Clusters were broken into smaller (sub-) clusters that each had a standard deviation of less than 10. 4) (Sub-)clusters with less than three ESTs were removed. Then, 5) the most highly utilized location per (sub-)cluster was selected as the TSS and 6) TSSs within 100 bp were grouped into broad TSS cluster groups.
Rach et al. Genome Biology 2009 10:R73 doi:10.1186/gb-2009-10-7-r73