Validation of ArchAlign. (a) The location of the TSSs was randomized by 50 to 250 bp for 200 structurally similar TSSs from S. cerevisiae, and the nucleosome occupancy data, centered at the randomized TSS coordinates, were aligned with ArchAlign. A total of 1.5 kb surrounding the TSSs is shown with -750 referring to the distance upstream and 750 referring to the distance downstream. (b) The TSS data were randomized by shifting the center upstream or downstream from the actual location by a maximum distance of 50 to 250 bp at 25-bp intervals of increasing randomization. The randomized TSS coordinates were then used to generate 1.5-kb regions of nucleosome occupancy at 10-bp resolution surrounding the coordinates (blue line). (c,e) Variability for the alignment of TSSs or origins was calculated as the average root of sum of squares for that region compared to the mean profile. Variability for the entire alignment was estimated as the average of all regions. The graphs show the mean and standard error of each of the genomic features' ten alignments' overall variability at each randomization. (d) The origins of replication were randomized as described above for TSSs and the randomized coordinates were then used to generate 2-kb regions of nucleosome at 10-bp resolution surrounding the coordinates (blue line). Randomized nucleosome occupancy regions were then entered into ArchAlign using both the single-best-pair (red line) and seed sampling (green line) approach. Each interval of randomization was repeated 10 times for a total of 90 randomized datasets for each genomic feature. ArchAlign's output was then tested for similarity to the original data by determining the percentage of aligned TSSs and origins that were within 40 bp upstream or downstream of their original positions.
Lai and Buck Genome Biology 2010 11:R126 doi:10.1186/gb-2010-11-12-r126