The neighbor effect and calculation of P values. (a) After ChIP, purified DNA fragments bound by the protein of interest will be of various lengths. (b) Actual log2 ratios reported by arrayed elements for Rap1p binding to promoter region of RPL1B (array element 'A') from the Rap1p binding dataset reported by Lieb and coworkers . Arrayed element 'A' contains the actual site of protein-DNA interaction, and so this spot will have the highest ratio (red = high positive ratio; yellow = low ratio; green = negative ratio). Arrayed elements 'B' (RPL1B open reading frame [ORF]) and 'C' (MRM2 ORF), which are within about 1 kilobase (kb) of the binding site, are also enriched above noise. Arrayed element 'B' has a higher ratio then spot 'C', because the binding site is located closer to element 'B'. The arrayed elements 'D', 'E', 'F', and 'G' are too far from the binding site to be enriched. (c) Using a 1 kb window with a 0.25 kb step, the value of each window is plotted. The location of each window is defined by its central coordinate. (d) The P value of each window is plotted. The Bonferroni corrected P values were calculated based on the observed data, which had a log2 background standard deviation of 0.32 with 21,208 comparisons. Note that the window with the smallest P value (about 10-30) does not correspond to the highest window average. This is due to the fact that the most significant window contains three arrayed elements (A, B, and C), whereas the windows with the highest average contain only two elements (A and B). In this case, the center of the window with the highest P value is located about 80 bases from the actual binding site.
Buck et al. Genome Biology 2005 6:R97 doi:10.1186/gb-2005-6-11-r97