Soft clustering method. (a) Standard clustering based on expression only: two sets of orthologs are depicted (color represents orthology, shape represents species) where orthologs are split between clusters 1 and 2. For illustrative purposes, only two time points (t and t + 1) are shown. (b) Soft clustering based on expression and orthology: dashed circles denote regions where orthologs will be co-clustered. Since the purple square has no orthologs in cluster 1, it remains assigned to cluster 2. (c) Effect of number of clusters k and orthology weight W on GO term enrichment. (d) The number of enriched GO terms, variance, and fraction of co-clustered orthologs for k = 17 as a function of W in comparison to randomized paralogs/orthologs. Randomization was performed as described in Additional file 1: Randomizing the Orthology Mapping. (e) Since k-means is non-deterministic, to ensure robustness we performed 50 runs of the algorithm recording the fraction of times each gene pair was co-clustered (including all genes from all species). This matrix was hierarchically clustered.
Kuo et al. Genome Biology 2010 11:R77 doi:10.1186/gb-2010-11-7-r77