Graphical illustration of the heuristic fuzzy partition algorithm. (a) Hypothetically, each element (gene) can be positioned in a virtual two-dimensional space, based on its characteristics (annotation terms). The distance represents the degree of relationship (kappa score) among the genes. (b) Any gene has a chance as a medoid to form an initial seeding group. Only the initial groups with enough closely related members (for example, members >3 and kappa score ≥0.4) are qualified (solid-line circle). Conversely, unqualified ones are shown as dashed-line circles. (c) Every qualified initial seeding group is iteratively merged with each other to form a larger group based on the multi-linkage rule, that is, sharing 50% or more of memberships, until all secondary clusters (thicker oval) are stable. Importantly, the genes not covered by any qualified initial seeding group are considered as outliers (in gray). (d) Finally, three final groups (thicker ovals) are formed because they can no longer be merged with any other group. One gene (in red) belonging to two groups represents the fuzziness capability of the algorithm. And outliers (in gray in (c)) are removed for clearer presentation. A step-by-step example can be found in Additional data file 13.
Huang et al. Genome Biology 2007 8:R183 doi:10.1186/gb-2007-8-9-r183