An illustration of rare variant association tests. Cases and controls from a hypothetical complex disease exome sequencing project are depicted. The horizontal bars indicate aligned exome sequences for individuals; stars indicate the presence of a non-reference allele. Variants 1 and 4 represent low-frequency variants with predominance in cases, Variant 2 represents a singleton, Variant 3 represents a common variant, and Variant 5 represents a low-frequency variant exclusive to controls. For simplicity, these variants are displayed with similar frequency, although very rare variants represent the majority of variation in real sequencing studies. As illustrated, the specific genetic architecture underlying the complex phenotype of interest is expected to have a large role in which test is most powerful for detecting an association. Collapsing methods may be best if a burden of rare variants drives the phenotype, whereas aggregation methods may be more powerful if the full allelic spectrum is contributory. Finally, for genes harboring both risk and protective alleles, bidirectional tests may be most appropriate. See Additional file 1 for examples of methods of each type. MAF, minor allele frequency.
Stitziel et al. Genome Biology 2011 12:227 doi:10.1186/gb-2011-12-9-227