Open Access Open Badges Research

Mixture modeling of transcript abundance classes in natural populations

Wen-Ping Hsieh12, Gisele Passador-Gurgel1, Eric A Stone3 and Greg Gibson1*

Author Affiliations

1 Department of Genetics, Gardner Hall, North Carolina State University, Raleigh, North Carolina 27695-7614, USA

2 Department of Statistics, 825 General Building III, National Tsing Hua University, Kuang-Fu Road, Hsinchu, 30013, Taiwan

3 Department of Statistics, and Bioinformatics Research Center, 1500 Partners II Building, 840 Main Campus Drive, North Carolina State University, Raleigh, North Carolina 27695, USA

For all author emails, please log on.

Genome Biology 2007, 8:R98  doi:10.1186/gb-2007-8-6-r98

Published: 4 June 2007



Populations diverge in genotype and phenotype under the influence of such evolutionary processes as genetic drift, mutation accumulation, and natural selection. Because genotype maps onto phenotype by way of transcription, it is of interest to evaluate how these evolutionary factors influence the structure of variation at the level of transcription. Here, we explore the distributions of cis-acting and trans-acting factors and their relative contributions to expression of transcripts that exhibit two or more classes of abundance among individuals within populations.


Expression profiling using cDNA microarrays was conducted in Drosophila melanogaster adult female heads for 58 nearly isogenic lines from a North Carolina population and 50 from a California population. Using a mixture modeling approach, transcripts were identified that exhibit more than one mode of transcript abundance across the samples. Power studies indicate that sample sizes of 50 individuals will generally be sufficient to detect divergent transcript abundance classes. The distribution of transcript abundance classes is skewed toward low frequency minor classes, which is reminiscent of the typical skew in genotype frequencies. Similar results are observed in reported data on gene expression in human lymphoblast cell lines, in which analysis of association with linked polymorphisms implies that cis-acting single nucleotide polymorphisms make only a modest contribution to bimodal distributions of transcript abundance.


Population surveys of gene expression may complement genetical genomics as a general approach to quantifying sources of transcriptional variation. Differential expression of transcripts among individuals is due to a complex interplay of cis-acting and trans-acting factors.