Skip to main content

Variation in tissue-specific gene expression among natural populations

Abstract

Background

Variation in gene expression is extensive among tissues, individuals, strains, populations and species. The interactions among these sources of variation are relevant for physiological studies such as disease or toxic stress; for example, it is common for pathologies such as cancer, heart failure and metabolic disease to be associated with changes in tissue-specific gene expression or changes in metabolic gene expression. But how conserved these differences are among outbred individuals and among populations has not been well documented. To address this we examined the expression of a selected suite of 192 metabolic genes in brain, heart and liver in three populations of the teleost fish Fundulus heteroclitus using a highly replicated experimental design.

Results

Half of the genes (48%) were differentially expressed among individuals within a population-tissue group and 76% were differentially expressed among tissues. Differences among tissues reflected well established tissue-specific metabolic requirements, suggesting that these measures of gene expression accurately reflect changes in proteins and their phenotypic effects. Remarkably, only a small subset (31%) of tissue-specific differences was consistent in all three populations.

Conclusions

These data indicate that many tissue-specific differences in gene expression are unique to one population and thus are unlikely to contribute to fundamental differences between tissue types. We suggest that those subsets of treatment-specific gene expression patterns that are conserved between taxa are most likely to be functionally related to the physiological state in question.

Background

The regulation of gene expression varies extensively among tissues, individuals, strains, populations and species [1–6] and variation in gene expression has a genetic basis [7, 8]. Despite such biological variance, differences in gene expression are used to describe cancers [9–12], heart failure [13, 14] and metabolic diseases [15]. It is common for these pathologies to be associated with changes in tissue-specific gene expression or changes in metabolic gene expression. For example, many different cancers have unique tissue-specific patterns of gene expression [16], and thyroid cancers are associated with increases in aerobic metabolic gene expression [17].

Although tissue-specific gene expression patterns are often used as a method to identify functionally relevant genes, how conserved these differences are among outbred individuals and among populations has not been well documented. It is possible that many of these changes represent polymorphism among individuals or populations and are not specifically associated with disease. To address this we used a well established system (tissue-specific gene expression) and genes with well defined function and tissue-specific distributions (metabolic genes).

Given the high variance in gene expression among individuals and populations, our goal was to examine the conservation of tissue-specific gene expression among populations of the same species. Specifically, we assessed the among-population variance of tissue-specific patterns of gene expression (in brain, heart and liver) in the teleost fish Fundulus heteroclitus. A cDNA microarray was used to measure levels of expression in normal healthy male fish for 192 genes involved in central metabolic pathways. We used this compact array in order to impose a high degree of technical and biological replication (24 replicates for each of three tissues from nine individuals with two samples per array). Also, this array was used because metabolic genes are essential, are known to have tissue-specific expression, especially in fish, and are often misused as controls with little characterization of variation in expression among individuals or tissues. Analysis of variance (ANOVA) was used as a statistical test to determine which genes were differentially expressed among tissues and populations. Tissue-specific patterns of gene expression were compared among populations. As expected, we detected extensive variation in gene expression among tissues. Unexpectedly, only a fraction (31%) of tissue-specific differences was conserved between all populations.

Results

Variation among

Variation among individuals within groups was high (groups included the nine tissue-by-population groupings; Figure 1). Nearly half of genes (92 genes, 48%) were differentially expressed (p < 0.05) among individuals within populations and tissues (Figure 1), and inter-individual differences ranged over fivefold.

Figure 1
figure 1

Variation within individuals (technical variance) and among individuals within populations and tissues (biological variance) for each of 192 genes indicated by the mean square error (MS) of measurements. Points above the dashed line indicate genes with greater variance among individuals than within. F-crit is the critical value of the F-statistic (F = MSamong/MSwithin, with 12 and 27 degrees of freedom and α = 0.05) for testing significant differences in gene expression between individuals. For 48% of genes, MSamong/MSwithin > F-crit (solid red line). These genes are therefore differentially expressed among individuals within treatments.

Variation among tissues

Although variation among individuals was high, added variation due to tissues was significant. Considering 192 genes and a p-value of 5%, one would expect less than 10 false-positive differences among tissues under the null hypothesis. We detected 76% of genes (146 of 192 genes) differentially expressed among brains, hearts and livers (ANOVA, p < 0.05). Selecting the α level at which differences between treatments are considered significant is problematic because of the large number of comparisons performed. As such, we present a volcano plot to illustrate the range of expression differences between tissues and associated p-values (Figure 2). When α is set at 0.01, 0.001 or at the Bonferroni-corrected value (2.6 × 10-4), the proportion of significant genes is 67% (129 genes), 50% (96) and 39% (75), respectively. Significant differences in expression ranged from less than 1.2-fold to nearly 16-fold (Figure 2). The predominant pattern of tissue-specific expression can be described by expression significantly different in the liver compared to the other two tissues (Figure 3).

Figure 2
figure 2

Volcano plot of differences between tissues and corresponding p-values. Differences in expression for each gene is the log2 ratio of tissue mean expression minus grand mean; a twofold difference in expression between tissues is indicated by one unit separation along the x-axis. p-values for differences in gene expression among tissues were calculated using ANOVA, and illustrated as -log(p). A p-value of 10-4 is expressed as 4 on the y-axis, and the α = 0.05 threshold is indicated by the red dashed line (1 - log(0.05) = 1.3).

Figure 3
figure 3

Dendrogram of gene expression patterns across samples for genes significantly different between tissues (ANOVA, p < 0.05). Clustering indicates similar expression patterns among samples (top axis) and among genes (left axis). Samples cluster as livers (yellow), hearts (pink) and brains (blue). Genes involved in oxidative phosphorylation are highlighted in green, and expression patterns that are consistent across all three populations are highlighted with a blue triangle.

Many expected tissue-specific patterns emerged. For example, the brain-specific fatty-acid-binding protein was typically more highly expressed in the brain than in other tissues (p = 0.005), hepatocyte nuclear factor 4-alpha (a transcription factor) was more highly expressed in liver than in other tissues (p < 0.001), and two genes involved in glycerolipid metabolism -lipoprotein lipase and phopholipase XIII A2 - were more highly expressed in liver than other tissues (p < 0.001 for both genes).

Liver-specific expression accounted for 61% of the expression differences among tissues (Figure 4). Heart-specific and brain-specific expression accounted for 24% and 15% of differences among tissues, respectively. Regardless of population, expression patterns were typically most similar between heart and brain, and least similar between liver and heart (Figure 5). There were 67 genes printed on the array that code for proteins involved in oxidative phosphorylation, and 88% (59 genes) were differentially expressed between tissues (genes highlighted in green, Figure 3). Of differentially expressed oxidative phosphorylation genes, only 10% (six genes) were expressed more highly in the liver than in other tissues, whereas the remaining 90% (53 genes) had lower expression in the liver compared to brain or heart.

Figure 4
figure 4

Number of genes differentially expressed among tissue groups for each population. Tissue-specific genes are those that are expressed more highly in a tissue than in the other tissues (for example, L > H, B) or lower in a tissue than in the other tissues (for example, L < H, B).

Figure 5
figure 5

Similarity of expression patterns among tissues. (a) Proportion of 192 genes that are similarly expressed between heart and brain (black bar), brain and liver (gray bar) and liver and heart (white bar), for each population including Maine (ME), New Jersey (NJ) and Georgia (GA). (b) Neighbor-joining trees of global similarity of expression patterns among samples (L, liver; H, heart; B, brain) for each population. Distance between samples is the sum of differences of log2 expression values over all genes.

Variation among taxa

A small proportion of genes (six genes, 3%) differed in expression among populations (p < 0.05). However, it should be noted that although the split-plot design is powerful for detecting differences between split-plot factors (tissues), it is considered to have low power for detecting differences between blocks (populations) [18]. As such, it is likely that 3% is an underestimate of true among-population differences in gene expression. Indeed, two-way ANOVA (data not shown), which has higher power for detecting population differences but is less valid than the split-plot model for testing individual and tissue differences, detected among-population differences in expression for 18% of genes at p < 0.05, or 6.3% of genes at p < 0.01. Each tissue contributed a similar number of genes differentially expressed among populations.

Surprisingly, differences among tissues in gene expression were not consistent across all three populations. More than one-third (37%) of the genes differentially expressed between tissues were significant in only one of the three populations (Figure 6). Population-specific differences were distributed among the three populations; Georgia had 40% of the population-specific genes, and New Jersey and Maine had 34% and 26%, respectively. A proportion of these inconsistencies could be due to false-positive or false-negative differences between tissues in individual populations. However, statistically significant interaction between tissue and population was detected for many (30%) of these inconsistencies (see Additional data file 1).

Figure 6
figure 6

Shared expression patterns among populations.

A relatively small proportion of tissue-specific genes (31%) have consistent expression patterns in all three populations (Figure 6; also see Additional data file 1 for details). This subset of genes also reflects the different metabolic status of brain, heart and liver; most of the genes involved in oxidative phosphorylation were more highly expressed in brain and heart than in liver (Figure 7a, Table 1), and most of the genes involved in fatty-acid metabolism, glycerolipid metabolism, steroid metabolism and detoxification were more highly expressed in liver. The majority of the tissue-specific genes were not consistent among populations (a subset of these genes are illustrated in Figure 7b, Table 1).

Figure 7
figure 7

Gene expression in liver, brain and heart (three symbols for each line) for the three different populations (three lines per gene). Each letter represents a gene, expression values are log2 transformed and are indicated for liver, brain and heart (left to right) in each of Maine (circles), New Jersey (triangles) and Georgia (squares) populations. (a) Genes consistently different among tissues in all three populations are grouped as those involved in oxidative phosphorylation (upper panel) and those involved in other metabolic pathways (lower panel). (b) A representative subset of genes not consistently different among tissues in all populations. Gene names associated with letters are provided in Table 1 and Additional data file 1.

Table 1 Identity of tissue-specific genes with expression patterns consistent in all three populations, and those inconsistent in all three populations

Quality control

Variation among technical replicates was low, and permutation tests indicated that the ANOVA model was robust. Sample coefficients of variation (CVs (standard deviation/mean) × 100), which estimate technical variance due to replicate spots (six spots per hybridization), repeated measures (two hybridizations per dye), and dye (two dyes per sample), were calculated for each gene of each of the 27 samples. CVs less than 5% accounted for 95% of sample/genes, respectively. Of the many comparisons performed (differences among tissues, populations, interaction), permutation tests results agreed with ANOVA results (the same comparisons identified as significant or not significant) for 99.1% of comparisons, suggesting that our ANOVA model was robust.

Discussion

Considerable variation occurs among the 27 samples (three tissues from each of three individuals from three populations) used to measure inter-individual and tissue-specific variation in gene expression. We are able to precisely describe the patterns of gene expression for 192 metabolic genes because of the low experimental variation; for 95% of the replicate measures of gene expression the standard deviation is less than 5% of the mean. Notably, gene expression is statistically different for many genes among individuals within a population for a tissue (48%), between tissues (76%), and between populations (3%). For genes with tissue-specific expression, only a fraction (31%) had expression patterns consistent across all three populations. These data do not specifically identify tissue-specific differences that are inconsistent across populations, but rather emphasize that tissue-specific differences detected can vary from one population to another. When measured from a single population, highly significant differences in tissue-specific expression do not necessarily represent genes relevant to general functional or morphological differences between tissues.

Variation among individuals

Variation in gene expression among healthy male individuals raised under controlled laboratory conditions was high. Nearly half of the metabolic genes (48%) were differentially expressed among individuals within a population for any one tissue (Figure 1), with fold differences ranging from 1.2- to 5-fold and p-values ranging down to 10-7. Differences in gene expression among individuals are unlikely to be due to common reversible environmental factors that affect physiological performance (acclimation effects) since all individuals used in this study were housed in a common environment and fed the same food for at least two months. However, the differences could be due to irreversible developmental effects or genetic variations that affect gene expression. Regardless of this, if these differences are heritable or due to developmental plasticity, they represent variation one would expect to find among outbred organisms, including humans.

Other studies that have measured inter-individual differences in gene expression have also detected high levels of variation in a variety of taxa. Among crosses of different yeast strains a large number of differences in expression (6% of genes varying more than twofold) were detected between morphotypes [1]. A previous study of the same Maine and Georgia Fundulus populations assayed here detected 18% of genes differentially expressed among healthy individuals [3]. Although inter-individual variance in gene expression seems prevalent, our observation that 48% of genes are differentially expressed among individuals is high. This may reflect the greater precision of these measurements as a result of extensive technical replication (24 replicate measures per sample) as coefficients of variation for technical replicates was less than 5% for 95% of the genes. Indeed, using similar methods and tools, a concurrent study assessing variation in Fundulus also detected a very high proportion of genes (94%) differentially expressed among individuals [19]. Alternatively, since our array is heavily biased toward metabolic genes, detected variance may also reflect a greater variation in metabolic gene expression. We could speculate that the high variation in metabolic genes reflects a greater allowable variation. That is, there may be less selective pressure to constrain metabolic variation either because varying the amount of an enzyme does not affect metabolism or variation in metabolism is phenotypically acceptable. One could test this by using an array with more comprehensive representation of the genome and comparing variances of different gene classes defined by function.

Considering the high inter-individual variation detected, the data presented here underscore the importance of including biological replicates within treatment groups in order to ascribe differences in expression to treatment rather than to inter-individual variation. Statistically, an analysis of variance can be used to examine the effects of technical and biological variation, and these tests have proved powerful for detecting significant differences in gene expression [3, 4], even differences as small as 1.2-fold. The cost of resources in microarray experiments should no longer excuse lack of biological and technical replication. Often, microarray experiments pool individual samples within treatment groups to capture biological variation. However, this approach only estimates an average level of expression and fails to estimate biological variation. When only small quantities of RNA can be extracted from samples, one can estimate biological variation by pooling multiple independent samples [20].

A variety of factors can contribute to differences in gene expression among individuals. Pritchard et al. [21] proposed that differences in immune status may explain the 3.3% difference in gene expression among genetically identical mice. Sex explained a large portion of among-individual variation in gene expression in Drosophila, whereas genotype was less of an influence, and the influence of age was weak [4]. Furthermore, this type of variation can be biologically relevant. For example recent work in Fundulus indicates that most inter-individual variation in metabolism can be accounted for by differences in metabolic gene expression [19].

Variation among tissues

Another important source of biological variation in gene expression is differences in expression among different tissues; 76% of genes were differentially expressed between brain, heart and liver, and expression in the liver was the most distinct compared to heart and brain. In this study, genes printed on our array are primarily enzymes functional in central metabolic pathways such as fatty-acid metabolism, glycolysis and oxidative phosphorylation. Of the oxidative phosphorylation genes differentially expressed between tissues, 92% were more highly expressed in heart or brain than in liver (Figure 3). The primary purpose of the heart is to act as a pump, and contraction is highly dependent on oxidative metabolism [22]. The metabolic rate in the brain is 7.5 times the average rate in the rest of the body [23]. High metabolic demand in the brain supports pumping of ions across neuronal membranes during action potentials and metabolism is primarily oxidative. Mitochondria are the principal sites for oxidative phosphorylation, and are most numerous in heart, brain and skeletal muscle cells. The liver, in contrast, is much more functionally diverse, as it is involved in carbohydrate storage, synthesis of proteins, glucose, fatty acids, cholesterol and lipids, and metabolism of xenobiotics and endogenous compounds, and has a relatively low respiration rate. Accordingly, transcripts of genes functional in oxidative phorphorylation appear to represent a much smaller portion of the cell's RNA transcripts in liver tissues than in the heart or brain. In addition, genes involved in fatty acid and phospholipid synthesis were more highly expressed in liver than the other tissues. Differences in expression among tissues detected using our array appear to reflect differences in the metabolic status of brain, heart, and liver. Because data presented here support well established patterns of metabolism, they suggest that measuring mRNA expression using microarrays accurately reflects changes in proteins and their phenotypic effect.

Many microarray studies have used expression levels of 'housekeeping' genes as an internal control for comparisons among arrays, individuals and treatments. Housekeeping genes may be defined as those that are involved in routine cellular metabolism and always expressed in all cells. Accordingly, many, if not most, of the genes studied here could be considered housekeeping genes. Nearly half of these genes were expressed at different levels between individuals, with fold differences ranging from 1.2- to 5-fold and p-values ranging down to 10-7. Lee et al. [24] applied ANOVA to screen four previously published datasets for housekeeping genes across a variety of biological contexts. They found that all genes that are commonly used as controls had fold changes ranging from greater than 2.0 to more than 300 within at least one dataset, and coefficients of variation were concordantly high, reflecting high variance in expression of these genes. It appears that upon application of ANOVA, statistically significant differences in expression of housekeeping genes can be detected among individuals and across different biological contexts, and scaling for differences among arrays using expression levels of these genes ought to be approached with caution.

Although genes differentially expressed among tissues reflect their different metabolic requirements, it should be noted that the purpose of the current study was not to comprehensively identify suites of genes responsible for functional differences between tissues. The relatively small number of printed probes was useful for a high degree of technical replication, and obviously represents a small portion of the expressed genes. However, this approach shows that highly significant differences in gene expression among tissues may be apparent but not consistent among closely related taxa. Therefore, highly significant differences in gene expression found only within a single population may not necessarily represent genes relevant to general functional or morphological differences between tissues.

Variation among taxa

Although the pattern of metabolic gene expression among tissues reflects established patterns of tissue-specific metabolism, there is additional variation due to population. It should be noted that the split-plot statistical design is not as powerful for detecting among-block differences (among populations) as for detecting differences among split-plot factors [18]. We detected 3% of genes (6 of 192) differentially expressed among populations. This proportion is similar to that detected in a previous study [3] in which 2.6% of genes were differentially expressed between Maine and Georgia Fundulus hearts. Similarly, approximately 1% of genes were differentially expressed in brain tissue among inbred strains of mice [2]. Differences in gene expression are to be expected among taxa (phylogenetically distinct groups of organisms which may include strains, populations or species), with the majority of differences most likely to be attributable to random genetic drift. For more distantly related groups, one would expect expression patterns to be more divergent than for closely related groups. Indeed, expression patterns between humans and chimpanzees are more similar than those between humans and orangutans, and similar results were obtained from comparisons among three mouse species [5, 6].

An unexpected finding is that the tissue-specific differences depend on which population was assayed. Differences in gene expression are expected between tissues because of functional divergence and between populations because of neutral genetic divergence. In addition, one might expect that the number of genes significantly different between populations would depend on the tissue. One might also expect tissue-specific differences to be consistent in all taxa. Yet our data indicate that tissue-specific expression patterns are not fixed within a species. The genes for which expression is significantly different between tissues are not all the same in all three populations. Of the 128 genes that have tissue-specific patterns of expression in any population, 37% are tissue-specific in only one of the three populations and 32% are found in only two of the three populations. Overall, it would appear that only 31% of tissue-specific differences in gene expression are consistent among all populations of F. heteroclitus. One needs to be careful about this interpretation, however. Our emphasis was not to specifically identify genes that have significant interaction between tissue and population. Rather, we emphasize that genes detected as tissue specific will vary from one population to another, and most microarray studies measure treatment-specific expression patterns in only one population of test organism. Because inter-individual variation is high, it is probable that inclusion of more replicate individuals in each group would increase the sensitivity of ANOVA, and the number of genes that distinguish tissues consistently in all populations may change.

The consistent tissue-specific differences still support expectations based on the metabolic requirements of each tissue (for example, genes involved in oxidative phosphorylation were more highly expressed in heart and brain, and those involved in fatty-acid and lipid metabolism were more highly expressed in the liver; Figure 7a). Accordingly, those differences in expression that are consistent across several groups of organisms are most likely to account for functional and morphological differences among tissues, emphasizing that this type of comparative approach may be powerful for testing the biological relevance of other functional traits. For example, expression differences between diseased and non-diseased tissues may vary among mouse strains, so that the subset of differences that are consistent across strains are more likely to be functionally related to the diseased state.

Our data suggest that many of the differences in gene expression detected between experimental groups may be of little functional importance because they vary among taxa. We suggest that patterns of expression that are consistent in different populations are more likely to be functionally important. Elucidation of adaptively important variation, such as variation related to antibiotics, pesticides or temperature adaptation, may also benefit from such a comparative approach that screens for conserved patterns. However, there is the possibility that partitioning of genetic polymorphisms among populations may allow distinct groups of organisms to reach different physiological or biochemical solutions to the same biological challenges. For example, patterns of polymorphism in a gene that regulates coat color in mammals indicated recent directional selection and was associated with coat color in one pocket mouse population, but not in a second population [25]. Other loci were probably responsible for adaptive variation in coat color in the second population.

Conclusions

These data indicate high variation in metabolic gene expression among individuals and thus expression of these housekeeping genes is unreliable as an internal control or as a method of normalization across samples. Second, concordance between tissue-specific expression patterns and established metabolic functions of brain, heart and liver indicate that measuring mRNA levels accurately reflects physiological status. Furthermore, since many metabolic genes differ in expression among brain, heart and liver, those studies using whole organisms need to rule out whether changes in expression reflect differences in the proportions of various tissues among samples. Finally, studies seeking to identify patterns of gene expression related to physiological states, such as disease or toxic stress, must consider both variation between individuals and differences between populations. Because of this biological variation, not all differences between treatments in any one population of test organism are likely to be generally relevant. We suggest that conserved patterns of treatment-specific gene expression among taxa are most likely to be functionally related to the physiological state in question.

Methods and materials

Animals and maintenance

Teleost fish Fundulus heteroclitus were collected from the field by seine and minnow trap in June 2003, transported to the University of Miami RSMAS laboratory under controlled temperature and aeration conditions, and acclimated to common conditions (20°C, 15 parts per thousand salinity) in recirculating 100-gallon tanks for at least two months before experiments. Fish were sacrificed by cervical dislocation and tissues were excised and stored in RNAlater (Ambion) at -20°C. Fish were collected at Wiscasset, Maine; Stone Harbor, New Jersey, and Sapelo Island, Georgia. Only healthy male fish were used for the following experiments.

Microarrays

Microarrays were printed using 192 cDNAs from a F. heteroclitus cardiac library encoding essential proteins for cellular metabolism [26]. These cDNAs were a subset of over 40,000 expressed sequences in our online database Funnybase [27]. These 192 cDNAs were amplified with amine-linked primers and printed on 3-D Link Activated slides (Surmodics) using a SpotArray Enterprise piezoelectric microarray printer (PerkinElmer Life Sciences) at Louisiana State University. Slides were blocked following slide manufacturer protocols. The suite of 192 amplified cDNAs was printed as a group in six spatially separated replicates. Four hybridization zones of these six replicate arrays were printed per slide, with each zone set separated by a hydrophobic barrier.

Hybridization experimental design

Microarray analyses were applied to three tissues (brain, heart and liver) from three individuals collected from three populations of F. heteroclitus. Each of these 27 samples was measured four times, twice with Cy3 and twice with Cy5 (Figure 8). In addition, since a hybridization zone covered six replicate printed arrays, total experimental replication per sample per gene was 24-fold. A total of 108 hybridizations were performed (27 × 4), and Cy3-Cy5 hybridizations were balanced (although incompletely) among tissues and populations in a sheet-loop design (Figure 8).

Figure 8
figure 8

Experimental design for hybridizations. Each arrow represents an array hybridization, with the samples at arrow base and head labeled with Cy3 and Cy5, respectively. Liver, heart and brain samples are indicated as purple, red and blue circles, respectively. Three individuals were assayed per tissue and from each of three populations. ME, Maine; NJ, New Jersey; GA, Georgia.

Sample preparation

RNA was extracted from tissue homogenate in a chaotropic buffer using phenol/cholorform/isoamyl alcohol. All reagents were from Sigma unless otherwise noted. Tissues were removed from RNAlater, blotted dry, and homogenized using an electric homogenizer in 400 μl chaotropic buffer (4.5 M guanidinium thiocyanate, 2% N-lauroylsarcosine, 50 mM EDTA pH 8.0, 25 mM Tris-HCl pH 7.5, 0.1 M β-mercaptoethanol, 2% antifoam A). An equal volume of 2 M sodium acetate (pH 4.0) was added to the homogenate, followed by 400 μl acidic phenol (pH 4.4), and 120 μl chloroform/isoamyl alcohol (23:1). The mixture was kept at 4°C for 10 min then centrifuged at 4°C at 16,000g for 20 min. Supernatant was removed and combined with 400 μl isopropanol, stored at -20°C for 30 min, then centrifuged at 4°C at 16,000g for 30 min. The remaining RNA pellet was rinsed twice with 400 μl of 70% ethanol, then further purified using the Qiagen RNeasy Mini kit (Qiagen) following the manufacturer's protocols. Purified RNA was quantified spectrophotometrically, and RNA quality was assessed using the Agilent 2100 Bioanalyzer. RNA was stored in 1/10 volumes 3 M sodium acetate and 2.5 volumes 100% ethanol at -20°C.

RNA for hybridization was prepared by amplification using a modified Eberwine protocol [28]. The Ambion Amino Allyl MessageAmp aRNA Kit was used (according to manufacturer's protocols) to copy template RNA by T7 amplification following incorporation of a T7 promoter, resulting in amplified template in the form of antisense RNA. Amino-allyl UTP was incorporated into targets during T7 transcription, and resulting amino-allyl antisenseRNA was coupled to Cy3 and Cy5 dyes (Amersham Biosciences).

Hybridization

Labeled aRNA aliquots of the two individual samples for each hybridization (18 pmol each of Cy3 and Cy5) were vacuum dried together and resuspended in 12 μl hybridization buffer (final concentration of each labeled sample = 1.5 pmol/μl). Hybridization buffer consisted of 5 × SSPE, 1% SDS, 50% formamide, 1 mg/ml poly(A), 1 mg/ml sheared herring sperm carrier DNA, and 1 mg/ml BSA. Slides were washed in sodium borohydride solution according to Raghavachari et al. [29] to reduce autofluorescence. Following rinsing, slides were boiled for 2 min and spin-dried in a centrifuge at 800 rpm for 3 min. Samples (12 μl) were heated to 90°C for 2 min, quick cooled to 42°C, applied to slide (hybridization zone area was 350 mm2), and covered with a coverslip. Slides were placed in an airtight chamber humidified with paper soaked in 1 × SSC buffer and incubated 12-18 h at 42°C. Following hybridization, slides were scanned using the Packard Bioscience ScanArray Express microarray scanner (PerkinElmer Life Sciences). Resulting .tiff images were imported into spot grids built in ImaGene (Biodiscovery) for each array, and spot signals were collected as fluorescence intensities for each dye channel.

Data processing and statistical analysis

Raw data were first sum normalized [30], which involves summing the total signal from each replicate array to the same value. Then spatial bias on each array was smoothed using a lowess transformation in MAANOVA Version 0.93-2 for R [31]. Other methods of normalization have also been proposed [32–34]. Log2 values of lowess-transformed sum-normalized data were used for all subsequent statistical analyses. MIAME-compliant data [35] have been submitted to the Gene Expression Omnibus as accession number GLP1224. Data were analyzed in a split-plot ANOVA design with population as blocks and tissues as split-plot factors using scripts written in MatLab Version 6 (The MathWorks). MatLab code is available upon request from the authors. Nested within tissue-by-population samples were technical replicates. Replicate spots within hybridization (six), replicate hybridizations per labeling (two) and replicate labelings per sample (two; Cy3 and Cy5) represent the three levels of technical variance nested within the tissue-by-population sample. The ANOVA structure is presented in Figure 9 and Table 2, and the model can be written as:

Figure 9
figure 9

Split-plot ANOVA statistical design. Populations (ME, Maine; NJ, New Jersey; GA, Georgia) are treated as blocks, replicate individuals within each population (1, 2 and 3) as plots, and tissue (L, liver; H, heart; B, brain) within an individual as the split-plot factor. Nested within each tissue-by-individual sample are technical replicates including two dyes (Cy3 and Cy5) within each sample, two replicate hybridizations (A and B) per dye, and six replicate spots per hybridization. GM, grand mean.

Table 2 Sources of variance and calculation of variables for the split-plot ANOVA statistical design [18]

y = grand mean + population + tissue + population-tissue interaction + individual in population + tissue-by-individual within population + dye within individual + hybridization within dye + spot within hybridization

where y is the normalized log2 expression and individual in population and tissue-by-individual within population are random effects. To test for differences among multiple means (for example, among population and tissue groups), and to correct for multiple comparisons, the T-method [36] was applied. The T-method calculates the minimum significant range defined as

MSR = Qα[kv] × SE

where the critical value Qα[kv] is the studentized range [37], k = number of groups in the comparison (for example, if comparisons are among tissues then k = 3), v = degrees of freedom of MStissue-by-individual within population, and SE is the standard error among tissue-by-individual samples within populations. The T-method following ANOVA was used to identify genes differentially expressed among tissues in each population. These data were then used to contrast tissue-specific and population-specific expression patterns. Robustness of ANOVA data was tested using a permutation test; means for the 27 biological samples were randomly permuted 1,000 times between population and tissue and test statistics were recalculated for differences among populations, tissues and tissue-by-population interaction. Agreement between ANOVA and permutation test results would indicate the robustness of the ANOVA model. Finally, in order to graphically illustrate expression similarity among tissues, expression distance between samples was calculated as the sum of differences of log2 expression values over all genes, and neighbor-joining trees of global similarity of expression patterns among tissues (L, liver; H, heart; B, brain) were constructed [38] for each population.

Additional data files

The following additional data are available with the online version of this paper. Additional data file 1 lists the results from statistical analyses for all genes. Listed for each gene are p-values associated with statistical tests for differences in expression between populations, tissues, tissue-by-population interaction, and among individuals within populations. Also listed are mean expression for each sample, and columns comparing differences in expression between tissues within each population. Final columns tabulate whether a tissue difference was detected for each comparison, whether this difference was consistent between populations, and whether significant interaction was detected for that gene.

References

  1. Cavalieri D, Townsend JP, Hartl DL: Manifold anomalies in gene expression in a vineyard isolate of Saccharomyces cerevisiae revealed by DNA microarray analysis. Proc Natl Acad Sci USA. 2000, 97: 12369-12374. 10.1073/pnas.210395297.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  2. Sandberg R, Yasuda R, Pankratz DG, Carter TA, Del Rio JA, Wodicka L, Mayford M, Lockhart DJ, Barlow C: Regional and strain-specific gene expression mapping in the adult mouse brain. Proc Natl Acad Sci USA. 2000, 97: 11038-11043. 10.1073/pnas.97.20.11038.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  3. Oleksiak MF, Churchill GA, Crawford DL: Variation in gene expression within and among natural populations. Nat Genet. 2002, 32: 261-266. 10.1038/ng983.

    Article  PubMed  CAS  Google Scholar 

  4. Jin W, Riley RM, Wolfinger RD, White KP, Passador-Gurgel G, Gibson G: The contributions of sex, genotype and age to transcriptional variance in Drosophila melanogaster. Nat Genet. 2001, 29: 389-395. 10.1038/ng766.

    Article  PubMed  CAS  Google Scholar 

  5. Enard W, Khaitovich P, Klose J, Zoellner S, Heissig F, Giavalisco P, Nieselt-Struwe K, Muchmore E, Varki A, Ravid R, et al: Intra- and interspecific variation in primate gene expression patterns. Science. 2002, 296: 340-343. 10.1126/science.1068996.

    Article  PubMed  CAS  Google Scholar 

  6. Hsieh WP, Chu TM, Wolfinger RD, Gibson G: Mixed-model reanalysis of primate data suggests tissue and species biases in oligonucleotide-based gene expression profiles. Genetics. 2003, 165: 747-757.

    PubMed  CAS  PubMed Central  Google Scholar 

  7. Cheung VG, Conlin LK, Weber TM, Arcaro M, Jen KY, Morley M, Spielman RS: Natural variation in human gene expression assessed in lymphoblastoid cells. Nat Genet. 2003, 33: 422-425. 10.1038/ng1094.

    Article  PubMed  CAS  Google Scholar 

  8. Brem RB, Yvert G, Clinton R, Kruglyak L: Genetic dissection of transcriptional regulation in budding yeast. Science. 2002, 296: 752-755. 10.1126/science.1069516.

    Article  PubMed  CAS  Google Scholar 

  9. Zhang L, Zhou W, Velculescu VE, Kern SE, Hruban RH, Hamilton SR, Vogelstein B, Kinzler KW: Gene expression profiles in normal and cancer cells. Science. 1997, 276: 1268-1272. 10.1126/science.276.5316.1268.

    Article  PubMed  CAS  Google Scholar 

  10. Elek J, Park KH, Narayanan R: Microarray-based expression profiling in prostate tumors. In Vivo. 2000, 14: 173-182.

    PubMed  CAS  Google Scholar 

  11. Alizadeh AA, Eisen MB, Davis RE, Ma C, Lossos IS, Rosenwald A, Boldrick JG, Sabet H, Tran T, Yu X, et al: Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature. 2000, 403: 503-511. 10.1038/35000501.

    Article  PubMed  CAS  Google Scholar 

  12. Paez JG, Janne PA, Lee JC, Tracy S, Greulich H, Gabriel S, Herman P, Kaye FJ, Lindeman N, Boggon TJ, et al: EGFR mutations in lung cancer: correlation with clinical response to Gefitinib therapy. Science. 2004, 304: 1497-500. 10.1126/science.1099314.

    Article  PubMed  CAS  Google Scholar 

  13. Archacki SR, Angheloiu G, Tian X-L, Tan FL, DiPaola N, Shen G-Q, Moravec C, Ellis S, Topol EJ, Wang Q: Identification of new genes differentially expressed in coronary artery disease by expression profiling. Physiol Genomics. 2003, 15: 65-74.

    Article  PubMed  CAS  Google Scholar 

  14. Iemitsu M, Miyauchi T, Maeda S, Sakai S, Fujii N, Miyazaki H, Kakinuma Y, Matsuda M, Yamaguchi I: Cardiac hypertrophy by hypertension and exercise training exhibits different gene expression of enzymes in energy metabolism. Hypertens Res. 2003, 26: 829-837. 10.1291/hypres.26.829.

    Article  PubMed  CAS  Google Scholar 

  15. Kunz WS: Different metabolic properties of mitochondrial oxidative phosphorylation in different cell types - important implications for mitochondrial cytopathies. Exp Physiol. 2003, 88: 149-154. 10.1113/eph8802512.

    Article  PubMed  CAS  Google Scholar 

  16. Ramaswamy S, Tamayo P, Rifkin R, Mukherjee S, Yeang CH, Angelo M, Ladd C, Reich M, Latulippe E, Mesirov JP, et al: Multiclass cancer diagnosis using tumor gene expression signatures. Proc Natl Acad Sci USA. 2001, 98: 15149-15154. 10.1073/pnas.211566398.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  17. Baris O, Savagner F, Nasser V, Loriod B, Granjeaud S, Guyetant S, Franc B, Rodien P, Rohmer V, Bertucci F, et al: Transcriptional profiling reveals coordinated up-regulation of oxidative metabolism genes in thyroid oncocytic tumors. J Clin Endocrinol Metab. 2004, 89: 994-1005. 10.1210/jc.2003-031238.

    Article  PubMed  CAS  Google Scholar 

  18. Steel RGD, Torrie JH: Principles and Procedures of Statistics. 1980, New York, NY: McGraw-Hill, 2

    Google Scholar 

  19. Oleksiak MF, Roach JL, Crawford DL: Natural variation in cardiac metabolism and gene expression in Fundulus heteroclitus. Nat Genet. 2005, 37: 67-72.

    PubMed  CAS  PubMed Central  Google Scholar 

  20. Kendziorski CM, Zhang Y, Lan H, Attie AD: The efficiency of pooling mRNA in microarray experiments. Biostatistics. 2003, 4: 465-477. 10.1093/biostatistics/4.3.465.

    Article  PubMed  CAS  Google Scholar 

  21. Pritchard CC, Hsu L, Delrow J, Nelson PS: Project normal: defining normal variance in mouse gene expression. Proc Natl Acad Sci USA. 2001, 98: 13266-13271. 10.1073/pnas.221465998.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  22. Weiss L, (Ed): Cell and Tissue Biology: A Textbook of Histology. 1983, Baltimore, MD: Urban and Schwarzenberg, 6

    Google Scholar 

  23. Guyton AC: Textbook of Medical Physiology. 1991, Philadelphia: W.B. Saunders Company, 8

    Google Scholar 

  24. Lee PD, Sladek R, Greenwood CMT, Hudson TJ: Control genes and variability: absence of ubiquitous reference transcripts in diverse mammalian expression studies. Genome Res. 2002, 12: 292-297. 10.1101/gr.217802.

    Article  PubMed  PubMed Central  Google Scholar 

  25. Nachman MW, Hoekstra HE, D'Agostino SL: The genetic basis of adaptive melanism in pocket mice. Proc Natl Acad Sci USA. 2003, 100: 5268-5273. 10.1073/pnas.0431157100.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  26. Oleksiak MF, Kolell KJ, Crawford DL: Utility of natural populations for microarray analyses: isolation of genes necessary for functional genomic studies. Mar Biotechnol (NY). 2001, 3 (Supplement 1): S203-S211. 10.1007/s10126-001-0043-0.

    Article  Google Scholar 

  27. FunnyBase gene expression database. [http://genomics.rsmas.miami.edu/funnybase/super_craw4/]

  28. Van Gelder RN, Von Zastrow ME, Yool A, Dement WC, Barchas JD, Eberwine JH: Amplified RNA synthesized from limited quantities of heterogeneous complementary DNA. Proc Natl Acad Sci USA. 1990, 87: 1663-1667.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  29. Raghavachari N, Bao YP, Li G, Xie X, Muller UR: Reduction of autofluorescence on DNA microarrays and slide surfaces by treatment with sodium borohydride. Anal Biochem. 2003, 312: 101-105. 10.1016/S0003-2697(02)00440-2.

    Article  PubMed  CAS  Google Scholar 

  30. Quackenbush J: Microarray data normalization and transformation. Nat Genet. 2002, 32 Suppl: 496-501. 10.1038/ng1032.

    Article  PubMed  Google Scholar 

  31. Wu H, Kerr K, Cui X, Churchill GA: MAANOVA: a software package for the analysis of spotted cDNA microarray experiments. The Analysis of Gene Expression Data: Methods and Software. 2003, New York: Springer

    Google Scholar 

  32. Chu TM, Weir B, Wolfinger R: A systematic statistical linear modeling approach to oligonucleotide array experiments. Math Biosci. 2002, 176: 35-51. 10.1016/S0025-5564(01)00107-9.

    Article  PubMed  CAS  Google Scholar 

  33. Kerr MK, Martin M, Churchill GA: Analysis of variance for gene expression microarray data. J Comput Biol. 2000, 7: 819-837. 10.1089/10665270050514954.

    Article  PubMed  CAS  Google Scholar 

  34. Yang YH, Dudoit S, Luu P, Lin DM, Peng V, Ngai J, Speed TP: Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. Nucleic Acids Res. 2002, 30: e15-10.1093/nar/30.4.e15.

    Article  PubMed  PubMed Central  Google Scholar 

  35. Brazma A, Hingamp P, Quackenbush J, Sherlock G, Spellman P, Stoeckert C, Aach J, Ansorge W, Ball CA, Causton HC, et al: Minimum information about a microarray experiment (MIAME): toward standards for microarray data. Nat Genet. 2001, 29: 365-371. 10.1038/ng1201-365.

    Article  PubMed  CAS  Google Scholar 

  36. Sokal RR, Rohlf FJ: Biometry. 2001, New York: W.H. Freeman, 3

    Google Scholar 

  37. Rohlf FJ, Sokal RR: Statistical Tables. 2002, New York: W.H. Freeman, 3

    Google Scholar 

  38. Kumar S, Tamura K, Jakobsen IB, Nei M: MEGA: Molecular Evolutionary Genetics Analysis, version 2.1. [http://www.megasoftware.net]

Download references

Acknowledgements

Much credit is due to Marjorie Oleksiak for construction of the expressed sequence tag (EST) library, array printing, and constructive criticisms. We also thank Steve Hand at Louisiana State University for the use of their facilities and assistance in printing the microarray. We thank Gary Churchill for statistical advice and Justin Paschall for EST database management and bioinformatics. Valuable assistance was provided by Jen Roach and Jeff VanWye, and Jen Roach provided helpful comments on the manuscript. We especially thank two reviewers for insightful criticisms and comments on the manuscript. This project was supported by a National Science Foundation OCE grant 0221879 to D.L.C. and a National Institutes of Health grant NHLBI R01 HL65470 to D.L.C.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Andrew Whitehead.

Electronic supplementary material

Authors’ original submitted files for images

Rights and permissions

Reprints and permissions

About this article

Cite this article

Whitehead, A., Crawford, D.L. Variation in tissue-specific gene expression among natural populations. Genome Biol 6, R13 (2005). https://doi.org/10.1186/gb-2005-6-2-r13

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/gb-2005-6-2-r13

Keywords