A comprehensive transcript index of the human genome generated using microarrays and computational approaches
- Equal contributors
1 Rosetta Inpharmatics LLC, 12040 115th Avenue NE, Kirkland, WA 98034, USA
2 Merck Research Laboratories, W42-213 Sumneytown Pike, POB 4, Westpoint, PA 19846, USA
3 Rally Scientific, 41 Fayette Street, Suite 1, Watertown, MA 02472, USA
4 Amgen Inc, 1201 Amgen Court W, Seattle, WA 98119, USA
5 The Scripps Research Institute, Jupiter, FL 33458, USA
Genome Biology 2004, 5:R73 doi:10.1186/gb-2004-5-10-r73Published: 23 September 2004
Computational and microarray-based experimental approaches were used to generate a comprehensive transcript index for the human genome. Oligonucleotide probes designed from approximately 50,000 known and predicted transcript sequences from the human genome were used to survey transcription from a diverse set of 60 tissues and cell lines using ink-jet microarrays. Further, expression activity over at least six conditions was more generally assessed using genomic tiling arrays consisting of probes tiled through a repeat-masked version of the genomic sequence making up chromosomes 20 and 22.
The combination of microarray data with extensive genome annotations resulted in a set of 28,456 experimentally supported transcripts. This set of high-confidence transcripts represents the first experimentally driven annotation of the human genome. In addition, the results from genomic tiling suggest that a large amount of transcription exists outside of annotated regions of the genome and serves as an example of how this activity could be measured on a genome-wide scale.
These data represent one of the most comprehensive assessments of transcriptional activity in the human genome and provide an atlas of human gene expression over a unique set of gene predictions. Before the annotation of the human genome is considered complete, however, the previously unannotated transcriptional activity throughout the genome must be fully characterized.