Significance and context
The race between the private and publicly funded projects to sequence the entire human genome is now in full fling. The publication of the sequence of chromosome 21 represents a second milestone for the public Human Genome Project and attests to the strength of the map-directed sequencing approach and international collaboration (this study involves researchers from six different countries). Chromosome 21 has been a focus of genome mapping efforts for at least two reasons: it is the smallest human autosome, and trisomy 21 causes Down syndrome, the most common form of mental retardation, which affects one in 700 births. Chromosome 21 also harbors more than 20 other disease-associated loci.
Hattori et al. report the isolation of 518 large-insert bacterial clones which span the whole of the long arm of human chromosome 21. They assembled these clones into four large contigs, with just three gaps of less than 40 kb each. They sequenced 33,546,361 bp of DNA, achieving 99.7% coverage (at 99.995% accuracy) and leaving only seven small sequencing gaps. Comparison of multiple sequences indicated that there was one sequence difference for each 787 bp. The researchers applied a number of sequence comparison and gene prediction programs in an attempt to catalog all the genes on chromosome 21. They divided this catalog into five categories: known human genes (127 genes); novel genes with extensive similarities (13 genes); novel genes with regional similarities (17 genes); novel anonymous genes defined by gene prediction (68 genes); and pseudogenes (59). Hence, approximately 41% of the predicted genes have no known function. The authors also report extensive statistical analysis, including characterization of chromosomal duplications and repeats, and variations in gene sizes and gene densities. For example, they confirm that gene-poor regions have few Alu sequences and low GC content, and that many genes are associated with CpG islands.
Information on mapping and sequencing chromosome 21 is available from the various centers involved: Human chromosome 21 project at the Max-Planck-Institut of Molekulare Genetik, Berlin; the Human Genome Research Group at the RIKEN Genomics Center, Japan; the Genome Sequencing Center at the Institute of Molecular Biology, Jena; Advanced Lifescience Information Systems (ALIS) at the Japan Science and Technology Corporation (JST); and the Department of Genome Analysis at the Gesellschaft für Biotechnologische Forschung (GBF), Germany.
The authors note that chromosome 21 is relatively poor in genes. They identified only 225 genes, compared to almost twice that number reported for human chromosome 22. As these two chromosomes represent approximately 2% of the genome, they predict that the total number of human genes might be around 40,000, considerably lower than previous estimates. Hattori et al. also discuss the limitations of current gene prediction methodology. Finally, they predict that the chromosome 21 sequence and gene catalog will be valuable for hypothesis-driven selection of candidate genes that might explain the effects of trisomy 21 and other diseases associated with chromosome 21.
This study has generated the longest continuous stretch of DNA sequence to date (25,491,867 bp) and represents a significant achievement. But the results also highlight the limitations of our ability to predict accurately the presence of genes within this sequence and the functions of these genes. Interpreting this information is critical if we are to understand how gene dosage effects lead to the Down syndrome symptoms. Finally, it will be only a matter of time before we discover the extent to which chromosomes 21 and 22 represent 'typical' chromosomes with respect to gene density and chromosomal architecture.