Using comparative genomics to reorder the human genome sequence into a virtual sheep genome
1 CSIRO Livestock Industries, Carmody Road, St Lucia, Queensland 4067, Australia
2 SheepGenomics, L1, Walker Street, North Sydney, New South Wales 2060, Australia
3 The Institute for Genomic Research, Rockville, Maryland 20850, USA
4 BACPAC Resources, Children's Hospital Oakland Research Institute (CHORI), Oakland, California 94609, USA
5 Department of Veterinary Science, The University of Melbourne, Parkville, Victoria 3010, Australia
6 Centre for Advanced Technologies in Animal Genetics and Reproduction (ReproGen), University of Sydney, Werombi Road, Camden, New South Wales 2570, Australia
7 AgResearch, Invermay Agricultural Centre, Puddle Alley, Private Bag 50034, Mosgiel 9053, New Zealand
8 US Department of Agriculture, Agricultural Research Service, Northern Plains Area, Roman L Hruska US Meat Animal Research, P.O. Box 166, Clay Center, Nebraska 68933, USA
9 Meat and Livestock Australia, 165 Walker Street, North Sydney, New South Wales 2059, Australia
10 University of New England, Armidale, New South Wales 2351, Australia
11 Utah State University, Logan, Utah 84322-4800, USA
Genome Biology 2007, 8:R152 doi:10.1186/gb-2007-8-7-r152Published: 30 July 2007
Is it possible to construct an accurate and detailed subgene-level map of a genome using bacterial artificial chromosome (BAC) end sequences, a sparse marker map, and the sequences of other genomes?
A sheep BAC library, CHORI-243, was constructed and the BAC end sequences were determined and mapped with high sensitivity and low specificity onto the frameworks of the human, dog, and cow genomes. To maximize genome coverage, the coordinates of all BAC end sequence hits to the cow and dog genomes were also converted to the equivalent human genome coordinates. The 84,624 sheep BACs (about 5.4-fold genome coverage) with paired ends in the correct orientation (tail-to-tail) and spacing, combined with information from sheep BAC comparative genome contigs (CGCs) built separately on the dog and cow genomes, were used to construct 1,172 sheep BAC-CGCs, covering 91.2% of the human genome. Clustered non-tail-to-tail and outsize BACs located close to the ends of many BAC-CGCs linked BAC-CGCs covering about 70% of the genome to at least one other BAC-CGC on the same chromosome. Using the BAC-CGCs, the intrachromosomal and interchromosomal BAC-CGC linkage information, human/cow and vertebrate synteny, and the sheep marker map, a virtual sheep genome was constructed. To identify BACs potentially located in gaps between BAC-CGCs, an additional set of 55,668 sheep BACs were positioned on the sheep genome with lower confidence. A coordinate conversion process allowed us to transfer human genes and other genome features to the virtual sheep genome to display on a sheep genome browser.
We demonstrate that limited sequencing of BACs combined with positioning on a well assembled genome and integrating locations from other less well assembled genomes can yield extensive, detailed subgene-level maps of mammalian genomes, for which genomic resources are currently limited.