This article has not been peer reviewed.Deposited research article
Assembling and gap filling of unordered genome sequences through gene checking
1 Laboratoire de Génétique Moléculaire Humaine, Université Claude Bernard Lyon 1, 8 av Rockefeller, F-Lyon cedex 08, France
2 Present address and Correspondence. Division of Medical Genetics University of Geneva Medical School, 1, rue Michel Servet CH-1211, Geneva, Switzerland
Genome Biology 2001, 2:preprint0008-preprint0008.11 doi:10.1186/gb-2001-2-9-preprint0008
This is the first version of this article to be made available publicly, and no other version is available at present. The article was submitted to Genome Biology for peer review.Published: 7 August 2001
The first draft of human genome sequencing is complete. A large amount of DNA sequences are already available in the database but these are not ordered and assembled. In many cases, these sequences are shorter sequences (ranging from 10kb to 100kb) and are separated by "NNNNNN". Also a considerable amount of gaps are to be filled in the subsequent years. Even after generating raw data, properly ordered, finished available sequences, are enormous tasks and expected to take another 2 years.
Here, we describe a simple way to order random genome sequences and to trace gaps. These gaps could be filled by subsequent hybridizations and sequencing. These could be achieved by a simple method by three steps. 1) Selection of large cDNAs in the database (from lower organisms to human). 2) Blasting with these large cDNAs to the unordered human genomic sequences (raw BAC DNA sequences or large DNA fragments) . 3) Ordering these BACs DNA sequences or large DNA fragments based on the homology with cDNA sequences to maintain the continuity of exonic sequences. Homologous exons could also be taken into account on the basis of evolutionary conservacy when other organism's sequence except human, would be used for blasting. Any discontinuity in the exonic sequences denote possible gaps in between two BACs or two sequences.
In this way a large number of BACs could be arranged. Subsequently gaps could be traced and filled by further hybridizations and sequencing.