This article has not been peer reviewed.Deposited research article
Universality in large-scale structure of complete genomes
1 Department of Physics, National Central University, Chungli, Taiwan 320
2 Department of Life Sciences, National Central University, Chungli, Taiwan 320.
3 Center for Complex Systems, National Central University, Chungli, Taiwan 320
Genome Biology 2004, 5:P7 doi:10.1186/gb-2004-5-3-p7
This is the first version of this article to be made available publicly.Published: 28 January 2004
The abundance of duplications in genomes in the form of paralogs, pseudogenes and a variety of repeats suggests that genomes may have used duplications as one mode for their growth. However a systematic knowledge on all possible duplications in whole genomes is still lacking. This paper reports the results of a detailed study of occurrence frequencies of short oligonucleotides in all extant complete genomes. We found a systematic pattern of repeats of short oligonucleotides that places all the complete genomes except Plasmodium in a single universality class expressed by an extremely simple formula. Our analysis of the data combined with computer simulation of genome growth models suggest a simple coarse-grain representation of genome growth: the ancestors of the genomes began to grow when they were no greater than 300 b in length via a mechanism whose main components were neutral stochastic segmental replicative translocations and random small mutations.