Open Access Highly Accessed Open Badges Research

Small variable segments constitute a major type of diversity of bacterial genomes at the species level

Fabrice Touzain1, Erick Denamur2, Claudine Médigue3, Valérie Barbe4, Meriem El Karoui1 and Marie-Agnès Petit1*

Author Affiliations

1 INRA, UMR1319, Micalis, Bat 222, Jouy en Josas, 78350, France

2 INSERM U722 and Université Paris 7, Faculté de Médecine, Site Xavier Bichat, Paris, 75018, France

3 CNRS-UMR 8030 & CEA/IG/Genoscope, Laboratoire d'Analyses Bioinformatiques en Génomique et Métabolisme (LABGeM), rue Gaston Crémieux, Evry, 91057, France

4 CEA, Institut de Génomique, Genoscope, rue Gaston Crémieux, Evry, 91057, France

For all author emails, please log on.

Genome Biology 2010, 11:R45  doi:10.1186/gb-2010-11-4-r45

Published: 30 April 2010



Analysis of large scale diversity in bacterial genomes has mainly focused on elements such as pathogenicity islands, or more generally, genomic islands. These comprise numerous genes and confer important phenotypes, which are present or absent depending on strains. We report that despite this widely accepted notion, most diversity at the species level is composed of much smaller DNA segments, 20 to 500 bp in size, which we call microdiversity.


We performed a systematic analysis of the variable segments detected by multiple whole genome alignments at the DNA level on three species for which the greatest number of genomes have been sequenced: Escherichia coli, Staphylococcus aureus, and Streptococcus pyogenes. Among the numerous sites of variability, 62 to 73% were loci of microdiversity, many of which were located within genes. They contribute to phenotypic variations, as 3 to 6% of all genes harbor microdiversity, and 1 to 9% of total genes are located downstream from a microdiversity locus. Microdiversity loci are particularly abundant in genes encoding membrane proteins. In-depth analysis of the E. coli alignments shows that most of the diversity does not correspond to known mobile or repeated elements, and it is likely that they were generated by illegitimate recombination. An intriguing class of microdiversity includes small blocks of highly diverged sequences, whose origin is discussed.


This analysis uncovers the importance of this small-sized genome diversity, which we expect to be present in a wide range of bacteria, and possibly also in many eukaryotic genomes.