Open Access Highly Accessed Open Badges Research

The ESAT-6 gene cluster of Mycobacterium tuberculosis and other high G+C Gram-positive bacteria

Nico C Gey van Pittius1*, Junaid Gamieldien2, Winston Hide2, Gordon D Brown3, Roland J Siezen4 and Albert D Beyers1

Author Affiliations

1 US/MRC Centre for Molecular and Cellular Biology, Department of Medical Biochemistry, University of Stellenbosch, Tygerberg, 7505, South Africa

2 South African National Bioinformatics Institute (SANBI), University of the Western Cape, Bellville, 7535, South Africa

3 Sir William Dunn School of Pathology, University of Oxford, South Parks Road, Oxford OX1 3RE, UK

4 Center for Molecular and Biomolecular Informatics, University of Nijmegen, 6525ED Nijmegen, The Netherlands

For all author emails, please log on.

Genome Biology 2001, 2:research0044-research0044.18  doi:10.1186/gb-2001-2-10-research0044

Published: 19 September 2001



The genome of Mycobacterium tuberculosis H37Rv has five copies of a cluster of genes known as the ESAT-6 loci. These clusters contain members of the CFP-10 (lhp) and ESAT-6 (esat-6) gene families (encoding secreted T-cell antigens that lack detectable secretion signals) as well as genes encoding secreted, cell-wall-associated subtilisin-like serine proteases, putative ABC transporters, ATP-binding proteins and other membrane-associated proteins. These membrane-associated and energy-providing proteins may function to secrete members of the ESAT-6 and CFP-10 protein families, and the proteases may be involved in processing the secreted peptide.


Finished and unfinished genome sequencing data of 98 publicly available microbial genomes has been analyzed for the presence of orthologs of the ESAT-6 loci. The multiple duplicates of the ESAT-6 gene cluster found in the genome of M. tuberculosis H37Rv are also conserved in the genomes of other mycobacteria, for example M. tuberculosis CDC1551, M. tuberculosis 210, M. bovis, M. leprae, M. avium, and the avirulent strain M. smegmatis. Phylogenetic analyses of the resulting sequences have established the duplication order of the gene clusters and demonstrated that the gene cluster known as region 4 (Rv3444c-3450c) is ancestral. Region 4 is also the only region for which an ortholog could be found in the genomes of Corynebacterium diphtheriae and Streptomyces coelicolor.


Comparative genomic analysis revealed that the presence of the ESAT-6 gene cluster is a feature of some high-G+C Gram-positive bacteria. Multiple duplications of this cluster have occurred and are maintained only within the genomes of members of the genus Mycobacterium.