Open Access Highly Accessed Open Badges Research

Characterization of probiotic Escherichia coli isolates with a novel pan-genome microarray

Hanni Willenbrock12*, Peter F Hallin1, Trudy M Wassenaar13 and David W Ussery1

Author Affiliations

1 Center for Biological Sequence Analysis, Technical University of Denmark, 2800, Lyngby, Denmark

2 Exiqon A/S, 2950 Vedbæk, Denmark

3 Molecular Microbiology and Genomics Consultants, Tannenstrasse, 55576 Zotzenheim, Germany

For all author emails, please log on.

Genome Biology 2007, 8:R267  doi:10.1186/gb-2007-8-12-r267

Published: 18 December 2007



Microarrays have recently emerged as a novel procedure to evaluate the genetic content of bacterial species. So far, microarrays have mostly covered single or few strains from the same species. However, with cheaper high-throughput sequencing techniques emerging, multiple strains of the same species are rapidly becoming available, allowing for the definition and characterization of a whole species as a population of genomes - the 'pan-genome'.


Using 32 Escherichia coli and Shigella genome sequences we estimate the pan- and core genome of the species. We designed a high-density microarray in order to provide a tool for characterization of the E. coli pan-genome. Technical performance of this pan-genome microarray based on control strain samples (E. coli K-12 and O157:H7) demonstrated a high sensitivity and relatively low false positive rate. A single-channel analysis approach is robust while allowing the possibility for deriving presence/absence predictions for any gene included on our pan-genome microarray. Moreover, the array was highly sufficient to investigate the gene content of non-pathogenic isolates, despite the strong bias towards pathogenic E. coli strains that have been sequenced so far.


This high-density microarray provides an excellent tool for characterizing the genetic makeup of unknown E. coli strains and can also deliver insights into phylogenetic relationships. Its design poses a considerably larger challenge and involves different considerations than the design of single strain microarrays. Here, lessons learned and future directions will be discussed in order to optimize design of microarrays targeting entire pan-genomes.