Open Access Highly Accessed Open Badges Research

Human genomic regions with exceptionally high levels of population differentiation identified from 911 whole-genome sequences

Vincenza Colonna12*, Qasim Ayub1, Yuan Chen1, Luca Pagani1, Pierre Luisi3, Marc Pybus3, Erik Garrison4, Yali Xue1, Chris Tyler-Smith1 and The 1000 Genomes Project Consortium

Author Affiliations

1 The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambs CB10 1SA, UK

2 Institute of Genetics and Biophysics ‘A. Buzzati-Traverso’, National Research Council (CNR), Naples 80131, Italy

3 Institute of Evolutionary Biology (Universitat Pompeu Fabra-CSIC), CEXS-UPF-PRBB, Barcelona, Catalonia 08003, Spain

4 Department of Biology, Boston College, Chestnut Hill, MA 02467, USA

For all author emails, please log on.

Genome Biology 2014, 15:R88  doi:10.1186/gb-2014-15-6-r88

Published: 30 June 2014



Population differentiation has proved to be effective for identifying loci under geographically localized positive selection, and has the potential to identify loci subject to balancing selection. We have previously investigated the pattern of genetic differentiation among human populations at 36.8 million genomic variants to identify sites in the genome showing high frequency differences. Here, we extend this dataset to include additional variants, survey sites with low levels of differentiation, and evaluate the extent to which highly differentiated sites are likely to result from selective or other processes.


We demonstrate that while sites with low differentiation represent sampling effects rather than balancing selection, sites showing extremely high population differentiation are enriched for positive selection events and that one half may be the result of classic selective sweeps. Among these, we rediscover known examples, where we actually identify the established functional SNP, and discover novel examples including the genes ABCA12, CALD1 and ZNF804, which we speculate may be linked to adaptations in skin, calcium metabolism and defense, respectively.


We identify known and many novel candidate regions for geographically restricted positive selection, and suggest several directions for further research.