Open Access Highly Accessed Open Badges Research

Novel origins of copy number variation in the dog genome

Jonas Berglund1, Elisa M Nevalainen2, Anna-Maja Molin3, Michele Perloski4, The LUPA Consortium5, Catherine André6, Michael C Zody4, Ted Sharpe4, Christophe Hitte6, Kerstin Lindblad-Toh14, Hannes Lohi2 and Matthew T Webster1*

Author Affiliations

1 Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, Box 582, SE-751 23, Uppsala, Sweden

2 Department of Basic Veterinary Sciences, Department of Medical Genetics, Program in Molecular Medicine, Folkhälsan Institute of Genetics, Biomedicum Helsinki, University of Helsinki, PO Box 63, 00014 Helsinki, Finland

3 Department of Animal Breeding and Genetics, Swedish University of Agricultural Sciences, Box 597, SE-751 24 Uppsala, Sweden

4 Broad Institute of Harvard and Massachusetts Institute of Technology, 7 Cambridge Center, Cambridge, Massachusetts 02142, USA


6 Institut de Génétique et Développement de Rennes, CNRS-UMR6290, Université de Rennes 1, Rennes, France

For all author emails, please log on.

Genome Biology 2012, 13:R73  doi:10.1186/gb-2012-13-8-r73

Published: 23 August 2012



Copy number variants (CNVs) account for substantial variation between genomes and are a major source of normal and pathogenic phenotypic differences. The dog is an ideal model to investigate mutational mechanisms that generate CNVs as its genome lacks a functional ortholog of the PRDM9 gene implicated in recombination and CNV formation in humans. Here we comprehensively assay CNVs using high-density array comparative genomic hybridization in 50 dogs from 17 dog breeds and 3 gray wolves.


We use a stringent new method to identify a total of 430 high-confidence CNV loci, which range in size from 9 kb to 1.6 Mb and span 26.4 Mb, or 1.08%, of the assayed dog genome, overlapping 413 annotated genes. Of CNVs observed in each breed, 98% are also observed in multiple breeds. CNVs predicted to disrupt gene function are significantly less common than expected by chance. We identify a significant overrepresentation of peaks of GC content, previously shown to be enriched in dog recombination hotspots, in the vicinity of CNV breakpoints.


A number of the CNVs identified by this study are candidates for generating breed-specific phenotypes. Purifying selection seems to be a major factor shaping structural variation in the dog genome, suggesting that many CNVs are deleterious. Localized peaks of GC content appear to be novel sites of CNV formation in the dog genome by non-allelic homologous recombination, potentially activated by the loss of PRDM9. These sequence features may have driven genome instability and chromosomal rearrangements throughout canid evolution.