By catalyzing the joining of breaks in the phosphodiester backbone of duplex DNA, DNA ligases play a vital role in the diverse processes of DNA replication, recombination and repair. Three related classes of ATP-dependent DNA ligase are readily apparent in eukaryotic cells. Enzymes of each class comprise catalytic and non-catalytic domains together with additional domains of varying function. DNA ligase I is required for the ligation of Okazaki fragments during lagging-strand DNA synthesis, as well as for several DNA-repair pathways; these functions are mediated, at least in part, by interactions between DNA ligase I and the sliding-clamp protein PCNA. DNA ligase III, which is unique to vertebrates, functions both in the nucleus and in mitochondria. Two distinct isoforms of this enzyme, differing in their carboxy-terminal sequences, are produced by alternative splicing: DNA ligase IIIα has a carboxy-terminal BRCT domain that interacts with the mammalian DNA-repair factor XrccI, but both α and β isoforms have an amino-terminal zinc-finger motif that appears to play a role in the recognition of DNA secondary structures that resemble intermediates in DNA metabolism. DNA ligase IV is required for DNA non-homologous end joining pathways, including recombination of the V(D)J immunoglobulin gene segments in cells of the mammalian immune system. DNA ligase IV forms a tight complex with Xrcc4 through an interaction motif located between a pair of carboxy-terminal BRCT domains in the ligase. Recent structural studies have shed light on the catalytic function of DNA ligases, as well as illuminating protein-protein interactions involving DNA ligases IIIα and IV.
Protein family review
DNA ligases are a large family of evolutionarily related proteins that play important roles in a wide range of DNA transactions, including chromosomal DNA replication, DNA repair and recombination, in all three kingdoms of life . Cofactor preferences divide the ligases into two sub-families. Most eubacterial enzymes utilize NAD+ as a cofactor; these enzymes fall outside the scope of this article but have recently been reviewed elsewhere . In contrast, most eukaryotic DNA ligases, together with archaeal and bacteriophage enzymes, fall into the second sub-family; these enzymes utilize ATP as a cofactor. Here we review the current state of knowledge of the cellular ATP-dependent DNA ligase enzymes in eukaryotic cells. Discussion of the function of related enzymes encoded by eukaryotic viruses can be found elsewhere .
Gene organization and evolutionary history
Vertebrate cells encode three well-characterized DNA ligases - DNA ligases I, III and IV - that appear to be descended from a common ancestral nucleotidyltransferase enzyme . DNA ligase I is probably conserved in all eukaryotes: orthologs have been identified and characterized in organisms as diverse as yeast and mammals, and have been shown to play important roles in nuclear DNA replication, repair and recombination. In budding yeast a form of DNA ligase I also functions in mitochondrial DNA replication and repair, a role that in higher eukaryotes is taken by DNA ligase III. This latter enzyme, which to date has been identified only in vertebrates, is also present in the nucleus, where it functions in DNA repair and perhaps also in meiotic recombination. Like DNA ligase I, ligase IV is also likely to be conserved in all eukaryotes: to date, orthologs of DNA ligase IV have been identified and characterized in yeast, higher plants and vertebrates. These studies have identified a vital role for this enzyme in nuclear DNA repair.
Characteristic structural features
Consistent with their descent from a common ancestor, all the eukaryotic ATP-dependent DNA ligases are related in sequence and structure. Figure 1 shows a schematic representation of the domain structures of DNA ligases I, III and IV from eukaryotic cells alongside other family members. With the exception of the atypically small PBCV-1 viral enzyme, two protein domains are common to all members of the family. The catalytic domain (CD) comprises six conserved sequence motifs (I, III, IIIa, IV, V-VI) that define a family of related nucleotidyltransferases including eukaryotic GTP-dependent mRNA-capping enzymes as well as eubacterial NAD+-dependent ligases . Motif I includes the lysine residue that is adenylated in the first step of the ligation reaction. Many of the enzymes shown in Figure 1 also contain a non-catalytic domain (NCD) that is conserved, albeit weakly, between different family members. The function of this domain is unknown.
Figure 1. Domain structures of ATP-dependent ligases. Schematic representation of the domain structures of DNA ligases I, IIIα, IIIβ and IV, together with ATP-dependent ligases from poxviruses (vaccinia, variola, fowlpox, and so on), the Chlorella virus of Paramecium bursaria PBCV-1, and archaea. Abbreviations: CD, catalytic domain; NCD, conserved non-catalytic domain; PBM, PCNA binding motif; NLS, nuclear localization signal; MTS, mitochondrial targeting sequence; ZnF, putative zinc finger; BRCT, BRCA carboxy-terminal-related domain. The red-boxed regions have had their structures solved crystallographically; the blue-boxed regions are found only in proteins targeted to mitochondria.
In addition to the CD and NCD domains, nuclear DNA ligase I proteins from different species also have an amino-terminal domain of variable length and low sequence conservation that includes a nuclear localization sequence (NLS) and, at the extreme amino terminus, a conserved PCNA-binding motif (PBM) of the type first identified in the mammalian DNA replication inhibitor p21Cip1 . PCNA (proliferating cell nuclear antigen) is best known as a DNA polymerase processivity factor, but there is increasing evidence that it plays an important role in coordinating protein-protein interactions on DNA. The PBM is found at the amino terminus of nuclear DNA ligase I enzymes from yeast and vertebrates, as well as in a number of other DNA replication and repair factors, such as the large subunit of the 'clamp loader' replication factor C (RF-C), which loads PCNA onto DNA, and the nuclease FEN1 .
In budding yeast, the use of different start codons results in the translation of distinct nuclear and mitochondrial iso-forms of the DNA ligase I protein Cdc9 . Translation from the first AUG gives rise to a pre-protein with an amino-terminal mitochondrial targeting sequence (MTS). This pre-protein is localized to the mitochondria, whereupon the MTS is cleaved by a mitochondrial peptidase. The nuclear form of the protein, which lacks the MTS, is translated from an internal in-frame AUG .
The DNA ligase III gene uses a similar mechanism to produce nuclear and mitochondrial proteins [7,8]. In addition, alternative pre-mRNA splicing results in the production of isoforms (DNA ligases IIIα and IIIβ with different carboxy-terminal sequences . DNA ligase IIIα is the longer of the two isoforms: at its carboxyl terminus it has a BRCT domain (BRCA carboxy-terminal-related domain), an autonomously folding protein module of about 95 amino acids that was first identified in the carboxy-terminal region of the BRCA1 tumour suppressor protein but which has since been found in a range of proteins implicated in DNA replication, DNA repair and checkpoint functions . This domain is absent from DNA ligase IIIβ, expression of which is confined to germline tissues [7,8,9]. Both DNA ligase III isoforms include a putative zinc-finger motif (ZnF) located amino-terminal to the NCD and CD domains. The ZnF motif has extensive sequence similarity to zinc fingers present in the DNA-damage response factor poly(ADP-ribose) polymerase and may facilitate binding to DNA secondary-structure elements, such as may be found at sites of DNA damage or as intermediates in DNA metabolism [11,12].
DNA ligase IV enzymes are characterized by a lengthy carboxy-terminal extension comprising two BRCT domains [13,14]. The BRCT domains are separated by a short linker sequence, of around 100 amino acids, that contains a conserved binding site for the DNA ligase IV binding protein Xrcc4 [15,16].
Although the three-dimensional structure of only one eukaryotic DNA ligase is known, that encoded by the virus PBCV-1 , the structure of bacteriophage T7 ligase has also been solved , making it possible to compare the two. Perhaps unsurprisingly, the structures share a high degree of similarity despite their low level of primary sequence similarity (Figure 2a,2b). Each protein comprises two distinct sub-domains: a large amino-terminal sub-domain ('domain 1') and a smaller carboxy-terminal sub-domain ('domain 2'). The ATP-binding site of the enzyme lies in the cleft between the two sub-domains. The structure of the catalytic core is similar to that of the eubacterial NAD+-dependent ligases and the eukaryotic GTP-dependent mRNA capping enzymes, reflecting their shared evolutionary history. As can be seen in Figure 2b, domain 1 consists of two antiparallel β sheets flanked by a helices, whereas domain 2 consists of a five-stranded β barrel and a single a helix and exemplifies the OB (oligonucleotide binding) fold found in a wide variety of nucleic-acid-binding proteins such as the eukaryotic single-stranded DNA binding factor RPA.
Figure 2. Structures of ATP-dependent DNA ligases. (a,b) Three-dimensional structures of (a) bacteriophage T7 DNA ligase complexed with ATP and (b) the Chlorella virus of Paramecium bursaria (PBCV-1) DNA ligase enzyme-adenylate complex, determined by X-ray crystallography. For the PBCV-1 enzyme, domains 1 and 2 (an OB fold) are indicated. (c,d) Structures of the BRCT domains from (c) the human DNA-repair factor Xrcc1 and (d) DNA ligaseIIIα, determined by X-ray crystallography and NMR (nuclear magnetic resonance), respectively. In each case, four short β strands form the core of the BRCT structure. The Xrcc1 core is flanked by three α helices (α1, α2 and α3) whereas that of DNA ligase IIIα is flanked by two only (α1 and α2). Theinteraction between the Xrcc1 and DNA ligase IIIα proteins in vivo is mediated by these BRCT domains. (e) Structure of a homodimer of Xrcc4 bound to a short peptide corresponding to amino acids 748-784 of human DNA ligase IV (shown in green). (f) Close-up view of the DNA ligase IV peptide bound to the helical tails of the Xrcc4 dimer. The peptide comprises a β hairpin followed by an α helix and lies asymmetrically across both Xrcc4 monomers. See text for details and references.
Two sets of studies have shed light on the structure and function of the non-catalytic regions of the eukaryotic ligases. In the first of these, the solution structure of the carboxy-terminal BRCT domain of DNA ligase IIIα was solved by NMR . This region of the protein is involved in binding to the DNA repair factor Xrcc1. The structure (shown in Figure 2c) comprises a sheet of four parallel β strands with a two-α-helix bundle and displays significant similarity to other BRCT domains, such as that of Xrcc1 itself , although the latter has an additional a helix located on the opposite side of the β sheet (α2 in Figure 2d).
More recently, the structure of the Xrcc4-interacting region of DNA ligase IV has been determined, in a complex with an Xrcc4 homodimer . The Xrcc4 protein itself has a globular amino-terminal head domain followed by a long helical tail (Figure 2e). In the crystal structure, a single polypeptide derived from DNA ligase IV (corresponding to 36 amino acids located between the carboxy-terminal BRCT domains) interacts simultaneously with the helical tails of both monomers but in an asymmetric manner. The peptide folds into a slab-like motif - comprising a β hairpin adjacent to a short α helix (Figure 2f) - that lies across the surfaces of the adjacent Xrcc4 monomer tails.
Localization and function
The ATP-dependent DNA ligases catalyze the joining of single-stranded breaks (nicks) in the phosphodiester backbone of double-stranded DNA in a three-step mechanism . The first step in the ligation reaction is the formation of a covalent enzyme-AMP complex. The co-factor ATP is cleaved to pyrophosphate and AMP, with the AMP being covalently joined to a highly conserved lysine residue in the active site of the ligase. The activated AMP residue is then transferred to the 5' phosphate of the nick, before the nick is sealed by phosphodiester-bond formation and AMP elimination. The reaction catalyzed by the eubacterial NAD+-dependent eubacterial ligases is essentially identical but for the initial formation of the enzyme-AMP intermediate resulting in the breakdown of NAD+ and release of nicotinamide mononucleotide (NMN) rather than pyrophosphate ; although these two groups of enzymes belong to the same family of nucleotidyl transferases , they share almost no protein sequence similarity outside the catalytic core.
Nuclear DNA ligase function
DNA ligase I plays a vital role during chromosomal DNA replication, and also in several DNA repair pathways . In eukaryotes, as in eubacteria, replication occurs in a semi-discontinuous manner, with the lagging strand being synthesized as a series of discrete Okazaki fragments that are first processed and then ligated by DNA ligase I to form a continuous DNA strand . The human cell line 46BR.1G1, which is defective in DNA ligase I function, exhibits abnormal joining of Okazaki fragments during S phase of the cell cycle, a defect that can be complemented by addition of exogenous DNA ligase I protein . Similar phenotypes are displayed by yeast cells defective in DNA ligase I function. DNA ligase I function is mediated by its interaction with PCNA. As shown in Figure 1, the amino terminus of the DNA ligase I protein has a p21Cip1-type PCNA-binding motif that is required for localization of the DNA ligase I protein to so-called replication factories within the nuclei of S-phase cells [5,23]. (For this reason the PBM is sometimes referred to as the 'replication factory targeting sequence', or RFTS.) The PBM also seems to play a role in regulating the phosphorylation status of DNA ligase I in human cells. At least one residue in human DNA ligase I (Ser66) is phosphorylated in a cell-cycle-dependent manner; dephosphorylation of the enzyme in early G1 is dependent upon its being targeted to the nucleus, and also on the presence of an intact PBM .
In the nucleus, DNA ligase III appears to function only in DNA repair and possibly recombination [1,9]. DNA ligase IIIα forms a heterodimeric complex with Xrcc1, the two proteins interacting via their carboxy-terminal BRCT modules (Figure 2). This complex functions in base excision repair. The function of DNA ligase IIIβ, which lacks the BRCT domain and which therefore cannot bind to Xrcc1, and whose expression is limited to germline tissues alone, is not known.
DNA ligase IV, which is exclusively nuclear, functions in DNA non-homologous end joining (NHEJ) processes [1,25]. NHEJ is the principal mechanism by which mammalian cells repair DNA double-strand breaks caused by exposure to ionizing radiation or certain classes of chemical mutagens. In mammals, NHEJ is also required for V(D)J recombination, the process by which immunoglobulin and T-cell receptor genes are rearranged to generate antibody diversity. As illustrated in Figure 2, DNA ligase IV forms a complex with Xrcc4 . Evidence from mammalian and yeast systems suggests that Xrcc4 functions to stabilize the DNA ligase IV protein, to stimulate its activity, and to target the protein to sites of DNA double-strand breaks. Interestingly, mice lacking DNA ligase IV display embryonic lethality, implying that the enzyme has an essential function during early development [26,27].
Mitochondrial DNA ligase function
In vertebrates, both isoforms of DNA ligase III appear to be capable of being targeted to mitochondria as well as to the nucleus, making it possible that both enzymes play a part in the replication and repair of mitochondrial DNA [7,8]. It should be noted, however, that there is no evidence that Xrcc1, the nuclear binding partner of DNA ligase IIIα, is present in mitochondria, implying that other factors may interact with DNA ligase IIIα in this compartment. In addition, as noted above, expression of DNA ligase IIIβ is restricted to germline tissues [7,8,9].
In budding yeast, which lacks DNA ligase III, it is DNA ligase I that plays dual roles in the nucleus and in mitochondria . In mitochondria, the Cdc9 protein appears to be required both for DNA replication and also for the repair of damaged DNA, including the repair of double-strand breaks . Note that, in both yeast and vertebrate cells, DNA ligase IV appears to have no role in mitochondria.
The past five years have seen our understanding of eukaryotic DNA ligase function increase considerably, and there is no reason to suspect that the coming years will be any less productive, with genetic, biochemical and structural approaches combining to dissect in detail the function of these important enzymes. One area in which progress will surely be made is in the determination of additional three-dimensional structures. Despite significant recent advances, in particular the determination of the T7 and Chlorella virus ligase structures [17,18], the overall structure of the eukaryotic cellular enzymes can still only be guessed at. Solving the structures of each of the eukaryotic DNA ligases would add greatly to our understanding of these enzymes' activities. Perhaps the most likely full-length structure to be solved, however, will be that of one of the archaeal enzymes, given the advantages for crystallization that proteins from these organisms frequently offer. In the absence of full-length structures, progress awaits the determination of further partial structures to add to the already solved BRCT domain from DNA ligase III  and the Xrcc4-interacting region from DNA ligase IV ; the zinc-finger domain of DNA ligase III is an obvious candidate.
We are grateful to Paul Taylor for his assistance with the preparation of Figure 2, and to Fiona C. Gray for her reading of the manuscript. S.M. is funded by a Wellcome Trust Senior Research Fellowship in Basic Biomedical Science, and I.M. by a studentship from the Darwin Trust of Edinburgh.
Mutat Res 2000, 460:301-318.
A detailed overview of DNA ligase structure and function, with particular emphasis on lessons from structural studies.PubMed Abstract | Publisher Full Text
Mol Microbiol 2001, 40:1241-1248.
A comprehensive review of bacterial ligase structure and function, with details of both NAD+ and ATP-dependent enzymes.PubMed Abstract | Publisher Full Text
Virus Res 1998, 56:135-147.
Functional analysis of the vaccinia and Shope fibroma virus (SFV) ATP-dependent DNA ligases.PubMed Abstract | Publisher Full Text
Mol Microbiol 1995, 17:405-410.
Identification of the nucleotidyl transferase superfamily that includes both DNA ligases (ATP- and NAD+-dependent) as well as GTP-dependent mRNA capping enzymes.PubMed Abstract
BioEssays 1998, 20:195-199.
A short review of the PCNA-binding motif (PBM) first identified in the mammalian cell cycle inhibitor p21Cip1.PubMed Abstract | Publisher Full Text
Curr Biol 1999, 9:1085-1094.
First evidence that budding yeast DNA ligase I could localize to both nucleus and to mitochondria.PubMed Abstract | Publisher Full Text
Mol Cell Biol 1999, 19:3869-3876.
Experiments demonstrating that human DNA ligase III is able to encode both nuclear and mitochodrial proteins through the use of different start codons.PubMed Abstract | Publisher Full Text | PubMed Central Full Text
J Biol Chem 2001, 276:48978-48987.
Characterization of the Xenopus DNA ligase III proteins, showing that both IIIα and IIIβ can be detected in mitochondria.PubMed Abstract | Publisher Full Text
Mackey ZB, Ramos W, Levin DS, Walter CA, McCarrey JR, Tomkinson AE: An alternative splicing event which occurs in mouse pachytene spermatocytes generates a form of DNA ligase III with distinct biochemical properties that may function in meiotic recombination.
Mol Cell Biol 1997, 17:989-998.
Demonstration that alternative splicing results in the production of two distinct DNA ligase III isoforms differing at their carboxyl termini.PubMed Abstract | Publisher Full Text | PubMed Central Full Text
FASEB J 1997, 11:68-76.
One of the first articles to document the widespread nature of the BRCT motif in proteins involved in DNA replication, repair and checkpoint function.PubMed Abstract
Nucleic Acids Res 2000, 28:3558-3563.
Biochemical analysis of the in vitro DNA-binding preferences of the zinc-finger motif common to both DNA ligases IIIα and IIIβ.PubMed Abstract | Publisher Full Text | PubMed Central Full Text
Mackey ZB, Niedergang C, Murcia JM, Leppard J, Au K, Chen J, de Murcia G, Tomkinson AE: DNA ligase III is recruited to DNA strand breaks by a zinc finger motif homologous to that of poly(ADP-ribose) polymerase. Identification of two functionally distinct DNA binding regions within DNA ligase III.
J Biol Chem 1999, 274:21679-21687.
Shows that the DNA ligase zinc-finger motif plays a role in binding to nicked DNA.PubMed Abstract | Publisher Full Text
Wei YF, Robins P, Carter K, Caldecott K, Pappin DJ, Yu GL, Wang RP, Shell BK, Nash RA, Schär P, et al.: Molecular cloning and expression of human cDNAs encoding a novel DNA ligase IV and DNA ligase III, an enzyme active in DNA repair and recombination.
Mol Cell Biol 1995, 15:3206-3216.
The first evidence for the existence of DNA ligase IV in mammalian cells.PubMed Abstract | Publisher Full Text | PubMed Central Full Text
Nature 1997, 388:495-498.
One of the first papers describing the isolation of the yeast homolog of DNA ligase IV.PubMed Abstract | Publisher Full Text
Curr Biol 1998, 8:873-876.
Compelling evidence that DNA ligase IV binds to Xrcc4, not via its BRCT domains, but instead via the linker region located between the BRCT domains.PubMed Abstract | Publisher Full Text
Nat Struct Biol 2001, 8:1015-1019.
The crystal structure of a homodimer of human Xrcc4 complexed with a short peptide corresponding to the Xrcc4-binding region of DNA ligase IV.PubMed Abstract | Publisher Full Text
Mol Cell 2000, 6:1183-1193.
Describes the first three-dimensional structure of a eukaryotic ATP-dependent DNA ligase, that of the Chlorella virus of Paramecium bursaria, PCBV-1.PubMed Abstract | Publisher Full Text
Cell 1996, 85:607-615.
The first DNA ligase structure to be determined, that of the ATP-dependent enzyme encoded by bacteriophage T7.PubMed Abstract | Publisher Full Text
Biochemistry 2001, 40:13158-13166.
The solution structure of the Xrcc1-interacting BRCT domain from the carboxyl terminus of human DNA ligase IIIα, as determined by NMR.PubMed Abstract | Publisher Full Text
EMBO J 1998, 17:6404-6411.
Crystal structure of the DNA ligase IIIα-interacting BRCT domain from the Xrcc1 protein, the first BRCT domain structure to be solved.PubMed Abstract | Publisher Full Text
Curr Biol 2001, 11:R842-R844.
A brief review of recent advances in dissecting the mechanisms of Okazaki fragment processing in eukaryotes.PubMed Abstract | Publisher Full Text
J Biol Chem 1997, 272:11550-11556.
Analysis of the role of the amino-terminal domain of human DNA ligase I in DNA replication using a cell-free system.PubMed Abstract | Publisher Full Text
Montecucco A, Rossi R, Levin DS, Gary R, Park MS, Motycka TA, Ciarocchi G, Villa A, Biamonti G, Tomkinson AE: DNA ligase I is recruited to sites of DNA replication by an interaction with proliferating cell nuclear antigen: identification of a common targeting mechanism for the assembly of replication factories.
EMBO J 1998, 17:3786-3795.
Clear demonstration that the PCNA-binding motif alone functions to localize DNA ligase I to sites of ongoing DNA replication (replication factories) in mammalian nuclei.PubMed Abstract | Publisher Full Text
Rossi R, Villa A, Negri C, Scovassi I, Ciarrocchi G, Biamonti G, Mon-tecucco A: The replication factory targeting sequence/PCNA binding site is required in G1 to control the phosphorylation status of DNA ligase I.
EMBO J 1999, 18:5745-5754.
Analysis of the role of the PCNA-binding motif at the amino-terminus of human DNA ligase I in regulating phosphorylation of the protein.PubMed Abstract | Publisher Full Text
Curr Biol 1999, 9:R759-R761.
A brief overview of double-stranded DNA-break repair strategies.PubMed Abstract | Publisher Full Text
Curr Biol 1998, 8:1395-1398.
Demonstration that DNA ligase IV is essential for embryonic development in mice.PubMed Abstract | Publisher Full Text
Nature 1998, 396:173-177.
Demonstration that DNA ligase IV is essential for embryonic development and V(D)J recombination in mice.PubMed Abstract | Publisher Full Text
Nucleic Acids Res 2001, 29:1582-1589.
Investigation of the role of the budding yeast DNA ligase I protein Cdc9 in the mitochondrion.PubMed Abstract | Publisher Full Text | PubMed Central Full Text