• Research on the interactions between natural and social systems, and with how those interactions affect the challenge of sustainability.
  • Science Sessions: The PNAS Podcast Program

Precision genome editing using synthesis-dependent repair of Cas9-induced DNA breaks

  1. Geraldine Seydouxa,1
  1. aDepartment of Molecular Biology and Genetics, Howard Hughes Medical Institute, The Johns Hopkins University School of Medicine, Baltimore MD 21205
  1. Contributed by Geraldine Seydoux, October 26, 2017 (sent for review July 5, 2017; reviewed by Dana Carroll and James E. Haber)

Significance

Genome editing, the introduction of precise changes in the genome, is revolutionizing our ability to decode the genome. Here we describe a simple method for genome editing in mammalian cells that takes advantage of an efficient mechanism for gene conversion that utilizes linear donors. We demonstrate that PCR fragments containing edits up to 1 kb require only 35-bp homology sequences to initiate repair of Cas9-induced double-stranded breaks in human cells and mouse embryos. We experimentally determine donor DNA design rules that maximize the recovery of edits without cloning or selection.

Abstract

The RNA-guided DNA endonuclease Cas9 has emerged as a powerful tool for genome engineering. Cas9 creates targeted double-stranded breaks (DSBs) in the genome. Knockin of specific mutations (precision genome editing) requires homology-directed repair (HDR) of the DSB by synthetic donor DNAs containing the desired edits, but HDR has been reported to be variably efficient. Here, we report that linear DNAs (single and double stranded) engage in a high-efficiency HDR mechanism that requires only ~35 nucleotides of homology with the targeted locus to introduce edits ranging from 1 to 1,000 nucleotides. We demonstrate the utility of linear donors by introducing fluorescent protein tags in human cells and mouse embryos using PCR fragments. We find that repair is local, polarity sensitive, and prone to template switching, characteristics that are consistent with gene conversion by synthesis-dependent strand annealing. Our findings enable rational design of synthetic donor DNAs for efficient genome editing.

Precision genome editing begins with the creation of a double-stranded break (DSB) in the genome near the site of the desired DNA sequence change (“edit”) (1). Generation of targeted DSBs has been greatly accelerated in recent years by the discovery of CRISPR-Cas9, a programmable DNA endonuclease that can be targeted to a specific DNA sequence by a small “guide” RNA (crRNA) (2). DSBs are lethal events that must be repaired by the cell’s DNA repair machinery. DSBs can be repaired via imprecise, nonhomology-based repair mechanisms, such as nonhomologous end joining (NHEJ), or by precise, homology-dependent repair (HDR) (3). HDR utilizes DNAs that contain homology to sequences flanking the DSB (termed homology arms) to template the repair. If a synthetic “donor” DNA containing the desired edit is available when the DSB is generated, the cellular HDR machinery will use the donor DNA to repair the DSB and the edit will be incorporated at the targeted locus (1). Several studies have reported that single-stranded oligodeoxyribonucleotides (ssODNs) can be used to introduce short edits (<50 bases) (ref. 4 and references therein). ssODNs that target the DNA strand that is first released by Cas9 after DSB generation have been reported to perform best (5). This strand preference, however, has only been tested for small edits near the DSB and has not been noticed at all loci (4). Edits at a distance from the DSB (>10 bp) are recovered at lower frequencies (4, 6). Recovery of large edits (such as GFP knockins) has also been reported to be inefficient, requiring large plasmid donors with long (>500 nt) homology arms or selection markers to recover the rare edits (3). Large insertions have been obtained through nonhomologous or microhomology-mediated end joining reactions (NHEJ and MMEJ), but these approaches require simultaneous Cas9-induced cleavage of donor and target DNAs (7?????13).

We documented previously that, in Caenorhabditis elegans, HDR can be very efficient, provided that the donor DNAs are linear (14). Linear donors do not appear to integrate at the DSB, but instead are used as templates for DNA synthesis, as in the synthesis-dependent strand-annealing (SDSA) model for gene conversion (1, 15, 16). In C. elegans, donors for SDSA can be single (ssODNs) or double stranded (PCR fragments) and require only short homology arms (~35 bases) to engage the DSB. The repair process is sensitive to insert size and prone to template switching, where synthesis can “jump” between two overlapping donors (14). In human cells, SDSA has been proposed as a repair mechanism for ssODNs (4, 17), but not for double-stranded donors, which are thought to participate in a different HDR pathway (18, 19). Here, we investigate how linear donors engage the DSB repair machinery in mammalian cells. First, we demonstrate that, as in C. elegans, PCR fragments with 35-bp homology arms function as efficient donors for genome editing in mouse embryos and human cells. Using PCR fragments and ssODNs, we investigate the sequence requirements for efficient repair by linear donors in human cells. Our findings are consistent with SDSA and suggest simple donor DNA design principles to maximize editing efficiency.

Materials and Methods

Detailed Results, Sequences, and Solutions.

SI Appendix, Table S1 lists all experiments, including detailed conditions and results of experimental replicates. SI Appendix, Tables S2–S5, list sequences of linear donors, plasmids, PCR primers, and cr/sgRNAs, respectively. Position of the cr/sgRNAs on the loci targeted in this study can be found in SI Appendix, Fig. S1. Fig. 1 describes mouse experiments and Figs. 27 describe HEK293T cells experiments. Results presented in Figs. 2, 3, and 7 B and D and SI Appendix, Fig. S2 are the average of at least two independent experiments and the error bars represent the SD.

Fig. 2.

PCR fragments with short homology arms are efficient donors to create GFP knockins in HEK293T cells. (A) Diagrams showing PCR donors for GFP insertion at the Lamin A/C and RAB11A loci. Locus, gray; GFP, green; homology arms, blue; and DSB, vertical line. GFP was inserted at the DSB in Lamin A/C and 11 bp upstream of the DSB in RAB11A. (B) Graphs showing percentage of GFP+ cells obtained with PCR donors with homology arms of the indicated lengths (33/33 refers to a right homology arm and a left homology arm, each 33 bp long). Insert size in all cases was 714 bp. Each bar represents the average insertion efficiency from two or more independent experiments (SI Appendix, Table S1). Error bars represent the ±SD. PCR fragments were nucleofected in HEK293T cells at the concentration indicated and counted by flow cytometer 3 d later. For this and all other figures, SI Appendix, Table S1 provides details. (C) Graphs showing percentage of GFP+ cells obtained with PCR or plasmid donors with homology arms of the indicated lengths. Insert size in all cases was 714 bp. Each bar represents the average insertion efficiency from two or more independent experiments (SI Appendix, Table S1). Error bars represent the ±SD. PCR fragments were nucleofected in HEK293T cells at the concentration indicated and cells were counted by flow cytometer 3 d later. (D) Confocal images of cells 3 d after nucleofection. GFP, green; DNA, blue. The GFP subcellular localizations are as expected for in-frame translational fusions.

Fig. 3.

Editing efficiency increases with decreasing insert size. Graphs showing percentage of GFP+ cells obtained with PCR donors with homology arms and inserts of the indicated lengths. Each bar represents the average insertion efficiency from two or more independent experiments (SI Appendix, Table S1). Error bars represent the ±SD. (A) Knockin of donors containing full-length GFP at the Lamin A/C locus. PCR fragments were nucleofected in HEK293T cells at the concentration indicated and cells were counted by microscopy 3 d later. (B) Knockin of donors containing full-length GFP or GFP11 at the Lamin A/C locus. PCR fragments were nucleofected at the concentration indicated in HEK293T (expressing GFP1–10) and cells were counted by microscopy 3 d later. (C) Knockin of donors containing full-length GFP or GFP11 at the RAB11A locus (11 bp upstream of DSB). PCR fragments were nucleofected at the concentration indicated in HEK293T (expressing GFP1–10), and cells were counted by flow cytometer 3 d later.

Fig. 4.

Repair is a polarity-sensitive process. (A) Synthesis-dependent strand annealing (SDSA) model for gene conversion (15, 16). In this, and all other schematics, each line corresponds to a DNA strand. Locus DNA is in gray, donor homology arms are in blue, donor insert is in green, and arrows indicate 3′ ends. Donor DNA strands of opposite polarity are shown above and below the locus for clarity. PCR donors contain both strands, ssODNs donors would contain either a sense or antisense strand. Dotted lines represent DNA synthesized during the repair process. Resection of DSB: DSB is resected, creating 3′ overhangs on each side of the DSB. Strand invasion and DNA synthesis: The overhangs pair with complementary strands in the donor and are extended by DNA synthesis. Annealing: The newly synthesized strands withdraw from the donor and anneal back at the locus. Ligation (not shown) seals the break. (B) Diagrams showing donor ssODNs with only one homology arm (same conventions as in A). The ssODNs contain a 126-bp insert (green) coding for 3×Flag and GFP11 and homology arm targeting either the right or left side of the DSB (SI Appendix, Table S1). (C) Normalized editing efficiency of ssODNs containing only one homology arm at the Lamin A/C and RAB11A loci. The polarity that allows pairing between the ssODN and resected ends (as shown in diagram in A) is favored. Sense and antisense ssODNs were tested in parallel experiments and their efficiency were normalized as follows: normalized efficiency of sense ssODN (light blue) = % GFP+ cells with sense ssODN/[% GFP+ cells with sense ssODN + % GFP+ cells with antisense ssODN]. Normalized efficiency of antisense ssODN (dark blue) = % GFP+ cells with antisense ssODN/[% GFP+ cells with sense ssODN + % GFP+ cells with antisense ssODN]. Numbers on top of each column indicate the nonnormalized % of GFP+ cells for each ssODN determined by microscopy (Lamin A/C) or flow cytometer (RAB11A).

Fig. 5.

Polarity of ssODNs affects incorporation of distal edits. (A) Schematics showing possible pairing interactions between resected locus (gray) and ssODNs (light or dark blue for sense and antisense ssODN, respectively, arrows indicate 3′ ends) coding for a distal insert (green). Sequences between the DSB and insert were recoded to help integration of the distal insert and prevent cutting of edited locus by Cas9. (B) Normalized efficiency of sense versus antisense ssODNs calculated as in Fig. 4 (SI Appendix, Table S1 provides detailed results). Distance from the DSB, locus, and guide RNA polarity are indicated Below each experiment. ssODN polarity has little effect on editing efficiency for proximal edits, but has a larger effect for distal edits. The favored polarity changes, depending on whether the distal edit is positioned to the left or right of the DSB. Note that the favored ssODN polarity does not correlate with crRNA polarity (for example, first two columns in the graph show crRNA1776 and crRNA1777, which cut at the same position but have opposite polarity). Experiments involving the PYM1 locus were done on HEK293T that were cloned out and genotyped by PCR genotyping (size shift) for 3×Flag insertion (Fig. 6). All other experiments were performed on HEK233T (GFP1–10) cells that were directly scored for GFP+ by flow cytometer or microscopy 3 d after nucleofection. Numbers Above each column indicate the overall percentage of edits. Note that overall frequency decreases with increasing distance from the DSB (also see SI Appendix, Fig. S6).

Fig. 6.

Recoding of sequences between the DSB and the edit increases recovery of distal edits. (A) Schematics showing resected locus (gray with arrow at the 3′ ends, PYM1 locus) and ssODN donor (blue with arrow at the 3′ end) coding for a proximal edit (green, restriction enzyme site, 1 bp to the right of the DSB) and a distal edit (red, 3×Flag, 23 bp to the left of the DSB). Double arrows represent the region between the proximal and distal edits that is recoded (silent mutations). (B) Graphs showing percentage of edited cells containing proximal + distal edits (purple), proximal only (green), or distal only (red), using a ssODN donor with or without a recoded region. More than 50 cell clones were analyzed by PCR genotyping (size shift) and RE digestion. (C) Schematics showing resected locus (gray with arrow at the 3′ ends, Lamin A/C locus) and PCR donor (blue, thick bar) coding for a proximal edit (green, GFP11 inserted at the DSB) and a distal edit (red, tagRFP, 33 bp to the right of the DSB). Double arrows represent the region between proximal and distal edits that is recoded (silent mutations). (D) Graphs showing percentage of edited cells containing proximal + distal edits (purple), proximal only (green), or distal only (red), using a PCR donor with or without a recoded region. Edits were determined by direct examination of >1,000 cells by microscopy.

Fig. 7.

Repair is prone to template switching between donors. (A) Schematics showing repair of a DSB at the RAB11A locus (gray) with two ssODN donors. Arrows indicate 3′ ends. Donor 1 contains GFP11 (green) with a stop codon (red cross) and two homology arms (blue). Donor 2 contains GFP11 with no stop codon and no homology arm. Double arrows indicate identical sequence shared between the donors. (B) Graphs showing the percent of GFP+ cells (y axis, as determined by flow cytometer) for each donor combination (x axis). Each bar represents the average insertion efficiency from two independent experiments (SI Appendix, Table S1). Error bars represent the ±SD. For comparison, an ssODN identical to donor 1 but without the stop codon gives 17.2% edits (discontinuous Rightmost bar). (C) Schematics showing repair of a DSB at the RAB11A locus as in diagram A but with two PCR donors (thick bars). (D) Graphs showing the percent of GFP+ cells as in graphs B but with two PCR donors. Each bar represents the average insertion efficiency from two independent experiments (SI Appendix, Table S1). Error bars represent the ±SD. (E) Schematics showing repair of a DSB at the Lamin A/C locus (gray) with two ssODN donors. Arrows represents 3′ ends. Donor 1 contains GFP11 (green) and two homology arms (blue). Donor 2 contains a recoded GFP11 (stars) with no homology arm. Double arrows indicate identical sequence shared between the donors. In this experiment, the edits were amplified en masse by PCR using a locus-specific primer and an insert-specific primer and sequenced by Illumina sequencing (Materials and Methods). (F) Graph showing the percentage of reads with evidence of template switching (y axis) for each donor combination (x axis). Donor 1 + donor 2 without mutations and donor 1 + donor 2 with one mutation every 3 nucleotides (1/3) show no evidence of template switching (0%), whereas donor 1 + donor 2 (1/6) and donor 1 + donor 2 (1/12) show evidence of template switching (0.5% and 1.4%, respectively). SI Appendix, Fig. S7 and Table S6 provides details.

Repair Templates, Cas9, cr/tracrRNAs, and Plasmids for Cell Culture.

ssODNs (ultramers) and PCR primers where ordered from IDT and reconstituted at 50 μM and 100 μM, respectively, in water. For the Illumina sequencing experiment shown in Fig. 7F, ssODNs and primers were ordered PAGE purified. PCR fragment donors were synthesized as described in ref. 20.

Cas9 protein was purified as described in ref. 21. crRNAs and tracrRNA were ordered from IDT and reconstituted in 5 mM Tris?HCl pH 7.5 at 130 μM. Plasmids containing repair templates were made using gBlock gene fragments (IDT) and InFusion cloning kit (Clontech), and purified using the Qiagen miniprep kit and eluted in H2O. For experiments at the PYM1 locus, the sgRNA was cloned as described in ref. 22.

Cas9 RNP Nucleofection.

With the exception of experiments at the PYM1 locus (see below), all experiments in this study used Cas9 ribonucleoprotein (RNP) delivery (23). Nucleofections using Cas9 RNP were performed as described (24). HEK293T cells or HEK293T cells expressing a truncated GFP (GFP1–10) (25) were grown to 50–75% confluency, trypsinized, pelleted, and resuspended at 800,000 cells per 80 μL of PBS. Just before nucleofection, PBS was replaced with 80 μL of Nucleofection kit V (Lonza). A total of 40 μL of Cas9 RNP mix (see below) was added to the cells in suspension in Nucleofector kit V and processed using an Amaxa Nucleofector 2b machine (Lonza) with the A023 program. Cells were transferred to culture media and analyzed for fluorescence 3 d (days) after.

The Cas9 RNP mix contains: 6.5 μM of crRNA and tracrRNA, 9.8 μM of Cas9 (1.6 μg/μL), a variable concentration of repair templates (SI Appendix, Table S1 provides details), 10.4% glycerol, 131 mM KCl, 5.2 mM Hepes, 1 mM MgCl2, 0.5 mM Tris?HCl, pH 7.5.

For sequencing of GFP edits at the Lamin A/C locus, cells were sorted [at the Johns Hopkins University (JHU) Ross Flow Cytometry Core Facility] for GFP signal and cloned in 96-well plates for genotyping or pooled in a 6-well plate for microscopy analysis. Single-cell clones were lysed using QuickExtract DNA Extraction Solution (Epicentre) and genotyped by PCR using Phusion Taq (NEB) with genomic primers outside of the HDR fragment. PCR products were analyzed on agarose gel and sequenced (SI Appendix, Figs. S4 and S5).

Cas9 Plasmid Transfections.

For experiments at the PYM1 locus, Cas9 and the sgRNA were delivered on plasmids. HEK293T cells were grown to 50–75% confluency in six-well plate (with 2 mL of culture media per wells). A total of 10.8 μL of Cas9 plasmid mix (containing 3.6 μL of X-tremeGENE 9 DNA Transfection Reagent from Roche, 892 ng of plasmid pX458 containing PYM1 sgRNA and 3.24 pmol of repair template) was added to 120 μL of optiMEM glutaMAX media (Thermo Fisher Scientific), incubated for 15 min at room temperature, and then added to the cells. Forty-eight hours after transfection, cells were sorted for GFP signal (to select for cells that received pX458) and grown out as single-cell clones. The single-cell clones were lysed and genotyped by PCR. PCR products were directly analyzed on agarose gel or mixed with EcoR1 (NEB) and the corresponding restriction enzyme (RE) buffer, digested overnight, and analyzed on agarose gel.

Cytometer Analysis.

For each experiment, 5,000–10,000 cells were analyzed using a Guava EasyCyte 6/2L (Millipore) cytometer. Cells were scored as GFP+ if they exhibited a higher signal than 99.5% of nontransfected control cells.

HEK293T (GFP1–10) cells exhibit a higher basal green fluorescence than wild-type HEK293T cells. Cytometer analysis could not be performed on these cells for GFP11-tagged Lamin A/C and SMC3. For those experiments, as well as for RFP tagging, cells were analyzed by fluorescence microscopy and scored manually (see below).

Microscopy.

Cells were fixed in 4% PFA and mounted with DAPI. Cells were imaged using a confocal microscope with a 63× objective. >50 fields of cells (>1,000 cells) were selected in the DAPI channel, photographed, and analyzed for GFP or RFP expression manually.

PCR Amplicons for Illumina Sequencing.

HEK293T (GFP1–10) were nucleofected with different combinations of repair ssODNs (Fig. 7E and SI Appendix, Table S1). To control for possible template switching during PCR amplification, we also introduced single donors (wild type or mutant) in two separate cell populations and combined the cells during PCR amplification. Sixty hours after nucleofection, cells were trypsinized, washed in PBS, and 500,000 cells were lysed in 40 μL of QuickExtract DNA Extraction Solution. A total of 40 μL of H2O was added to each lysis. A total of 6 μL of DNA from each experiment was PCR amplified using Phusion Taq and the primer 390 (forward, in the left end of the insert) and the primer 1849 (reverse, in the Lamin A/C locus downstream of the right homology arm of the ssODN used for repair) for 10 cycles at 68.5 °C (SI Appendix, Table S4 provides primer sequences). After 10 PCR cycles, no band could be detected on agarose gel and ethidium bromide staining. Each PCR was purified using Qiagen Minelute columns and eluted in 10 μL of H2O. A total of 2 μL of each PCR was amplified using Phusion Taq at 65 °C for 20 cycles. PCR reactions did not reach an amplification plateau with this number of cycles. The PCR reactions were performed using primers 1928 (forward, containing the Illumina sequence and annealing in the same region as primer 390) and reverse primers containing the Illumina sequence and a specific barcode. The Illumina reverse primers anneal with the Lamin A/C locus just upstream of primer 1849 and downstream of the right homology arm of the ssODN used for repair.

PCR amplicons were purified on a 10% nondenaturing Tris-borate-EDTA (TBE)/PAGE gel and the band corresponding to the PCR product was cut from the gel, eluted overnight, and precipitated with isopropanol. After resuspension, sample concentrations were quantified on a bioanalyzer, and the barcoded samples were pooled to a concentration of 0.4 μM per sample in 10 μL. This sample was submitted to The Johns Hopkins School of Medicine Genetics Resources Core Facility for 250-cycle paired-end sequencing on an Illumina MiSeq instrument.

Illumina Sequencing Analysis.

After demultiplexing of barcoded samples, the 3′ adaptor and all downstream nucleotides were trimmed from the forward reads using Cutadapt (journal.embnet.org/index.php/embnetjournal/article/view/200), and the resulting sequences were mapped to the insert + Lamin A/C locus using Bowtie 2 (26). After removing reads that did not fully map to the template and low-quality reads (Q score <35; error probability of 0.00032), sequences were parsed for template switching. To score template switches, we evaluated sequencing reads at diagnostic positions and determined whether each position matched the sequence of the wild-type or mutated template. Reads with a diagnostic nucleotide that did not match either the wild-type or mutated template were discarded. Because the PCR control sample contained a mixture of the fully wild-type and fully mutated templates, we used the first diagnostic position (from the right side of the insert) only as an “anchor” to determine the initial identity of the template; this position was not used to score switching. Thereafter, whenever two or more contiguous diagnostic nucleotides indicated a switch in template identity, we scored this as a switch. For the control sample in which both templates were wild type, we used the “1/6”-mutated template for comparison, to determine the rate of false-positive switches in the assay. Because the PCR control experiment was performed with the wild-type and 1/6-mutated template (SI Appendix, Fig. S7 and Table S1), we also used the 1/6-mutated template for scoring switches in this sample. SI Appendix, Table S6 provides details.

Cas9 RNP Injection in Mouse Zygotes.

All mouse experiments were carried out under protocols approved by the JHU Animal Care and Use Committee.

The PCR fragment donor was synthesized as described in ref. 20. The plasmid donor was generated using a gBlock and restriction enzyme cloning, and purified by the Qiagen midi-prep kit and eluted in injection buffer (10 mM Tris?HCl, pH 7.5, 0.1 mM EDTA). Pronuclear injections of zygotes (from B6SJLF1/J parents) (The Jackson Laboratory) were performed by the JHU Transgenic Facility at a final concentration of 30 ng/μL Cas9 protein (PNABio), 0.6 μM each of crRNA/TracrRNA (Dharmacon), and PCR donor (3 ng/μL or 5 ng/μL) or plasmid donor (10 ng/μL). The Cas9 protein, crRNA, and tracrRNA were combined from stocks at 1,000 ng/μL, 20 μM, 20 μM, respectively, and incubated at 4 °C for 10 min. Then injection buffer was added to dilute to the final working concentrations above (SI Appendix, Table S1) along with repair vector or fragment. The solution was microcentrifuged 5 min at 13,000 × g and the solution used for injection. Pups were genotyped using genomic primers immediately outside of the PCR donor sequence, or by using one primer in mCherry and one upstream of the 483-bp homology arm in the case of the plasmid donor. Genomic DNA from all pups was also subjected to PCR amplification with internal mCherry-specific primers to identify random insertions of the donor template (locus-specific mCherry negative/internal mCherry product positive).

We identified seven pups (11%, out of 60 pups without mCherry insertion at the Adcy3 locus) with potential transgenic insertions of the PCR fragment at other undetermined loci. In contrast, we identified no transgenics (0%, out of 20 pups without mCherry insertion at the Adcy3 locus) when using the plasmid donor.

Results

mCherry Tagging of a Mouse Locus Using a PCR Donor with Short Homology Arms.

In mammalian systems, ssODNs and plasmids are most commonly used as donors for genome editing (3). To test whether PCR fragments with short homology arms can also function as donors, we designed a PCR fragment to insert mCherry near the C terminus of the mouse adenylyl cyclase 3 (Adcy3) locus. The mCherry ORF (739 bp) flanked by 36-bp homology arms for the Adcy3 locus was amplified by PCR. The purified PCR fragment and in vitro-assembled Cas9 complexes were coinjected into mouse zygotes, and the resulting pups were genotyped by PCR and Sanger sequencing (Fig. 1). We identified 27/87 pups with a correct size insertion at the Adcy3 locus (31% editing efficiency). Sequencing of 10 full-size mCherry edits revealed them all to be precise (no indels). A parallel editing experiment using an mCherry supercoiled plasmid with 500-bp homology arms yielded five edits from 25 pups (20% editing efficiency). Similar knockin efficiencies have also been reported using long single-stranded donors (27). These results suggest that single-stranded DNAs, plasmids, and PCR fragments function with similar efficiency for genome editing in mouse embryos. Unlike single-stranded DNAs and plasmids, PCR fragments have the added convenience of ease of synthesis especially for long inserts.

GFP Tagging of Human Loci Using PCR Donors with Short Homology Arms.

To determine whether PCR fragments can also function for genome editing in human cells, we attempted to knock in GFP at three loci in HEK293T cells. We designed the homology arms to insert GFP 0, 11, and 5 bp away from a Cas9 cleavage site in the Lamin A/C, RAB11A, and SMC3 ORFs, respectively (Fig. 2 and SI Appendix, Fig. S2). The PCR fragments (0.33–0.21 μM) and in vitro-assembled Cas9-guide RNA complexes were introduced by nucleofection into HEK293T cells without selection as in ref. 24. The efficiency of GFP integration was examined 3 d later by cytometer or fluorescence microscopy. These methods permit the scoring of >5,000 cells (cytometer) and >1,000 cells (fluorescence) per each nucleofection experiment, and we performed at least two independent experiment for each condition (Materials and Methods). We obtained an average of 14.9%, 17.5%, and 14.0% GFP+ cells for the Lamin A/C, RAB11A, and SMC3 loci, respectively (Fig. 2B and SI Appendix, Fig. S2B). In each case, the cells expressed GFP in a pattern consistent for the targeted ORF (Fig. 2D and SI Appendix, Fig. S2C).

Reducing the molarity of the PCR fragments by 10-fold reduced efficiency by ~1/2 (compare Fig. 2 B and C). Increasing the length of the homology arms to 500 bp did not increase editing efficiency, even when controlling for the reduced molarity of the longer PCR fragments (Fig. 2C). Reducing the length of the homology arms to ~15 bp, however, decreased efficiency (Fig. 2B). PCR fragments with no homology arm or homology arms for a locus not targeted by Cas9 yielded GFP+ in the range of the background levels obtained with cells that did not receive any repair template (Fig. 2 and SI Appendix, Figs. S2 and S3 and Table S1). Plasmid donors with ~500-bp homology arms also performed poorly (Fig. 2C) as reported previously (7). We conclude that PCR fragments function as efficient donors in HEK293T cells, performing better than plasmids with much longer homology arms. Because ~35-bp homology arms are convenient to introduce by PCR amplification, we used that length for subsequent experiments. The 30- to 40-nt homology arms have also been reported to be optimal for ssODNs (4).

Editing Efficiency Is Sensitive to Insert Size.

To test the effect of insert size on editing efficiency, we added varied sizes of DNA sequence to the GFP insert. For ease of synthesis and to maintain equimolar amounts of donor DNAs, we introduced donor fragments at the same low molarity (0.12 μM). We found that inserts beyond 1 kb performed very poorly, yielding fewer than 0.5% edits (Fig. 3A). By varying the size of the homology arms, we found that the size of the insert, and not the overall size of the donor DNA, determines editing efficiency. A 1,188-bp donor (714-bp insert with two 237-bp homology arms) performed as well as a 780-bp donor with the same size insert and 33-bp homology arms (8.5% versus 9.8% edits, Fig. 3A). The 1,188-bp donor, however, performed much better than a 1,188-bp donor with a longer insert (1,122 bp) and 33-bp homology arms (8.5% versus 0.3% edits, Fig. 3A).

To test whether decreasing insert size below the size of GFP would increase editing efficiency, we took advantage of the split-GFP system (24, 25). In this system, the 11th beta-strand of GFP (57 bp, GFP11) is knocked in, in cells expressing a complementary GFP fragment (GFP1–10). We generated PCR products containing the GFP11 insert and ~35-bp homology arms and introduced these at 0.33 μM. We obtain 45.4% edits at the Lamin A/C locus (Fig. 3B) and 32.8% at the RAB11A locus (Fig. 3C). A donor with no homology arm yielded only 1.3% edits (SI Appendix, Fig. S3B). Again, we found that increasing insert size reduced efficiency, down to 17.9% for a 993-bp insert (Fig. 3B). We conclude that dsDNAs engage in an efficient repair process that requires only 35-bp homology arms, but favors relatively short inserts (<1 kb at the molarities tested here).

Accuracy of Repair Is Asymmetric.

To investigate the accuracy of repair with PCR fragments, we isolated GFP+ and GFP? cells by fluorescence-activated cell sorting from a single editing experiment targeting the Lamin A/C locus with a GFP-containing PCR fragment under optimal conditions (Fig. 2B, 33/33 homology arms, 0.33 μM molarity). Each cell was grown out as a clone and the Lamin A/C locus was amplified using two primers flanking the insertion site. As expected, all 48 GFP+ clones contained at least one Lamin A/C allele with a full-size insert (four were homozygous with two edited alleles). We sequenced the GFP insert in 23 of the 48 GFP+ clones and identified 20 precise insertions and three imprecise insertions containing small in-frame indels at the left or right junction (SI Appendix, Figs. S4 and S5). We also sequenced the wild-type–sized allele in 11 of the 44 heterozygous GFP+ clones and identified two with wild-type sequence, six with indels at the DSB, and three with small inserts (<100 bp) corresponding to either the N terminus or C terminus of GFP (SI Appendix, Fig. S5). We also screened 37 GFP? clones by PCR and, surprisingly, identified 10 that contained inserts at the Lamin A/C locus. We sequenced 7 of the 10 inserts and identified three with a full-size GFP insert with out-of-frame indels at one junction and four with smaller GFP inserts (SI Appendix, Fig. S5).

In total, we sequenced 13 imprecise GFP edits and found only one internal deletion and one insertion in the wrong orientation (SI Appendix, Fig. S5). All other imprecise edits were full-size or truncated GFP fragments inserted in the correct orientation. All had one precise junction on the nontruncated terminus of GFP. The other junction was imprecise and contained indels (SI Appendix, Fig. S5). These observations are consistent with an asymmetric repair process that uses mechanisms with different homology requirements to initiate and resolve repair.

Repair Is a Polarity-Sensitive Process.

In the SDSA model, initiation and resolution of repair proceeds via distinct steps. First, the DSB is resected to yield 3′ overhangs on both sides of the DSB (Fig. 4A). The 3′ overhangs pair with the donor and are extended by DNA synthesis copying donor sequences (Fig. 4A). Bridging of the DSB is completed when the newly synthesized strands withdraw from the donor and anneal back at the locus (Fig. 4A). To determine whether initiation and resolution might have different homology requirements, we tested the editing efficiency of single-stranded donors (ssODNs) bearing only one homology arm. We designed ssODNs with a GFP11 insert and only one homology arm at either the 3′ or 5′ end of the ssODN (5′- or 3′-homology arm). The homology arm targeted sequences on the left or right side of the Cas9-induced DSB in Lamin A/C and RAB11A (Fig. 4B). At both loci, we found that editing efficiency was highest with ssODNs that had a 3′-homology arm that could anneal to a complementary 3′ end at the DSB (Fig. 4C). ssODNs of the opposite polarity yielded only background-level edits. These observations are consistent with a replicative repair process that requires pairing between a 3′-homology arm on the donor and sequences on at least one side of the DSB. Apparently, a different, less stringent mechanism can be used to bridge the donor to the other side. One possibility is that NHEJ was used to repair the gap on the side with no homology arm. Coupling of homologous and nonhomologous repair mechanisms has already been documented in mammalian cells (28).

Polarity of Single-Stranded Donors Affects Incorporation of Distal Edits.

We wondered whether the different requirements for homology on the 3′ and 5′ ends of single-stranded donors might also apply to donors that contain two homology arms at different distances from the DSB. Such homology arms are found in donors designed to insert an edit at a distance from the DSB. In these donors, one homology arm (proximal homology arm) matches sequences immediately next to the DSB and the other homology arm (recessed homology arm) matches sequences at a distance from the DSB on the distal side of the edit (Fig. 5A). We tested whether proximal and recessed homology arms function equivalently on the 5′ and 3′ ends of ssODNs using a series of 23 pairs of sense and antisense ssODNs with inserts ranging from 0 to 41 nucleotides from the DSB at four loci (Fig. 5B and SI Appendix, Table S1). (In all ssODNs, the sequence between the DSB and edit was partially recoded to promote edit incorporation as described in the next section.) Strikingly, we observed an increasing bias for a particular polarity with increasing edit-to-DSB distance (Fig. 5B). The favored ssODN polarity changed whether the edit (and recessed homology arm) was positioned to the left or right of the DSB (sense polarity when the edit is on the left side of the DSB, and antisense when the edit is on the right side). ssODNs with inserts close to the DSB did not show much polarity bias (Fig. 5B). These findings demonstrate that repair favors ssODNs with a 3′-homology arm that directly abut the DSB (proximal homology arm) and suggest that initiation of repair synthesis is enhanced by donors that can pair with sequences directly flanking the DSB. These experiments also showed that, in contrast to ssODN polarity, the polarity of the guide RNA used to create the DSB had no discernible effect on editing efficiency (Fig. 5B). We conclude that, under the conditions used here, the requirements for replicative repair have a greater impact on editing efficiency than the strand bias imposed by asymmetric Cas9 release of the DSB (5).

Recoding of Sequences Between the DSB and the Edit Increases Recovery of Distal Edits.

Editing efficiency has been observed to decrease with increasing distance between the edit and the DSB (6). This observation is also consistent with replicative repair, which predicts that synthesis that generates sequence complementary to the other side of the DSB will promote annealing back to the locus, potentially even before the edit is copied (Fig. 6). To test this prediction directly, we designed an ssODN donor with two inserts: a proximal insert (restriction enzyme site) 1 base away from the DSB in the PYM1 locus and a distal insert (3×Flag) 23 bases away from the DSB. Each insert was flanked by a homology arm targeting the PYM1 locus (Fig. 6A). We generated 63 single-cell clones and genotyped the PYM1 locus by PCR (Materials and Methods). A total of 46% of the clones contained only the proximal edit and 12.6% contained both the proximal and distal edits (Fig. 6B). The finding that ~80% of the edits contained only the proximal edit is consistent with annealing using sequence between the two edits. To test this hypothesis, we mutated 7 bases in the 23-base region separating the proximal and distal edit. The mutations were designed to reduce homology with the locus while preserving coding potential (Fig. 6A). This partial recoding reduced the frequency of proximal edit-only clones to 10.3% and increased the frequency of proximal + distal edits to 25.8% (Fig. 6B). We conclude that sequences on the donor that span the DSB can prevent incorporation of distal edits. We note that, although recoding enhances the recovery of distal edits, recoding does not eliminate the preference for proximal edits, which are still recovered at higher frequency than distal edits even when using recoded templates (SI Appendix, Fig. S6).

To test whether internal homologies can also participate in the repair process when using double-stranded donors, we performed a similar experiment with a PCR fragment designed to incorporate GFP11 at the DSB, and tagRFP 33 bases from the DSB in the Lamin A/C locus (Fig. 6C). We recovered 10.8% GFP-only edits and 8.6% GFP-RFP double positives (Fig. 6D). Partial recoding of the sequence between GFP11 and tagRFP (by introducing 10 silent mutations) reduced the percent of GFP-only edits to 4.4% and raised the percent of GFP-RFP double positives to 17.6% (Fig. 6D). We conclude that internal homologies on double-stranded templates can also interact with the targeted locus. Since both polarities are present in double-stranded templates, internal sequences could participate in principle in both the initial invasion step and the annealing step back to the locus.

Repair Is Prone to Template Switching Between Donors.

Another characteristic of SDSA first observed in yeast is the ability of the repair process to undergo sequential rounds of invasion and synthesis (29, 30). “Template switching” can create edits that combine sequences from overlapping donors (14). To test whether template switching also occurs in human cells, we used two donors to correct a single DSB. The first donor was an ssODN with two homology arms and a GFP11-coding insert containing a stop codon to prevent translation of the full-length fusion (Fig. 7A). The second donor was a ssODN with the same GFP11 insert but without the stop codon and without any homology arm. Consistent with template switching, we obtained 3.2% GFP+ edits when using both donors, compared with 0.3% and 0.4% GFP+ edits when using only the first or second ssODN, respectively (Fig. 7B). We repeated this experiment with double-stranded donors and obtained similar results (Fig. 7 C and D). We conclude that template switching between donors can occur in human cells (SI Appendix, Fig. S8).

To visualize template switching more directly, we combined wild-type donors with recoded donors where the GFP11 insert contained several silent mutations and used Illumina sequencing to sequence the insertional edits en masse (Fig. 7E). Using recoded donors with silent mutations every 12 bases in the GFP11 insert, we identified evidence of template switching in 1.4% of edits (“chimeric edits,” Materials and Methods). Interestingly, the same experiment performed with donors that contained silent mutations every six or every three nucleotides resulted in only 0.5% and 0% chimeric edits, respectively (Fig. 7F and SI Appendix, Fig. S7 and Table S6). The chimeric edits could not have resulted from sequential rounds of Cas9 cleavage and repair, since the edit destroyed the crRNA pairing sequence. The chimeric edits also could not have arisen during PCR amplification, since we observed no chimeric edits in a control experiment mixing two different cell populations (SI Appendix, Fig. S7). We conclude that template switching occurs between donors in human cells and is sensitive to the degree of homology between donors (SI Appendix, Fig. S8), as reported previously in yeast (30, 31).

Discussion

In this report, we demonstrate that PCR fragments are efficient donors for genome editing in mouse embryos and human cells. PCR fragments with short homology arms (~35 bp) can be used to integrate edits up to 1 kb, long enough to encode fluorescent reporters such as GFP. Experiments using single- and double-stranded DNAs suggest that linear donors participate in a replicative repair mechanism that broadly conforms to the SDSA model for gene conversion. Our findings suggest simple guidelines to streamline donor design and maximize editing efficiency (Fig. 8).

Fig. 8.

Guidelines for donor design. (A) Schematic showing a typical editing experiment using a PCR fragment (thick line) with two homology arms (blue) to introduce an edit (green) at a distance from the DSB (stippled line). (B) Recommendations based on results presented in this study. We refer readers to refs. 5 and 23 for additional recommendations for ssODNs designed to insert edits at the DSB.

Linear DNAs Repair Cas9-Induced DSBs by Templating Repair Synthesis.

In principle, linear donors could repair Cas9-induced breaks by integrating directly at the DSB. For example, MMEJ could cause donor ends to become ligated to each side of the DSB (8). Alternatively, homology arms on the donor could form Holliday junctions with sequences on each side of the DSB. Crossover resolution of the two Holliday junctions could cause donor sequences to become integrated at the DSB. This type of HDR has been proposed to underlie genome editing with plasmid and viral donors (17). In these models, repair is symmetric: the same mechanism (MMEJ or recombination) is used to ligate donor sequences to each side of the break. In contrast, our observations suggest that repair with linear donors proceeds by an asymmetric, likely replicative, process. First, ssODNs with only one homology arm show strong polarity specificity (Fig. 4C), consistent with a specific requirement for pairing with 3′ ends at the DSB (Fig. 4A). Second, recessed homology arms (homology arm at a distance from the DSB) are rarely used to initiate repair synthesis, but can be used to resolve a repair event (Figs. 5 and 6). Third, internal homologies on the donor can bypass integration of distal edits (Fig. 6). Fourth, most imprecise edits have asymmetric junctional signatures (SI Appendix, Fig. S5). These observations suggest that the repair process is polar, like DNA synthesis, and has different requirements to initiate and resolve repair. These findings are consistent with the SDSA model for gene conversion (15) (Fig. 4A). SDSA initiates with DNA synthesis templated by the donor to extend 3′ ends at the DSB and resolves by annealing of the newly replicated strand(s) back to the locus. Our observations suggest that initiation of DNA synthesis is the most homology-stringent step, requiring a ~35-base homology arm on the donor complementary to sequences directly adjacent to one side of the DSB. Either side of the DSB can initiate repair and, contrary to an earlier report (5), we did not observe a preference consistent with biased strand release by Cas9. The observations that homology arms longer than 35 bases do not perform significantly better, and that distal homology arms perform more poorly, also suggest that resection exposes only short regions of ssDNA on either side of the DSB. In contrast to the initiation step, the resolution step has more relaxed homology requirements. Recessed homology arms can be used for that step, and in fact repair can proceed with no homology arm on the “annealing side” (Fig. 4C). In that case, NHEJ (or MHEJ) may be used to fuse the newly replicated strand to the other side of the DSB. One possibility is that NHEJ or MHEJ competes with annealing during resolution, especially in the case of long edits where synthesis has a higher chance of stalling before reaching the distal homology arm or before synthesis of a complementary strand primed from the other side of the DSB (Fig. 4A). Consistent with this view, we recovered several partial GFP insertions that were integrated in the correct orientation but contained one imprecise junction on the truncated side of GFP, consistent with premature withdrawal from the donor. We cannot exclude the possibility, however, that in these partial edits, the nonhomologous joint was made first using a broken donor.

If partial edits are due to premature withdrawal of the newly replicated strand from the donor, partial edits should be less frequent when using donors with shorter inserts. Consistent with this prediction, we found that editing efficiency is inversely proportional to insert size. At the Lamin A/C locus, we obtained 45.4% edits for a 57-bp insert, 23.5% edits for 714-bp insert (GFP), and 17.9% edits for a 993-bp insert. The size of the insert, and not the overall size of the donor, correlated with efficiency, arguing against the possibility that breakage of longer donors contributes to reduced efficiency (Fig. 3). We suggest that the low processivity of repair polymerases (32) increases the chances of aberrant dissociation/annealing events on long inserts.

We also obtained evidence for dissociation and invasion events between donors. Such template switching was also observed in yeast and C. elegans and can cause sequences from overlapping donors to become incorporated in the same edit (14, 30, 31). We found that template switching is sensitive to the degree of homology between donors and is reduced significantly by mutations every three or six bases, as was also found in yeast (30, 31). Similarly, recoding of sequences between the DSB and the edit promotes the incorporation of distal edits, presumably by increasing the rejection rate of heteroduplexes formed during annealing between the newly replicated strand and sequences flanking the DSB (33). Template switching may also explain why editing efficiency is sensitive to donor molarity, since high donor molarity is predicted to lower the frequency of aberrant dissociation/reannealing events during synthesis. It will be interesting to determine which repair polymerases are responsible for synthesis templated by linear donors and whether their processivity characteristics account for our observations of template switching. In this regard, it is interesting to note that we identified a higher frequency of full-length edits (and lower frequency of partial edits) in mice compared with HEK293T cells. This difference could reflect differences in the properties of the enzymes that mediate SDSA in the two systems. Alternatively, the higher precision in mice could be due to a more efficient method for delivering donors at high molarity (pronuclear injection in mouse zygotes versus nucleofection in HEK293T cells).

SDSA as a Repair Mechanism for Cas9-Induced DSBs: Implications for Genome Editing.

The demonstration that ssODNs and PCR fragments engage in a SDSA-like mechanism to repair Cas9-induced DSBs has two important implications for genome editing. First, the SDSA model makes simple predictions for optimal donor design (Fig. 8). These predictions improve editing efficiencies for edits at a distance from the DSB and eliminate the effort and expense used in creating donor DNAs with unnecessarily long homology arms. Linear donors with short homology arms can be chemically synthesized as single-stranded or double-stranded DNA or PCR amplified, avoiding the need for cloning. In this manner, tagging of genes with GFP can be achieved readily, without resorting to split-GFP approaches that also require expression of a complementary GFP1–10 fragment (24). Second, because SDSA is thought to be a widespread mechanism for DSB repair among eukaryotes (34), it is likely that the approaches outlined here will be applicable to other cell types and organisms. We documented previously that PCR fragments with short homology arms perform well in C. elegans (14), and we demonstrate here the same for HEK293T cells and mouse embryos. It will be interesting to investigate whether linear donors with short homology arms can also be used for genome editing in pluripotent cells and postmitotic cells.

Acknowledgments

We thank the Johns Hopkins University (JHU) Genetic Resources Core Facility’s Sequencing Facility, the JHU Transgenic Facility, and the JHU Ross Flow Cytometry Core Facility for expert support; Dr. Jonathan Weissman for the gift of HEK293T GFP1–10 cells; Andrew Holland and Tyler Moyer for tissue culture help; and Boris Zinshteyn for assistance with Illumina sequencing and data analysis. This work was supported by NIH Grants R01HD37047 (to G.S.), R01DC004553 (to R.R.R.), and F32GM117814 (to A.F.). G.S. and R.G. are investigators of the Howard Hughes Medical Institute. D.H.G. is a Damon Runyon Fellow supported by the Damon Runyon Cancer Research Foundation (DRG-2280‐16). A.P. dedicates this work to Marcel Bodelet.

Footnotes

  • ?1To whom correspondence may be addressed. Email: apaix1{at}jhmi.edu or gseydoux{at}jhmi.edu.
  • Author contributions: A.P. and G.S. designed research; A.P., A.F., D.H.G., H.K., M.J.G., D.R., and S.P. performed research; A.P., A.F., D.H.G., and G.S. analyzed data; and A.P., A.F., D.H.G., R.G., R.R.R., and G.S. wrote the paper.

  • Reviewers: D.C., University of Utah; and J.E.H., Brandeis University.

  • The authors declare no conflict of interest.

  • This article contains supporting information online at www.danielhellerman.com/lookup/suppl/doi:10.1073/pnas.1711979114/-/DCSupplemental.

References

  1. ?
    .
  2. ?
    .
  3. ?
    .
  4. ?
    .
  5. ?
    .
  6. ?
    .
  7. ?
    .
  8. ?
    .
  9. ?
    .
  10. ?
    .
  11. ?
    .
  12. ?
    .
  13. ?
    .
  14. ?
    .
  15. ?
    .
  16. ?
    .
  17. ?
    .
  18. ?
    .
  19. ?
    .
  20. ?
    .
  21. ?
    .
  22. ?
    .
  23. ?
    .
  24. ?
    .
  25. ?
    .
  26. ?
    .
  27. ?
    .
  28. ?
    .
  29. ?
    .
  30. ?
    .
  31. ?
    .
  32. ?
    .
  33. ?
    .
  34. ?
    .

Online Impact