Research article | Open | Published:
An efficient system for homology-dependent targeted gene integration in medaka (Oryzias latipes)
Zoological Lettersvolume 3, Article number: 10 (2017)
The Correction to this article has been published in Zoological Letters 2019 5:22
The CRISPR/Cas system is a powerful genome editing tool that enables targeted genome modifications in various organisms. In medaka (Oryzias latipes), targeted mutagenesis with small insertions and deletions using this system have become a robust technique and are now widely used. However, to date there have been only a small number of reports on targeted gene integration using this system. We thus sought in the present study to identify factors that enhance the efficiency of targeted gene integration events in medaka.
We show that longer homology arms (ca. 500 bp) and linearization of circular donor plasmids by cleavage with bait sequences enhances the efficiency of targeted integration of plasmids in embryos. A new bait sequence, BaitD, which we designed and selected by in silico screening, achieved the highest efficiency of the targeted gene integration in vivo. Using this system, donor plasmids integrated precisely at target sites and were efficiently transmitted to progeny. We also report that the genotype of F2 siblings, obtained by mating of individuals harboring two different colors of fluorescent protein genes (e.g. GFP and RFP) in the same locus, can be easily and rapidly determined non-invasively by visual observations alone.
We report that the efficiency of targeted gene integration can be enhanced by using donor vectors with longer homologous arms and linearization using a highly active bait system in medaka. These findings may contribute to the establishment of more efficient systems for targeted gene integration in medaka and other fish species.
Medaka (Oryzias latipes) is a small freshwater teleost species, which serves as an excellent vertebrate model for genetics due to it ease of breeding and unique genetic resources, such as spontaneous mutant collections, highly polymorphic inbred strains, and related species that exhibit a number of unique features [1, 2]. In medaka, a large number of transgenic strains is available for molecular genetic analysis [2, 3]. However, the development of techniques for targeted manipulation of endogenous genes, such as genetic tagging by reporter genes and site-specific introduction of single nucleotide polymorphisms, has been hindered by lack of established protocols for gene targeting in embryonic stem cells in this species.
Genome editing using targetable nucleases, including systems utilizing clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR-associated (Cas), transcription activator-like effector nucleases (TALENs), and zinc-finger nucleases (ZFNs), have been established as a powerful methods for reverse genetics in a wide range of organisms . These nucleases can induce DNA double-strand-breaks (DSBs) at any genomic target locus, which allows for various types of targeted genome modifications via DNA DSB repair systems, such as targeted gene disruptions by small insertions and deletions (indels) via non-homologous end-joining (NHEJ) and targeted gene integration by homology-directed repair (HDR) . We previously reported targeted gene disruptions mediated by NHEJ, thus establishing efficient methods for targeted mutagenesis using targetable nuclease systems in medaka [6,7,8]. However, although targeted gene integration mediated by HDR is desirable for more precise and complex genome manipulations, there have been only a few reports on the HDR-mediated gene integration in medaka . More detailed knowledge is thus needed to establish efficient protocols for this technology.
The length of homologous sequences is known to play an important role in determining the pathways used for the repair of DNA DSBs . Relatively long homology arms (0.5–1 kb) can induce homologous recombination (HR) and have been commonly used for targeted gene integration by genome editing . Short homology arms (2–25 bp) in contrast can induce microhomology-mediated end joining (MMEJ), recently identified as a DSB repair pathway in a highly efficient gene knockin method for genome editing [11, 12]. Recent studies have reported that both of these pathways can mediate targeted gene integration in zebrafish embryos [13,14,15]. However, no studies have directly compared the integration efficiencies mediated by these two pathways in fish embryos, and thus the effects of length of homology arms on integration efficiency have remained unclear.
Previous studies have also demonstrated that simultaneous cleavage of a circular donor plasmid and the targeted genomic locus by targetable nucleases can enhance HDR-mediated gene integration efficiency in zebrafish and sea urchin embryos [13, 16]. For efficient methods of the targeted gene integration by CRISPR/Cas system, guide RNAs (gRNAs) and their targets, known as “bait” sequences, have been designed for the cleavage of circular donor plasmids. Gbait is a bait sequence designed on the coding sequence of EGFP gene, and has been used in several zebrafish studies due to its high genome-editing activity and lower frequency of off-target effects. In addition, PITCh gRNAs were designed for targeted integration by the PITCh system, and has also minimized off-target effects in various mammalian genomes . However, there has been no comparative studies on the effect of these sequences on the efficiency of targeted gene integration in vivo.
In the present study, we sought to establish an efficient system for targeted gene integration in medaka. First, we examined the effects of the length of homology arms on integration efficiencies at a genomic locus in medaka embryos. Next, we developed novel bait sequences with fewer off-target effects in fish genomes and validated their effect on integration efficiency in comparison with the previously reported bait sequences, Gbait and PITCh gRNAs. Lastly, we demonstrated that gene knockin strains harboring different fluorescent protein genes at the target locus can be generated by a method established in this study and that these strains may be helpful in maintaining mutant strains without PCR genotyping.
This study was conducted in compliance with the Regulation for Animal Experiments in Kyoto University. Fish handling and sampling methods were approved by Kyoto University (No.26–71). All efforts were made to minimize suffering.
A cab (closed colony) of medaka was used in this study. The fish were kept under a 14/10-h light/dark cycle at 26 °C.
Design of bait sequences
Candidate bait sequences that are disrupted by corresponding single guide RNAs (sgRNAs) were designed following published data sets from high-throughput screening of sgRNA activity in mammalian cells . The top 40 sgRNAs with the highest gene disrupting activity (20 from the “non ribo efficient sgRNA” data set and 20 from the “mESC efficient sgRNA” data set) were nominated as candidates. These candidates were screened by following two criteria using an offline version of Cas-OFFinder  with a genome database of 12 teleost fish species (Additional file 1: Table S1). For each candidate, we calculated the total number of potential off-target sequences with ~3 bp mismatches in 18 bp of target sequence and PAM (NGG or NAG) sequence (in total 21 bp) in the 12 fish genome database. From this group, the seven candidates with the lowest numbers of total potential off-target sequences were selected. We screened the selected candidates and the previously reported bait sequences (Gbait  and PITCh-gRNAs ) following the second criterion: genomic sequences with ~2 bp mismatches in the 18-bp target sequence for each sgRNA [8, 20]. After calculating the total number of off-target sequences in the12 fish genome database, the eight bait sequences with the lowest number were selected for use in this study.
Construction of donor plasmids
Backbone fragment 1
Backbone fragments containing a pUC replication origin and an ampicillin resistance gene were amplified from a plasmid pPBIS19-mgfc:TagBFP-8xHSE:Cre  by PCR using a primer pair (pUCoriFW-SpeI and pUCoriRV-XhoI) (Additional file 2: Table S4) and digested with XhoI and SpeI.
Backbone fragment 2
Backbone fragments containing each bait sequence, a pUC replication origin, and an ampicillin resistance gene were amplified from a plasmid pPBIS19-mgfc:TagBFP-8xHSE:Cre  by PCR using a primer pair (SpeI-Bait-FW and XhoI-Bait-RV) with each bait sequence and a restriction site (XhoI or SpeI) (Additional file 2: Table S4) and then digested with XhoI and SpeI.
GFP cassette: To avoid EGFP gene disruption induced by Gbait, we used monomeric Azami-Green (mAG) gene as a reporter gene expressing green fluorescence protein (GFP). A GFP reporter cassette, a mAG gene with a N-terminal linker and a SV40 polyA signal, was amplified from a plasmid phmAG1-MNLinker (Medical & Biological Laboratories, Aichi, Japan) by PCR using primers mAG-linker-BamHI and polyA-RV-EcoRI (Additional file 2: Table S4), and then digested with BamHI and EcoRI.
pBaitX-acta1_500 bp-mAG (Fig. 2a): To generate donor plasmids with ~500 bp of sequences homologous to acta1 locus (pBaitX-acta1_500 bp-mAG), upstream (472 bp) or downstream (473 bp) genomic region at the target site of sgRNA-acta1 was PCR-amplified using primer pairs with restriction sites, acta1-5FW-XhoI/acta1-5RV-BamHI or acta1-3FW-EcoRI/acta1-3RV-SpeI (Additional file 2: Table S4) and then digested with XhoI/BamHI or EcoRI/SpeI, respectively. These two fragments were simultaneously ligated with the “Backbone fragment 2” and the “GFP cassette”.
pNoBait-acta1_500 bp-mAG (Fig. 1a): The “Backbone fragment 1” and the “GFP cassette” are ligated to generate a donor plasmid without bait sequences.
pGbait-acta1_20 bp-mAG (Fig. 1a): Single-stranded oligo DNAs of acta1-exon3-5S1 and acta1-exon3-5AS1 or acta1-exon3-3S1 and acta1-exon3-3AS1 (Additional file 2: Table S4) were annealed to generate double-stranded oligo DNAs containing upstream or downstream short homology arms (20 bp). These double-stranded oligo DNAs were ligated with the “GFP cassette” and one of the “Backbone fragment 2,” which contains the Gbait to generate the pGbait-acta1-short-mAG donor plasmid.
pGbait-acta1_40 bp-mAG (Fig. 1a): To generate a donor plasmid containing short homology arms (40 bp), a fragment containing “GFP cassette” and homology arms (40 bp) was amplified from the above plasmid, pGbait-acta1_500 bp-mAG, using primer pairs XhoI-acta1-Fw and acta1-SpeI-Rv. The fragment was digested with XhoI and SpeI, and ligated with one of the “Backbone fragment 2,” which contains the Gbait to generate the pGbait-acta1_40 bp-mAG donor plasmid.
pBaitD-gap43_500 bp-mAG (Fig. 3a): To construct a donor plasmid for targeting gap43 locus (pBaitD-gap43_500 bp-mAG), an upstream (483 bp) or downstream (496 bp) genomic region at the target site of sgRNA-gap43 was amplified by PCR with primer pairs XhoI-GAP43-FW/GAP43-BamHI-RV or EcoRI-GAP43-FW/GAP43-SpeI-RV (Additional file 2: Table S4) and digested with XhoI/BamHI or EcoRI/SpeI, respectively. These fragments and the BamHI/EcoRI-digested GFP cassette were cloned into one of the “Backbone fragment 2” containing BaitD sequences.
pBaitD-gap43_500 bp-tdTomato (Fig. 3a): The coding sequence of the tdTomato gene was amplified from ptdTomato Plasmid (Clontech, California, USA) by PCR using primer pairs BamHI-tdTomato-Fw/tdTomato-XbaI-Rv (Additional file 2: Table S4). The donor plasmid, pBaitD-gap43_500 bp-tdTomato, was generated following the method also used for generating pBaitD-gap43_500 bp-mAG using the tdTomato coding sequence instead of the “GFP cassette”.
All constructed plasmids were purified by an alkaline lysis method or NucleoSpin Plasmid QuickPure kit (MACHEREY-NAGEL, Düren, Germany).
Preparation of donor plasmids, Cas9 RNA and sgRNAs for microinjection
To eliminate residual RNase activity of extracted plasmids, donor plasmids dissolved with 50 μL of 5 mM Tris-HCl buffer (pH 8.5) were incubated with 5 μL of 10% sodium dodecyl sulfate (SDS) and 2 μL of Proteinase K (20 mg/mL) at 55 °C for 30 min, and then purified using NucleoSpin Gel and PCR Clean-up kit (MACHEREY-NAGEL) with the Buffer NTB supplied by the manufacturer.
Cas9 RNA was transcribed from pCS2 + hSpCas9 (Addgene Plasmid 51,815)  using the mMessage mMachine SP6 Kit (Thermo Fisher Scientific, Waltham, MA). Custom-designed sgRNAs for the genomic sequence of medaka were designed using a track for the UCSC genome browser of CRISPRscan . Expression plasmids for the custom-designed sgRNAs were constructed by cloning the annealed oligonucleotides into a sgRNA expression plasmid pDR274 (Addgene Plasmid 42,250) , as described previously . The sgRNAs were transcribed from the DraI-digested template plasmids using the Ampliscribe T7-Flash Transcription Kit (Epicentre, WI). All synthesized RNAs were purified using the RNeasy Plus Mini Kit (Qiagen, Hilden, Germany) to eliminate the template DNA without DNase treatment. Sequences of the genomic target sites and annealed oligonucleotides are listed in Additional file 2: Table S4.
To evaluate the DSB inducing activity of each sgRNA at the genomic target site, an injection mixture containing 100 ng/μL of Cas9 RNA and 50 ng/μL of each sgRNA was prepared. To evaluate the efficiency of targeted gene integration, injection mixtures containing 2.5 ng/μL of each donor plasmid, 100 ng/μL of Cas9 RNA, and 50 ng/μL of sgRNA corresponding to the donor plasmid were prepared. These injection mixtures were introduced into medaka eggs at the one-cell stage, as described previously .
Genomic DNA extraction
Embryos were lysed individually at 4 days post fertilization (dpf) in 25 μL of alkaline lysis buffer containing 25 mM NaOH and 0.2 mM EDTA and incubated at 95 °C for 15 min after breaking the egg envelope with forceps. Each sample was neutralized with 25 μL of 40 mM Tris-HCl (pH 8.0) and used as a genomic DNA sample.
Heteroduplex mobility assay
Heteroduplex mobility assay (HMA) was performed to investigate the DSB inducing activity of each sgRNA on the genomic target site, as described previously . Briefly, a genomic region containing the target sequence of sgRNAs-acta1 or sgRNA-gap43 was amplified by PCR using KOD-FX DNA polymerase (Toyobo, Osaka, Japan) with primers acta1-HMA-FW and acta1-HMA-RV or GAP43-for-HMA-FW and GAP43-for-HMA-RV (Additional file 2: Table S4), respectively. The resulting amplicons were analyzed using a microchip electrophoresis system (MCE-202 MultiNA; Shimadzu, Kyoto, Japan) with DNA-500 reagent kit.
Embryos and larvae injected with the donor plasmids were observed using a fluorescence stereomicroscope MZFLIII (Leica Microsystems, Wetzlar, Germany) with a GFP2 filter set (for GFP) and a DsRed filter set (for RFP). Microscopic images were captured using a digital color-cooled charge-coupled camera and the VB-7010 image control system (Keyence, Osaka, Japan).
To evaluate the precise targeted integration, the DNA sequence around the integration site was investigated. The junction regions of the target site on the host genome and introduced gene were amplified by PCR using the following primer pairs: acta1-for-Seq-Fw and mAG-Rv, mAG-Fw and acta1-for-Seq-Rv, or GAP43-for-Seq-FW and GAP43-for Seq-RV (Additional file 2: Table S4), and KOD -plus- Neo DNA polymerase (Toyobo). The PCR conditions were as follows: one cycle at 94 °C for 2 min, followed by 35 cycles of 98 °C for 10 s, 58 °C for 30 s, and 68 °C for 1 min. The resulting PCR products were subjected to electrophoresis with a 1% agarose gel. PCR fragments predicted to contain the introduced gene were excised from the gel and purified using NucleoSpin Gel and PCR Clean-up (MACHEREY-NAGEL). The purified fragments were sequenced using the primers acta1-for-Seq-Fw and acta1-for-Seq-Rv (for acta1) or GAP43-for-Seq-FW and GAP43-for-Seq-RV (for gap43) (Additional file 2: Table S4).
Selection of sgRNA targeting to the skeletal muscle-specific actin gene in medaka genome
Selection of a genomic target locus expressed widely in the early-stage embryo and an sgRNA possessing high genome-editing activity at that locus is necessary for the accurate and rapid evaluation of the gene knockin efficiency of each donor plasmid. A skeletal muscle-specific actin gene (acta1; Ensembl gene number ENSORLG00000010881) was selected as the target locus, as detection of site-specific integration by observing the green fluorescence in the skeletal muscles is simple [25, 26].
To obtain an sgRNA targeting the acta1 gene with high DSB-inducing activity, we designed two sgRNAs targeting the third exon of the gene without potential off-target sites in the medaka genome (Additional file 3: Figure S1). Each sgRNA was injected with a Cas9 RNA into fertilized medaka eggs and its genome-editing activity was evaluated by HMA. More multiple banding patterns were observed in embryos with sgRNA-acta1 #1 than with sgRNA-acta1 #2, suggesting that sgRNA-acta1 #1 has higher DSB-inducing activity than sgRNA-acta1 #2 (Additional file 3: Figure S1). Thus, we used sgRNA-acta1 #1 for targeted genome cleavage in the following experiments.
Effect of length of homologous sequences and existence of the bait sequence on targeted gene integration events
To assess the effects of lengths of homologous sequences that are located on both ends of the inserted gene fragment in the donor plasmid, the targeted gene integrating efficiency into the third exon of acta1 gene was evaluated using three donor plasmids containing the Gbait sequences : pGbait-acta1_20 bp-mAG, which possesses short homologous sequences (20 bp each) on both ends of the insert gene fragment (mAG-pA); pGbait-acta1_40 bp-mAG, which possesses short homologous sequences (40 bp each) on both ends of the insert gene; and pGbait-acta1_500 bp-mAG, which possesses longer homologous sequences (472 and 473 bp) on both ends of the insert gene (Fig. 1a). When an integration event occurred precisely into the target site, green fluorescence was observed in the skeletal muscle. As shown in Table 1 and Fig. 1b, 17 of 70 (24.3%) injected embryos expressed green fluorescence in the skeletal muscle following injection of the long homologous plasmid. In contrast, only 2 of 110 (1.9%) and 3 of 64 (4.7%) embryos expressed green fluorescence in skeletal muscle following injection of short homologous plasmids (20_bp and 40_bp, respectively) (Table 1). This suggests that donor plasmids with longer homologous sequences (ca. 500_bp) are more efficiently integrated at the target site of this locus.
Previous studies have reported that the induction of DSBs on sgRNA target sequences next to homologous sequences in circular donor plasmids using the CRISPR/Cas9 system can enhance targeted integration events [13, 16]. We also evaluated the efficiency of targeted gene integration using donor plasmids containing the long homologous sequences for the acta1 gene with (pGbait-acta1_500 bp-mAG) or without (pNoBait-acta1_500_bp-mAG) the Gbait sequence containing long homologous sequences of acta1. Without the bait sequence, the integration efficiency was reduced to 9.7%, compared to 24.3% with the bait sequence Table 1. This highlights the importance of the bait sequence in efficient targeted gene integration events in medaka Table 1.
To investigate the precision of gene knockin events using the donor plasmid with bait sequences and long homologous sequences (pGbait-acta1_500 bp-mAG), we extracted genomic DNA from five G0 embryos with GFP fluorescence in skeletal muscle (Fig.1b). Each fragment containing an upstream or downstream junction between the genomic sequence and the insert gene was PCR-amplified from the genomic DNA (Additional file 4: Figure S2a) and was sequenced. In all five embryos examined, the donor plasmid was precisely integrated on the target site without any indels (Additional file 4: Figure S2b), indicating that the gene knockin events occurred via HR, as expected.
Design of bait sequences for targeted gene integration in teleost fish species
To improve the efficiency of targeted gene integration events in medaka and other teleost species, bait sequences with high DSB-inducing activity and without off-target effects must be designed. In this study, we selected 40 sgRNA target sequences with the highest DSB-inducing activity from a dataset from high-throughput screening of mammalian cells  as candidates for use as new bait sequences (Additional file 5: Table S2).
To screen bait sequences with fewer off-target effects in a wide range of teleost fish species, we investigated the number of potential off-target sites of 40 candidates in reference genome sequences of medaka and other 11 teleost species (Danio rerio, Gasterosteus aculeatus, Oreochromis niloticus, Takifugu rubripes, Salmo salar, Oncorhynchus mykiss, Nothobranchius furzeri, Astyanax mexicanus, Pagrus major, Thunnus orientalis, and Cynoglossus semilaevis) (Additional file 1: Table S1). The numbers of genomic sequences with up to 3-bp mismatches in a total of 21 bp of the candidates and a NRG PAM were counted (Additional file 5: Table S2) and the top seven sequences with the fewest potential off-target numbers (#3, #10, #11, #13, #26, #28, and #33) were selected as candidates. We subsequently examined in detail the numbers of mismatch base pairs in all potential off-target sites of these seven candidates and four previously designed bait sequences—Gbait  and PITCh gRNAs (the PITCh-gRNA#1-#3) . Ten candidates except the PITCh-gRNA#3 had some potential off-target sites with 2-bp mismatches in the 18-bp target sequence, but none of the 11 candidates had sites with 1 bp or less (Additional file 6: Table S3). We excluded three sequences (#13, #26, the PITCh-gRNA#2) with larger numbers of genomic sites with 2-bp mismatches. Consequently, we included five bait candidates (#3, #10, #11, #28, and #33; hereafter BaitA–E, respectively) and three previously reported sequences (Gbait, PITCh-gRNA#1 and PITCh-gRNA#3) in our evaluation in medaka embryos.
Comparison of targeted integration efficiencies among bait sequences in medaka
We compared the targeted gene integration activity of the five selected bait sequences with those of the previously designed bait sequences, Gbait  and PITCh gRNAs (the PITCh-gRNA#1 and PITCh-gRNA#3) , which have been reported to exhibit high targeted gene integration activity. A mixture of each donor plasmid with a type of bait sequences, a sgRNA corresponding to each bait sequence, Cas9 RNA, and sgRNA-acta1 was injected into one-cell stage medaka embryos. Expression of green fluorescence in skeletal muscle was observed at 4 days post fertilization (dpf). According to the area expressing green fluorescence, the larvae were categorized into the following three groups: “Strong,” in which green fluorescence was observed in >40% of the area of whole embryonic body; “Weak,” in which green fluorescence was observed in <40% of the area of whole embryonic body; and “No GFP,” in which no fluorescence was detected (Fig. 2b and c).
As shown in Fig. 2c, with all donor plasmids, larvae expressing green fluorescence were observed and all donor plasmids containing bait sequence showed a higher green fluorescence expressing ratio (13.5–55.2%) than those with no bait (without bait sequence) (9.6%). The ratio of larvae with green fluorescence varied and depended on the bait sequences. The highest rate (55.2%) of the larvae with green fluorescence (total larvae with green fluorescence) was observed with BaitD. The ratio of “Strong” area of GFP expression was also highest in BaitD larvae (Fig. 2c). These findings indicate that, of the bait systems tested, the BaitD system most efficiently induces targeted gene integration events. Unfortunately, all of the fish expressing green fluorescence in skeletal muscle died before or soon after hatching; however, the cause of death was unclear.
Germline transmission of a DNA fragment integrated at the target locus using the BaitD system
To investigate whether gene fragments integrated at the target loci using the BaitD system can be transmitted to progeny, we performed a gene knockin experiment targeting another gene, growth associated protein 43 (gap43; Ensembl gene number ENSORLG00000015837). The gap43 transcript is expressed in the central nervous system (CNS) from 4 dpf and contributes to the growth of neuroblasts in medaka . An sgRNA-gap43 with no potential off-target site in the medaka genome was designed on the second exon (Fig. 3a). After confirming the sgRNA possessed high DSB-inducing activity by HMA (Fig. 3b), a mixture of sgRNA-gap43, containing a donor plasmid with two BaitD sequences, a sgRNA for the BaitD sequence, and the Cas9 RNA, was injected into the one-cell stage medaka eggs. Green fluorescence was expressed strongly in the brains of 39 of 320 injected eggs at 4 dpf (Fig. 3c) (Table 2). This fluorescence pattern corresponded to the endogenous expression pattern of the gap43 gene as reported in a previous study . Twenty-eight of the GFP-expressing embryos were raised to adulthood and mated with wild-type fish. Of the 28 adult fish, two individuals transmitted the insert sequence to the next generation (34.6% and 26.8% germline transmission rate, respectively) (Tables 2 and 3). To investigate whether the knockin events occurred precisely at the target locus, we sequenced the DNA surrounding the inserted fragment after PCR amplification using genomic DNA extracted from five F1 embryos with GFP fluorescence in the CNS (Fig. 4a-c). In all five embryos examined, the insert fragment was precisely integrated on the target site without any indels (Fig. 4b), demonstrating that the BaitD system enables effective and precise targeted gene integration in germ cells in medaka.
PCR genotyping-free selection of double allelic gene edited fish using two different fluorescent colors
To establish a novel genotyping method using two different fluorescent colors, a donor plasmid (pBaitD-gap43_500 bp-tdTomato) containing RFP (tdTomato gene) was introduced into fertilized medaka eggs following the same method used for the GFP plasmid. RFP expression in the CNS was observed in 16 of 103 injected embryos and one individual transmitted the insert sequence to the next generation (Table 2).
To investigate whether the double allelic gene knockin fish could be selected simply by fluorescence without PCR genotyping, the F1 fish harboring the RFP gene in the gap43 locus was mated with a F1 individual harboring the GFP gene in the locus. The resultant F2 individuals were divided into the following four groups by color of fluorescence: green fluorescence (G+/R–; n = 12), red fluorescence (G−/R+; n = 11), both green and red fluorescence (G+/R+; n = 13), and no fluorescence (G−/R–; n = 11) (Fig. 5a). This distribution indicates that the inserted gene fragments were transmitted in a Mendelian manner. With the genomic DNA extracted from the embryo of each group, PCR analysis was carried out to discriminate among the two inserted gene fragments (GFP and RFP) and an intact allele of the gap43 gene. As shown in Fig. 5b, in the group “G+/R–”, a GFP allele and an intact allele were detected, indicating that one of the gap43 alleles was disrupted by the insertion of the GFP gene. Similarly, one of the gap43 alleles was disrupted by the insertion of the RFP gene in the group “G−/R+”. In the group “G+/R+”, both the GFP and RFP genes were detected in the gap43 locus, indicating that both alleles were disrupted by the inserted genes. These results indicate that, by mating fish harboring fluorescent reporter genes of different colors in a targeted locus, the genotypes of their progeny (fish without any mutations, monoalleleic mutants, and biallelic mutants) can be determined simply by fluorescent observation at the embryonic stage.
Previous studies have suggested that the length of homologous sequences of donor plasmids is important in determining which DSB repair pathway can be induced by targetable nuclease systems . Targeted gene integration mediated by MMEJ using plasmids with short homologous arms (10–40 bp) has been applied in cultured cells, frogs, mice, and zebrafish, as short arms can be inserted into the donor plasmids easily by PCR or oligonucleotide annealing [11, 12, 14, 28]. Although our results showed that the donor plasmid with 40 bp of homologous arms slightly improved the integration efficiency than that with 20 bp of arms, the efficiencies were much lower than that of the plasmid with longer homology arms (~500 bp each) in medaka (Table 1). These results indicate that donor plasmids with longer homology arms (ca. 500 bp) represent a potentially attractive system for targeted gene integration in medaka. One previous study showed that MMEJ is active during G1 and early S phases when HR is inactive . It has also been reported that some molecules are involved in MMEJ but not in HR . The activity of the DSB repair pathways induced by the nuclease systems may vary among species and across developmental stages. Thus, specifying pathways that are highly active in species and/or developmental stages of interest is needed to establish highly efficient systems for targeted gene integration in any given model system.
Previous studies have reported that simultaneous cleavage of the bait sequences of a donor plasmid and a genomic target site by targetable nucleases can improve the targeted integration efficiency of the plasmid [13, 16]. In our experiments, donor plasmids cleaved by bait systems also showed higher integration efficiencies than that of a donor plasmid with no targeted cleavage (Fig. 2c), which suggests that linearization of donor plasmids using bait systems enhances integration efficiency by HR in medaka.
We additionally designed a new bait sequence BaitD and demonstrated successful targeted gene integration using donor plasmids harboring the bait sequences. Our data indicate that the BaitD system represents a potentially useful system for targeted gene integration with high efficiency in medaka and other fish species, for the following reasons. 1) In an in vivo screening by gene knockin in medaka embryos, the BaitD system showed the highest integration efficiency among eight bait sequences tested (Fig. 2c). 2) Unlike Gbait, the BaitD system can be used in previously established GFP-transgenic strains, as there is no potential target site of the sgRNA for the BaitD in EGFP gene. 3) In silico screening in the reference genomes of 12 fish species showed that the BaitD system has less potential off-target effect sites in fish genomes (Additional file 5: Table S2 and Additional file 6: S3).
To date, to identify the genotype of a targeted locus in genome-edited animals, genomic DNA extraction from a tissue sample and the subsequent PCR-based genotyping works have been required. However, these processes are invasive, laborious, and time-consuming. In the present study, we demonstrated a simple method to genotype the genome-edited fish using two different colors of fluorescent protein genes inserted at the target locus (Fig. 5a). Using this method, we were able to determine the genotype of each individual simply by observing the fluorescence; no sacrifice of individuals or labor-intensive processes such as genomic DNA preparation were required. This is especially advantageous for cases in which it is difficult to obtain genomic DNA from living embryos and larvae, as for example in the genotyping of mutants with embryonic lethal phenotypes or in the selection of fish harboring desired mutations at early stages (to reduce breeding costs). Thus, generation of gene knockin strains harboring the different colors of fluorescent protein genes may also represent an effective approach for targeted mutagenesis in medaka and others.
Although some G0 fish injected with components to target the gap43 locus transmitted the genes integrated at the target site to their progenies, the transmission rates were lower than those of other methods in mimicking endogenous gene expression by reporter genes. One example is the homology-independent gene knockin using donor plasmids with a hsp70 promoter and a reporter gene in zebrafish . That study reported that 5.0–10.0% of injected embryos showed broad reporter gene expression and 30.0–40.0% transmitted the gene to next generation; that is, 1.5–4.0% of injected embryos became transgenic founders. In our study, targeted integration of reporter genes into the gap43 locus resulted in transmission by only 0.6–1.0% of injected embryos to their progeny (Table 2). This lower efficiency may be attributable to our use of promoter-less constructs in the present study. While constructs harboring a promoter can express when integrated in any direction and frame, in-frame integrations into the genomic target site are required for expression of promoter-less constructs. Thus, even if constructs with or without a promoter are integrated at similar efficiencies, the integration efficiency of the promoter-less constructs will be lower than that of constructs with promoter. In another example, bacterial artificial chromosome (BAC)-based transgenes exhibited with high germline transmission rate (~15%) mediated by Tol2 transposon in zebrafish . Although the integration rate of our method is lower than that of the BAC transgenesis, our gene knockin system has the advantage that the transgenes can be precisely integrated into the target site with no positional effects. However, the lower rates of germline transmission in our system indicate the need for further studies to establish methods for improving the efficiency of targeted gene integration in medaka.
Here, we have established an efficient system for targeted gene integration in medaka, which is applicable to the generation of medaka strains with a wide range of genetic modifications. In addition to the gene tagging and knockout by fluorescent reporter genes as shown in this study, this system may enable generation of conditional knockout strains by the insertion of a site-specific recombinase (e.g. LoxP sites for Cre recombinase or FRT sites for Flp) at the genomic targeted position, and for precise introduction of site-specific point mutations in genes of interest [13, 15]. These types of advanced genome editing by HR-mediated targeted gene integration will be useful in accelerating the detailed analysis of gene functions in medaka. Recently, targeted mutagenesis with small indels by the targetable nucleases was also demonstrated in a wide range of fish species in some similar ways to medaka and zebrafish [29,30,31,32,33,34,38]. Our findings in improving the efficiency of targeted gene integration may thus be effective for establishing targeted gene integration systems in other fish species as well.
In this study, we demonstrated targeted gene integration events using CRISPR/Cas9 system and donor plasmids with homologous sequences in medaka. First, we showed that plasmids with longer homology arms (~500 bp each) induced targeted gene integration events more efficiently than those with short homology arms (20 bp and 40 bp). We also found that linearization of the circular donor plasmid by site-specific cleavage with the Cas9 nuclease and the sgRNA targeting to bait sequences increased the efficiency of HR-mediated gene integration in medaka. In addition, a comparison of five newly designed bait sequences and three previously reported sequences revealed that the new bait sequence, BaitD, exhibited the highest efficiency of targeted integration in medaka embryos. Using donor plasmids with longer homology arms and the BaitD sequence, we successfully established gene knockin strains for the gap43 locus. Taken together, these results open new avenues into the establishment of efficient methods for targeted gene integration using the CRISPR/Cas system in medaka.
Takeda H, Shimada A. The art of medaka genetics and genomics: what makes them so unique? Annu Rev Genet. 2010;44:217–41.
Kirchmaier S, Naruse K, Wittbrodt J, Loosli F. The genomic and genetic toolbox of the teleost medaka (Oryzias latipes). Genetics. 2015;199:905–18.
Kinoshita M, Murata K, Naruse K, Tanaka M, editors. Medaka: Biology, Management, and Experimental Protocols. Ames, Iowa, USA: Wiley-Blackwell; 2009.
Peng Y, Clark KJ, Campbell JM, Panetta MR, Guo Y, Ekker SC. Making designer mutants in model organisms. Development. 2014;141:4042–54.
Urnov FD, Rebar EJ, Holmes MC, Zhang HS, Gregory PD. Genome editing with engineered zinc finger nucleases. Nat Rev Genet. 2010;11:636–46.
Ansai S, Sakuma T, Yamamoto T, Ariga H, Uemura N, Takahashi R, et al. Efficient targeted mutagenesis in medaka using custom-designed transcription activator-like effector nucleases. Genetics. 2013;193:739–49.
Ansai S, Inohaya K, Yoshiura Y, Schartl M, Uemura N, Takahashi R, et al. Design, evaluation, and screening methods for efficient targeted mutagenesis with transcription activator-like effector nucleases in medaka. Develop Growth Differ. 2014;56:98–107.
Ansai S, Kinoshita M. Targeted mutagenesis using CRISPR/Cas system in medaka. Biol Open. 2014;3:362–71.
Stemmer M, Thumberger T, Del Sol KM, Wittbrodt J, Mateo JL. CCTop: An Intuitive, Flexible and Reliable CRISPR/Cas9 Target Prediction Tool. PLoS One. 2015;10:e0124633.
McVey M, Lee SE. MMEJ repair of double-strand breaks (director’s cut): deleted sequences and alternative endings. Trends Genet. 2008;24:529–38.
Nakade S, Tsubota T, Sakane Y, Kume S, Sakamoto N, Obara M, et al. Microhomology-mediated end-joining-dependent integration of donor DNA in cells and animals using TALENs and CRISPR/Cas9. Nat Commun. 2014;5:5560.
Sakuma T, Nakade S, Sakane Y, Suzuki K-IT, Yamamoto T. MMEJ-assisted gene knock-in using TALENs and CRISPR-Cas9 with the PITCh systems. Nat Protoc. 2016;11:118–33.
Irion U, Krauss J, Nusslein-Volhard C. Precise and efficient genome editing in zebrafish using the CRISPR/Cas9 system. Development. 2014;141:4827–30.
Hisano Y, Sakuma T, Nakade S, Ohga R, Ota S, Okamoto H, et al. Precise in-frame integration of exogenous DNA mediated by CRISPR/Cas9 system in zebrafish. Sci Rep. 2015;5:8841.
Hoshijima K, Jurynec MJ, Grunwald DJ. Precise Editing of the Zebrafish Genome Made Simple and Efficient. Dev Cell. 2016;36:654–67.
Ochiai H, Sakamoto N, Fujita K, Nishikawa M, Suzuki K, Matsuura S, et al. Zinc-finger nuclease-mediated targeted insertion of reporter genes for quantitative imaging of gene expression in sea urchin embryos. Proc Natl Acad Sci U S A. 2012;109:10915–20.
Xu H, Xiao T, Chen CH, Li W, Meyer CA, Wu Q, et al. Sequence determinants of improved CRISPR sgRNA design. Genome Res. 2015;25:1147–57.
Bae S, Park J, Kim JS. Cas-OFFinder: A fast and versatile algorithm that searches for potential off-target sites of Cas9 RNA-guided endonucleases. Bioinformatics. 2014;30:1473–5.
Auer TO, Duroure K, De Cian A, Concordet JP, Del Bene F. Highly efficient CRISPR/Cas9-mediated knock-in in zebrafish by homology-independent DNA repair. Genome Res. 2014;24:142–53.
Hsu PD, Scott DA, Weinstein JA, Ran FA, Konermann S, Agarwala V, et al. DNA targeting specificity of RNA-guided Cas9 nucleases. Nat Biotechnol. 2013;31:827–32.
Okuyama T, Isoe Y, Hoki M, Suehiro Y, Yamagishi G, Naruse K, et al. Controlled Cre/loxP site-specific recombination in the developing brain in medaka fish, Oryzias latipes. PLoS One. 2013;8:e66597.
Moreno-Mateos MA, Vejnar CE, Beaudoin J, Fernandez JP, Mis EK, Khokha MK, et al. CRISPRscan: designing highly efficient sgRNAs for CRISPR-Cas9 targeting in vivo. Nat Methods. 2015;12:982–8.
Hwang WY, Fu Y, Reyon D, Maeder ML, Tsai SQ, Sander JD, et al. Efficient genome editing in zebrafish using a CRISPR-Cas system. Nat Biotechnol. 2013;31:227–9.
Kinoshita M, Kani S, Ozato K, Wakamatsu Y. Activity of the medaka translation elongation factor 1alpha-A promoter examined using the GFP gene as a reporter. Develop Growth Differ. 2000;42:469–78.
Kusakabe R, Kusakabe T, Suzuki N. In vivo analysis of two striated muscle actin promoters reveals combinations of multiple regulatory modules required for skeletal and cardiac muscle-specific gene expression. Int J Dev Biol. 1999;43:541–54.
Kinoshita M. Transgenic medaka with brilliant fluorescence in skeletal muscle under normal light. Fish Sci. 2004;70:645–9.
Fujimori KE, Kawasaki T, Deguchi T, Yuba S. Characterization of a nervous system-specific promoter for growth-associated protein 43 gene in Medaka (Oryzias latipes). Brain Res. 2008;1245:1–15.
Aida T, Nakade S, Sakuma T, Izu Y, Oishi A, Mochida K, et al. Gene cassette knock-in in mammalian cells and zygotes by enhanced MMEJ. BMC Genomics. 2016;17:979.
Taleei R, Nikjoo H. Biochemical DSB-repair model for mammalian cells in G1 and early S phases of the cell cycle. Mutat. Res. - Genet. Toxicol. Environ. Mutagenesis. 2013;756:206–12.
Kimura Y, Hisano Y, Kawahara A, Higashijima S-I. Efficient generation of knock-in transgenic zebrafish carrying reporter/driver genes by CRISPR/Cas9-mediated genome engineering. Sci Rep. 2014;4:6545.
Suster ML, Abe G, Schouw A, Kawakami K. Transposon-mediated BAC transgenesis in zebrafish. Nat Protoc. 2011;6:1998–2021.
Yano A, Guyomard R, Nicol B, Jouanno E, Quillet E, Klopp C, et al. An immune-related gene evolved into the master sex-determining gene in rainbow trout, Oncorhynchus mykiss. Curr Biol. 2012;22:1423–8.
Edvardsen RB, Leininger S, Kleppe L, Skaftnesmo KO, Wargelius A. Targeted mutagenesis in Atlantic salmon (Salmo salar L.) using the CRISPR/Cas9 system induces complete knockout individuals in the F0 generation. PLoS One. 2014;9:e108622.
Li M, Yang H, Zhao J, Fang L, Shi H, Li M, et al. Efficient and heritable gene targeting in tilapia by CRISPR/Cas9. Genetics. 2014;197:591–9.
Harel I, Benayoun BA, Machado B, Singh PP, Hu CK, Pech MF, et al. A platform for rapid exploration of aging and diseases in a naturally short-lived vertebrate. Cell. 2015;160:1013–26.
Ma L, Jeffery WR, Essner JJ, Kowalko JE. Genome editing using TALENs in blind Mexican Cavefish, Astyanax mexicanus. PLoS One. 2015;10:e0119370.
Erickson PA, Ellis NA, Miller CT. Microinjection for Transgenesis and Genome Editing in Threespine Sticklebacks. J Vis Exp. 2016;111:e54055.
Cui Z, Liu Y, Wang W, Wang Q, Zhang N, Lin F, et al. Genome editing reveals dmrt1 as an essential male sex-determining gene in Chinese tongue sole (Cynoglossus semilaevis). Sci Rep. 2017;7:42213.
We thank Dr. A. Shimizu for providing a plasmid pPBIS19-mgfc:TagBFP-8xHSE:Cre and Dr. A. Toyoda for sharing an unpublished genome assembly of Pagrus major.
This work was partially supported by a Grant-in-Aid for Scientific Research (A) 15H02540 (MK) and a Grant-in-Aid for JSPS fellows 13 J01682 (SA). The funders had no role in the design of the study and collection, analysis, or interpretation of data or in the writing of the manuscript.
Availability of data and materials
Donor plasmids containing the BaitD sequence will be distributed from RIKEN Bioresource Center DNA Bank (http://dna.brc.riken.jp/).
Ethics approval and consent to participate
Fish were maintained and used in accordance with the Regulation on Animal Experimentation at Kyoto University (approval number: 26–71).
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Reference genome sequences of teleost species used in this study. (XLSX 9 kb)
Oligonucleotide sequences used in this study. (XLSX 13 kb)
Selection of sgRNA targeting to the skeletal muscle-specific actin gene (acta1) in medaka. (Upper): The design of sgRNAs targeting the acta1 locus. (Lower): An electrophoresis image shows results of the heteroduplex mobility assay (HMA) in embryos injected with 50 ng/μL of a sgRNA (sgRNA-acta1 #1 or #2) and 100 ng/μL of Cas9 mRNA. Control shows a result from an embryo without injection. (PPTX 76 kb)
Precise gene integration of a donor vector containing the GFP gene into the acta1 locus. (a) Schematic illustration of the locus. The primer pair (acta1-for-Seq-Fw and mAG-Rv or mAG-Fw and acta1-for-Seq-Rv) was used for investigating the precise integration into the acta1 locus. The amplicons containing 5′ junction or 3′ junction are 1.0 kbp, respectively. (b) The sequence analysis of the PCR amplicons (1.0 kbp) containing 5′ or 3′ junction. (Upper): The sequence observed in the G0 fish with the inserted gene (#1–#5). (Middle): The sequence in the donor plasmid, pBaitD-acta1_500 bp-mAG (Plasmid). (Lower): The sequence in the wild type fish without the insertion (WT). (PPTX 41 kb)
The numbers of potential off-target sequences in reference genomes of teleost fish species. Numbers of genomic sites with up to 3 bp mismatches in the total 21 bp sequences are shown. (XLSX 11 kb)
Potential off-target sites of 7 candidates of bait sequence that selected in the first screening and previously reported bait sequences. Potential off-target sites are defined as genomic sequence harboring up to 2 bp mismatches in the total 18 bp sequences and a NGG PAM. (XLSX 11 kb)