Abstract

CRISPR system is a powerful defense mechanism in bacteria and archaea to provide immunity against viruses. Recently, this process found a new application in intended targeting of the genomes. CRISPR-mediated genome editing is performed by two main components namely single guide RNA and Cas9 protein. Despite the enormous data generated in this area, there is a dearth of high throughput resource. Therefore, we have developed CrisprGE, a central hub of CRISPR/Cas-based genome editing. Presently, this database holds a total of 4680 entries of 223 unique genes from 32 model and other organisms. It encompasses information about the organism, gene, target gene sequences, genetic modification, modifications length, genome editing efficiency, cell line, assay, etc. This depository is developed using the open source LAMP (Linux Apache MYSQL PHP) server. User-friendly browsing, searching facility is integrated for easy data retrieval. It also includes useful tools like BLAST CrisprGE, BLAST NTdb and CRISPR Mapper. Considering potential utilities of CRISPR in the vast area of biology and therapeutics, we foresee this platform as an assistance to accelerate research in the burgeoning field of genome engineering.

Database URL : http://crdd.osdd.net/servers/crisprge/ .

Introduction

Genome editing is a method to target any desired sequence in the genome. From past few years, this technique has earned significant achievements in the area of therapeutics or gene therapy with the help of artificially designed nucleases ( 1 ). In this method, a sequence-specific DNA-binding domain is fused to a nuclease domain that cuts DNA at intended site with high efficiency but in non-sequence specific manner ( 2 ).

The primary tools that are being used to execute genome excision are constructed using zinc fingers (ZF) ( 3 ) and transcription activator-like effector (TALE) ( 4 ) proteins but they have their own limitations. A new class of nucleases, known as, Clustered regularly interspaced short palindromic repeats/CRISPR-associated proteins (CRISPR/Cas) has emerged in recent times. ( 5 ). It is a type of adaptive immunity in bacteria and archaea, which is acquired in response to exposure of foreign genetic material ( 6 ). This approach has built a buzz in the scientific community to apply this method in crafting sequence-specific alterations in genomes of various organisms ( 7 ).

CRISPR was firstly identified in the genome of Escherichia coli as uncommon repeat segments ( 8 ). Later, it was discovered that CRISPR contain an array of repeat spacer sequences, which are derived from attacking bacteriophages ( 9 ). A set of cas genes is also present at one end of this array, which are key players in cleaving the foreign genetic material ( 10 ). The type II CRISPR/Cas system from bacterium Streptococcus pyogenes then emerged as a powerful tool for editing genomes of various organisms ( 5 ). It contains a single Cas protein i.e. Cas9 endonuclease and crRNA along with tracrRNA that forms a dual RNA system to cleave a particular target site ( 11 , 12 ). Single guide RNA (sgRNA) is mainly a chimeric RNA, which is created/generated by merging the 3′-end of crRNA with the 5′-end of tracrRNA. Cas9 requires ‘NGG’ protospacer adjacent motif downstream to the site of target ( 5 ) ( Figure 1 ). It has been reported that sgRNA or the chimeric RNA shows more efficiency than using them separately ( 5 ).

General mechanism of CRISPR/Cas genome editing.
Figure 1.

General mechanism of CRISPR/Cas genome editing.

The breaks induced by Cas9 are repaired by homology directed repair or non-homologous end joining creating alterations i.e. insertions, deletions and substitutions at the target site. CRISPR constructs are easy to design, and plenty of data has been generated in the last few years. The efficiency of this approach motivated Cong et al . ( 11 ) to execute human genome editing. Subsequently, genome editing using CRISPR was accomplished in model organism namely Rattus norvegicus , Caenorhabditis elegans , Danio rerio , Mus musculus , Drosophila melanogaster , Arabidopsis thaliana and other organisms ( 12–18 ).

CRISPR/Cas method has demonstrated wider potential applications comprising knockout ( 27 , 28 ), knock-in, large chromosomal deletions and replacement of genes in different cells ( 29–31 ). This technique has also been successfully utilized to make knockout mice with heritable mutated alleles ( 32 ). It is now being used to target long non-coding RNAs in vivo ( 33 ), to check the changes in proteome after transcription activation ( 34 ) and to delete synaptic proteins for studying their functions ( 35 ). It is important utility includes correction of genetic disorders like beta thalassemia, and duchenne muscular dystrophy ( 36–38 ). This system also helped in creating indels to inactivate human papillomavirus, Hepatitis B virus, HIV-1 and virulent phages ( 39-43 ).

In no time, CRISPR/Cas has gained a lot of importance in the field of genome editing. The main aim of CrisprGE is to provide single platform to integrate the growing information being generated by this genome editing approach.

Materials and Methods

Data search

Extensive literature search was done, and data were retrieved from PubMed with different combination of keywords comprising ‘Clustered regularly interspaced short palindromic repeats’, ‘CRISPRs’, ‘CRISPR*’, ‘CRISPR’, ‘genome editing’, ‘genome engineering’, etc. The query used for the advanced search option is as follows:

(((((Clustered regularly interspaced short palindromic repeats) OR CRISPRs) OR CRISPR) OR CRISPR*)) AND ((genome editing) OR genome engineering)

With this query, 575 articles were obtained as of April 2015. We extracted articles having data related to organisms and genes, along with the modification generated by this targeting. Reviews and general methodology articles were excluded. Similarly, articles lacking the desired information were also omitted. Finally, 4680 entries were totally extracted.

Database organization

For precise demonstrations, this directory/database is organized to comprehend the different aspects of genome editing ( Figure 2 ) and includes the following fields:

  • CrisprID: a unique ID is given to each entry.

  • Organism: all organisms are displayed according to their Latin names (e.g. Homo sapiens ).

  • Gene/locus: genes are formatted according to NCBI’s Gene database and literature (e.g. CCR5).

  • Target sequence: sequence of the target gene from the respective study.

  • Target/mutant: sequence of the wild-type gene and the modified sequence or mutant.

  • Cell line: cell lines on which experiments were performed (e.g. HEK293).

  • Assay: experimental method used to find indels (e.g. sequencing).

  • Genetic modification: insertion, deletion, point mutation, indels.

  • Modification length: length of insertion, deletion, indels (e.g. D1, D2).

  • PMIDs: references are specified as PubMed IDs.

CrisprGE design.
Figure 2.

CrisprGE design.

The database is equipped with easy browsing and searching options. Analysis tools like BLAST CrisprGE, BLAST NTdb and CRISPR mapper are also present. Individual entries are hyperlinked to other resources like UniProt, KEGG and PubMed, etc.

Implementation of web-interface

CrisprGE is constructed using the open source LAMP server on Red Hat Enterprise Linux 5 with MySQL and Apache on the back end. The front end is implemented with PHP. It is freely available at: http://crdd.osdd.net/servers/crisprge/ .

Results

Database statistics

CrisprGE is a dedicated repository having total of 4680 genes edited by CRISPR/Cas approach. It comprises 223 unique genes targeted in 32 model and other organisms along with different modification induced by repair mechanisms. It also contains details of various organisms in which genome editing has been carried out ( Figure 3 A). The experiments reported in the database have been performed on different cell lines. Out of these, injection of sgRNA constructs in embryo ( Figure 3 B) is the most commonly applied strategy followed by injection of plant cells and protoplast. There are different methods to detect indels at the target site. Amongst them, most widely used method in the literature was that of sequencing, followed by T7 Endonuclease I assay ( Figure 3 C).

 CrisprGE statistics: graphs are representing the statistical distribution of the ( A ) organism ( B ) cell lines ( C ) assay. PCR, polymerase chain reaction; T7E1, T7 endonuclease1 assay; HMA, heteroduplex mobility assay; HRMA, high-resolution melting assay; RFLP, restriction fragment length polymorphism; RE, restriction enzyme assay; CAPS, Cleaved Amplified Polymorphic Sequences; SSA assay, Single-strand annealing assay.
Figure 3.

CrisprGE statistics: graphs are representing the statistical distribution of the ( A ) organism ( B ) cell lines ( C ) assay. PCR, polymerase chain reaction; T7E1, T7 endonuclease1 assay; HMA, heteroduplex mobility assay; HRMA, high-resolution melting assay; RFLP, restriction fragment length polymorphism; RE, restriction enzyme assay; CAPS, Cleaved Amplified Polymorphic Sequences; SSA assay, Single-strand annealing assay.

The modifications achieved on the target sites are mainly insertions or deletions, point mutations and in some cases both. The range of deletions has been observed between 1 and 294 24 bp and that of insertion from 1 to 1837 bp. It has been seen that most of the deletions and insertions created were of 1 bp followed by 3 bp or 4 bp. The deletion pattern is shown in Figure 4 .

Bar graph is signifying length of insertions and deletion of various genes. Del, deletion; Ins, insertion and p, point mutation.
Figure 4.

Bar graph is signifying length of insertions and deletion of various genes. Del, deletion; Ins, insertion and p, point mutation.

In this depository, we have also incorporated top 20 genes Table 1 , which are targeted at least 70 times by CRISPR/cas method. Among them, Tyr and alcohol dehydrogenase 1 (ADH1) are the most commonly edited genes, followed by phytoene desaturase (PDS), Prkdc and Tet1 from M. musculus and TT4 from A. thaliana . List of all genes and organism wise frequency distribution are also provided (see Supplementary Tables S1 and Supplementary Data , respectively).

Table 1.

List of top genes targeted by CRISPR/Cas system

GenesNumber of entriesOrganism
Tyr252Mus musculus, Rattus norvegicus, Xenopus tropicalis, Danio rerio
ADH1238Arabidopsis thaliana, Nicotiana benthamiana
PDS155Nicotiana tabacum, Nicotiana benthamiana, Oryza sativa, citrus sinensis
Prkdc125Rattus norvegicus, Mus musculus
Tet1118Rattus norvegicus, Mus musculus
TT4108Arabidopsis thaliana
B2m95Rattus norvegicus, Mus musculus
YSA92Oryza sativa
Tet288Rattus norvegicus, Mus musculus
DDM187Glycine max
CCR586Homo sapiens
PCSK981Mus musculus
DMD80Homo sapiens
fh72Danio rerio
Pcdh72Homo sapiens
HBB70Homo sapiens
ApoE69Rattus norvegicus, Danio rerio
Tet368Rattus norvegicus, Mus musculus
Prf167Rattus norvegicus, Mus musculus
PDS366Arabidopsis thaliana, Nicotiana benthamiana
GenesNumber of entriesOrganism
Tyr252Mus musculus, Rattus norvegicus, Xenopus tropicalis, Danio rerio
ADH1238Arabidopsis thaliana, Nicotiana benthamiana
PDS155Nicotiana tabacum, Nicotiana benthamiana, Oryza sativa, citrus sinensis
Prkdc125Rattus norvegicus, Mus musculus
Tet1118Rattus norvegicus, Mus musculus
TT4108Arabidopsis thaliana
B2m95Rattus norvegicus, Mus musculus
YSA92Oryza sativa
Tet288Rattus norvegicus, Mus musculus
DDM187Glycine max
CCR586Homo sapiens
PCSK981Mus musculus
DMD80Homo sapiens
fh72Danio rerio
Pcdh72Homo sapiens
HBB70Homo sapiens
ApoE69Rattus norvegicus, Danio rerio
Tet368Rattus norvegicus, Mus musculus
Prf167Rattus norvegicus, Mus musculus
PDS366Arabidopsis thaliana, Nicotiana benthamiana

DMD, duchenne muscular dystrophy.

Table 1.

List of top genes targeted by CRISPR/Cas system

GenesNumber of entriesOrganism
Tyr252Mus musculus, Rattus norvegicus, Xenopus tropicalis, Danio rerio
ADH1238Arabidopsis thaliana, Nicotiana benthamiana
PDS155Nicotiana tabacum, Nicotiana benthamiana, Oryza sativa, citrus sinensis
Prkdc125Rattus norvegicus, Mus musculus
Tet1118Rattus norvegicus, Mus musculus
TT4108Arabidopsis thaliana
B2m95Rattus norvegicus, Mus musculus
YSA92Oryza sativa
Tet288Rattus norvegicus, Mus musculus
DDM187Glycine max
CCR586Homo sapiens
PCSK981Mus musculus
DMD80Homo sapiens
fh72Danio rerio
Pcdh72Homo sapiens
HBB70Homo sapiens
ApoE69Rattus norvegicus, Danio rerio
Tet368Rattus norvegicus, Mus musculus
Prf167Rattus norvegicus, Mus musculus
PDS366Arabidopsis thaliana, Nicotiana benthamiana
GenesNumber of entriesOrganism
Tyr252Mus musculus, Rattus norvegicus, Xenopus tropicalis, Danio rerio
ADH1238Arabidopsis thaliana, Nicotiana benthamiana
PDS155Nicotiana tabacum, Nicotiana benthamiana, Oryza sativa, citrus sinensis
Prkdc125Rattus norvegicus, Mus musculus
Tet1118Rattus norvegicus, Mus musculus
TT4108Arabidopsis thaliana
B2m95Rattus norvegicus, Mus musculus
YSA92Oryza sativa
Tet288Rattus norvegicus, Mus musculus
DDM187Glycine max
CCR586Homo sapiens
PCSK981Mus musculus
DMD80Homo sapiens
fh72Danio rerio
Pcdh72Homo sapiens
HBB70Homo sapiens
ApoE69Rattus norvegicus, Danio rerio
Tet368Rattus norvegicus, Mus musculus
Prf167Rattus norvegicus, Mus musculus
PDS366Arabidopsis thaliana, Nicotiana benthamiana

DMD, duchenne muscular dystrophy.

Data retrieval

CrisprGE browse

CrisprGE has been provided with easy browsing options. Users can browse it by any of the five fields namely, Organism name, Gene/Locus, Target sequence, Cell line and Assay see (see Supplementary Figure S1 ).

Database search and advanced search

In basic search option, user can enter query in the box and can search for provided fields. Search output has information on essential components like CrisprID, organism, gene, target, modification, location and PMIDs (see Supplementary Figure S2 ). Sorting and filtering functionality is also offered in the search output.

Along with the simple search, a user-friendly advanced search tool is also offered for extensive data search. User can apply logical operators (=/like) along with conditional operators (AND/OR) on various fields such as organism, gene, target and modification, etc. User can add ‘N’ number of keywords just by clicking on Add button and can build final query (see Supplementary Figure S3 ). The output gives information, which can be sorted, and further filtered based on specific keywords using a filter box. Additionally, hints on allowed search keywords are also provided to assist users.

Analysis tools

Various tools have been assimilated to assist analysis of CRISPRs. ‘BLAST NTdb’ tool is available in CrisprGE to support users to align their target sequence against the NCBI non-redundant nucleotide database. It was built by downloading standalone BLAST programs from NCBI BLAST ftp ( ftp://ftp.ncbi.nlm.nih.gov/blast/db/ ) site. After installation, this is implemented on the Red Hat Enterprise Linux 5 web server. A text box is given in which query sequence can be inserted in Fasta format. Default parameters such as Expected value ( 10 ), Scoring Matrix (BLOSSUM62), Alignment view (Pairwise), etc. are used to query target sequence. The output displays alignment, graphical view and score. ‘BLAST CrisprGE’ tool will help user to align their desired sequence with the target sequences from CrisprGE repository. It helps user to find best possible target site hits for their gene. Default parameters and the resulting output of this tool are similar to nucleotide BLAST output.

‘CRISPR Mapper’ can be utilized to find possible off-target sequence regions within particular gene or genome. It helps user to explore the perfectly matching target sequences on user provided nucleotide sequence, which generates a list of target sites with details. Output of this tool displays the CrisprID, organism name, gene or locus, target sequence, start position along with the associated genetic modification and its length (see Supplementary Figure S4 ).

Each entry in this databank is curated manually and further verified by cross-checking. The tools included in web server are also checked for proper working. It would be updated half yearly/yearly to encompass newer records.

Comparison of genome editing methods

Besides CRISPR/Cas, artificially designed nucleases like ZF proteins and TALEs are also exploited for genome editing ( 19 , 20 ). Both these nucleases have a DNA binding and catalytic domain ( 21 , 22 ). The catalytic domain in ZFNs and TALENs is derived from FokI (type II restriction endonuclease) while in CRISPR system it originates from Cas9 nuclease. Although, ZFNs and TALENs have been successfully used for genome editing, they have some restraints, specifically on their delivery, due to large size ( 23 ) and may also have toxicity ( 24 ). Further, there is always a need to reconstruct new enzyme for every new DNA target. In CRISPR/Cas, a single Cas9 nuclease is sufficient to perform these tasks ( 25 ).

We compared the effectiveness and frequency of excision mediated by all three approaches of genome editing. The genes targeted by CRISPR/Cas of our resource were checked in EENdb- a database of ZFNs and TALENs-based genome editing ( 26 ). List of genes targeted by all these methods is shown in Table 2 . For example, CRISPR/Cas-mediated editing of human CCR5 gene has been 76.00% efficient whereas ZFNs and TALENs achieved efficiency of 16.70% and 20.00%, respectively. CRISPR/Cas-based editing of ben-1 gene in C. elegans was 88.00% efficient followed by 3.50% using other two techniques. However, in few cases, the other two techniques have slightly better efficiency e.g. gene ADH1 of A. thaliana . These observations suggest that CRISPR/Cas is comparatively more efficient than other methods of genome editing.

Table 2.

Comparison of genome editing efficiency with different methods

Organism/speciesGeneMethodModification methodEfficiency (%)Efficiency detection methodPMID
Human ( Homo sapiens ) CCR5CRISPR/Cas9NHEJ76T7E1 assay/ Sequencing23939622
ZFNsNHEJ16.70MDNA/SSA assay19470664
TALENsNHEJ20MDNA21179091
Human ( Homo sapiens ) HBBCRISPR/Cas9NHEJ70T7E1 assay/ Sequencing23939622
ZFNsNHEJ, HR2.1/12.9Sequencing21898685
TALENsNHEJNAReporter gene addition assay22301904
Rat ( Rattus norvegicus ) PrkdcCRISPR/Cas9NHEJ66.70T7E1 assay24598943
ZFNsNHEJNASequencing22981234
TALENsNANANANA
Worm ( Caenorhabditis elegans ) ben-1CRISPR/Cas9NHEJ88Sequencing24013562
ZFNsNHEJ3.50MDNA & high-throughput sequencing21700836
TALENsNHEJ3.50MDNA21700836
Zebrafish ( Danio rerio ) gria3aCRISPR/Cas9NHEJ61T7E1 assay23360964
ZFNsNHEJ26Sequencing21822241
TALENsNHEJ15Sequencing21822241
TALENsNANASSA assay21493687
Thale cress ( Arabidopsis thaliana ) ADH1CRISPR/Cas9NHEJ8HRMA, sequencing24836556
ZFNsNHEJ16Restriction-enzyme- resistance assay20508152
TALENsNHEJ, HRNASSA assay, & restriction-enzyme- resistance assay21493687
Silk worm ( Bombyx mori ) BLOS2CRISPR/Cas9NHEJ35.60PCR24165890
ZFNsNHEJ0Reporter gene disruption assay/ direct sequencing20692340
TALENsNHEJ0.45Reporter gene disruption assay23028749
Organism/speciesGeneMethodModification methodEfficiency (%)Efficiency detection methodPMID
Human ( Homo sapiens ) CCR5CRISPR/Cas9NHEJ76T7E1 assay/ Sequencing23939622
ZFNsNHEJ16.70MDNA/SSA assay19470664
TALENsNHEJ20MDNA21179091
Human ( Homo sapiens ) HBBCRISPR/Cas9NHEJ70T7E1 assay/ Sequencing23939622
ZFNsNHEJ, HR2.1/12.9Sequencing21898685
TALENsNHEJNAReporter gene addition assay22301904
Rat ( Rattus norvegicus ) PrkdcCRISPR/Cas9NHEJ66.70T7E1 assay24598943
ZFNsNHEJNASequencing22981234
TALENsNANANANA
Worm ( Caenorhabditis elegans ) ben-1CRISPR/Cas9NHEJ88Sequencing24013562
ZFNsNHEJ3.50MDNA & high-throughput sequencing21700836
TALENsNHEJ3.50MDNA21700836
Zebrafish ( Danio rerio ) gria3aCRISPR/Cas9NHEJ61T7E1 assay23360964
ZFNsNHEJ26Sequencing21822241
TALENsNHEJ15Sequencing21822241
TALENsNANASSA assay21493687
Thale cress ( Arabidopsis thaliana ) ADH1CRISPR/Cas9NHEJ8HRMA, sequencing24836556
ZFNsNHEJ16Restriction-enzyme- resistance assay20508152
TALENsNHEJ, HRNASSA assay, & restriction-enzyme- resistance assay21493687
Silk worm ( Bombyx mori ) BLOS2CRISPR/Cas9NHEJ35.60PCR24165890
ZFNsNHEJ0Reporter gene disruption assay/ direct sequencing20692340
TALENsNHEJ0.45Reporter gene disruption assay23028749

NHEJ, non homologous end joining; HR, homologous recombination; PCR, polymerase chain reaction; ZFNs, zinc finger nucleases; TALENs, transcription activator like effector nucleases; T7E1, T7 endonuclease1 assay; HRMA, high resolution melting assay; SSA, single strand annealing; MDNA, mismatch-detection nuclease assay.

Table 2.

Comparison of genome editing efficiency with different methods

Organism/speciesGeneMethodModification methodEfficiency (%)Efficiency detection methodPMID
Human ( Homo sapiens ) CCR5CRISPR/Cas9NHEJ76T7E1 assay/ Sequencing23939622
ZFNsNHEJ16.70MDNA/SSA assay19470664
TALENsNHEJ20MDNA21179091
Human ( Homo sapiens ) HBBCRISPR/Cas9NHEJ70T7E1 assay/ Sequencing23939622
ZFNsNHEJ, HR2.1/12.9Sequencing21898685
TALENsNHEJNAReporter gene addition assay22301904
Rat ( Rattus norvegicus ) PrkdcCRISPR/Cas9NHEJ66.70T7E1 assay24598943
ZFNsNHEJNASequencing22981234
TALENsNANANANA
Worm ( Caenorhabditis elegans ) ben-1CRISPR/Cas9NHEJ88Sequencing24013562
ZFNsNHEJ3.50MDNA & high-throughput sequencing21700836
TALENsNHEJ3.50MDNA21700836
Zebrafish ( Danio rerio ) gria3aCRISPR/Cas9NHEJ61T7E1 assay23360964
ZFNsNHEJ26Sequencing21822241
TALENsNHEJ15Sequencing21822241
TALENsNANASSA assay21493687
Thale cress ( Arabidopsis thaliana ) ADH1CRISPR/Cas9NHEJ8HRMA, sequencing24836556
ZFNsNHEJ16Restriction-enzyme- resistance assay20508152
TALENsNHEJ, HRNASSA assay, & restriction-enzyme- resistance assay21493687
Silk worm ( Bombyx mori ) BLOS2CRISPR/Cas9NHEJ35.60PCR24165890
ZFNsNHEJ0Reporter gene disruption assay/ direct sequencing20692340
TALENsNHEJ0.45Reporter gene disruption assay23028749
Organism/speciesGeneMethodModification methodEfficiency (%)Efficiency detection methodPMID
Human ( Homo sapiens ) CCR5CRISPR/Cas9NHEJ76T7E1 assay/ Sequencing23939622
ZFNsNHEJ16.70MDNA/SSA assay19470664
TALENsNHEJ20MDNA21179091
Human ( Homo sapiens ) HBBCRISPR/Cas9NHEJ70T7E1 assay/ Sequencing23939622
ZFNsNHEJ, HR2.1/12.9Sequencing21898685
TALENsNHEJNAReporter gene addition assay22301904
Rat ( Rattus norvegicus ) PrkdcCRISPR/Cas9NHEJ66.70T7E1 assay24598943
ZFNsNHEJNASequencing22981234
TALENsNANANANA
Worm ( Caenorhabditis elegans ) ben-1CRISPR/Cas9NHEJ88Sequencing24013562
ZFNsNHEJ3.50MDNA & high-throughput sequencing21700836
TALENsNHEJ3.50MDNA21700836
Zebrafish ( Danio rerio ) gria3aCRISPR/Cas9NHEJ61T7E1 assay23360964
ZFNsNHEJ26Sequencing21822241
TALENsNHEJ15Sequencing21822241
TALENsNANASSA assay21493687
Thale cress ( Arabidopsis thaliana ) ADH1CRISPR/Cas9NHEJ8HRMA, sequencing24836556
ZFNsNHEJ16Restriction-enzyme- resistance assay20508152
TALENsNHEJ, HRNASSA assay, & restriction-enzyme- resistance assay21493687
Silk worm ( Bombyx mori ) BLOS2CRISPR/Cas9NHEJ35.60PCR24165890
ZFNsNHEJ0Reporter gene disruption assay/ direct sequencing20692340
TALENsNHEJ0.45Reporter gene disruption assay23028749

NHEJ, non homologous end joining; HR, homologous recombination; PCR, polymerase chain reaction; ZFNs, zinc finger nucleases; TALENs, transcription activator like effector nucleases; T7E1, T7 endonuclease1 assay; HRMA, high resolution melting assay; SSA, single strand annealing; MDNA, mismatch-detection nuclease assay.

Discussion

CRISPR/Cas-based genome editing has been extensively explored since invention of sgRNA. This method was successfully applied for excising genome of various organisms namely humans ( 11 , 44 ), M. musculus ( 45 ), D. rerio ( 46 ), A. thaliana ( 18 ), etc. These findings lead to the generation of a huge amount of data on genome editing. CrisprGE is the first specialized resource to encompass vital data on CRISPR/Cas-based genome editing. Presently, it comprises a total of 4680 entries of 223 unique genes from 32 model and important organisms. Prior to our resource, only 439 entries of TALEN and 340 of ZFN-mediated genome editing were available in EENdb ( 15 ). Also in EENdb they have provided only eight data fields while CrisprGE covers 12 data fields each offering significant information.

We have analysed the pattern of modifications mediated by CRISPR/Cas method. We observed that each kind of mutations like insertions, deletions and point mutations have been carried out using this method. Deletions and insertions range from as small as 1 bp to as large as several kilo base pairs. However, efficiency of small indels like 1–2 bp was high in different organisms but large indels have also been also performed with good efficiency ( 47 , 48 ). Although this technique has been majorly applied to target a particular location in genome. Lately, it has also exhibited potential to target many genes or even various locations within a gene simultaneously with high efficiency. For example, Tet1 , Tet2 and Tet3 genes were aimed in M. musculus ( 49 ), multiple locations in Coe gene of Ciona intestinalis ( 50 ) as well as w gene of D. melanogaster ( 47 ).

We have provided a user-friendly web server with data retrieval capabilities. Standard browse, search, and advanced search options are offered for easy access to data. Advanced search facility help users to explore multiple terms and restrict the search in one click. Sorting and filtering options help users to refine their search further. ‘How to use’ section with step-by-step pictorial representation is offered on web server. In addition, various analysis tools have also been integrated for further help, e.g. Using KEGG Mapper analysis tool, we found those targets genes were involved in various metabolic pathways. We have checked that, genes, which are frequently targeted e.g. Tyr (tyrosinase) is involved in Tyrosine metabolism; ADH1 is involved in glucose metabolism and PDS is engaged in Carotenoid biosynthesis. Thus, this suggests that CrisprGE harbor genes, which regulate various biological pathways.

The only limitation here is that data on genome editing is increasing very fast as evident from recent literature; therefore, it is necessary to keep the database up to date. Each record in CrisprGE is curated manually at the time of data extraction and further cross-checked. The same strategy would be continued for addition of new entries preferably on half-yearly/yearly basis. Further emphasis would be given on to incorporate newer analysis tools for CRISPR.

Genome editing has generated a large amount of data so there is an irresistible need to develop a storehouse that can accommodate high throughput data. In a very short span, this method has successfully been applied to knock in and knock out genes, creating mutations and also large chromosomal deletions. It has also shown therapeutic potential in curing genetic disorders and inhibiting viral infections, etc. Therefore, we expect that CrisprGE resource would assist the wider scientific community working on different aspects of CRISPR-based genome editing.

Acknowledgements

This work was supported by Council of Scientific and Industrial Research (CSIR) and Department of Biotechnology, Government of India (GAP001). Open access charges provided by CSIR- Institute of Microbial Technology.

Conflict of interest . None declared.

Supplementary Data

Supplementary data are available at Database Online.

References

1

Tebas
P.
Stein
D.
Tang
W.W.
et al.  . (
2014
)
Gene editing of CCR5 in autologous CD4 T cells of persons infected with HIV
.
N. Engl. J. Med.
,
370
,
901
910
.

2

Carroll
D.
(
2011
)
Genome engineering with zinc-finger nucleases
.
Genetics
,
188
,
773
782
.

3

Bibikova
M.
Golic
M.
Golic
K.G.
et al.  . (
2002
)
Targeted chromosomal cleavage and mutagenesis in Drosophila using zinc-finger nucleases
.
Genetics
,
161
,
1169
1175
.

4

Boch
J.
Scholze
H.
Schornack
S.
et al.  . (
2009
)
Breaking the code of DNA binding specificity of TAL-type III effectors
.
Science
,
326
,
1509
1512
.

5

Jinek
M.
Chylinski
K.
Fonfara
I.
et al.  . (
2012
)
A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity
.
Science
,
337
,
816
821
.

6

Horvath
P.
Barrangou
R.
(
2010
)
CRISPR/Cas, the immune system of bacteria and archaea
.
Science
,
327
,
167
170
.

7

Sander
J.D.
Joung
J.K.
(
2014
)
CRISPR-Cas systems for editing, regulating and targeting genomes
.
Nat. Biotechnol.
,
32
,
347
355
.

8

Ishino
Y.
Shinagawa
H.
Makino
K.
et al.  . (
1987
)
Nucleotide sequence of the iap gene, responsible for alkaline phosphatase isozyme conversion in Escherichia coli , and identification of the gene product
.
J. Bacteriol.
,
169
,
5429
5433
.

9

Bolotin
A.
Quinquis
B.
Sorokin
A.
et al.  . (
2005
)
Clustered regularly interspaced short palindrome repeats (CRISPRs) have spacers of extrachromosomal origin
.
Microbiology
,
151
,
2551
2561
.

10

Haft
D.H.
Selengut
J.
Mongodin
E.F.
et al.  . (
2005
)
A guild of 45 CRISPR-associated (Cas) protein families and multiple CRISPR/Cas subtypes exist in prokaryotic genomes
.
PLoS Comput. Biol.
,
1
,
e60
.

11

Cong
L.
Ran
F.A.
Cox
D.
et al.  . (
2013
)
Multiplex genome engineering using CRISPR/Cas systems
.
Science
,
339
,
819
823
.

12

Cradick
T.J.
Fine
E.J.
Antico
C.J.
et al.  . (
2013
)
CRISPR/Cas9 systems targeting beta-globin and CCR5 genes have substantial off-target activity
.
Nucleic Acids Res.
,
41
,
9584
9592
.

13

Ma
Y.
Shen
B.
Zhang
X.
et al.  . (
2014
)
Heritable multiplex genetic engineering in rats using CRISPR/Cas9
.
PLoS One
,
9
,
e89413
.

14

Friedland
A.E.
Tzur
Y.B.
Esvelt
K.M.
et al.  . (
2013
)
Heritable genome editing in C. elegans via a CRISPR-Cas9 system
.
Nat. Methods
,
10
,
741
743
.

15

Hwang
W.Y.
Fu
Y.
Reyon
D.
et al.  . (
2013
)
Efficient genome editing in zebrafish using a CRISPR-Cas system
.
Nat. Biotechnol.
,
31
,
227
229
.

16

Mashiko
D.
Fujihara
Y.
Satouh
Y.
et al.  . (
2013
)
Generation of mutant mice by pronuclear injection of circular plasmid expressing Cas9 and single guided RNA
.
Sci. Rep.
,
3
,
3355
.

17

Port
F.
Chen
H.M.
Lee
T.
et al.  . (
2014
)
Optimized CRISPR/Cas tools for efficient germline and somatic genome engineering in Drosophila
.
Proc. Natl. Acad. Sci. U.S.A.
,
111
,
E2967
E2976
.

18

Mao
Y.
Zhang
H.
Xu
N.
et al.  . (
2013
)
Application of the CRISPR-Cas system for efficient genome engineering in plants
.
Mol. Plant
,
6
,
2008
2011
.

19

Meng
X.
Noyes
M.B.
Zhu
L.J.
et al.  . (
2008
)
Targeted gene inactivation in zebrafish using engineered zinc-finger nucleases
.
Nat. Biotechnol.
,
26
,
695
701
.

20

Baker
M.
(
2012
)
Gene-editing nucleases
.
Nat. Methods
,
9
,
23
-
26
.

21

Kim
Y.G.
Cha
J.
Chandrasegaran
S.
(
1996
)
Hybrid restriction enzymes: zinc finger fusions to Fok I cleavage domain
.
Proc. Natl. Acad. Sci. U.S.A.
,
93
,
1156
1160
.

22

Li
T.
Huang
S.
Jiang
W.Z.
et al.  . (
2011
)
TAL nucleases (TALNs): hybrid proteins composed of TAL effectors and FokI DNA-cleavage domain
.
Nucleic Acids Res.
,
39
,
359
372
.

23

Gaj
T.
Gersbach
C.A.
Barbas
C.F.
III
. (
2013
)
ZFN, TALEN, and CRISPR/Cas-based methods for genome engineering
.
Trends Biotechnol.
,
31
,
397
405
.

24

Szczepek
M.
Brondani
V.
Buchel
J.
et al.  . (
2007
)
Structure-based redesign of the dimerization interface reduces the toxicity of zinc-finger nucleases
.
Nat. Biotechnol.
,
25
,
786
793
.

25

Mali
P.
Esvelt
K.M.
Church
G.M.
(
2013
)
Cas9 as a versatile tool for engineering biology
.
Nat. Methods
,
10
,
957
963
.

26

Xiao
A.
Wu
Y.
Yang
Z.
et al.  . (
2013
)
EENdb: a database and knowledge base of ZFNs and TALENs for endonuclease engineering
.
Nucleic Acids Res.
,
41
,
D415
D422
.

27

Sasaki
H.
Yoshida
K.
Hozumi
A.
et al.  . (
2014
)
CRISPR/Cas9-mediated gene knockout in the ascidian Ciona intestinalis
.
Dev. Growth Differ.
,
56
,
499
510
.

28

Shen
Z.
Zhang
X.
Chai
Y.
et al.  . (
2014
)
Conditional knockouts generated by engineered CRISPR-Cas9 endonuclease reveal the roles of coronin in C
.
elegans neural development. Dev. Cell
,
30
,
625
636
.

29

Heo
Y.
Quan
X.
Xu
Y.
et al.  . (
2015
)
CRISPR/Cas9 nuclease-mediated gene knock-in in bovine-induced pluripotent cells
.
Stem Cells Dev.
,
24
,
393
402
.

30

Zhou
H.
Liu
B.
Weeks
D.P.
et al.  . (
2014
)
Large chromosomal deletions and heritable small genetic changes induced by CRISPR/Cas9 in rice
.
Nucleic Acids Res
.,
42
,
10903
10914
.

31

Zheng
Q.
Cai
X.
Tan
M.H.
et al.  . (
2014
)
Precise gene deletion and replacement using the CRISPR/Cas9 system in human cells
.
Biotechniques
,
57
,
115
124
.

32

Fujii
W.
Onuma
A.
Sugiura
K.
et al.  . (
2014
)
One-step generation of phenotype-expressing triple-knockout mice with heritable mutated alleles by the CRISPR/Cas9 system
.
J. Reprod. Dev.
,
60
,
324
327
.

33

Han
J.
Zhang
J.
Chen
L.
et al.  . (
2014
)
Efficient in vivo deletion of a large imprinted lncRNA by CRISPR/Cas9
.
RNA Biol.
,
11
.

34

Waldrip
Z.J.
Byrum
S.D.
Storey
A.J.
et al.  . (
2014
)
A CRISPR-based approach for proteomic analysis of a single genomic locus
.
Epigenetics
,
9
,
1207
1211
.

35

Incontro
S.
Asensio
C.S.
Edwards
R.H.
et al.  . (
2014
)
Efficient, complete deletion of synaptic proteins using CRISPR
.
Neuron
,
83
,
1051
1057
.

36

Xie
F.
Ye
L.
Chang
J.C.
et al.  (
2014
)
Seamless gene correction of beta-thalassemia mutations in patient-specific iPSCs using CRISPR/Cas9 and piggyBac
.
Genome Res.
,
24
,
1526
1533
.

37

Long
C.
McAnally
J.R.
Shelton
J.M.
et al.  . (
2014
)
Prevention of muscular dystrophy in mice by CRISPR/Cas9-mediated editing of germline DNA
.
Science
,
345
,
1184
1188
.

38

Yoshimi
K.
Kaneko
T.
Voigt
B.
et al.  . (
2014
)
Allele-specific genome editing and correction of disease-associated phenotypes in rats using the CRISPR-Cas platform
.
Nat. Commun.
,
5
,
4240
.

39

Hu
Z.
Yu
L.
Zhu
D.
et al.  . (
2014
)
Disruption of HPV16-E7 by CRISPR/Cas system induces apoptosis and growth inhibition in HPV16 positive human cervical cancer cells
.
Biomed. Res. Int.
,
2014
,
612823
.

40

Lin
S.R.
Yang
H.C.
Kuo
Y.T.
et al.  . (
2014
)
The CRISPR/Cas9 system facilitates clearance of the intrahepatic HBV templates in vivo
.
Mol. Ther. Nucleic Acids
,
3
,
e186
.

41

Martel
B.
Moineau
S.
(
2014
)
CRISPR-Cas: an efficient tool for genome engineering of virulent bacteriophages
.
Nucleic Acids Res.
,
42
,
9504
9513
.

42

Xue
W.
Chen
S.
Yin
H.
et al.  . (
2014
)
CRISPR-mediated direct mutation of cancer genes in the mouse liver
.
Nature .
,
514
,
380
384
.

43

Hu
W.
Kaminski
R.
Yang
F.
et al.  . (
2014
)
RNA-directed gene editing specifically eradicates latent and prevents new HIV-1 infection
.
Proc. Natl. Acad. Sci. U.S.A.
,
111
,
11461
11466
.

44

Mali
P.
Yang
L.
Esvelt
K.M.
et al.  . (
2013
)
RNA-guided human genome engineering via Cas9
.
Science
,
339
,
823
826
.

45

Wang
H.
Yang
H.
Shivalila
C.S.
et al.  . (
2013
)
One-step generation of mice carrying mutations in multiple genes by CRISPR/Cas-mediated genome engineering
.
Cell
,
153
,
910
918
.

46

Yu
C.
Zhang
Y.
Yao
S.
et al.  . (
2014
)
A PCR based protocol for detecting indel mutations induced by TALENs and CRISPR/Cas9 in zebrafish
.
PLoS One
,
9
,
e98282
.

47

Ren
X.
Sun
J.
Housden
B.E.
et al.  . (
2013
)
Optimized gene editing technology for Drosophila melanogaster using germ line-specific Cas9
.
Proc. Natl. Acad. Sci. U.S.A.
,
110
,
19012
19017
.

48

Li
J.F.
Norville
J.E.
Aach
J.
et al.  . (
2013
)
Multiplex and homologous recombination-mediated genome editing in Arabidopsis and Nicotiana benthamiana using guide RNA and Cas9
.
Nat. Biotechnol.
,
31
,
688
691
.

49

Horii
T.
Morita
S.
Kimura
M.
et al.  . (
2013
)
Genome engineering of mammalian haploid embryonic stem cells using the Cas9/RNA system
.
PeerJ
,
1
,
e230
.

50

Stolfi
A.
Gandhi
S.
Salek
F.
et al.  . (
2014
)
Tissue-specific genome editing in Ciona embryos by CRISPR/Cas9
.
Development
,
141
,
4115
4120
.

Author notes

Citation details: Kaur,K., Tandon,H., Gupta,A.K. et al. CrisprGE: a central hub of CRISPR/Cas-based genome editing. Database (2015) Vol. 2015: article ID bav055; doi:10.1093/database/bav055

This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/4.0/ ), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

Supplementary data