Methods for Discovery and Analysis of Class2 CRISPR-Cas Systems

ZHU Xiaofei; HUANG Jiaomei; YUAN Hao; WAN Yi

doi:10.15886/j.cnki.rdswxb.2021.01.017

Volume 12 Issue 1

Apr. 2021

Turn off MathJax

Article Contents

Article Navigation > Journal of Tropical Biology > 2021 > 12(1): 115-123

ZHU Xiaofei, HUANG Jiaomei, YUAN Hao, WAN Yi. Methods for Discovery and Analysis of Class2 CRISPR-Cas Systems[J]. Journal of Tropical Biology, 2021, 12(1): 115-123. doi: 10.15886/j.cnki.rdswxb.2021.01.017

Citation:

ZHU Xiaofei, HUANG Jiaomei, YUAN Hao, WAN Yi. Methods for Discovery and Analysis of Class2 CRISPR-Cas Systems[J]. Journal of Tropical Biology, 2021, 12(1): 115-123. doi: 10.15886/j.cnki.rdswxb.2021.01.017

Methods for Discovery and Analysis of Class2 CRISPR-Cas Systems

doi: 10.15886/j.cnki.rdswxb.2021.01.017

ZHU Xiaofei^1
,,
HUANG Jiaomei¹,
YUAN Hao²,
WAN Yi^{1, 3
,
,}

1.
Marine College/State Laboratory of Marine Utilization in South China Sea, Hainan University, Haikou, Hainan 570228
2.
College of Information and Communication Engineering, Hainan University, Haikou, Hainan 570228
3.
Institute of Oceanology/Shandong Key Laboratory of Corrosion Science, Chinese Academy of Sciences, Qingdao, Shandong 266071

Received Date: 2020-07-08
Rev Recd Date: 2020-09-20

Available Online: 2021-01-11

Publish Date: 2021-04-12

Abstract

Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR-Cas) has been widely used as a tool in recent years for gene editing in animal and plant gene editing. The proven Class2 CRISPR-Cas systems, such as CRISPR-Cas12 and CRISPR-Cas14, have been discovered through bioinformatics mining. Bioinformatics has become an important tool for discovering of new CRISPR-Cas systems and their subtypes. Two methods for bioinformatics mining of Cas enzymes are reviewed. One method is to create a hidden Markov model (HMM) using known Cas enzymes to predict similar Cas enzymes, and the other method is to analyze the possible upstream and downstream Cas enzymesbased on the recognition of the marker sequence Cas1 or CRISPR. The limitations of these two methods are discussed. Furthermore, methods for further analysis of Cas protein and CRISPR sequences are also reviewed, including Cas protein homology, phylogenetic analysis, and analysis of CRISPR sequence spacers, protospacers&protospacer adjacent motifs (PAM).
- mining of Cas enzyme,
- CRISPR-Cas system,
- bioinformatics analysis

References

[1]	ISHINO Y, SHINAGAWA H, MAKINO K, et al. Nucleotide sequence of the iap gene responsible for alkaline phosphatase isozyme conversion in Escherichia coli and identification of the gene product [J]. Journal of Bacteriology, 1987, 169(12): 5429 − 5433. doi: 10.1128/JB.169.12.5429-5433.1987
[2]	JANSEN R, EMBDEN J D, GAASTRA W, et al. Identification of genes that are associated with DNA repeats in prokaryotes [J]. Molecular Microbiology, 2002, 43(6): 1565 − 1575. doi: 10.1046/j.1365-2958.2002.02839.x
[3]	MAKAROVA K S, GRISHIN N V, SHABALINA S A, et al. A putative RNA-interference-based immune system in prokaryotes: computational analysis of the predicted enzymatic machinery, functional analogies with eukaryotic RNAi, and hypothetical mechanisms of action [J]. Biology Direct, 2006, 1(7): 1 − 26.
[4]	BARRANGOU R, FREMAUX C, DEVEAU H, et al. CRISPR provides acquired resistance against viruses in prokaryotes [J]. Science, 2007, 315(5819): 1709 − 1712. doi: 10.1126/science.1138140
[5]	KONERMANN S, LOTFY P, BRIDEAU N J, et al. Transcriptome engineering with RNA-targeting type Ⅵ-D CRISPR e ffectors [J]. Cell, 2018, 173(3): 665 − 676. doi: 10.1016/j.cell.2018.02.033
[6]	LUCAS B H, DAVID B, JANICE S C, et al. Programmed DNA destruction by miniature CRISPR-Cas14 enzymes [J]. Science, 2018, 362(6416): 839 − 842. doi: 10.1126/science.aav4294
[7]	HYATT D, CHEN G L, LOCASCIO P F, et al. Prodigal: prokaryotic gene recognition andtranslation initiation site identification [J]. Bmc Bioinformatics, 2010, 11(119): 1 − 11.
[8]	DELCHER A L, BRATKE K A, POWERS E C, et al. Identifying bacterial genes and endosymbiont DNA with Glimmer [J]. Bioinformatics, 2007, 23(6): 673 − 679. doi: 10.1093/bioinformatics/btm009
[9]	BESEMER J, LOMSADZE A, BORODOVSKY M. GeneMarkS: a self-training method for prediction of gene starts in microbial genomes. Implications for finding sequence motifs in regulatory regions [J]. Nucleic Acids Research, 2001, 29(12): 2607 − 2618. doi: 10.1093/nar/29.12.2607
[10]	BURSTEIN D, HARRINGTON L B, STRUTT S C, et al. New CRISPR-Cas systems from uncultivated microbes [J]. Nature, 2017, 542(7640): 237 − 241. doi: 10.1038/nature21059
[11]	周海廷. 隐马尔科夫过程在生物信息学中的应用[J]. 生命科学研究, 2002, 6(3): 204 − 210. doi: 10.3969/j.issn.1007-7847.2002.03.004
[12]	WONG K M, SUCHARD M A, HUELSENBECK J P. Alignment Uncertainty and Genomic Analysis [J]. Science, 2008, 319(5862): 473 − 476. doi: 10.1126/science.1151532
[13]	POTTER S C, LUCIANI A, EDDY S R, et al. HMMER web server: 2018 update [J]. Nucleic Acids Research, 2018(46): 200 − 204.
[14]	BISWAS A, STAALS J, MORALES S E, et al. CRISPRDetect: A flexible algorithm to define CRISPR arrays [J]. BMC Genomics, 2016, 17(1): 1 − 14.
[15]	IBTISSEM G, GILLES V, CHRISTINE P. CRISPRFinder: a web tool to identify clustered regularly interspaced short palindromic repeats [J]. Nucleic Acids Research, 2007(35): 52 − 57.
[16]	Robert C E. PILER-CR: Fast and accurate identification of CRISPR repeats [J]. BMC Bioinformatics, 2007, 8(18): 1 − 6.
[17]	ZETSCHE B, GOOTENBERG J S, ABUDAYYEH O O, et al. Cpf1 is a single RNA-guided endonuclease of a class 2 CRISPR-Cassystem [J]. Cell, 2015(163): 1 − 13.
[18]	COUVIN D, BERNHEIM A, TOFFANO-NIOCHE C, et al. CRISPRCasFinder, an update of CRISRFinder, includes a portable version enhanced performance and integrates search for Casproteins [J]. Nuclc Acids Research, 2018(46): 246 − 251.
[19]	TAKEUCHI N, WOLF Y I, MAKAROVA S, et al. Nature and intensity of selection pressure on CRISPR-associated genes [J]. Journal of Bacteriology, 2011, 194(5): 1216 − 1225.
[20]	SHMAKOV S, SMARGON A, SCOTT D, et al. Diversity and evolution of class 2 CRISPR–Cassystems [J]. Nature Reviews Microbiology, 2017, 15(3): 169 − 182. doi: 10.1038/nrmicro.2016.184
[21]	WENHAN ZHU, LOMSADZE A, BORODOVSKY M. Ab initio gene identification in metagenomic sequences [J]. Nucleic Acids Research, 2010, 38(12): e132. doi: 10.1093/nar/gkq275
[22]	MAKAROVA K S, WOLF Y I, ALKHNBASHI O S, et al. An updated evolutionary classification of CRISPR-Cassystems [J]. Nature Reviews Microbiology, 2015, 13(3569): 722 − 736.
[23]	SMARGON A A, COX D B, PYZOCHA N K, et al. Cas13b is a type VI-B CRISPR-associated RNA-guided RNasedifferentially regulated by accessory proteins Csx27 and Csx28 [J]. Molecular Cell, 2017(65): 618 − 630.
[24]	NISHIMASU H, RAN A F, PATRICK D H, et al. Crystal structure of Cas9 in complex with guide RNA and target DNA [J]. Cell, 2014(156): 935 − 949.
[25]	NISHIMASU H, CONG L, YAN W, et al. Crystal structure of Staphylococcus aureusCas9 [J]. Cell, 2015, 162(5): 1113 − 1126. doi: 10.1016/j.cell.2015.08.007
[26]	YAMANO T, NISHIMASU H, ZETSCHE B, et al. Crystal structure of Cpf1 in complex with guide RNA and target DNA [J]. Cell, 2016, 165(4): 949 − 962. doi: 10.1016/j.cell.2016.04.003
[27]	唐东明, 朱清新, 陈科, 等. 一种有效的蛋白质序列聚类分析方法[J]. 软件学报, 2011, 22(8): 1827 − 1837.
[28]	YING ZHAO, KARYPIS G. Data clustering in life sciences [J]. Molecular Biotechnology, 2005, 31(1): 55 − 80. doi: 10.1385/MB:31:1:055
[29]	LI L. OrthoMCL: Identification of orthologgroups for eukaryotic genomes [J]. Genome Research, 2003, 13(9): 2178 − 2189. doi: 10.1101/gr.1224503
[30]	ENRIGHT A J, DONGEN S V, OUZOUNIS C A. An efficient algorithm for large-scale detection of protein families [J]. Nucleic Acids Research, 2002, 30(7): 1575 − 1584. doi: 10.1093/nar/30.7.1575
[31]	ARON M B, PANCHENKO A R, SHOEMAKER B A, et al. CDD: a database of conserved domain alignments with links to domain three-dimensional structure [J]. Nucleic Acids Research, 2002(30): 281 − 283.
[32]	UNIPROT C. The UniProt Consortium. UniProt: a hub for protein information [J]. Nucleic Acids Research, 2015, 43(D1): D204 − D212. doi: 10.1093/nar/gku989
[33]	REMMERT M, BIEGERT A, HAUSERA, et al. HHblits: Lightning-fast iterative protein sequence searching by HMM-HMM alignment [J]. Nature Methods, 2011, 9(2): 173 − 175.
[34]	ALEXANDROS S. RAxML version 8: A tool for phylogenetic analysis and post-analysis of large phylogenies [J]. Bioinformatics, 2014(9): 1312 − 1313.
[35]	GASCUEL O. New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0 [J]. Systematic Biology, 2010, 59(3): 307 − 321. doi: 10.1093/sysbio/syq010
[36]	IVICA L, PEER B. Interactive Tree of Life (iTOL): An online tool for phylogenetic tree display and annotation[M]. New York: Oxford University Press, 2007.
[37]	MAKAROVA K S, WOLF Y I, KOONIN E V. Comparative genomics of defense systems in archaea and bacteria [J]. Nucleic Acids Research, 2013, 41(8): 4360 − 4377. doi: 10.1093/nar/gkt157
[38]	Alexey D, Christian C, James P, et al. JPred4: a protein secondary structure prediction server [J]. Nucleic Acids Research, 2015, 43(332): 389 − 394.
[39]	MARCHLER-BAUER A, STEPHEN H B. CDD: conserved domains and protein three-dimensional structure [J]. Nucleic Acids Research, 2004, 32(454): 327 − 331.
[40]	SODINGJ. Protein homology detection by HMM-HMM comparison. [J]. Bioinformatics, 2005(21): 951 − 960.
[41]	KELLEY L A, MEZULIS S, YATES C M, et al. The Phyre2 web portal for protein modeling, prediction and analysis [J]. Nature Protocol, 2015, 10(6): 845 − 858. doi: 10.1038/nprot.2015.053
[42]	ROY A, KUCUKURAL A, ZHANG Y. I-TASSER: a unified platform for automated protein structure and function prediction [J]. Nature Protocols, 2010, 5(4): 725 − 738. doi: 10.1038/nprot.2010.5
[43]	SKENNERTON C T, MICHAEL I, TYSON G W. Crass: identification and reconstruction of CRISPR from unassembled metagenomicdata [J]. Nucleic Acids Research, 2013, 41(10): 105. doi: 10.1093/nar/gkt183
[44]	ZHANG Z, SCHWARTZ S, WAGNER L, et al. A greedy algorithm for aligning DNA sequences. [J]. Journal of Computational Biology, 2000, 7(2): 203 − 214.
[45]	JINEK M, CHYLINSKI K, FONFARA I, et al. A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity [J]. Science, 2012, 337(6096): 816 − 821. doi: 10.1126/science.1225829
[46]	GAVIN E C, GARY H, JOHN J M, et al. WebLogo: a sequence logo generator [J]. Genome Research, 2004, 14(6): 1188 − 1190. doi: 10.1101/gr.849004
[47]	MAKAROVA K S, WOLF Y I, IRANZO J, et al. Evolutionary classification of CRISPR–Cas systems: a burst of class 2 and derived variants [J]. Nature Reviews Microbiology, 2020, 18(2): 67 − 83. doi: 10.1038/s41579-019-0299-x
[48]	KOONIN E V, MAKAROVA K S. Mobile genetic elements and evolution of CRISPR-Cassystems: all the way there and back [J]. Genome Biology and Evolution, 2017, 9(10): 2812 − 2825. doi: 10.1093/gbe/evx192
[49]	GUILHEM F K, MAKAROVA K S, KOONIN E V. CRISPR-Cas: complex functional networks and multiple roles beyond adaptive immunity [J]. Journal of Molecular Biology, 2019, 4(431): 3 − 20.
[50]	PETERS J E, MAKAROVA K S, SHMAKOV S, et al. Recruitment of CRISPR-Cas systems by Tn7-like transposons [J]. Proceedings of the National Academy of Sciences, 2017, 114(35): 7358 − 7366. doi: 10.1073/pnas.1709035114
[51]	MIGLE K, GEORGIJ K, CESLOVAS V, et al. A cyclic oligonucleotide signaling pathway in type III CRISPR-Cassystems [J]. Science, 2017(357): 605 − 609.
[52]	NIEWOEHNER O, GARCIA-DOVAL C, ROSTOL J T, et al. Type III CRISPR-Cas systems produce cyclic oligoadenylate second messengers [J]. Nature, 2017, 548(7669): 543 − 548. doi: 10.1038/nature23467

Proportional views

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

Figures(4) / Tables(2)

Get Citation

PDF

XML

Article views(929) PDF downloads(113) Cited by()

Proportional views

HTML

Clustered Regularly Interspaced Short Palindromic Repeats-associated gene（CRISPR-Cas）全称为成簇的规律间隔的短回文重复序列，最初于1987年在大肠杆菌中发现。ISHINO Y等^[1]在研究大肠杆菌iap（碱性磷酸酶）基因时，在其编码区3′端侧翼序列中发现长度为29 bp高度保守的重复核苷酸序列，重复序列的间隔为32 bp。随着对该序列的深入研究，发现该重复序列广泛存在于古细菌和细菌的基因组中，直到2002年JANSEN R正式命名该重复序列为CRISPR序列，除此之外，该研究还发现CRISPR基因的侧翼序列中有4种同源基因（CRISPR-associated gene）：cas1、cas2、cas3、cas4，这些基因编码一些功能蛋白，与CRISPR有功能相关性^[2]。随着深入研究，CRISPR-Cas系统的功能的免疫功能逐渐被发现，CRISPR-Cas系统类似于真核生物的RNA干扰（RNAi）^[3]，后经证实是细菌对噬菌体等病原生物的获得性免疫作用^[4]。CRISPR-Cas系统在细菌对抗噬菌体侵入时分为3个阶段。第1阶段为适应阶段：在噬菌体侵入细菌时，Cas1-Cas2蛋白复合物根据前间隔序列临近基序（PAM）位点将噬菌体靶DNA（protospacer）切割并将这段靶DNA序列插入到CRISPR重复序列5′端的末尾，产生新的间隔序列（spacer）。第2阶段是基因的表达和处理阶段，间隔序列（spacers）和CRISPR重复序列共同进行转录，形成初转录产物pre-CRISPR RNA（pre-crRNA），后由Cas蛋白复合物对转录初产物进行切割，得到成熟的包含间隔序列（spacers）和重复序列的CRISPR RNAs（crRNAs）。不同的CRISPR-Cas系统对pre-crRNA的处理存在差异，有些由多个Cas蛋白亚基处理，有的由单个Cas蛋白处理，有的借助于宿主细胞的RNase。第3阶段为干扰阶段，在guide RNA（crRNA和tracrRNA合成的引导RNA）的引导下，利用单独Cas蛋白或是Cas蛋白复合物对靶DNA或RNA进行切割。第一类CRISPR-Cas系统在切割靶链时需要多个Cas蛋白复合体的参与，而第二类CRISPR-Cas系统在切割靶DNA或RNA时只需要单个Cas蛋白加guide RNA（gRNA）即可完成对靶链的切割。因此，第二类CRISPR-Cas系统成为现在基因编辑中重要的工具。

4. 总结与展望

笔者以生物信息学手段为重点，将基于微生物基因组CRISPR-Cas系统发掘细分为：1）基于隐马尔科夫模型的发掘方法：i）开放阅读框预测，ii）收集已知的Cas蛋白建立隐马尔科夫模型，iii）CRISPR序列识别；2）以Cas1和CRISPR为标志序列进行CRRISPR-Cas发掘：i）通过标志序列Cas1或CRISPR序列对基因组进行检索，ii）对标志序列的上下游蛋白进行分析寻找可能存在的Cas酶。提供了在识别出新CRISPR-Cas系统后，对新CRISPR-Cas系统的Cas酶进行的聚类分析（BLAST、HHpred等软件）、进化树建立（RAxml等软件）、结构域和三级结构预测（JPred4等软件）分析方法；3）对新CRISPR-Cas系统中，CRISPR序列的间隔序列（CRASS等软件）、前间隔序列（blastn等）前间隔序列临近基序分析。

然而，不同的分析方法在实践应用中会有相应的限制。Cas酶发掘方面，通过隐马尔科夫建立模型的手段只能根据已知的Cas酶预测出与已知相似的类型，不能预测出序列差别大的两种不同类型Cas蛋白。通过标志序列Cas1和CRISPR序列进行的新Cas酶发掘对CRISPR-Cas系统的结构有严格要求，发掘出的CRISPR-Cas系统必须在上下游20 kb以内含有标志序列。随着新发现的Class2 CRISPR-Cas14中Cas蛋白只有400～700个氨基酸^[6]，传统认为，单个蛋白可以产生靶向切割作用的Cas蛋白需要大于950个氨基酸残基的观点被颠覆，因此，对于标志基因上下游>700氨基酸残基的蛋白筛选限制条件应当更新。此外，Cas蛋白进化分类方面随着Cas12发现可能与TnpB蛋白转座有关，提供了不同Cas蛋白起源不同的新观点。CRISPR序列识别方面，有些软件并不能展示出DR序列或是序列方向，因此，可能会造成PAM分析和结构分析的误差。

CRISPR系统分类上看，随着近年来CRISPR-Cas系统研究的不断发展，分类方法应不断更新。主要原因如下：1）随着CRISPR-Cas生物信息学发掘工具的不断发展，已经发现靶RNA切割的Ⅵ型和Ⅴ型CRISPR-Cas系统，并有个Ⅴ型的亚型被发现。有研究表明，Ⅴ型CRISPR-Cas系统是从转座子TnpB核酸酶通过基因座转移和重复进化产生，因此Ⅴ型CRISPR-Cas系统出现了大量的突变体，并且有相当一部分进化成了独立的亚型^[48]。2）近年来发现的CRISPR-Cas系统中，被认为在细菌或古菌中执行不同于获得性免疫的功能^[49]，不含有靶链切割的能力，这些被认为功能不同的CRISPR-Cas序列通常编码于转座子等可以动的编码区中^[48,50]。3）多种涉及到CRISPR-Cas系统的标志基因与信号传递和调控作用有关^[51-52]。

CRISPR-Cas系统作为定向基因编辑的革命性技术，拥有巨大的潜力和广阔的研究前景。已经发现的Class2 CRISPR-Cas系统可以定向切割靶单链DNA/RNA和靶双链DNA，然而，至今为止尚未有科学家发现可切割双链RNA的CRISPR-Cas系统。随着越来越多的微生物和宏基因组数据被提供、越来越精进的基因组测序以及不断完善的生物信息学方分析法，会有更多的CRISPR-Cas系统被发现并应用于基因组的定向编辑，帮助人们了解分析动植物基因功能。

Reference (52)

[1]	ISHINO Y, SHINAGAWA H, MAKINO K, et al. Nucleotide sequence of the iap gene responsible for alkaline phosphatase isozyme conversion in Escherichia coli and identification of the gene product [J]. Journal of Bacteriology, 1987, 169(12): 5429 − 5433.
[2]	JANSEN R, EMBDEN J D, GAASTRA W, et al. Identification of genes that are associated with DNA repeats in prokaryotes [J]. Molecular Microbiology, 2002, 43(6): 1565 − 1575.
[3]	MAKAROVA K S, GRISHIN N V, SHABALINA S A, et al. A putative RNA-interference-based immune system in prokaryotes: computational analysis of the predicted enzymatic machinery, functional analogies with eukaryotic RNAi, and hypothetical mechanisms of action [J]. Biology Direct, 2006, 1(7): 1 − 26.
[4]	BARRANGOU R, FREMAUX C, DEVEAU H, et al. CRISPR provides acquired resistance against viruses in prokaryotes [J]. Science, 2007, 315(5819): 1709 − 1712.
[5]	KONERMANN S, LOTFY P, BRIDEAU N J, et al. Transcriptome engineering with RNA-targeting type Ⅵ-D CRISPR e ffectors [J]. Cell, 2018, 173(3): 665 − 676.
[6]	LUCAS B H, DAVID B, JANICE S C, et al. Programmed DNA destruction by miniature CRISPR-Cas14 enzymes [J]. Science, 2018, 362(6416): 839 − 842.
[7]	HYATT D, CHEN G L, LOCASCIO P F, et al. Prodigal: prokaryotic gene recognition andtranslation initiation site identification [J]. Bmc Bioinformatics, 2010, 11(119): 1 − 11.
[8]	DELCHER A L, BRATKE K A, POWERS E C, et al. Identifying bacterial genes and endosymbiont DNA with Glimmer [J]. Bioinformatics, 2007, 23(6): 673 − 679.
[9]	BESEMER J, LOMSADZE A, BORODOVSKY M. GeneMarkS: a self-training method for prediction of gene starts in microbial genomes. Implications for finding sequence motifs in regulatory regions [J]. Nucleic Acids Research, 2001, 29(12): 2607 − 2618.
[10]	BURSTEIN D, HARRINGTON L B, STRUTT S C, et al. New CRISPR-Cas systems from uncultivated microbes [J]. Nature, 2017, 542(7640): 237 − 241.
[11]	周海廷. 隐马尔科夫过程在生物信息学中的应用[J]. 生命科学研究, 2002, 6(3): 204 − 210.
[12]	WONG K M, SUCHARD M A, HUELSENBECK J P. Alignment Uncertainty and Genomic Analysis [J]. Science, 2008, 319(5862): 473 − 476.
[13]	POTTER S C, LUCIANI A, EDDY S R, et al. HMMER web server: 2018 update [J]. Nucleic Acids Research, 2018(46): 200 − 204.
[14]	BISWAS A, STAALS J, MORALES S E, et al. CRISPRDetect: A flexible algorithm to define CRISPR arrays [J]. BMC Genomics, 2016, 17(1): 1 − 14.
[15]	IBTISSEM G, GILLES V, CHRISTINE P. CRISPRFinder: a web tool to identify clustered regularly interspaced short palindromic repeats [J]. Nucleic Acids Research, 2007(35): 52 − 57.
[16]	Robert C E. PILER-CR: Fast and accurate identification of CRISPR repeats [J]. BMC Bioinformatics, 2007, 8(18): 1 − 6.
[17]	ZETSCHE B, GOOTENBERG J S, ABUDAYYEH O O, et al. Cpf1 is a single RNA-guided endonuclease of a class 2 CRISPR-Cassystem [J]. Cell, 2015(163): 1 − 13.
[18]	COUVIN D, BERNHEIM A, TOFFANO-NIOCHE C, et al. CRISPRCasFinder, an update of CRISRFinder, includes a portable version enhanced performance and integrates search for Casproteins [J]. Nuclc Acids Research, 2018(46): 246 − 251.
[19]	TAKEUCHI N, WOLF Y I, MAKAROVA S, et al. Nature and intensity of selection pressure on CRISPR-associated genes [J]. Journal of Bacteriology, 2011, 194(5): 1216 − 1225.
[20]	SHMAKOV S, SMARGON A, SCOTT D, et al. Diversity and evolution of class 2 CRISPR–Cassystems [J]. Nature Reviews Microbiology, 2017, 15(3): 169 − 182.
[21]	WENHAN ZHU, LOMSADZE A, BORODOVSKY M. Ab initio gene identification in metagenomic sequences [J]. Nucleic Acids Research, 2010, 38(12): e132.
[22]	MAKAROVA K S, WOLF Y I, ALKHNBASHI O S, et al. An updated evolutionary classification of CRISPR-Cassystems [J]. Nature Reviews Microbiology, 2015, 13(3569): 722 − 736.
[23]	SMARGON A A, COX D B, PYZOCHA N K, et al. Cas13b is a type VI-B CRISPR-associated RNA-guided RNasedifferentially regulated by accessory proteins Csx27 and Csx28 [J]. Molecular Cell, 2017(65): 618 − 630.
[24]	NISHIMASU H, RAN A F, PATRICK D H, et al. Crystal structure of Cas9 in complex with guide RNA and target DNA [J]. Cell, 2014(156): 935 − 949.
[25]	NISHIMASU H, CONG L, YAN W, et al. Crystal structure of Staphylococcus aureusCas9 [J]. Cell, 2015, 162(5): 1113 − 1126.
[26]	YAMANO T, NISHIMASU H, ZETSCHE B, et al. Crystal structure of Cpf1 in complex with guide RNA and target DNA [J]. Cell, 2016, 165(4): 949 − 962.
[27]	唐东明, 朱清新, 陈科, 等. 一种有效的蛋白质序列聚类分析方法[J]. 软件学报, 2011, 22(8): 1827 − 1837.
[28]	YING ZHAO, KARYPIS G. Data clustering in life sciences [J]. Molecular Biotechnology, 2005, 31(1): 55 − 80.
[29]	LI L. OrthoMCL: Identification of orthologgroups for eukaryotic genomes [J]. Genome Research, 2003, 13(9): 2178 − 2189.
[30]	ENRIGHT A J, DONGEN S V, OUZOUNIS C A. An efficient algorithm for large-scale detection of protein families [J]. Nucleic Acids Research, 2002, 30(7): 1575 − 1584.
[31]	ARON M B, PANCHENKO A R, SHOEMAKER B A, et al. CDD: a database of conserved domain alignments with links to domain three-dimensional structure [J]. Nucleic Acids Research, 2002(30): 281 − 283.
[32]	UNIPROT C. The UniProt Consortium. UniProt: a hub for protein information [J]. Nucleic Acids Research, 2015, 43(D1): D204 − D212.
[33]	REMMERT M, BIEGERT A, HAUSERA, et al. HHblits: Lightning-fast iterative protein sequence searching by HMM-HMM alignment [J]. Nature Methods, 2011, 9(2): 173 − 175.
[34]	ALEXANDROS S. RAxML version 8: A tool for phylogenetic analysis and post-analysis of large phylogenies [J]. Bioinformatics, 2014(9): 1312 − 1313.
[35]	GASCUEL O. New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0 [J]. Systematic Biology, 2010, 59(3): 307 − 321.
[36]	IVICA L, PEER B. Interactive Tree of Life (iTOL): An online tool for phylogenetic tree display and annotation[M]. New York: Oxford University Press, 2007.
[37]	MAKAROVA K S, WOLF Y I, KOONIN E V. Comparative genomics of defense systems in archaea and bacteria [J]. Nucleic Acids Research, 2013, 41(8): 4360 − 4377.
[38]	Alexey D, Christian C, James P, et al. JPred4: a protein secondary structure prediction server [J]. Nucleic Acids Research, 2015, 43(332): 389 − 394.
[39]	MARCHLER-BAUER A, STEPHEN H B. CDD: conserved domains and protein three-dimensional structure [J]. Nucleic Acids Research, 2004, 32(454): 327 − 331.
[40]	SODINGJ. Protein homology detection by HMM-HMM comparison. [J]. Bioinformatics, 2005(21): 951 − 960.
[41]	KELLEY L A, MEZULIS S, YATES C M, et al. The Phyre2 web portal for protein modeling, prediction and analysis [J]. Nature Protocol, 2015, 10(6): 845 − 858.
[42]	ROY A, KUCUKURAL A, ZHANG Y. I-TASSER: a unified platform for automated protein structure and function prediction [J]. Nature Protocols, 2010, 5(4): 725 − 738.
[43]	SKENNERTON C T, MICHAEL I, TYSON G W. Crass: identification and reconstruction of CRISPR from unassembled metagenomicdata [J]. Nucleic Acids Research, 2013, 41(10): 105.
[44]	ZHANG Z, SCHWARTZ S, WAGNER L, et al. A greedy algorithm for aligning DNA sequences. [J]. Journal of Computational Biology, 2000, 7(2): 203 − 214.
[45]	JINEK M, CHYLINSKI K, FONFARA I, et al. A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity [J]. Science, 2012, 337(6096): 816 − 821.
[46]	GAVIN E C, GARY H, JOHN J M, et al. WebLogo: a sequence logo generator [J]. Genome Research, 2004, 14(6): 1188 − 1190.
[47]	MAKAROVA K S, WOLF Y I, IRANZO J, et al. Evolutionary classification of CRISPR–Cas systems: a burst of class 2 and derived variants [J]. Nature Reviews Microbiology, 2020, 18(2): 67 − 83.
[48]	KOONIN E V, MAKAROVA K S. Mobile genetic elements and evolution of CRISPR-Cassystems: all the way there and back [J]. Genome Biology and Evolution, 2017, 9(10): 2812 − 2825.
[49]	GUILHEM F K, MAKAROVA K S, KOONIN E V. CRISPR-Cas: complex functional networks and multiple roles beyond adaptive immunity [J]. Journal of Molecular Biology, 2019, 4(431): 3 − 20.
[50]	PETERS J E, MAKAROVA K S, SHMAKOV S, et al. Recruitment of CRISPR-Cas systems by Tn7-like transposons [J]. Proceedings of the National Academy of Sciences, 2017, 114(35): 7358 − 7366.
[51]	MIGLE K, GEORGIJ K, CESLOVAS V, et al. A cyclic oligonucleotide signaling pathway in type III CRISPR-Cassystems [J]. Science, 2017(357): 605 − 609.
[52]	NIEWOEHNER O, GARCIA-DOVAL C, ROSTOL J T, et al. Type III CRISPR-Cas systems produce cyclic oligoadenylate second messengers [J]. Nature, 2017, 548(7669): 543 − 548.

Name
E-mail
Phone
Title
Content
Verification Code

软件 Software	优点 Advantages	缺点 Disadvantages
Prodigal	使用简单、所有基因组可在同一文件运行	预测结果较少
Glimmer	预测结果多	使用复杂
Genemarks	依赖自我训练集	需要单个基因组运行

软件 Software	优点 Advantages	缺点 Disadvantages
CRISPRDetect	识别序列方向	背景噪声
CRISPR Finder	DRs识别及展示、准确识别小序列	单个基因组序列运行
PILER-CR	使用简单，所有基因组可放在同一文件运行，速度快	识别精度较低

Message Board

Methods for Discovery and Analysis of Class2 CRISPR-Cas Systems

doi: 10.15886/j.cnki.rdswxb.2021.01.017

Abstract

References

Proportional views

通讯作者: 陈斌, bchen63@163.com

Proportional views

Related