Identification of SNPs in rice GPAT genes and in silico analysis of their functional impact on GPAT proteins
DOI:
https://doi.org/10.15835/nbha49312346Keywords:
3000 Rice Genome project, functional SNPs, in silico analysis, nucleotide variationAbstract
SNPs are the most common nucleotide variations in the genome. Functional SNPs in the coding region, known as nonsynonymous SNPs (nsSNPs), change amino acid residues and affect protein function. Identifying functional SNPs is an uphill task as it is difficult to correlate between variation and phenotypes in association studies. Computational in silico analysis provides an opportunity to understand the SNPs functional impact to proteins and facilitate experimental approaches in understanding the relationship between the phenotype and genotype. Advancement in sequencing technologies contributed to sequencing thousands of genomes. As a result, many public databases have been designed incorporating this sequenced data to explore nucleotide variations. In this study, we explored functional SNPs in the rice GPAT family (as a model plant gene family), using 3000 Rice Genome Sequencing Project data. We identified 1056 SNPs, among hundred rice varieties in 26 GPAT genes, and filtered 98 nsSNPs. We further investigated the structural and functional impact of these nsSNPs using various computational tools and shortlisted 13 SNPs having high damaging effects on protein structure. We found that rice GPAT genes can be influenced by nsSNPs and they might have a major effect on regulation and function of GPAT genes. This information will be useful to understand the possible relationships between genetic mutation and phenotypic variation, and their functional implication on rice GPAT proteins. The study will also provide a computational pathway to identify SNPs in other rice gene families.
Metrics
References
Adzhubei IA, Schmidt S, Peshkin L, Ramensky VE, Gerasimova A, Bork P, … Sunyaev SR (2010). A method and server for predicting damaging missense mutations. Nature Methods 7:248-249. https://doi.org/10.1038/nmeth0410-248
Alexandrov N, Tai S, Wang W, Mansueto L, Palis K, Fuentes RR, … Li Z (2015). SNP-Seek database of SNPs derived from 3000 rice genomes. Nucleic Acids Research 43:D1023-D1027. https://doi.org/10.1093/nar/gku1039
Arif R, Akram F, Jamil T, Mukhtar H, Lee SF, Saleem M (2017). Genetic variation and its reflection on posttranslational modifications in frequency clock and mating Type a-1 proteins in Sordaria fimicola. BioMed Research International 2017:1268623. https://doi.org/10.1155/2017/1268623
Arshad M, Attya Bhatti PJ (2018). Identification and in silico analysis of functional SNPs of human TAGAP protein: A comprehensive study. PloS One 13(1):e0188143. https://doi.org/10.1371/journal.pone.0188143
Ashkenazy H, Abadi S, Martz E, Chay O, Mayrose I, Pupko T, Ben-Tal N (2016). ConSurf 2016: an improved methodology to estimate and visualize evolutionary conservation in macromolecules. Nucleic Acids Research 44:W344-W350. https://doi.org/10.1093/nar/gkw408
Ashkenazy H, Erez E, Martz E, Pupko T, Ben-Tal N (2010). ConSurf 2010: calculating evolutionary conservation in sequence and structure of proteins and nucleic acids. Nucleic Acids Research 38:W529-W533. https://doi.org/10.1093/nar/gkq399
Berezin C, Glaser F, Rosenberg J, Paz I, Pupko T, Fariselli P, Casadio R, Ben-Tal N (2004). ConSeq: the identification of functionally and structurally important residues in protein sequences. Bioinformatics 20:1322-1324. https://doi.org/10.1093/bioinformatics/bth070
Bhardwaj A, Dhar YV, Asif MH, Bag SK (2016). In silico identification of SNP diversity in cultivated and wild tomato species: insight from molecular simulations. Scientific Reports 6:38715-38715. https://doi.org/10.1038/srep38715
Bhardwaj VK, Purohit R (2020). Structural changes induced by substitution of amino acid 129 in the coat protein of Cucumber mosaic virus. Genomics 112:3729-3738. https://doi.org/10.1016/j.ygeno.2020.04.023
Blom N, Gammeltoft S, Brunak S (1999). Sequence and structure-based prediction of eukaryotic protein phosphorylation sites. Journal of Molecular Biology 294:1351-1362. https://doi.org/10.1006/jmbi.1999.3310
Carugo O, Pongor S (2001). A normalized root‐mean‐square distance for comparing protein three‐dimensional structures. Protein Science 10:1470-1473. https://doi.org/10.1110/ps.690101
Celniker G, Nimrod G, Ashkenazy H, Glaser F, Martz E, Mayrose I, Pupko T, Ben‐Tal N (2013). ConSurf: using evolutionary data to raise testable hypotheses about protein function. Israel Journal of Chemistry 53:199-206. https://doi.org/10.1002/ijch.201200096
Chaisan T, Van K, Kim MY, Kim KD, Choi B-S, Lee S-H (2012). In silico single nucleotide polymorphism discovery and application to marker-assisted selection in soybean. Molecular Breeding 29:221-233. https://doi.org/10.1007/s11032-010-9541-y
Chen M-H, Bergman CJ, Pinson SRM, Fjellstrom RG (2008). Waxy gene haplotypes: Associations with pasting properties in an international rice germplasm collection. Journal of Cereal Science 48:781-788. https://doi.org/10.1016/j.jcs.2008.05.004
Choi Y, Chan AP (2015). PROVEAN web server: a tool to predict the functional effect of amino acid substitutions and indels. Bioinformatics (Oxford, England) 31:2745-2747. https://doi.org/10.1093/bioinformatics/btv195
Choi Y, Sims GE, Murphy S, Miller JR, Chan AP (2012). Predicting the functional effect of amino acid substitutions and indels. PloS One 7. https://doi.org/10.1371/journal.pone.0046688
Cobb JN, DeClerck G, Greenberg A, Clark R, McCouch S (2013). Next-generation phenotyping: requirements and strategies for enhancing our understanding of genotype-phenotype relationships and its relevance to crop improvement. Theoretical and Applied Genetics 126:867-887. https://doi.org/10.1007/s00122-013-2066-0
De Alencar S, Lopes JC (2010). A comprehensive in silico analysis of the functional and structural impact of SNPs in the IGF1R gene. BioMed Research International 715139. https://doi.org/10.1155/2010/715139
Deller MC, Kong L, Rupp B (2016). Protein stability: a crystallographer's perspective. Acta Crystallographica Section F: Structural Biology Communications 72:72-95. https://doi.org/10.1107/S2053230X15024619
Friso G, van Wijk KJ (2015). Posttranslational protein modifications in plant metabolism. Plant Physiology 169:1469-1487. https://doi.org/10.1107/S2053230X15024619
Gailing O, Vornam B, Leinemann L, Finkeldey R (2009). Genetic and genomic approaches to assess adaptive genetic variation in plants: forest trees as a model. Physiologia Plantarum 137:509-519. https://doi.org/10.1111/j.1399-3054.2009.01263.x
Goff SA, Ricke D, Lan T-H, Presting G, Wang R, Dunn M, Glazebrook J, Sessions A, Oeller P, Varma H (2002). A draft sequence of the rice genome (Oryza sativa L. ssp. japonica). Science 296:92-100. https://doi.org/10.1126/science.1068275
Guajardo V, Solís S, Almada R, Saski C, Gasic K, Moreno MÁ (2020). Genome-wide SNP identification in Prunus rootstocks germplasm collections using genotyping-by-sequencing: phylogenetic analysis, distribution of SNPs and prediction of their effect on gene function. Scientific Reports 10:1467-1467. https://doi.org/10.1038/s41598-020-58271-5
Gulzar N, Dingerdissen H, Yan C, Mazumder R (2017). Impact of nonsynonymous single-nucleotide variations on post-translational modification sites in human proteins. Protein Bioinformatics, Springer, pp 159-190. https://doi.org/10.1007/978-1-4939-6783-4_8
Han JH, Kerrison N, Chothia C, Teichmann SA (2006). Divergence of interdomain geometry in two-domain proteins. Structure (London, England, 1993) 14:935-945. https://doi.org/10.1016/j.str.2006.01.016
Hirakawa H, Shirasawa K, Ohyama A, Fukuoka H, Aoki K, Rothan C, Sato S, Isobe S, Tabata S (2013). Genome-wide SNP genotyping to infer the effects on gene functions in tomato. DNA Research 20:221-233. https://doi.org/10.1093/dnares/dst005
Huq MA, Akter S, Nou IS, Kim HT, Jung YJ, Kang KK (2016). Identification of functional SNPs in genes and their effects on plant phenotypes. Journal of Plant Biotechnology 43:1-11. https://doi.org/10.5010/JPB.2016.43.1.1
Islam MJ, Khan AM, Parves MR, Hossain MN, Halim MA (2019). Prediction of deleterious non-synonymous SNPs of human STK11 gene by combining algorithms, molecular docking, and molecular dynamics simulation. Scientific Reports 9:16426. https://doi.org/10.1038/s41598-019-52308-0
Jackson SA (2016). Rice: the first crop genome. Rice 9:1-3. https://doi.org/10.1186/s12284-016-0087-4
Jiang D, Ye Q-l, Wang F-S, Cao L (2010). The mining of citrus EST-SNP and its application in cultivar discrimination. Agricultural Sciences in China 9:179-190. https://doi.org/10.1016/S1671-2927(09)60082-1
Jovine L, Qi H, Williams Z, Litscher E, Wassarman PM (2002). The ZP domain is a conserved module for polymerization of extracellular proteins. Nature Cell Biology 4:457-461. https://doi.org/10.1038/ncb802
Kamaraj B, Purohit R (2013). In silico screening and molecular dynamics simulation of disease-associated nsSNP in TYRP1 gene and its structural consequences in OCA3. BioMed Research International 2013:697051-697051. https://doi.org/10.1155/2013/697051
Kelley LA, Mezulis S, Yates CM, Wass MN, Sternberg MJ (2015). The Phyre2 web portal for protein modeling, prediction and analysis. Nature Protocols 10:845. https://doi.org/10.1107/S2053230X15024619
Kharabian-Masouleh A, Waters DLE, Reinke RF, Ward R, Henry RJ (2012). SNP in starch biosynthesis genes associated with nutritional and functional properties of rice. Scientific Reports 2:557. https://doi.org/10.1038/srep00557
Kharabian A (2010). An efficient computational method for screening functional SNPs in plants. Journal of Theoretical Biology 265:55-62. https://doi.org/10.1016/j.jtbi.2010.04.017
Korani W, Clevenger JP, Chu Y, Ozias-Akins P (2019). Machine learning as an effective method for identifying true single nucleotide polymorphisms in polyploid plants. Plant Genome 12. https://doi.org/10.3835/plantgenome2018.05.0023
Kumar B, Abdel-Ghani AH, Pace J, Reyes-Matamoros J, Hochholdinger F, Lübberstedt T (2014). Association analysis of single nucleotide polymorphisms in candidate genes with root traits in maize (Zea mays L.) seedlings. Plant Science 224:9-19. https://doi.org/10.1007/s11103-015-0314-1
Lek M, Karczewski KJ, Minikel EV, Samocha KE, Banks E, Fennell T, … MacArthur DG (2016). Analysis of protein-coding genetic variation in 60,706 humans. Nature 536:285-291. https://doi.org/10.1038/nature19057
Li B, Krishnan VG, Mort ME, Xin F, Kamati KK, Cooper DN, Mooney SD, Radivojac P (2009a). Automated inference of molecular mechanisms of disease from amino acid substitutions. Bioinformatics 25:2744-2750. https://doi.org/10.1093/bioinformatics/btp528
Li J-Y, Wang J, Zeigler RS (2014). The 3,000 rice genomes project: new opportunities and challenges for future rice research. Gigascience 3:2047-217X. https://doi.org/10.1186/2047-217X-3-8
Li X, Gao X, Ren J, Jin C, Xue Y (2009b). BDM-PUB: Computational prediction of protein ubiquitination sites with a Bayesian discriminant method. https://doi.org/10.2174/1389202919666191014091250
Liao M-l, Somero GN, Dong Y-W (2019). Comparing mutagenesis and simulations as tools for identifying functionally important sequence changes for protein thermal adaptation. Proceedings of the National Academy of Sciences 116:679-688. https://doi.org/10.1073/pnas.1817455116
Majeed S, Rana IA, Atif RM, Ali Z, Hinze L, Azhar MT (2019). Role of SNPs in determining QTLs for major traits in cotton. Journal of Cotton Research 2:5. https://doi.org/10.1186/s42397-019-0022-5
Mammadov J, Aggarwal R, Buyyarapu R, Kumpatla S (2012). SNP markers and their impact on plant breeding. International Journal of Plant Genomics 2012:728398. https://doi.org/10.1155/2012/728398
Mansueto L, Fuentes RR, Borja FN, Detras J, Abriol-Santos JM, Chebotarov D, … Alexandrov N (2016). Rice SNP-seek database update: new SNPs, indels, and queries. Nucleic Acids Research 45:D1075-D1081. https://doi.org/10.1093/nar/gkw1135
McCouch SR, Zhao K, Wright M, Tung C-W, Ebana K, Thomson M, Reynolds A, Wang D, DeClerck G, Ali ML (2010). Development of genome-wide SNP assays for rice. Breeding Science 60:524-535. https://doi.org/10.1270/jsbbs.60.524
Nelson MR, Marnellos G, Kammerer S, Hoyal CR, Shi MM, Cantor CR, Braun A (2004). Large-scale validation of single nucleotide polymorphisms in gene regions. Genome Research 14:1664-1668. https://doi.org/10.1101/gr.2421604
Ng PC, Henikoff S (2003). SIFT: Predicting amino acid changes that affect protein function. Nucleic Acids Research 31:3812-3814. https://doi.org/10.1093/nar/gkg509
Ortbauer M, Vahdati K, Leslie C (2013). Abiotic stress adaptation: protein folding stability and dynamics. Abiotic Stress-Plant Responses and Applications in Agriculture 1:3-25. https://doi.org/10.5772/53129
Pea G, Aung HH, Frascaroli E, Landi P, Pè ME (2013). Extensive genomic characterization of a set of near-isogenic lines for heterotic QTL in maize (Zea mays L.). BMC Genomics 14:61. https://doi.org/10.1186/1471-2164-14-61
Pettersen EF, Goddard TD, Huang CC, Couch GS, Greenblatt DM, Meng EC, Ferrin TE (2004). UCSF Chimera–A visualization system for exploratory research and analysis. Journal of Computational Chemistry 25:1605-1612. https://doi.org/10.1002/jcc.20084
Piquerez SJ, Balmuth AL, Sklenář J, Jones AM, Rathjen JP, Ntoukakis V (2014). Identification of post-translational modifications of plant protein complexes. JoVE Journal of Visualized Experiments e51095. https://doi.org/10.3791/51095
Qiu W-R, Xiao X, Lin W-Z, Chou K-C (2014). iMethyl-PseAAC: identification of protein methylation sites via a pseudo amino acid composition approach. BioMed Research International 947416. https://doi.org/10.1155/2014/947416
Radivojac P, Vacic V, Haynes C, Cocklin RR, Mohan A, Heyen JW, Goebl MG, Iakoucheva LM (2010). Identification, analysis, and prediction of protein ubiquitination sites. Proteins: Structure, Function, and Bioinformatics 78:365-380. https://doi.org/10.1002/prot.22555
Rasal KD, Shah TM, Vaidya M, Jakhesara SJ, Joshi CG (2015). Analysis of consequences of non-synonymous SNP in feed conversion ratio associated TGF-β receptor type 3 gene in chicken. Meta Gene 4:107-117. https://doi.org/10.1016/j.mgene.2015.03.006
Safder I, Shao G, Sheng Z, Hu P, Tang S (2021). Identification and analysis of the structure, expression and nucleotide polymorphism of the GPAT gene family in rice. Plant Gene 100290. https://doi.org/10.1016/j.plgene.2021.100290
Salmon M, Thimmappa RB, Minto RE, Melton RE, Hughes RK, O’Maille PE, Hemmings AM, Osbourn A (2016). A conserved amino acid residue critical for product and substrate specificity in plant triterpene synthases. Proceedings of the National Academy of Sciences 113:E4407-E4414. https://doi.org/10.1073/pnas.1605509113
Sandhu D, Pudussery MV, Kumar R, Pallete A, Markley P, Bridges WC, Sekhon RS (2020). Characterization of natural genetic variation identifies multiple genes involved in salt tolerance in maize. Functional & Integrative Genomics 20:261-275. https://doi.org/10.1007/s10142-019-00707-x
Schreiber L, Nader-Nieto AC, Schönhals EM, Walkemeier B, Gebhardt C (2014). SNPs in genes functional in starch-sugar interconversion associate with natural variation of tuber starch and sugar content of potato (Solanum tuberosum L.). G3 Genes, Genomes, Genetics 4:1797-1811. https://doi.org/10.1534/g3.114.012377
Senior AW, Evans R, Jumper J, Kirkpatrick J, Sifre L, Green T, … Hassabis D (2020). Improved protein structure prediction using potentials from deep learning. Nature 577:706-710. https://doi.org/10.1038/s41586-019-1923-7
Seymour GB, Chapman NH, Chew BL, Rose JK (2013). Regulation of ripening and opportunities for control in tomato and other fruits. Plant Biotechnology Journal 11:269-278. https://doi.org/10.1111/j.1467-7652.2012.00738.x
Tam V, Patel N, Turcotte M, Bossé Y, Paré G, Meyre D (2019). Benefits and limitations of genome-wide association studies. Nature Reviews Genetics 20:467-484. https://doi.org/10.1038/s41576-019-0127-1
Tibbs Cortes L, Zhang Z, Yu J (2021). Status and prospects of genome-wide association studies in plants. The Plant Genome 14:e20077. https://doi.org/10.1002/tpg2.20077
Wang C-C, Yu H, Huang J, Wang W-S, Faruquee M, Zhang F, … Zheng T-Q (2020a). Towards a deeper haplotype mining of complex traits in rice with RFGB v2.0. Plant Biotechnology Journal 18:14-16. https://doi.org/10.1111/pbi.13215
Wang H, Ham T-H, Im D-E, Lar SM, Jang S-G, Lee J, Mo Y, Jeung J-U, Kim ST, Kwon S-W (2020b). A new SNP in rice gene encoding pyruvate phosphate dikinase (PPDK) associated with floury endosperm. Genes (Basel) 11:465. https://doi.org/10.3390/genes11040465
Wang H, Mo Y-J, Im D-E, Jang S-G, Ham T-H, Lee J, Jeung J-U, Kwon S-W (2018). A new SNP in cy OsPPDK gene is associated with floury endosperm in Suweon 542. Molecular Genetics and Genomics 293:1151-1158. https://doi.org/10.3390/genes11040465
Wen P-P, Shi S-P, Xu H-D, Wang L-N, Qiu J-D (2016). Accurate in silico prediction of species-specific methylation sites based on information gain feature optimization. Bioinformatics 32:3107-3115. https://doi.org/10.1093/bioinformatics/btw377
Withana WVE, Kularathna RMRE, Kottearachchi NS, Kekulandara DS, Weerasena J, Steele KA (2020). In silico analysis of the fragrance gene (badh2) in Asian rice (Oryza sativa L.) germplasm and validation of allele specific markers. Plant Genetic Resources: Characterization and Utilization 18:71-80. https://doi.org/10.1017/S1479262120000015
Xia Y, Li R, Ning Z, Bai G, Siddique KH, Yan G, Baum M, Varshney RK, Guo P (2013). Single nucleotide polymorphisms in HSP17. 8 and their association with agronomic traits in barley. PloS One 8:e56816. https://doi.org/10.1371/journal.pone.0056816
Xue Y, Zhou F, Zhu M, Ahmed K, Chen G, Yao X (2005). GPS: a comprehensive www server for phosphorylation sites prediction. Nucleic Acids Research 33:W184-W187. https://doi.org/10.1093/nar/gki393
Yang J, Anishchenko I, Park H, Peng Z, Ovchinnikov S, Baker D (2020). Improved protein structure prediction using predicted interresidue orientations. Proceedings of the National Academy of Sciences 117:1496-1503. https://doi.org/10.1073/pnas.1914677117
Yang W, Bai X, Kabelka E, Eaton C, Kamoun S, van der Knaap E, David F (2004). Discovery of single nucleotide polymorphisms in Lycopersicon esculentum by computer aided analysis of expressed sequence tags. Molecular Breeding 14:21-34. https://doi.org/10.1023/B:MOLB.0000037992.03731.a5
Zaynab M, Fatima M, Abbas S, Sharif Y, Umair M, Zafar MH, Bahadar K (2018). Role of secondary metabolites in plant defense against pathogens. Microbial Pathogenesis 124:198-202. https://doi.org/10.1016/j.micpath.2018.08.034
Zhang M, Huang C, Wang Z, Lv H, Li X (2020). In silico analysis of non-synonymous single nucleotide polymorphisms (nsSNPs) in the human GJA3 gene associated with congenital cataract. BMC Molecular and Cell Biology 21:12. https://doi.org/10.1186/s12860-020-00252-7
Zhang W, Mirlohi S, Li X, He Y (2018). Identification of functional single-nucleotide polymorphisms affecting leaf hair number in Brassica rapa. Plant Physiology 177:490-503. https://doi.org/10.1104/pp.18.00025
Zhang Y, Skolnick J (2005). TM-align: a protein structure alignment algorithm based on the TM-score. Nucleic Acids Research 33:2302-2309. https://doi.org/10.1093/nar/gki524
Zhao H, Yao W, Ouyang Y, Yang W, Wang G, Lian X, … Xie W (2015). RiceVarMap: a comprehensive database of rice genomic variations. Nucleic Acids Research 43:D1018-D1022. https://doi.org/10.1093/nar/gku894

Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2021 Notulae Botanicae Horti Agrobotanici Cluj-Napoca

This work is licensed under a Creative Commons Attribution 4.0 International License.
License:
Open Access Journal:
The journal allows the author(s) to retain publishing rights without restriction. Users are allowed to read, download, copy, distribute, print, search, or link to the full texts of the articles, or use them for any other lawful purpose, without asking prior permission from the publisher or the author.