Skip to content
Licensed Unlicensed Requires Authentication Published by De Gruyter September 1, 2017

Bayesian comparison of protein structures using partial Procrustes distance

  • Nasim Ejlali EMAIL logo , Mohammad Reza Faghihi and Mehdi Sadeghi

Abstract

An important topic in bioinformatics is the protein structure alignment. Some statistical methods have been proposed for this problem, but most of them align two protein structures based on the global geometric information without considering the effect of neighbourhood in the structures. In this paper, we provide a Bayesian model to align protein structures, by considering the effect of both local and global geometric information of protein structures. Local geometric information is incorporated to the model through the partial Procrustes distance of small substructures. These substructures are composed of β-carbon atoms from the side chains. Parameters are estimated using a Markov chain Monte Carlo (MCMC) approach. We evaluate the performance of our model through some simulation studies. Furthermore, we apply our model to a real dataset and assess the accuracy and convergence rate. Results show that our model is much more efficient than previous approaches.

Acknowledgement

We are thankful to the editor, the associate editor, and four anonymous referees for their helpful comments and suggestions that lead to improvements in the manuscript.

References

Berman, H. M., J. Westbrook, Z. Feng, G. Gilliland, T. Bhat, H. Weissig, I. N. Shindyalov and P. E. Bourne (2000): “The protein data bank,” Nucleic Acids Res., 28, 235–242.10.1093/nar/28.1.235Search in Google Scholar PubMed PubMed Central

Cheng, H., B.-H. Kim and N. V. Grishin (2008): “MALISAM: a database of structurally analogous motifs in proteins,” Nucleic Acids Res., 36, D211–D217.10.1093/nar/gkm698Search in Google Scholar PubMed PubMed Central

Cui, X., H. Kuwahara, S. C. Li and X. Gao (2015a): “Compare local pocket and global protein structure models by small structure patterns,” in Proceedings of the 6th ACM Conference on Bioinformatics, Computational Biology and Health Informatics, ACM, 355–365.10.1145/2808719.2808756Search in Google Scholar

Cui, X., H. Naveed and X. Gao (2015b): “Finding optimal interaction interface alignments between biological complexes,” Bioinformatics, 31, i133–i141.10.1093/bioinformatics/btv242Search in Google Scholar PubMed PubMed Central

Czogiel, I., I. L. Dryden, C. J. Brignell (2011): “Bayesian matching of unlabeled marked point sets using random fields, with an application to molecular alignment,” Ann. Appl. Stat., 5, 2603–2629.10.1214/11-AOAS486Search in Google Scholar

Dryden, I. L. and K. V. Mardia (1998): Statistical Shape Analysis. John Wiley and Sons, Chichester.Search in Google Scholar

Dryden, I. L., J. D. Hirst and J. L. Melville (2007): “Statistical analysis of unlabeled point sets: comparing molecules in chemoinformatics,” Biometrics, 63, 237–251.10.1111/j.1541-0420.2006.00622.xSearch in Google Scholar PubMed

Eslahchi, C., H. Pezeshk, M. Sadeghi, A. M. Rahimi, H. M. Afkham and S. Arab (2009): “STON: A novel method for protein three-dimensional structure comparison,” Comput. Biol. Med., 39, 166–172.10.1016/j.compbiomed.2008.12.004Search in Google Scholar PubMed

Essen, L.-O. (2003): Structural bioinformatics, Edited by Philip E. Bourne and Helge Weissig, Wiley-Liss, New Jersey.10.1002/anie.200385018Search in Google Scholar

Fallaize, C., P. Green, K. Mardia and S. Barber (2014): “Bayesian protein sequence and structure alignment,” arXiv preprint arXiv:1404.1556.10.1111/rssc.12394Search in Google Scholar

Geweke, J. (1991): Evaluating the accuracy of sampling-based approaches to the calculation of posterior moments, volume 196, Federal Reserve Bank of Minneapolis, Research Department Minneapolis, MN, USA.10.21034/sr.148Search in Google Scholar

Gibrat, J.-F., T. Madej and S. H. Bryant (1996): “Surprising similarities in structure comparison,” Curr. Opin. Struc. Biol., 6, 377–385.10.1016/S0959-440X(96)80058-3Search in Google Scholar

Green, P. J. and K. V. Mardia (2006): “Bayesian alignment using hierarchical models, with applications in protein bioinformatics,” Biometrika, 93, 235–254.10.1093/biomet/93.2.235Search in Google Scholar

Holm, L. and C. Sander (1993): “Protein structure comparison by alignment of distance matrices,” J. Mol. Biol., 233, 123–138.10.1006/jmbi.1993.1489Search in Google Scholar

Kenobi, K. and I. L. Dryden (2012): “Bayesian matching of unlabeled point sets using procrustes and configuration models,” Bayesian Anal., 7, 547–566.10.1214/12-BA718Search in Google Scholar

Kent, J. T., K. V. Mardia and C. C. Taylor (2004): “Matching problems for unlabelled configurations,” Bioinf. Image Wavelets 33–36.Search in Google Scholar

Mardia, K. V., C. J. Fallaize, S. Barber, R. M. Jackson and D. L. Theobald (2013): “Bayesian alignment of similarity shapes,” Ann. Appl. Stat., 7, 989.10.1214/12-AOAS615Search in Google Scholar

Najibi, S., M. Faghihi, M. Golalizadeh and S. Arab (2015): “Bayesian alignment of proteins via Delaunay tetrahedralization,” J. Appl. Stat., 42, 1064–1079.10.1080/02664763.2014.995605Search in Google Scholar

Orengo, C. A. and W. R. Taylor (1996): “[36] SSAP: sequential structure alignment program for protein structure comparison,” Method Enzymol., 266, 617–635.10.1016/S0076-6879(96)66038-8Search in Google Scholar

Ortiz, A. R., C. E. Strauss and O. Olmea (2002): “Mammoth (matching molecular models obtained from theory): an automated method for model comparison,” Protein Sci., 11, 2606–2621.10.1110/ps.0215902Search in Google Scholar PubMed PubMed Central

Rodriguez, A. and S. C. Schmidler (2014): “Bayesian protein structure alignment,” Ann, Appl. Stat., 8, 2068.10.1214/14-AOAS780Search in Google Scholar PubMed PubMed Central

Schmidler, S. C. (2007): “Fast bayesian shape matching using geometric algorithms,” Bayesian Stat., 8, 471–490.Search in Google Scholar

Shindyalov, I. N. and P. E. Bourne (1998): “Protein structure alignment by incremental combinatorial extension (CE) of the optimal path.” Protein Eng., 11, 739–747.10.1093/protein/11.9.739Search in Google Scholar PubMed

Subbiah, S., D. Laurents and M. Levitt (1993): “Structural similarity of dna-binding domains of bacteriophage repressors and the globin core,” Curr. Biol., 3, 141–148.10.1016/0960-9822(93)90255-MSearch in Google Scholar

Team, R. C. (2013): R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing.Search in Google Scholar

Zemla, A. (2003): “LGA: a method for finding 3d similarities in protein structures,” Nucleic Acids Res., 31, 3370–3374.10.1093/nar/gkg571Search in Google Scholar PubMed PubMed Central

Zhang, Y. and J. Skolnick (2005): “TM-align: a protein structure alignment algorithm based on the TM-score,” Nucleic Acids Res., 33, 2302–2309.10.1093/nar/gki524Search in Google Scholar PubMed PubMed Central

Published Online: 2017-9-1
Published in Print: 2017-9-26

©2017 Walter de Gruyter GmbH, Berlin/Boston

Downloaded on 13.5.2024 from https://www.degruyter.com/document/doi/10.1515/sagmb-2016-0014/html
Scroll to top button