Abstract—Here, an efficient method is introduced for multiple structural alignment of proteins. The method encodes geometry of protein secondary and tertiary structures in linear sequences and then uses a hierarchical procedure for superposition of proteins based on these sequences. To capture similarities between secondary structure sequences, the method utilizes n-gram modeling technique over entropy concept adopted from computational linguistics. Moreover, a step-by-step algorithm is used to align relative residue position sequences in tertiary structure level. A number of case studies are presented here to demonstrate the power of the method comparing with other structure alignment tools. The results provide evidence for efficiency and applicability of the proposed method.
Index Terms—Multiple structure alignment, linear encoding methods, text modeling.
J. Razmara is with the Department of Computer Sciences, Faculty of Mathematical Sciences, University of Tabriz, Tabriz, Iran (e-mail: razmara@tabrizu.ac.ir).
[PDF]
Cite: Jafar Razmara, "A Method for Multiple Structural Alignment of Proteins Using Text Modeling Techniques," International Journal of Future Computer and Communication vol. 4, no. 2, pp. 143-146, 2015.