Improving the performance of protein threading using insertion/deletion frequency arrays

Kyle Ellrott, Jun Tao Guo, Victor Olman, Ying Xu

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

As a protein evolves, not every part of the amino acid sequence has an equal probability of being deleted or for allowing insertions, because not every amino acid plays an equally important role in maintaining the protein structure. However, the most prevalent models in fold recognition methods treat every amino acid deletion and insertion as equally probable events. We have analyzed the alignment patterns for homologous and analogous sequences to determine patterns of insertion and deletion, and used that information to determine the statistics of insertions and deletions for different amino acids of a target sequence. We define these patterns as insertion/deletion (indel) frequency arrays (IFAs). By applying IFAs to the protein threading problem, we have been able to improve the alignment accuracy, especially for proteins with low sequence identity. We have also demonstrated that the application of this information can lead to an improvement in fold recognition.

Original languageEnglish (US)
Pages (from-to)585-602
Number of pages18
JournalJournal of Bioinformatics and Computational Biology
Volume6
Issue number3
DOIs
StatePublished - Jun 2008
Externally publishedYes

Fingerprint

Amino acids
Proteins
Amino Acids
Protein Array Analysis
Sequence Homology
Amino Acid Sequence
Statistics

Keywords

  • Indel
  • Proteins
  • Threading
  • Z-score

ASJC Scopus subject areas

  • Medicine(all)
  • Cell Biology

Cite this

Improving the performance of protein threading using insertion/deletion frequency arrays. / Ellrott, Kyle; Guo, Jun Tao; Olman, Victor; Xu, Ying.

In: Journal of Bioinformatics and Computational Biology, Vol. 6, No. 3, 06.2008, p. 585-602.

Research output: Contribution to journalArticle

@article{0d7e305efc6840baba48a6f201f11e63,
title = "Improving the performance of protein threading using insertion/deletion frequency arrays",
abstract = "As a protein evolves, not every part of the amino acid sequence has an equal probability of being deleted or for allowing insertions, because not every amino acid plays an equally important role in maintaining the protein structure. However, the most prevalent models in fold recognition methods treat every amino acid deletion and insertion as equally probable events. We have analyzed the alignment patterns for homologous and analogous sequences to determine patterns of insertion and deletion, and used that information to determine the statistics of insertions and deletions for different amino acids of a target sequence. We define these patterns as insertion/deletion (indel) frequency arrays (IFAs). By applying IFAs to the protein threading problem, we have been able to improve the alignment accuracy, especially for proteins with low sequence identity. We have also demonstrated that the application of this information can lead to an improvement in fold recognition.",
keywords = "Indel, Proteins, Threading, Z-score",
author = "Kyle Ellrott and Guo, {Jun Tao} and Victor Olman and Ying Xu",
year = "2008",
month = "6",
doi = "10.1142/S0219720008003552",
language = "English (US)",
volume = "6",
pages = "585--602",
journal = "Journal of Bioinformatics and Computational Biology",
issn = "0219-7200",
publisher = "World Scientific Publishing Co. Pte Ltd",
number = "3",

}

TY - JOUR

T1 - Improving the performance of protein threading using insertion/deletion frequency arrays

AU - Ellrott, Kyle

AU - Guo, Jun Tao

AU - Olman, Victor

AU - Xu, Ying

PY - 2008/6

Y1 - 2008/6

N2 - As a protein evolves, not every part of the amino acid sequence has an equal probability of being deleted or for allowing insertions, because not every amino acid plays an equally important role in maintaining the protein structure. However, the most prevalent models in fold recognition methods treat every amino acid deletion and insertion as equally probable events. We have analyzed the alignment patterns for homologous and analogous sequences to determine patterns of insertion and deletion, and used that information to determine the statistics of insertions and deletions for different amino acids of a target sequence. We define these patterns as insertion/deletion (indel) frequency arrays (IFAs). By applying IFAs to the protein threading problem, we have been able to improve the alignment accuracy, especially for proteins with low sequence identity. We have also demonstrated that the application of this information can lead to an improvement in fold recognition.

AB - As a protein evolves, not every part of the amino acid sequence has an equal probability of being deleted or for allowing insertions, because not every amino acid plays an equally important role in maintaining the protein structure. However, the most prevalent models in fold recognition methods treat every amino acid deletion and insertion as equally probable events. We have analyzed the alignment patterns for homologous and analogous sequences to determine patterns of insertion and deletion, and used that information to determine the statistics of insertions and deletions for different amino acids of a target sequence. We define these patterns as insertion/deletion (indel) frequency arrays (IFAs). By applying IFAs to the protein threading problem, we have been able to improve the alignment accuracy, especially for proteins with low sequence identity. We have also demonstrated that the application of this information can lead to an improvement in fold recognition.

KW - Indel

KW - Proteins

KW - Threading

KW - Z-score

UR - http://www.scopus.com/inward/record.url?scp=46049119218&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=46049119218&partnerID=8YFLogxK

U2 - 10.1142/S0219720008003552

DO - 10.1142/S0219720008003552

M3 - Article

VL - 6

SP - 585

EP - 602

JO - Journal of Bioinformatics and Computational Biology

JF - Journal of Bioinformatics and Computational Biology

SN - 0219-7200

IS - 3

ER -