PROSPECT II: Protein structure prediction program for genome-scale applications

Dongsup Kim, Dong Xu, Jun Tao Guo, Kyle Ellrott, Ying Xu

Research output: Contribution to journalArticle

79 Citations (Scopus)

Abstract

A new method for fold recognition is developed and added to the general protein structure prediction package PROSPECT (http://compbio.ornl.gov/PROSPECT/). The new method (PROSPECT II) has four key features. (i) We have developed an efficient way to utilize the evolutionary information for evaluating the threading potentials including singleton and pairwise energies. (ii) We have developed a two-stage threading strategy: (a) threading using dynamic programming without considering the pairwise energy and (b) fold recognition considering all the energy terms, including the pairwise energy calculated from the dynamic programming threading alignments. (iii) We have developed a combined z-score scheme for fold recognition, which takes into consideration the z-scores of each energy term. (iv) Based on the z-scores, we have developed a confidence index, which measures the reliability of a prediction and a possible structure-function relationship based on a statistical analysis of a large data set consisting of threadings of 600 query proteins against the entire FSSP templates. Tests on several benchmark sets indicate that the evolutionary information and other new features of PROSPECT II greatly improve the alignment accuracy. We also demonstrate that the performance of PROSPECT II on fold recognition is significantly better than any other method available at all levels of similarity. Improvement in the sensitivity of the fold recognition, especially at the superfamily and fold levels, makes PROSPECT II a reliable and fully automated protein structure and function prediction program for genome-scale applications.

Original languageEnglish (US)
Pages (from-to)641-650
Number of pages10
JournalProtein Engineering
Volume16
Issue number9
StatePublished - Sep 2003
Externally publishedYes

Fingerprint

Genes
Genome
Proteins
Dynamic programming
Benchmarking
Statistical methods
Datasets

Keywords

  • Fold recognition
  • PROSPECT
  • Protein structure prediction
  • Threading
  • z-score

ASJC Scopus subject areas

  • Molecular Biology
  • Biochemistry

Cite this

PROSPECT II : Protein structure prediction program for genome-scale applications. / Kim, Dongsup; Xu, Dong; Guo, Jun Tao; Ellrott, Kyle; Xu, Ying.

In: Protein Engineering, Vol. 16, No. 9, 09.2003, p. 641-650.

Research output: Contribution to journalArticle

Kim, Dongsup ; Xu, Dong ; Guo, Jun Tao ; Ellrott, Kyle ; Xu, Ying. / PROSPECT II : Protein structure prediction program for genome-scale applications. In: Protein Engineering. 2003 ; Vol. 16, No. 9. pp. 641-650.
@article{3f755620d17947c89e329df58ae0528e,
title = "PROSPECT II: Protein structure prediction program for genome-scale applications",
abstract = "A new method for fold recognition is developed and added to the general protein structure prediction package PROSPECT (http://compbio.ornl.gov/PROSPECT/). The new method (PROSPECT II) has four key features. (i) We have developed an efficient way to utilize the evolutionary information for evaluating the threading potentials including singleton and pairwise energies. (ii) We have developed a two-stage threading strategy: (a) threading using dynamic programming without considering the pairwise energy and (b) fold recognition considering all the energy terms, including the pairwise energy calculated from the dynamic programming threading alignments. (iii) We have developed a combined z-score scheme for fold recognition, which takes into consideration the z-scores of each energy term. (iv) Based on the z-scores, we have developed a confidence index, which measures the reliability of a prediction and a possible structure-function relationship based on a statistical analysis of a large data set consisting of threadings of 600 query proteins against the entire FSSP templates. Tests on several benchmark sets indicate that the evolutionary information and other new features of PROSPECT II greatly improve the alignment accuracy. We also demonstrate that the performance of PROSPECT II on fold recognition is significantly better than any other method available at all levels of similarity. Improvement in the sensitivity of the fold recognition, especially at the superfamily and fold levels, makes PROSPECT II a reliable and fully automated protein structure and function prediction program for genome-scale applications.",
keywords = "Fold recognition, PROSPECT, Protein structure prediction, Threading, z-score",
author = "Dongsup Kim and Dong Xu and Guo, {Jun Tao} and Kyle Ellrott and Ying Xu",
year = "2003",
month = "9",
language = "English (US)",
volume = "16",
pages = "641--650",
journal = "Protein Engineering, Design and Selection",
issn = "1741-0126",
publisher = "Oxford University Press",
number = "9",

}

TY - JOUR

T1 - PROSPECT II

T2 - Protein structure prediction program for genome-scale applications

AU - Kim, Dongsup

AU - Xu, Dong

AU - Guo, Jun Tao

AU - Ellrott, Kyle

AU - Xu, Ying

PY - 2003/9

Y1 - 2003/9

N2 - A new method for fold recognition is developed and added to the general protein structure prediction package PROSPECT (http://compbio.ornl.gov/PROSPECT/). The new method (PROSPECT II) has four key features. (i) We have developed an efficient way to utilize the evolutionary information for evaluating the threading potentials including singleton and pairwise energies. (ii) We have developed a two-stage threading strategy: (a) threading using dynamic programming without considering the pairwise energy and (b) fold recognition considering all the energy terms, including the pairwise energy calculated from the dynamic programming threading alignments. (iii) We have developed a combined z-score scheme for fold recognition, which takes into consideration the z-scores of each energy term. (iv) Based on the z-scores, we have developed a confidence index, which measures the reliability of a prediction and a possible structure-function relationship based on a statistical analysis of a large data set consisting of threadings of 600 query proteins against the entire FSSP templates. Tests on several benchmark sets indicate that the evolutionary information and other new features of PROSPECT II greatly improve the alignment accuracy. We also demonstrate that the performance of PROSPECT II on fold recognition is significantly better than any other method available at all levels of similarity. Improvement in the sensitivity of the fold recognition, especially at the superfamily and fold levels, makes PROSPECT II a reliable and fully automated protein structure and function prediction program for genome-scale applications.

AB - A new method for fold recognition is developed and added to the general protein structure prediction package PROSPECT (http://compbio.ornl.gov/PROSPECT/). The new method (PROSPECT II) has four key features. (i) We have developed an efficient way to utilize the evolutionary information for evaluating the threading potentials including singleton and pairwise energies. (ii) We have developed a two-stage threading strategy: (a) threading using dynamic programming without considering the pairwise energy and (b) fold recognition considering all the energy terms, including the pairwise energy calculated from the dynamic programming threading alignments. (iii) We have developed a combined z-score scheme for fold recognition, which takes into consideration the z-scores of each energy term. (iv) Based on the z-scores, we have developed a confidence index, which measures the reliability of a prediction and a possible structure-function relationship based on a statistical analysis of a large data set consisting of threadings of 600 query proteins against the entire FSSP templates. Tests on several benchmark sets indicate that the evolutionary information and other new features of PROSPECT II greatly improve the alignment accuracy. We also demonstrate that the performance of PROSPECT II on fold recognition is significantly better than any other method available at all levels of similarity. Improvement in the sensitivity of the fold recognition, especially at the superfamily and fold levels, makes PROSPECT II a reliable and fully automated protein structure and function prediction program for genome-scale applications.

KW - Fold recognition

KW - PROSPECT

KW - Protein structure prediction

KW - Threading

KW - z-score

UR - http://www.scopus.com/inward/record.url?scp=0142184275&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0142184275&partnerID=8YFLogxK

M3 - Article

C2 - 14560049

AN - SCOPUS:0142184275

VL - 16

SP - 641

EP - 650

JO - Protein Engineering, Design and Selection

JF - Protein Engineering, Design and Selection

SN - 1741-0126

IS - 9

ER -