Evolutionary sequence modeling for discovery of peptide hormones

Mustafa (Kemal) Sonmez, Naunihal T. Zaveri, Ilan A. Kerman, Sharon Burke, Charles R. Neal, Xinmin Xie, Stanley J. Watson, Lawrence Toll

Research output: Contribution to journalArticle

47 Citations (Scopus)

Abstract

There are currently a large number of "orphan" G-protein-coupled receptors (GPCRs) whose endogenous ligands (peptide hormones) are unknown. Identification of these peptide hormones is a difficult and important problem. We describe a computational framework that models spatial structure along the genomic sequence simultaneously with the temporal evolutionary path structure across species and show how such models can be used to discover new functional molecules, in particular peptide hormones, via cross-genomic sequence comparisons. The computational framework incorporates a priori high-level knowledge of structural and evolutionary constraints into a hierarchical grammar of evolutionary probabilistic models. This computational method was used for identifying novel prohormones and the processed peptide sites by producing sequence alignments across many species at the functional-element level. Experimental results with an initial implementation of the algorithm were used to identify potential prohormones by comparing the human and non-human proteins in the Swiss-Prot database of known annotated proteins. In this proof of concept, we identified 45 out of 54 prohormones with only 44 false positives. The comparison of known and hypothetical human and mouse proteins resulted in the identification of a novel putative prohormone with at least four potential neuropeptides. Finally, in order to validate the computational methodology, we present the basic molecular biological characterization of the novel putative peptide hormone, including its identification and regional localization in the brain. This species comparison, HMM-based computational approach succeeded in identifying a previously undiscovered neuropeptide from whole genome protein sequences. This novel putative peptide hormone is found in discreet brain regions as well as other organs. The success of this approach will have a great impact on our understanding of GPCRs and associated pathways and help to identify new targets for drug development.

Original languageEnglish (US)
Article numbere1000258
JournalPLoS Computational Biology
Volume5
Issue number1
DOIs
StatePublished - Jan 2009

Fingerprint

peptide hormones
Peptide Hormones
Hormones
Peptides
peptide
hormone
Proteins
protein
Modeling
modeling
neuropeptides
G Protein
G-Protein-Coupled Receptors
Neuropeptides
Protein
Receptor
Genomics
brain
Brain
genomics

ASJC Scopus subject areas

  • Cellular and Molecular Neuroscience
  • Ecology
  • Molecular Biology
  • Genetics
  • Ecology, Evolution, Behavior and Systematics
  • Modeling and Simulation
  • Computational Theory and Mathematics

Cite this

Sonmez, M. K., Zaveri, N. T., Kerman, I. A., Burke, S., Neal, C. R., Xie, X., ... Toll, L. (2009). Evolutionary sequence modeling for discovery of peptide hormones. PLoS Computational Biology, 5(1), [e1000258]. https://doi.org/10.1371/journal.pcbi.1000258

Evolutionary sequence modeling for discovery of peptide hormones. / Sonmez, Mustafa (Kemal); Zaveri, Naunihal T.; Kerman, Ilan A.; Burke, Sharon; Neal, Charles R.; Xie, Xinmin; Watson, Stanley J.; Toll, Lawrence.

In: PLoS Computational Biology, Vol. 5, No. 1, e1000258, 01.2009.

Research output: Contribution to journalArticle

Sonmez, MK, Zaveri, NT, Kerman, IA, Burke, S, Neal, CR, Xie, X, Watson, SJ & Toll, L 2009, 'Evolutionary sequence modeling for discovery of peptide hormones', PLoS Computational Biology, vol. 5, no. 1, e1000258. https://doi.org/10.1371/journal.pcbi.1000258
Sonmez MK, Zaveri NT, Kerman IA, Burke S, Neal CR, Xie X et al. Evolutionary sequence modeling for discovery of peptide hormones. PLoS Computational Biology. 2009 Jan;5(1). e1000258. https://doi.org/10.1371/journal.pcbi.1000258
Sonmez, Mustafa (Kemal) ; Zaveri, Naunihal T. ; Kerman, Ilan A. ; Burke, Sharon ; Neal, Charles R. ; Xie, Xinmin ; Watson, Stanley J. ; Toll, Lawrence. / Evolutionary sequence modeling for discovery of peptide hormones. In: PLoS Computational Biology. 2009 ; Vol. 5, No. 1.
@article{afeda460d8804498ad5329f0ac27c971,
title = "Evolutionary sequence modeling for discovery of peptide hormones",
abstract = "There are currently a large number of {"}orphan{"} G-protein-coupled receptors (GPCRs) whose endogenous ligands (peptide hormones) are unknown. Identification of these peptide hormones is a difficult and important problem. We describe a computational framework that models spatial structure along the genomic sequence simultaneously with the temporal evolutionary path structure across species and show how such models can be used to discover new functional molecules, in particular peptide hormones, via cross-genomic sequence comparisons. The computational framework incorporates a priori high-level knowledge of structural and evolutionary constraints into a hierarchical grammar of evolutionary probabilistic models. This computational method was used for identifying novel prohormones and the processed peptide sites by producing sequence alignments across many species at the functional-element level. Experimental results with an initial implementation of the algorithm were used to identify potential prohormones by comparing the human and non-human proteins in the Swiss-Prot database of known annotated proteins. In this proof of concept, we identified 45 out of 54 prohormones with only 44 false positives. The comparison of known and hypothetical human and mouse proteins resulted in the identification of a novel putative prohormone with at least four potential neuropeptides. Finally, in order to validate the computational methodology, we present the basic molecular biological characterization of the novel putative peptide hormone, including its identification and regional localization in the brain. This species comparison, HMM-based computational approach succeeded in identifying a previously undiscovered neuropeptide from whole genome protein sequences. This novel putative peptide hormone is found in discreet brain regions as well as other organs. The success of this approach will have a great impact on our understanding of GPCRs and associated pathways and help to identify new targets for drug development.",
author = "Sonmez, {Mustafa (Kemal)} and Zaveri, {Naunihal T.} and Kerman, {Ilan A.} and Sharon Burke and Neal, {Charles R.} and Xinmin Xie and Watson, {Stanley J.} and Lawrence Toll",
year = "2009",
month = "1",
doi = "10.1371/journal.pcbi.1000258",
language = "English (US)",
volume = "5",
journal = "PLoS Computational Biology",
issn = "1553-734X",
publisher = "Public Library of Science",
number = "1",

}

TY - JOUR

T1 - Evolutionary sequence modeling for discovery of peptide hormones

AU - Sonmez, Mustafa (Kemal)

AU - Zaveri, Naunihal T.

AU - Kerman, Ilan A.

AU - Burke, Sharon

AU - Neal, Charles R.

AU - Xie, Xinmin

AU - Watson, Stanley J.

AU - Toll, Lawrence

PY - 2009/1

Y1 - 2009/1

N2 - There are currently a large number of "orphan" G-protein-coupled receptors (GPCRs) whose endogenous ligands (peptide hormones) are unknown. Identification of these peptide hormones is a difficult and important problem. We describe a computational framework that models spatial structure along the genomic sequence simultaneously with the temporal evolutionary path structure across species and show how such models can be used to discover new functional molecules, in particular peptide hormones, via cross-genomic sequence comparisons. The computational framework incorporates a priori high-level knowledge of structural and evolutionary constraints into a hierarchical grammar of evolutionary probabilistic models. This computational method was used for identifying novel prohormones and the processed peptide sites by producing sequence alignments across many species at the functional-element level. Experimental results with an initial implementation of the algorithm were used to identify potential prohormones by comparing the human and non-human proteins in the Swiss-Prot database of known annotated proteins. In this proof of concept, we identified 45 out of 54 prohormones with only 44 false positives. The comparison of known and hypothetical human and mouse proteins resulted in the identification of a novel putative prohormone with at least four potential neuropeptides. Finally, in order to validate the computational methodology, we present the basic molecular biological characterization of the novel putative peptide hormone, including its identification and regional localization in the brain. This species comparison, HMM-based computational approach succeeded in identifying a previously undiscovered neuropeptide from whole genome protein sequences. This novel putative peptide hormone is found in discreet brain regions as well as other organs. The success of this approach will have a great impact on our understanding of GPCRs and associated pathways and help to identify new targets for drug development.

AB - There are currently a large number of "orphan" G-protein-coupled receptors (GPCRs) whose endogenous ligands (peptide hormones) are unknown. Identification of these peptide hormones is a difficult and important problem. We describe a computational framework that models spatial structure along the genomic sequence simultaneously with the temporal evolutionary path structure across species and show how such models can be used to discover new functional molecules, in particular peptide hormones, via cross-genomic sequence comparisons. The computational framework incorporates a priori high-level knowledge of structural and evolutionary constraints into a hierarchical grammar of evolutionary probabilistic models. This computational method was used for identifying novel prohormones and the processed peptide sites by producing sequence alignments across many species at the functional-element level. Experimental results with an initial implementation of the algorithm were used to identify potential prohormones by comparing the human and non-human proteins in the Swiss-Prot database of known annotated proteins. In this proof of concept, we identified 45 out of 54 prohormones with only 44 false positives. The comparison of known and hypothetical human and mouse proteins resulted in the identification of a novel putative prohormone with at least four potential neuropeptides. Finally, in order to validate the computational methodology, we present the basic molecular biological characterization of the novel putative peptide hormone, including its identification and regional localization in the brain. This species comparison, HMM-based computational approach succeeded in identifying a previously undiscovered neuropeptide from whole genome protein sequences. This novel putative peptide hormone is found in discreet brain regions as well as other organs. The success of this approach will have a great impact on our understanding of GPCRs and associated pathways and help to identify new targets for drug development.

UR - http://www.scopus.com/inward/record.url?scp=59149099840&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=59149099840&partnerID=8YFLogxK

U2 - 10.1371/journal.pcbi.1000258

DO - 10.1371/journal.pcbi.1000258

M3 - Article

C2 - 19132080

AN - SCOPUS:59149099840

VL - 5

JO - PLoS Computational Biology

JF - PLoS Computational Biology

SN - 1553-734X

IS - 1

M1 - e1000258

ER -