Development of a robust classifier for quality control of reverse-phase protein arrays

Zhenlin Ju, Wenbin Liu, Paul L. Roebuck, Doris R. Siwak, Nianxiang Zhang, Yiling Lu, Michael A. Davies, Rehan Akbani, John N. Weinstein, Gordon Mills, Kevin R. Coombes

Research output: Contribution to journalArticle

15 Citations (Scopus)

Abstract

Motivation: High-throughput reverse-phase protein array (RPPA) technology allows for the parallel measurement of protein expression levels in approximately 1000 samples. However, the many steps required in the complex protocol (sample lysate preparation, slide printing, hybridization, washing and amplified detection) may create substantial variability in data quality. We are not aware of any other quality control algorithm that is tuned to the special characteristics of RPPAs. Results: We have developed a novel classifier for quality control of RPPA experiments using a generalized linear model and logistic function. The outcome of the classifier, ranging from 0 to 1, is defined as the probability that a slide is of good quality. After training, we tested the classifier using two independent validation datasets. We conclude that the classifier can distinguish RPPA slides of good quality from those of poor quality sufficiently well such that normalization schemes, protein expression patterns and advanced biological analyses will not be drastically impacted by erroneous measurements or systematic variations. Availability and implementation: The classifier, implemented in the "SuperCurve" R package, can be freely downloaded at http://bioinformatics.mdanderson.org/main/OOMPA:Overview or http://r-forge.r-project.org/projects/supercurve/. The data used to develop and validate the classifier are available at http://bioinformatics.mdanderson.org/MOAR.

Original languageEnglish (US)
Pages (from-to)912-918
Number of pages7
JournalBioinformatics
Volume31
Issue number6
DOIs
StatePublished - Mar 15 2015
Externally publishedYes

Fingerprint

Protein Array Analysis
Quality Control
Quality control
Reverse
Classifiers
Classifier
Computational Biology
Proteins
Protein
Printing
Bioinformatics
Linear Models
Technology
Data Quality
Generalized Linear Model
Washing
Logistics
High Throughput
Control Algorithm
Normalization

ASJC Scopus subject areas

  • Statistics and Probability
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Computational Theory and Mathematics
  • Computational Mathematics

Cite this

Ju, Z., Liu, W., Roebuck, P. L., Siwak, D. R., Zhang, N., Lu, Y., ... Coombes, K. R. (2015). Development of a robust classifier for quality control of reverse-phase protein arrays. Bioinformatics, 31(6), 912-918. https://doi.org/10.1093/bioinformatics/btu736

Development of a robust classifier for quality control of reverse-phase protein arrays. / Ju, Zhenlin; Liu, Wenbin; Roebuck, Paul L.; Siwak, Doris R.; Zhang, Nianxiang; Lu, Yiling; Davies, Michael A.; Akbani, Rehan; Weinstein, John N.; Mills, Gordon; Coombes, Kevin R.

In: Bioinformatics, Vol. 31, No. 6, 15.03.2015, p. 912-918.

Research output: Contribution to journalArticle

Ju, Z, Liu, W, Roebuck, PL, Siwak, DR, Zhang, N, Lu, Y, Davies, MA, Akbani, R, Weinstein, JN, Mills, G & Coombes, KR 2015, 'Development of a robust classifier for quality control of reverse-phase protein arrays', Bioinformatics, vol. 31, no. 6, pp. 912-918. https://doi.org/10.1093/bioinformatics/btu736
Ju, Zhenlin ; Liu, Wenbin ; Roebuck, Paul L. ; Siwak, Doris R. ; Zhang, Nianxiang ; Lu, Yiling ; Davies, Michael A. ; Akbani, Rehan ; Weinstein, John N. ; Mills, Gordon ; Coombes, Kevin R. / Development of a robust classifier for quality control of reverse-phase protein arrays. In: Bioinformatics. 2015 ; Vol. 31, No. 6. pp. 912-918.
@article{c680bd66d41e4b4b8672f6dd31c3c372,
title = "Development of a robust classifier for quality control of reverse-phase protein arrays",
abstract = "Motivation: High-throughput reverse-phase protein array (RPPA) technology allows for the parallel measurement of protein expression levels in approximately 1000 samples. However, the many steps required in the complex protocol (sample lysate preparation, slide printing, hybridization, washing and amplified detection) may create substantial variability in data quality. We are not aware of any other quality control algorithm that is tuned to the special characteristics of RPPAs. Results: We have developed a novel classifier for quality control of RPPA experiments using a generalized linear model and logistic function. The outcome of the classifier, ranging from 0 to 1, is defined as the probability that a slide is of good quality. After training, we tested the classifier using two independent validation datasets. We conclude that the classifier can distinguish RPPA slides of good quality from those of poor quality sufficiently well such that normalization schemes, protein expression patterns and advanced biological analyses will not be drastically impacted by erroneous measurements or systematic variations. Availability and implementation: The classifier, implemented in the {"}SuperCurve{"} R package, can be freely downloaded at http://bioinformatics.mdanderson.org/main/OOMPA:Overview or http://r-forge.r-project.org/projects/supercurve/. The data used to develop and validate the classifier are available at http://bioinformatics.mdanderson.org/MOAR.",
author = "Zhenlin Ju and Wenbin Liu and Roebuck, {Paul L.} and Siwak, {Doris R.} and Nianxiang Zhang and Yiling Lu and Davies, {Michael A.} and Rehan Akbani and Weinstein, {John N.} and Gordon Mills and Coombes, {Kevin R.}",
year = "2015",
month = "3",
day = "15",
doi = "10.1093/bioinformatics/btu736",
language = "English (US)",
volume = "31",
pages = "912--918",
journal = "Bioinformatics",
issn = "1367-4803",
publisher = "Oxford University Press",
number = "6",

}

TY - JOUR

T1 - Development of a robust classifier for quality control of reverse-phase protein arrays

AU - Ju, Zhenlin

AU - Liu, Wenbin

AU - Roebuck, Paul L.

AU - Siwak, Doris R.

AU - Zhang, Nianxiang

AU - Lu, Yiling

AU - Davies, Michael A.

AU - Akbani, Rehan

AU - Weinstein, John N.

AU - Mills, Gordon

AU - Coombes, Kevin R.

PY - 2015/3/15

Y1 - 2015/3/15

N2 - Motivation: High-throughput reverse-phase protein array (RPPA) technology allows for the parallel measurement of protein expression levels in approximately 1000 samples. However, the many steps required in the complex protocol (sample lysate preparation, slide printing, hybridization, washing and amplified detection) may create substantial variability in data quality. We are not aware of any other quality control algorithm that is tuned to the special characteristics of RPPAs. Results: We have developed a novel classifier for quality control of RPPA experiments using a generalized linear model and logistic function. The outcome of the classifier, ranging from 0 to 1, is defined as the probability that a slide is of good quality. After training, we tested the classifier using two independent validation datasets. We conclude that the classifier can distinguish RPPA slides of good quality from those of poor quality sufficiently well such that normalization schemes, protein expression patterns and advanced biological analyses will not be drastically impacted by erroneous measurements or systematic variations. Availability and implementation: The classifier, implemented in the "SuperCurve" R package, can be freely downloaded at http://bioinformatics.mdanderson.org/main/OOMPA:Overview or http://r-forge.r-project.org/projects/supercurve/. The data used to develop and validate the classifier are available at http://bioinformatics.mdanderson.org/MOAR.

AB - Motivation: High-throughput reverse-phase protein array (RPPA) technology allows for the parallel measurement of protein expression levels in approximately 1000 samples. However, the many steps required in the complex protocol (sample lysate preparation, slide printing, hybridization, washing and amplified detection) may create substantial variability in data quality. We are not aware of any other quality control algorithm that is tuned to the special characteristics of RPPAs. Results: We have developed a novel classifier for quality control of RPPA experiments using a generalized linear model and logistic function. The outcome of the classifier, ranging from 0 to 1, is defined as the probability that a slide is of good quality. After training, we tested the classifier using two independent validation datasets. We conclude that the classifier can distinguish RPPA slides of good quality from those of poor quality sufficiently well such that normalization schemes, protein expression patterns and advanced biological analyses will not be drastically impacted by erroneous measurements or systematic variations. Availability and implementation: The classifier, implemented in the "SuperCurve" R package, can be freely downloaded at http://bioinformatics.mdanderson.org/main/OOMPA:Overview or http://r-forge.r-project.org/projects/supercurve/. The data used to develop and validate the classifier are available at http://bioinformatics.mdanderson.org/MOAR.

UR - http://www.scopus.com/inward/record.url?scp=84925269960&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84925269960&partnerID=8YFLogxK

U2 - 10.1093/bioinformatics/btu736

DO - 10.1093/bioinformatics/btu736

M3 - Article

C2 - 25380958

AN - SCOPUS:84925269960

VL - 31

SP - 912

EP - 918

JO - Bioinformatics

JF - Bioinformatics

SN - 1367-4803

IS - 6

ER -