Sparse multitask regression for identifying common mechanism of response to therapeutic targets

Kai Zhang, Joe Gray, Bahram Parvin

Research output: Contribution to journalArticle

21 Citations (Scopus)

Abstract

Motivation: Molecular association of phenotypic responses is an important step in hypothesis generation and for initiating design of new experiments. Current practices for associating gene expression data with multidimensional phenotypic data are typically (i) performed one-to-one, i.e. each gene is examined independently with a phenotypic index and (ii) tested with one stress condition at a time, i.e. different perturbations are analyzed separately. As a result, the complex coordination among the genes responsible for a phenotypic profile is potentially lost. More importantly, univariate analysis can potentially hide new insights into common mechanism of response. Results: In this article, we propose a sparse, multitask regression model together with co-clustering analysis to explore the intrinsic grouping in associating the gene expression with phenotypic signatures. The global structure of association is captured by learning an intrinsic template that is shared among experimental conditions, with local perturbations introduced to integrate effects of therapeutic agents. We demonstrate the performance of our approach on both synthetic and experimental data. Synthetic data reveal that the multi-task regression has a superior reduction in the regression error when compared with traditional L1-and L2-regularized regression. On the other hand, experiments with cell cycle inhibitors over a panel of 14 breast cancer cell lines demonstrate the relevance of the computed molecular predictors with the cell cycle machinery, as well as the identification of hidden variables that are not captured by the baseline regression analysis. Accordingly, the system has identified CLCA2 as a hidden transcript and as a common mechanism of response for two therapeutic agents of CI-1040 and Iressa, which are currently in clinical use. Contact: b_parvin@lbl.gov.

Original languageEnglish (US)
Article numberbtq181
JournalBioinformatics
Volume26
Issue number12
DOIs
StatePublished - Jun 1 2010
Externally publishedYes

Fingerprint

Cell Cycle
Regression
Cells
Synthetic Data
Gene Expression
Gene expression
Target
Coordination Complexes
Therapeutic Uses
Genes
Association reactions
Gene
Perturbation
Cluster Analysis
Hidden Variables
Clustering Analysis
Multidimensional Data
Regression Analysis
Learning
Breast Neoplasms

ASJC Scopus subject areas

  • Biochemistry
  • Molecular Biology
  • Computational Theory and Mathematics
  • Computer Science Applications
  • Computational Mathematics
  • Statistics and Probability
  • Medicine(all)

Cite this

Sparse multitask regression for identifying common mechanism of response to therapeutic targets. / Zhang, Kai; Gray, Joe; Parvin, Bahram.

In: Bioinformatics, Vol. 26, No. 12, btq181, 01.06.2010.

Research output: Contribution to journalArticle

@article{ab03de7a58cf470e91494a17b1ec3a14,
title = "Sparse multitask regression for identifying common mechanism of response to therapeutic targets",
abstract = "Motivation: Molecular association of phenotypic responses is an important step in hypothesis generation and for initiating design of new experiments. Current practices for associating gene expression data with multidimensional phenotypic data are typically (i) performed one-to-one, i.e. each gene is examined independently with a phenotypic index and (ii) tested with one stress condition at a time, i.e. different perturbations are analyzed separately. As a result, the complex coordination among the genes responsible for a phenotypic profile is potentially lost. More importantly, univariate analysis can potentially hide new insights into common mechanism of response. Results: In this article, we propose a sparse, multitask regression model together with co-clustering analysis to explore the intrinsic grouping in associating the gene expression with phenotypic signatures. The global structure of association is captured by learning an intrinsic template that is shared among experimental conditions, with local perturbations introduced to integrate effects of therapeutic agents. We demonstrate the performance of our approach on both synthetic and experimental data. Synthetic data reveal that the multi-task regression has a superior reduction in the regression error when compared with traditional L1-and L2-regularized regression. On the other hand, experiments with cell cycle inhibitors over a panel of 14 breast cancer cell lines demonstrate the relevance of the computed molecular predictors with the cell cycle machinery, as well as the identification of hidden variables that are not captured by the baseline regression analysis. Accordingly, the system has identified CLCA2 as a hidden transcript and as a common mechanism of response for two therapeutic agents of CI-1040 and Iressa, which are currently in clinical use. Contact: b_parvin@lbl.gov.",
author = "Kai Zhang and Joe Gray and Bahram Parvin",
year = "2010",
month = "6",
day = "1",
doi = "10.1093/bioinformatics/btq181",
language = "English (US)",
volume = "26",
journal = "Bioinformatics",
issn = "1367-4803",
publisher = "Oxford University Press",
number = "12",

}

TY - JOUR

T1 - Sparse multitask regression for identifying common mechanism of response to therapeutic targets

AU - Zhang, Kai

AU - Gray, Joe

AU - Parvin, Bahram

PY - 2010/6/1

Y1 - 2010/6/1

N2 - Motivation: Molecular association of phenotypic responses is an important step in hypothesis generation and for initiating design of new experiments. Current practices for associating gene expression data with multidimensional phenotypic data are typically (i) performed one-to-one, i.e. each gene is examined independently with a phenotypic index and (ii) tested with one stress condition at a time, i.e. different perturbations are analyzed separately. As a result, the complex coordination among the genes responsible for a phenotypic profile is potentially lost. More importantly, univariate analysis can potentially hide new insights into common mechanism of response. Results: In this article, we propose a sparse, multitask regression model together with co-clustering analysis to explore the intrinsic grouping in associating the gene expression with phenotypic signatures. The global structure of association is captured by learning an intrinsic template that is shared among experimental conditions, with local perturbations introduced to integrate effects of therapeutic agents. We demonstrate the performance of our approach on both synthetic and experimental data. Synthetic data reveal that the multi-task regression has a superior reduction in the regression error when compared with traditional L1-and L2-regularized regression. On the other hand, experiments with cell cycle inhibitors over a panel of 14 breast cancer cell lines demonstrate the relevance of the computed molecular predictors with the cell cycle machinery, as well as the identification of hidden variables that are not captured by the baseline regression analysis. Accordingly, the system has identified CLCA2 as a hidden transcript and as a common mechanism of response for two therapeutic agents of CI-1040 and Iressa, which are currently in clinical use. Contact: b_parvin@lbl.gov.

AB - Motivation: Molecular association of phenotypic responses is an important step in hypothesis generation and for initiating design of new experiments. Current practices for associating gene expression data with multidimensional phenotypic data are typically (i) performed one-to-one, i.e. each gene is examined independently with a phenotypic index and (ii) tested with one stress condition at a time, i.e. different perturbations are analyzed separately. As a result, the complex coordination among the genes responsible for a phenotypic profile is potentially lost. More importantly, univariate analysis can potentially hide new insights into common mechanism of response. Results: In this article, we propose a sparse, multitask regression model together with co-clustering analysis to explore the intrinsic grouping in associating the gene expression with phenotypic signatures. The global structure of association is captured by learning an intrinsic template that is shared among experimental conditions, with local perturbations introduced to integrate effects of therapeutic agents. We demonstrate the performance of our approach on both synthetic and experimental data. Synthetic data reveal that the multi-task regression has a superior reduction in the regression error when compared with traditional L1-and L2-regularized regression. On the other hand, experiments with cell cycle inhibitors over a panel of 14 breast cancer cell lines demonstrate the relevance of the computed molecular predictors with the cell cycle machinery, as well as the identification of hidden variables that are not captured by the baseline regression analysis. Accordingly, the system has identified CLCA2 as a hidden transcript and as a common mechanism of response for two therapeutic agents of CI-1040 and Iressa, which are currently in clinical use. Contact: b_parvin@lbl.gov.

UR - http://www.scopus.com/inward/record.url?scp=77954196545&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77954196545&partnerID=8YFLogxK

U2 - 10.1093/bioinformatics/btq181

DO - 10.1093/bioinformatics/btq181

M3 - Article

C2 - 20529943

AN - SCOPUS:77954196545

VL - 26

JO - Bioinformatics

JF - Bioinformatics

SN - 1367-4803

IS - 12

M1 - btq181

ER -