TY - JOUR
T1 - Coupled dimensionality reduction and classification for supervised and semi-supervised multilabel learning
AU - Gönen, Mehmet
N1 - Funding Information:
Most of this work has been done while the author was working at the Helsinki Institute for Information Technology HIIT, Department of Information and Computer Science, Aalto University, Espoo, Finland. This work was financially supported by the Integrative Cancer Biology Program of the National Cancer Institute (Grant No. 1U54CA149237 ) and the Academy of Finland ( Finnish Centre of Excellence in Computational Inference Research COIN , Grant No. 251170 ). A preliminary version of this work appears in Gönen (2012) .
PY - 2014/3/1
Y1 - 2014/3/1
N2 - Coupled training of dimensionality reduction and classification is proposed previously to improve the prediction performance for single-label problems. Following this line of research, in this paper, we first introduce a novel Bayesian method that combines linear dimensionality reduction with linear binary classification for supervised multilabel learning and present a deterministic variational approximation algorithm to learn the proposed probabilistic model. We then extend the proposed method to find intrinsic dimensionality of the projected subspace using automatic relevance determination and to handle semi-supervised learning using a low-density assumption. We perform supervised learning experiments on four benchmark multilabel learning data sets by comparing our method with baseline linear dimensionality reduction algorithms. These experiments show that the proposed approach achieves good performance values in terms of hamming loss, average AUC, macro F1, and micro F1 on held-out test data. The low-dimensional embeddings obtained by our method are also very useful for exploratory data analysis. We also show the effectiveness of our approach in finding intrinsic subspace dimensionality and semi-supervised learning tasks.
AB - Coupled training of dimensionality reduction and classification is proposed previously to improve the prediction performance for single-label problems. Following this line of research, in this paper, we first introduce a novel Bayesian method that combines linear dimensionality reduction with linear binary classification for supervised multilabel learning and present a deterministic variational approximation algorithm to learn the proposed probabilistic model. We then extend the proposed method to find intrinsic dimensionality of the projected subspace using automatic relevance determination and to handle semi-supervised learning using a low-density assumption. We perform supervised learning experiments on four benchmark multilabel learning data sets by comparing our method with baseline linear dimensionality reduction algorithms. These experiments show that the proposed approach achieves good performance values in terms of hamming loss, average AUC, macro F1, and micro F1 on held-out test data. The low-dimensional embeddings obtained by our method are also very useful for exploratory data analysis. We also show the effectiveness of our approach in finding intrinsic subspace dimensionality and semi-supervised learning tasks.
KW - Automatic relevance determination
KW - Dimensionality reduction
KW - Multilabel learning
KW - Semi-supervised learning
KW - Supervised learning
KW - Variational approximation
UR - http://www.scopus.com/inward/record.url?scp=84891616874&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84891616874&partnerID=8YFLogxK
U2 - 10.1016/j.patrec.2013.11.021
DO - 10.1016/j.patrec.2013.11.021
M3 - Article
AN - SCOPUS:84891616874
SN - 0167-8655
VL - 38
SP - 132
EP - 141
JO - Pattern Recognition Letters
JF - Pattern Recognition Letters
IS - 1
ER -