### Abstract

We extend kernelized matrix factorization with a fully Bayesian treatment and with an ability to work with multiple side information sources expressed as different kernels. Kernel functions have been introduced to matrix factorization to integrate side information about the rows and columns (e.g., objects and users in recommender systems), which is necessary for making out-of-matrix (i.e., cold start) predictions. We discuss specifically bipartite graph inference, where the output matrix is binary, but extensions to more general matrices are straightforward. We extend the state of the art in two key aspects: (i) A fully conjugate probabilistic formulation of the kernelized matrix factorization problem enables an efficient variational approximation, whereas fully Bayesian treatments are not computationally feasible in the earlier approaches, (ii) Multiple side information sources are ineluded, treated as different kernels in multiple kernel learning that additionally reveals which side information sources are informative. Our method outperforms alternatives in predicting drug-protein interactions on two data sets. We then show that our framework can also be used for solving multilabel learning problems by considering samples and labels as the two domains where matrix factorization operates on. Our algorithm obtains the lowest Hamming loss values on 10 out of 14 multilabel classification data sets compared to five state-of-the-art multilabel learning algorithms.

Original language | English (US) |
---|---|

Title of host publication | 30th International Conference on Machine Learning, ICML 2013 |

Publisher | International Machine Learning Society (IMLS) |

Pages | 1901-1909 |

Number of pages | 9 |

Edition | PART 3 |

State | Published - 2013 |

Externally published | Yes |

Event | 30th International Conference on Machine Learning, ICML 2013 - Atlanta, GA, United States Duration: Jun 16 2013 → Jun 21 2013 |

### Other

Other | 30th International Conference on Machine Learning, ICML 2013 |
---|---|

Country | United States |

City | Atlanta, GA |

Period | 6/16/13 → 6/21/13 |

### Fingerprint

### ASJC Scopus subject areas

- Human-Computer Interaction
- Sociology and Political Science

### Cite this

*30th International Conference on Machine Learning, ICML 2013*(PART 3 ed., pp. 1901-1909). International Machine Learning Society (IMLS).

**Kernelized bayesian matrix factorization.** / Gonen, Mehmet; Khan, Suleiman A.; Kaski, Samuel.

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

*30th International Conference on Machine Learning, ICML 2013.*PART 3 edn, International Machine Learning Society (IMLS), pp. 1901-1909, 30th International Conference on Machine Learning, ICML 2013, Atlanta, GA, United States, 6/16/13.

}

TY - GEN

T1 - Kernelized bayesian matrix factorization

AU - Gonen, Mehmet

AU - Khan, Suleiman A.

AU - Kaski, Samuel

PY - 2013

Y1 - 2013

N2 - We extend kernelized matrix factorization with a fully Bayesian treatment and with an ability to work with multiple side information sources expressed as different kernels. Kernel functions have been introduced to matrix factorization to integrate side information about the rows and columns (e.g., objects and users in recommender systems), which is necessary for making out-of-matrix (i.e., cold start) predictions. We discuss specifically bipartite graph inference, where the output matrix is binary, but extensions to more general matrices are straightforward. We extend the state of the art in two key aspects: (i) A fully conjugate probabilistic formulation of the kernelized matrix factorization problem enables an efficient variational approximation, whereas fully Bayesian treatments are not computationally feasible in the earlier approaches, (ii) Multiple side information sources are ineluded, treated as different kernels in multiple kernel learning that additionally reveals which side information sources are informative. Our method outperforms alternatives in predicting drug-protein interactions on two data sets. We then show that our framework can also be used for solving multilabel learning problems by considering samples and labels as the two domains where matrix factorization operates on. Our algorithm obtains the lowest Hamming loss values on 10 out of 14 multilabel classification data sets compared to five state-of-the-art multilabel learning algorithms.

AB - We extend kernelized matrix factorization with a fully Bayesian treatment and with an ability to work with multiple side information sources expressed as different kernels. Kernel functions have been introduced to matrix factorization to integrate side information about the rows and columns (e.g., objects and users in recommender systems), which is necessary for making out-of-matrix (i.e., cold start) predictions. We discuss specifically bipartite graph inference, where the output matrix is binary, but extensions to more general matrices are straightforward. We extend the state of the art in two key aspects: (i) A fully conjugate probabilistic formulation of the kernelized matrix factorization problem enables an efficient variational approximation, whereas fully Bayesian treatments are not computationally feasible in the earlier approaches, (ii) Multiple side information sources are ineluded, treated as different kernels in multiple kernel learning that additionally reveals which side information sources are informative. Our method outperforms alternatives in predicting drug-protein interactions on two data sets. We then show that our framework can also be used for solving multilabel learning problems by considering samples and labels as the two domains where matrix factorization operates on. Our algorithm obtains the lowest Hamming loss values on 10 out of 14 multilabel classification data sets compared to five state-of-the-art multilabel learning algorithms.

UR - http://www.scopus.com/inward/record.url?scp=84897531872&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84897531872&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:84897531872

SP - 1901

EP - 1909

BT - 30th International Conference on Machine Learning, ICML 2013

PB - International Machine Learning Society (IMLS)

ER -