Academia.eduAcademia.edu

Bayesian Mixture Models For Semi-Supervised Clustering

2020

Abstract

In most real-world applications of clustering, data is partially labeled by an expert. Classical clustering approaches have been extensively studied in the presence of partial labels, however little work has been done to treat the general case of Bayesian mixture models. In this paper, we propose a new approach to perform semisupervised clustering using parametric and non parametric mixture models. We show how our approach generalizes mixture models with different types of emission distributions and priors under the same theoretical framework for semi-supervised clustering. The partial labels intervene in the clustering in the form of a Hidden Markov Random Field (HMRF) that introduces a penalty if the partial labels are not respected. We demonstrate how to perform inference in both the finite and infinite case with priors on the mixture components and the parameters using variational inference. Our experimental evaluations on synthetic data show how the method can leverage the part...