TensorClus: A python library for tensor (Co)-clustering

Lazhar Labiod

TensorClus: A python library for tensor (Co)-clustering

Lazhar Labiod

2021, Neurocomputing

visibility

…

description

13 pages

link

1 file

Sign up for access to the world's latest research

checkGet notified about relevant papers

checkSave papers to use in your research

checkJoin the discussion with peers

checkTrack your impact

Abstract

Tensor data analysis is the evolutionary step of data analysis to more than two dimensions. Dealing with tensor data is often based on tensor decomposition methods. The present paper focuses on unsupervised learning and provides a python package referred to as TensorClus including novel co-clustering algorithms of three-way data. All proposed algorithms are based on the latent block models and suitable to different types of data, sparse or not. They are successfully evaluated on challenges in text mining, recommender systems, and hyperspectral image clustering. TensorClus is an open-source Python package that allows easy interaction with other python packages such as NumPy and TensorFlow; it also offers an interface with some tensor decomposition packages namely Tensorly and TensorD on the one hand, and on the other, the co-clustering package Coclust. Finally, it provides CPU and GPU compatibility. The TensorClus library is available at https://pypi.org/project/TensorClus/ 1 .

Lazhar Labiod

International Journal of Data Science and Analytics, 2020

With the exponential growth of collected data in different fields like recommender system (user, items), text mining (document, term), bioinformatics (individual, gene), co-clustering, which is a simultaneous clustering of both dimensions of a data matrix, has become a popular technique. Co-clustering aims to obtain homogeneous blocks leading to a straightforward simultaneous interpretation of row clusters and column clusters. Many approaches exist; in this paper, we rely on the latent block model (LBM), which is flexible, allowing to model different types of data matrices. We extend its use to the case of a tensor (3D matrix) data in proposing a Tensor LBM (TLBM), allowing different relations between entities. To show the interest of TLBM, we consider continuous, binary, and contingency tables datasets. To estimate the parameters, a variational EM algorithm is developed. Its performances are evaluated on synthetic and real datasets to highlight different possible applications. Keywords Co-clustering • Tensor • Data science This submission is an extension version of the PAKDD 2019 paper 'Co-clustering from Tensor Data'.

Log In

TensorClus: A python library for tensor (Co)-clustering

Sign up for access to the world's latest research

Abstract

Related papers

Related papers

Related topics