Skip to main content

Dmitry Ignatov

National Research University Higher School of Economics, Data analysis and artificial intelligence, Associate Professor

Followers

44

Following

44

Public Views

Address: This web page is not full and up-to-date. You can find my publication list on Google scholar or DBLP. To see more information about me look at my personal HSE page.

less

University of Salamanca

Harvard University

Julita Vassileva

University of Saskatchewan

Sebastián Ventura

Universidad de Córdoba

University of Central Florida

Microsoft Research

The University of Sydney

Aswani Kumar Cherukuri

VIT University

Armando Marques-Guedes

UNL - New University of Lisbon

The Hebrew University of Jerusalem

InterestsView All (13)

Uploads

Papers (not up-to-date) by Dmitry Ignatov

Frequent Itemset Mining for Clustering Near Duplicate Web Documents

A vast amount of documents in the Web have duplicates, which is a challenge for developing effici... more A vast amount of documents in the Web have duplicates, which is a challenge for developing efficient methods that would compute clusters of similar documents. In this paper we use an approach based on computing (closed) sets of attributes having large support (large extent) as clusters of similar documents. The method is tested in a series of computer experiments on large public collections of web documents and compared to other established methods and software, such as biclustering, on same datasets. Practical efficiency of different algorithms for computing frequent closed sets of attributes is compared.

Concept-based Recommendations for Internet Advertisement

Computing Research Repository, 2009

The problem of detecting terms that can be interesting to the advertiser is considered. If a comp... more The problem of detecting terms that can be interesting to the advertiser is considered. If a company has already bought some advertising terms which describe certain services, it is reasonable to find out the terms bought by competing companies. A part of them can be recommended as future advertising terms to the company. The goal of this work is to propose better interpretable recommendations based on FCA and association rules.

From Triconcepts to Triclusters

A novel approach to triclustering of a three-way binary data is proposed. Tricluster is defined i... more A novel approach to triclustering of a three-way binary data is proposed. Tricluster is defined in terms of Triadic Formal Concept Analysis as a dense triset of a binary relation Y, describing relationship between objects, attributes and conditions. This definition is a relaxation of a triconcept notion and makes it possible to find all triclusters and triconcepts contained in triclusters of large datasets. This approach generalizes the similar study of concept-based biclustering.

Concept Stability for Constructing Taxonomies of Website Users

Computing Research Repository, 2009

Owners of a web-site are often interested in analysis of groups of users of their site. Informati... more Owners of a web-site are often interested in analysis of groups of users of their site. Information on these groups can help optimizing the structure and contents of the site. In this paper we use an approach based on formal concepts for constructing taxonomies of user groups. For decreasing the huge amount of concepts that arise in applications, we employ stability index of a concept, which describes how a group given by a concept extent differs from other such groups. We analyze resulting taxonomies of user groups for three target websites.

Papers by Dmitry Ignatov

Synthesis of multi-stakeholder run time dynamic digital twins and dynamic digital twins networks

Research Square (Research Square), Jun 1, 2023

The article describes an approach to the construction of complex distributed cyber-physical syste... more The article describes an approach to the construction of complex distributed cyber-physical systems with a high level of architectural dynamics built on fog and edge computing platforms. The key idea of the developed approach is to use digital twins as dynamic models of the observed and managed systems, which are kept up-to-date by processing the event flow received in the form of logs. A reference architecture of a dynamic runtime digital twin is proposed. The possible approaches to synthesis of models that form the basis of digital twins are discussed. Examples of using the proposed approach to solve practical problems are given. The described approach may be of interest to specialists engaged in research and development of different kinds of information systems realized on the IoT platforms such as smart cities, smart transport, medical information systems, etc.

Modeling States, Events

We show how purpose can be used as a central guiding principle for organizing knowledge about art... more We show how purpose can be used as a central guiding principle for organizing knowledge about artifacts. It allows the actions in which the artifact participates to be related naturally to other objects. Similarly, the structure or parts of the artifact can also be related to the actions. A knowledgebase called PurposeNet has been built using these principles. A comparison with other knowledebases shows that it is a superior method in terms of coverage. It also makes it possible for automatic extraction of simple facts (or information) from text for populating a richly structured knowledgebase. An experiment in domain-specific question-answering from a given passage shows that PurposeNet used alongwith scripts (or knowledge of stereotypical situations), can lead to substantially higher accuracy in question answering. In the domain of car racing, individually they produce correct answers to 50% and 37.5% questions respectively, but together they produce 89% correct answers.

On closure operators related to maximal tricliques in tripartite hypergraphs

arXiv (Cornell University), Feb 23, 2016

Triadic Formal Concept Analysis (3FCA) was introduced by Lehman and Wille almost two decades ago.... more Triadic Formal Concept Analysis (3FCA) was introduced by Lehman and Wille almost two decades ago. And many researchers work in Data Mining and Formal Concept Analysis using the notions of closed sets, Galois and closure operators, closure systems, but up-to-date even though that different researchers actively work on mining triadic and n-ary relations, a proper closure operator for enumeration of triconcepts, i.e. maximal triadic cliques of tripartite hypergaphs, was not introduced. In this paper we show that the previously introduced operators for obtaining triconcepts and maximal connected and complete sets (MCCSs) are not always consistent and provide the reader with a definition of valid closure operator and associated set system. Moreover, we study the difficulties of related problems from order-theoretic and combinatorial point view as well as provide the reader with justifications of the complexity classes of these problems.

Reciprocity in Gift-Exchange-Games

arXiv (Cornell University), Feb 23, 2014

This paper presents an analysis of data from a gift-exchange-game experiment. The experiment was ... more This paper presents an analysis of data from a gift-exchange-game experiment. The experiment was described in 'The Impact of Social Comparisons on Reciprocity' by Gächter et al. 2012. Since this paper uses state-of-art data science techniques, the results provide a different point of view on the problem. As already shown in relevant literature from experimental economics, human decisions deviate from rational payoff maximization. The average gift rate was 31%. Gift rate was under no conditions zero. Further, we derive some special findings and calculate their significance.

On Suboptimality of GreConD for Boolean Matrix Factorisation of Contranominal Scales

International Joint Conference on Artificial Intelligence, 2021

In this paper we study certain properties of the GreConD algorithm for Boolean matrix factorisati... more In this paper we study certain properties of the GreConD algorithm for Boolean matrix factorisation, a popular technique in Data Mining with binary relational data. This greedy algorithm was inspired by the fact that the optimal number of factors for the Boolean matrix factorisation can be chosen among the formal concepts of the corresponding formal context. In particular, we consider one of the hardest cases (in terms of the numerous of possible factors), the so-called contranominal scales, and show that the output of GreConD is not optimal in this case. Moreover, we formally analyse its output by means of recurrences and generating functions and provide the reader with the closed form for the returned number of factors. An algorithm generating the optimal number of factors and the corresponding product matrices P and Q is also provided by us for the case of contranominal scales.

A Lattice-based Consensus Clustering Algorithm

Concept Lattices and their Applications, 2016

We propose a new algorithm for consensus clustering, FCA-Consensus, based on Formal Concept Analy... more We propose a new algorithm for consensus clustering, FCA-Consensus, based on Formal Concept Analysis. As the input, the algorithm takes T partitions of a certain set of objects obtained by k-means algorithm after T runs from different initialisations. The resulting consensus partition is extracted from an antichain of the concept lattice built on a formal context objects × classes, where the classes are the set of all cluster labels from each initial k-means partition. We compare the results of the proposed algorithm in terms of ARI measure with the state-of-theart algorithms on synthetic datasets. Under certain conditions, the best ARI values are demonstrated by FCA-Consensus.

Mixed Integer Programming for Searching Maximum Quasi-Bicliques

Springer proceedings in mathematics & statistics, 2020

This paper is related to the problem of finding the maximal quasi-bicliques in a bipartite graph ... more This paper is related to the problem of finding the maximal quasi-bicliques in a bipartite graph (bigraph). A quasi-biclique in the bigraph is its "almost" complete subgraph. The relaxation of completeness can be understood variously; here, we assume that the subgraph is a γ-quasi-biclique if it lacks a certain number of edges to form a biclique such that its density is at least γ ∈ (0, 1]. For a bigraph and fixed γ, the problem of searching for the maximal quasi-biclique consists of finding a subset of vertices of the bigraph such that the induced subgraph is a quasi-biclique and its size is maximal for a given graph. Several models based on Mixed Integer Programming (MIP) to search for a quasi-biclique are proposed and tested for working efficiency. An alternative model inspired by biclustering is formulated and tested; this model simultaneously maximizes both the size of the quasi-biclique and its density, using the least-square criterion similar to the one exploited by triclustering T B .

Context-Aware Recommender System Based on Boolean Matrix Factorisation

Concept Lattices and their Applications, 2015

In this work we propose and study an approach for collaborative filtering, which is based on Bool... more In this work we propose and study an approach for collaborative filtering, which is based on Boolean matrix factorisation and exploits additional (context) information about users and items. To avoid similarity loss in case of Boolean representation we use an adjusted type of projection of a target user to the obtained factor space. We have compared the proposed method with SVD-based approach on the MovieLens dataset. The experiments demonstrate that the proposed method has better MAE and Precision and comparable Recall and F-measure. We also report an increase of quality in the context information presence.

Compression of recurrent neural networks for efficient language modeling

Applied Soft Computing, Jun 1, 2019

Recurrent neural networks have proved to be an effective method for statistical language modeling... more Recurrent neural networks have proved to be an effective method for statistical language modeling. However, in practice their memory and run-time complexity are usually too large to be implemented in real-time offline mobile applications. In this paper we consider several compression techniques for recurrent neural networks including Long-Short Term Memory models. We make particular attention to the high-dimensional output problem caused by the very large vocabulary size. We focus on effective compression methods in the context of their exploitation on devices: pruning, quantization, and matrix decomposition approaches (low-rank factorization and tensor train decomposition, in particular). For each model we investigate the trade-off between its size, suitability for fast inference and perplexity. We propose a general pipeline for applying the most suitable methods to compress recurrent neural networks for language modeling. It has been shown in the experimental study with the Penn Treebank (PTB) dataset that the most efficient results in terms of speed and compression-perplexity balance are obtained by matrix decomposition techniques.

Efficient Language Modeling with Automatic Relevance Determination in Recurrent Neural Networks

Reduction of the number of parameters is one of the most important goals in Deep Learning. In thi... more Reduction of the number of parameters is one of the most important goals in Deep Learning. In this article we propose an adaptation of Doubly Stochastic Variational Inference for Automatic Relevance Determination (DSVI-ARD) for neural networks compression. We find this method to be especially useful in language modeling tasks, where large number of parameters in the input and output layers is often excessive. We also show that DSVI-ARD can be applied together with encoder-decoder weight tying allowing to achieve even better sparsity and performance. Our experiments demonstrate that more than 90% of the weights in both encoder and decoder layers can be removed with a minimal quality loss.

Preliminary Results on Mixed Integer Programming for Searching Maximum Quasi-Bicliques and Large Dense Biclusters

This short paper is related to the problem of finding maximum quasi-bicliques in a bipartite grap... more This short paper is related to the problem of finding maximum quasi-bicliques in a bipartite graph (bigraph). A quasi-biclique in a bigraph is its "almost" complete subgraph; here, we assume that the subgraph is a quasi-biclique if it lacks γ • 100% of the edges to become a biclique. The problem of finding the maximal quasi-biclique(s) consists of finding subset(s) of vertices of an input bigraph such that the induced by these subsets subgraph is a quasi-biclique and its size is maximal. A model based on mixed integer programming (MIP) to search for a quasi-biclique is proposed and tested. Another its variant is tested that simultaneously maximizes both the size of the quasi-biclique and its density, using the least-square criterion similar to the one exploited by Tri-Box method for tricluster generation. Therefore, the output patterns can be called large dense biclusters as well.

Understanding Collaborative Filtering with Galois Connections

International Joint Conference on Artificial Intelligence, 2018

In this paper, we explain how Galois connection and related operators between sets of users and i... more In this paper, we explain how Galois connection and related operators between sets of users and items naturally arise in user-item data for forming neighbourhoods of a target user or item for Collaborative Filtering. We compare the properties of these operators and their applicability in simple collaborative user-to-user and item-to-item setting. Moreover, we propose a new neighbourhood-forming operator based on pair-wise similarity ranking of users, which takes intermediate place between the studied closure operators and its relaxations in terms of neighbourhood size and demonstrates comparatively good Precision-Recall trade-off. In addition, we compare the studied neighbourhood-forming operators in the collaborative filtering setting against simple but strong benchmark, the SlopeOne algorithm, over bimodal cross-validation on MovieLens dataset.

Shapley and Banzhaf Vectors of a Formal Concept

Concept Lattices and their Applications, 2020

We propose the usage of two power indices from cooperative game theory and public choice theory f... more We propose the usage of two power indices from cooperative game theory and public choice theory for ranking attributes of closed sets, namely intents of formal concepts (or closed itemsets). The introduced indices are related to extensional concept stability and based on counting generators, especially those that contain a selected attribute. The introduction of such indices is motivated by the so-called interpretable machine learning, which supposes that we do not only have the class membership decision of a trained model for a particular object, but also a set of attributes (in the form of JSM-hypotheses or other patterns) along with individual importance of their single attributes (or more complex constituent elements). We characterise computation of Shapley and Banzhaf values of a formal concept in terms of minimal generators and their order filters, provide the reader with their properties important for computation purposes, and show experimental results.

RAPS: A Recommender Algorithm Based on Pattern Structures

arXiv (Cornell University), Jul 20, 2015

We propose a new algorithm for recommender systems with numeric ratings which is based on Pattern... more We propose a new algorithm for recommender systems with numeric ratings which is based on Pattern Structures (RAPS). As the input the algorithm takes rating matrix, e.g., such that it contains movies rated by users. For a target user, the algorithm returns a rated list of items (movies) based on its previous ratings and ratings of other users. We compare the results of the proposed algorithm in terms of precision and recall measures with Slope One, one of the state-of-theart item-based algorithms, on Movie Lens dataset and RAPS demonstrates the best or comparable quality.

Towards a closure operator for enumeration of maximal tricliques in tripartite hypergraphs

arXiv (Cornell University), Feb 23, 2016

Triadic Formal Concept Analysis (3FCA) was introduced by Lehman and Wille almost two decades ago,... more Triadic Formal Concept Analysis (3FCA) was introduced by Lehman and Wille almost two decades ago, but up-to-date even though that different researchers actively work on this FCA branch, a proper closure operator for enumeration of triconcepts, i.e. maximal triadic cliques of tripartite hypergaphs, was not introduced. In this paper we show that the previously introduced operators for obtaining triconcepts and maximal connected and complete sets (MCCS) are not always consistent and provide the reader with the definition of a valid closure operator and the associated set system.

On the Cryptomorphism between Davis' Subset Lattices, Atomic Lattices, and Closure Systems under T1 Separation Axiom

arXiv (Cornell University), Sep 25, 2022

In this paper we count set closure systems (also known as Moore families) for the case when all s... more In this paper we count set closure systems (also known as Moore families) for the case when all single element sets are closed. In particular, we give the numbers of such strict (empty set included) and non-strict families for the base set of size n = 6. We also provide the number of such inequivalent Moore families with respect to all permutations of the base set up to n = 6. The search in OEIS and existing literature revealed the coincidence of the found numbers with the entry for D. M. Davis' set union lattice (A235604, up to n = 5) and |L n |, the number of atomic lattices on n atoms, obtained by S. Mapes (up to n = 6), respectively. Thus we study all those cases, establish one-to-one correspondences between them via Galois adjunctions and Formal Concept Analysis, and provide the reader with two of our enumerative algorithms as well as with the results of these algorithms used for additional tests. Other results include the largest size of intersection free families for n = 6 plus our conjecture for n = 7, an upper bound for the number of atomic lattices L n , and some structural properties of L n based on the theory of extremal lattices.

Frequent Itemset Mining for Clustering Near Duplicate Web Documents

A vast amount of documents in the Web have duplicates, which is a challenge for developing effici... more A vast amount of documents in the Web have duplicates, which is a challenge for developing efficient methods that would compute clusters of similar documents. In this paper we use an approach based on computing (closed) sets of attributes having large support (large extent) as clusters of similar documents. The method is tested in a series of computer experiments on large public collections of web documents and compared to other established methods and software, such as biclustering, on same datasets. Practical efficiency of different algorithms for computing frequent closed sets of attributes is compared.

Concept-based Recommendations for Internet Advertisement

Computing Research Repository, 2009

The problem of detecting terms that can be interesting to the advertiser is considered. If a comp... more The problem of detecting terms that can be interesting to the advertiser is considered. If a company has already bought some advertising terms which describe certain services, it is reasonable to find out the terms bought by competing companies. A part of them can be recommended as future advertising terms to the company. The goal of this work is to propose better interpretable recommendations based on FCA and association rules.

From Triconcepts to Triclusters

A novel approach to triclustering of a three-way binary data is proposed. Tricluster is defined i... more A novel approach to triclustering of a three-way binary data is proposed. Tricluster is defined in terms of Triadic Formal Concept Analysis as a dense triset of a binary relation Y, describing relationship between objects, attributes and conditions. This definition is a relaxation of a triconcept notion and makes it possible to find all triclusters and triconcepts contained in triclusters of large datasets. This approach generalizes the similar study of concept-based biclustering.

Concept Stability for Constructing Taxonomies of Website Users

Computing Research Repository, 2009

Owners of a web-site are often interested in analysis of groups of users of their site. Informati... more Owners of a web-site are often interested in analysis of groups of users of their site. Information on these groups can help optimizing the structure and contents of the site. In this paper we use an approach based on formal concepts for constructing taxonomies of user groups. For decreasing the huge amount of concepts that arise in applications, we employ stability index of a concept, which describes how a group given by a concept extent differs from other such groups. We analyze resulting taxonomies of user groups for three target websites.

Synthesis of multi-stakeholder run time dynamic digital twins and dynamic digital twins networks

Research Square (Research Square), Jun 1, 2023

The article describes an approach to the construction of complex distributed cyber-physical syste... more The article describes an approach to the construction of complex distributed cyber-physical systems with a high level of architectural dynamics built on fog and edge computing platforms. The key idea of the developed approach is to use digital twins as dynamic models of the observed and managed systems, which are kept up-to-date by processing the event flow received in the form of logs. A reference architecture of a dynamic runtime digital twin is proposed. The possible approaches to synthesis of models that form the basis of digital twins are discussed. Examples of using the proposed approach to solve practical problems are given. The described approach may be of interest to specialists engaged in research and development of different kinds of information systems realized on the IoT platforms such as smart cities, smart transport, medical information systems, etc.

Modeling States, Events

We show how purpose can be used as a central guiding principle for organizing knowledge about art... more We show how purpose can be used as a central guiding principle for organizing knowledge about artifacts. It allows the actions in which the artifact participates to be related naturally to other objects. Similarly, the structure or parts of the artifact can also be related to the actions. A knowledgebase called PurposeNet has been built using these principles. A comparison with other knowledebases shows that it is a superior method in terms of coverage. It also makes it possible for automatic extraction of simple facts (or information) from text for populating a richly structured knowledgebase. An experiment in domain-specific question-answering from a given passage shows that PurposeNet used alongwith scripts (or knowledge of stereotypical situations), can lead to substantially higher accuracy in question answering. In the domain of car racing, individually they produce correct answers to 50% and 37.5% questions respectively, but together they produce 89% correct answers.

On closure operators related to maximal tricliques in tripartite hypergraphs

arXiv (Cornell University), Feb 23, 2016

Triadic Formal Concept Analysis (3FCA) was introduced by Lehman and Wille almost two decades ago.... more Triadic Formal Concept Analysis (3FCA) was introduced by Lehman and Wille almost two decades ago. And many researchers work in Data Mining and Formal Concept Analysis using the notions of closed sets, Galois and closure operators, closure systems, but up-to-date even though that different researchers actively work on mining triadic and n-ary relations, a proper closure operator for enumeration of triconcepts, i.e. maximal triadic cliques of tripartite hypergaphs, was not introduced. In this paper we show that the previously introduced operators for obtaining triconcepts and maximal connected and complete sets (MCCSs) are not always consistent and provide the reader with a definition of valid closure operator and associated set system. Moreover, we study the difficulties of related problems from order-theoretic and combinatorial point view as well as provide the reader with justifications of the complexity classes of these problems.

Reciprocity in Gift-Exchange-Games

arXiv (Cornell University), Feb 23, 2014

This paper presents an analysis of data from a gift-exchange-game experiment. The experiment was ... more This paper presents an analysis of data from a gift-exchange-game experiment. The experiment was described in 'The Impact of Social Comparisons on Reciprocity' by Gächter et al. 2012. Since this paper uses state-of-art data science techniques, the results provide a different point of view on the problem. As already shown in relevant literature from experimental economics, human decisions deviate from rational payoff maximization. The average gift rate was 31%. Gift rate was under no conditions zero. Further, we derive some special findings and calculate their significance.

On Suboptimality of GreConD for Boolean Matrix Factorisation of Contranominal Scales

International Joint Conference on Artificial Intelligence, 2021

In this paper we study certain properties of the GreConD algorithm for Boolean matrix factorisati... more In this paper we study certain properties of the GreConD algorithm for Boolean matrix factorisation, a popular technique in Data Mining with binary relational data. This greedy algorithm was inspired by the fact that the optimal number of factors for the Boolean matrix factorisation can be chosen among the formal concepts of the corresponding formal context. In particular, we consider one of the hardest cases (in terms of the numerous of possible factors), the so-called contranominal scales, and show that the output of GreConD is not optimal in this case. Moreover, we formally analyse its output by means of recurrences and generating functions and provide the reader with the closed form for the returned number of factors. An algorithm generating the optimal number of factors and the corresponding product matrices P and Q is also provided by us for the case of contranominal scales.

A Lattice-based Consensus Clustering Algorithm

Concept Lattices and their Applications, 2016

We propose a new algorithm for consensus clustering, FCA-Consensus, based on Formal Concept Analy... more We propose a new algorithm for consensus clustering, FCA-Consensus, based on Formal Concept Analysis. As the input, the algorithm takes T partitions of a certain set of objects obtained by k-means algorithm after T runs from different initialisations. The resulting consensus partition is extracted from an antichain of the concept lattice built on a formal context objects × classes, where the classes are the set of all cluster labels from each initial k-means partition. We compare the results of the proposed algorithm in terms of ARI measure with the state-of-theart algorithms on synthetic datasets. Under certain conditions, the best ARI values are demonstrated by FCA-Consensus.

Mixed Integer Programming for Searching Maximum Quasi-Bicliques

Springer proceedings in mathematics & statistics, 2020

This paper is related to the problem of finding the maximal quasi-bicliques in a bipartite graph ... more This paper is related to the problem of finding the maximal quasi-bicliques in a bipartite graph (bigraph). A quasi-biclique in the bigraph is its "almost" complete subgraph. The relaxation of completeness can be understood variously; here, we assume that the subgraph is a γ-quasi-biclique if it lacks a certain number of edges to form a biclique such that its density is at least γ ∈ (0, 1]. For a bigraph and fixed γ, the problem of searching for the maximal quasi-biclique consists of finding a subset of vertices of the bigraph such that the induced subgraph is a quasi-biclique and its size is maximal for a given graph. Several models based on Mixed Integer Programming (MIP) to search for a quasi-biclique are proposed and tested for working efficiency. An alternative model inspired by biclustering is formulated and tested; this model simultaneously maximizes both the size of the quasi-biclique and its density, using the least-square criterion similar to the one exploited by triclustering T B .

Context-Aware Recommender System Based on Boolean Matrix Factorisation

Concept Lattices and their Applications, 2015

In this work we propose and study an approach for collaborative filtering, which is based on Bool... more In this work we propose and study an approach for collaborative filtering, which is based on Boolean matrix factorisation and exploits additional (context) information about users and items. To avoid similarity loss in case of Boolean representation we use an adjusted type of projection of a target user to the obtained factor space. We have compared the proposed method with SVD-based approach on the MovieLens dataset. The experiments demonstrate that the proposed method has better MAE and Precision and comparable Recall and F-measure. We also report an increase of quality in the context information presence.

Compression of recurrent neural networks for efficient language modeling

Applied Soft Computing, Jun 1, 2019

Recurrent neural networks have proved to be an effective method for statistical language modeling... more Recurrent neural networks have proved to be an effective method for statistical language modeling. However, in practice their memory and run-time complexity are usually too large to be implemented in real-time offline mobile applications. In this paper we consider several compression techniques for recurrent neural networks including Long-Short Term Memory models. We make particular attention to the high-dimensional output problem caused by the very large vocabulary size. We focus on effective compression methods in the context of their exploitation on devices: pruning, quantization, and matrix decomposition approaches (low-rank factorization and tensor train decomposition, in particular). For each model we investigate the trade-off between its size, suitability for fast inference and perplexity. We propose a general pipeline for applying the most suitable methods to compress recurrent neural networks for language modeling. It has been shown in the experimental study with the Penn Treebank (PTB) dataset that the most efficient results in terms of speed and compression-perplexity balance are obtained by matrix decomposition techniques.

Efficient Language Modeling with Automatic Relevance Determination in Recurrent Neural Networks

Reduction of the number of parameters is one of the most important goals in Deep Learning. In thi... more Reduction of the number of parameters is one of the most important goals in Deep Learning. In this article we propose an adaptation of Doubly Stochastic Variational Inference for Automatic Relevance Determination (DSVI-ARD) for neural networks compression. We find this method to be especially useful in language modeling tasks, where large number of parameters in the input and output layers is often excessive. We also show that DSVI-ARD can be applied together with encoder-decoder weight tying allowing to achieve even better sparsity and performance. Our experiments demonstrate that more than 90% of the weights in both encoder and decoder layers can be removed with a minimal quality loss.

Preliminary Results on Mixed Integer Programming for Searching Maximum Quasi-Bicliques and Large Dense Biclusters

This short paper is related to the problem of finding maximum quasi-bicliques in a bipartite grap... more This short paper is related to the problem of finding maximum quasi-bicliques in a bipartite graph (bigraph). A quasi-biclique in a bigraph is its "almost" complete subgraph; here, we assume that the subgraph is a quasi-biclique if it lacks γ • 100% of the edges to become a biclique. The problem of finding the maximal quasi-biclique(s) consists of finding subset(s) of vertices of an input bigraph such that the induced by these subsets subgraph is a quasi-biclique and its size is maximal. A model based on mixed integer programming (MIP) to search for a quasi-biclique is proposed and tested. Another its variant is tested that simultaneously maximizes both the size of the quasi-biclique and its density, using the least-square criterion similar to the one exploited by Tri-Box method for tricluster generation. Therefore, the output patterns can be called large dense biclusters as well.

Understanding Collaborative Filtering with Galois Connections

International Joint Conference on Artificial Intelligence, 2018

In this paper, we explain how Galois connection and related operators between sets of users and i... more In this paper, we explain how Galois connection and related operators between sets of users and items naturally arise in user-item data for forming neighbourhoods of a target user or item for Collaborative Filtering. We compare the properties of these operators and their applicability in simple collaborative user-to-user and item-to-item setting. Moreover, we propose a new neighbourhood-forming operator based on pair-wise similarity ranking of users, which takes intermediate place between the studied closure operators and its relaxations in terms of neighbourhood size and demonstrates comparatively good Precision-Recall trade-off. In addition, we compare the studied neighbourhood-forming operators in the collaborative filtering setting against simple but strong benchmark, the SlopeOne algorithm, over bimodal cross-validation on MovieLens dataset.

Shapley and Banzhaf Vectors of a Formal Concept

Concept Lattices and their Applications, 2020

We propose the usage of two power indices from cooperative game theory and public choice theory f... more We propose the usage of two power indices from cooperative game theory and public choice theory for ranking attributes of closed sets, namely intents of formal concepts (or closed itemsets). The introduced indices are related to extensional concept stability and based on counting generators, especially those that contain a selected attribute. The introduction of such indices is motivated by the so-called interpretable machine learning, which supposes that we do not only have the class membership decision of a trained model for a particular object, but also a set of attributes (in the form of JSM-hypotheses or other patterns) along with individual importance of their single attributes (or more complex constituent elements). We characterise computation of Shapley and Banzhaf values of a formal concept in terms of minimal generators and their order filters, provide the reader with their properties important for computation purposes, and show experimental results.

RAPS: A Recommender Algorithm Based on Pattern Structures

arXiv (Cornell University), Jul 20, 2015

We propose a new algorithm for recommender systems with numeric ratings which is based on Pattern... more We propose a new algorithm for recommender systems with numeric ratings which is based on Pattern Structures (RAPS). As the input the algorithm takes rating matrix, e.g., such that it contains movies rated by users. For a target user, the algorithm returns a rated list of items (movies) based on its previous ratings and ratings of other users. We compare the results of the proposed algorithm in terms of precision and recall measures with Slope One, one of the state-of-theart item-based algorithms, on Movie Lens dataset and RAPS demonstrates the best or comparable quality.

Towards a closure operator for enumeration of maximal tricliques in tripartite hypergraphs

arXiv (Cornell University), Feb 23, 2016

Triadic Formal Concept Analysis (3FCA) was introduced by Lehman and Wille almost two decades ago,... more Triadic Formal Concept Analysis (3FCA) was introduced by Lehman and Wille almost two decades ago, but up-to-date even though that different researchers actively work on this FCA branch, a proper closure operator for enumeration of triconcepts, i.e. maximal triadic cliques of tripartite hypergaphs, was not introduced. In this paper we show that the previously introduced operators for obtaining triconcepts and maximal connected and complete sets (MCCS) are not always consistent and provide the reader with the definition of a valid closure operator and the associated set system.

On the Cryptomorphism between Davis' Subset Lattices, Atomic Lattices, and Closure Systems under T1 Separation Axiom

arXiv (Cornell University), Sep 25, 2022

In this paper we count set closure systems (also known as Moore families) for the case when all s... more In this paper we count set closure systems (also known as Moore families) for the case when all single element sets are closed. In particular, we give the numbers of such strict (empty set included) and non-strict families for the base set of size n = 6. We also provide the number of such inequivalent Moore families with respect to all permutations of the base set up to n = 6. The search in OEIS and existing literature revealed the coincidence of the found numbers with the entry for D. M. Davis' set union lattice (A235604, up to n = 5) and |L n |, the number of atomic lattices on n atoms, obtained by S. Mapes (up to n = 6), respectively. Thus we study all those cases, establish one-to-one correspondences between them via Galois adjunctions and Formal Concept Analysis, and provide the reader with two of our enumerative algorithms as well as with the results of these algorithms used for additional tests. Other results include the largest size of intersection free families for n = 6 plus our conjecture for n = 7, an upper bound for the number of atomic lattices L n , and some structural properties of L n based on the theory of extremal lattices.

Multimodal Clustering of Boolean Tensors on MapReduce: Experiments Revisited

This paper presents further development of distributed multimodal clustering. We introduce a new ... more This paper presents further development of distributed multimodal clustering. We introduce a new version of multimodal clustering algorithm for distributed processing in Apache Hadoop on computer clusters. Its implementation allows a user to conduct clustering on data with modality greater than two. We provide time and space complexity of the algorithm and justify its relevance. The algorithm is adapted for MapReduce distributed processing model. The program implemented by means of Apache Hadoop framework is able to perform parallel computing on thousands of nodes.

Recommendation of Ideas and Antagonists for Crowdsourcing Platform Witology

Springer eBooks, 2015

This paper introduces several recommender methods for crowdsourcing platforms. These methods are ... more This paper introduces several recommender methods for crowdsourcing platforms. These methods are based on modern data analysis approaches for object-attribute data, such as Formal Concept Analysis and biclustering. The use of the proposed techniques is illustrated by the results of recommendation of ideas and antagonists for crowdsourcing platform Witology. In particular we show how the quality of antagonists recommender can be improved by usage of biclusters as focal areas for distance and similarity calculation.

Multimodal Clustering for Community Detection

arXiv (Cornell University), Feb 27, 2017

Multimodal clustering is an unsupervised technique for mining interesting patterns in n-adic bina... more Multimodal clustering is an unsupervised technique for mining interesting patterns in n-adic binary relations or n-mode networks. Among different types of such generalized patterns one can find biclusters and formal concepts (maximal bicliques) for 2-mode case, triclusters and triconcepts for 3-mode case, closed nsets for n-mode case, etc. Object-attribute biclustering (OA-biclustering) for mining large binary datatables (formal contexts or 2-mode networks) arose by the end of the last decade due to intractability of computation problems related to formal concepts; this type of patterns was proposed as a meaningful and scalable approximation of formal concepts. In this paper, our aim is to present recent advance in OAbiclustering and its extensions to mining multi-mode communities in SNA setting. We also discuss connection between clustering coefficients known in SNA community for 1-mode and 2-mode networks and OA-bicluster density, the main quality measure of an OA-bicluster. Our experiments with 2-, 3-, and 4-mode large realworld networks show that this type of patterns is suitable for community detection in multi-mode cases within reasonable time even though the number of corresponding n-cliques is still unknown due to computation difficulties. An interpretation of OA-biclusters for 1-mode networks is provided as well.

Towards a Unified Taxonomy of Biclustering Methods

arXiv (Cornell University), Feb 17, 2017

Being an unsupervised machine learning and data mining technique, biclustering and its multimodal... more Being an unsupervised machine learning and data mining technique, biclustering and its multimodal extensions are becoming popular tools for analysing object-attribute data in different domains. Apart from conventional clustering techniques, biclustering is searching for homogeneous groups of objects while keeping their common description, e.g., in binary setting, their shared attributes. In bioinformatics, biclustering is used to find genes, which are active in a subset of situations, thus being candidates for biomarkers. However, the authors of those biclustering techniques that are popular in gene expression analysis, may overlook the existing methods. For instance, BiMax algorithm is aimed at finding biclusters, which are well-known for decades as formal concepts. Moreover, even if bioinformatics classify the biclustering methods according to reasonable domain-driven criteria, their classification taxonomies may be different from survey to survey and not full as well. So, in this paper we propose to use concept lattices as a tool for taxonomy building (in the biclustering domain) and attribute exploration as means for cross-domain taxonomy completion.