Papers by Hung Son Nguyen

IEEE Access, 2021
Mining frequent subgraphs is an interesting and important problem in the graph mining field, in t... more Mining frequent subgraphs is an interesting and important problem in the graph mining field, in that mining frequent subgraphs from a single large graph has been strongly developed, and has recently attracted many researchers. Among them, MNI-based approaches are considered as state-of-the-art, such as the GraMi algorithm. Besides frequent subgraph mining (FSM), frequent closed frequent subgraph mining was also developed. This has many practical applications and is a fundamental premise for many studies. This paper proposes the CloGraMi (Closed Frequent Subgraph Mining) algorithm based on GraMi to find all closed frequent subgraphs in a single large graph. Two effective strategies are also developed, the first one is a new level order traversal strategy to quickly determine closed subgraphs in the searching process, and the second is setting a condition for early pruning a large portion of non-closed candidates, both of them aim to reduce the running time as well as the memory requirements, improve the performance of the proposed algorithm. Our experiments are performed on five real datasets (both directed and undirected graphs) and the results show that the running time as well as the memory requirements of our algorithm are significantly better than those of the GraMi-based algorithm.

Advances in Intelligent Systems and Computing, 2015
In this paper we describe the architecture of a simple evacuation model which is based on a graph... more In this paper we describe the architecture of a simple evacuation model which is based on a graph representation of the scene. Such graphs are typically constructed using Medial Axis Transform (MAT) or Straight Medial Axis Transform (S-MAT) transformations, the former being a part of the Voronoi diagram (Dirichlet tessellation) of the floor plan. In our work we construct such graphs for floor plans using Voronoi diagram along with the dual Delaunay triangulation of a set of points approximating the scene. Information supplied by Delaunay triangulation complements the graph in two ways: it determines capacities of some paths associated with edges, and provides a bijection between graph vertices and a set of regions forming a partition of the floor plan. We call the representation granular for this reason. We provide an exposition of the representation of fire scene that aids egress time calculations, discuss the algorithm of construction of this representation and briefly discuss the applicability of network flow models (e.g. the Ford-Fulkerson method or the push-relabel method) in our setting.

Proceedings of the 2014 Federated Conference on Computer Science and Information Systems, 2014
In this paper we consider a problem of automatic labeling of textual data with concepts explicitl... more In this paper we consider a problem of automatic labeling of textual data with concepts explicitly defined in an external knowledge base. We describe our tagging system and we also present a framework for adaptive learning of associations between terms or phrases from the texts and the concepts. Those associations are then utilized by our semantic interpreter, which is based on the Explicit Semantic Analysis (ESA) method, in order to label scientific articles indexed by our SONCA platform. Apart from the description of the learning algorithm, we show a few practical application examples of our system, in which it was used for tagging scientific articles with headings from the MeSH ontology, categories from ACM Computing Classification System and from OECD Fields of Science and Technology Classification.

In this paper, we investigate the problem of quality analysis of clustering results using semanti... more In this paper, we investigate the problem of quality analysis of clustering results using semantic annotations given by experts. We propose a novel approach to construction of evaluation measure, which is based on the Minimal Description Length (MDL) principle. In fact this proposed measure, called SEE (Semantic Evaluation by Exploration), is an improvement of the existing evaluation methods such as Rand Index or Normalized Mutual Information. It fixes some of weaknesses of the original methods. We illustrate the proposed evaluation method on the freely accessible biomedical research articles from Pubmed Central (PMC). Many articles from Pubmed Central are annotated by the experts using Medical Subject Headings (MeSH) thesaurus. This paper is a part of the research on designing and developing a dialog-based semantic search engine for SONCA system 1 which is a part of the SYNAT project 2 . We compare different semantic techniques for search result clustering using the proposed measure.

Proceedings of the 2014 Federated Conference on Computer Science and Information Systems, 2014
In this paper we describe an evacuation modeling framework based on a graph representation of the... more In this paper we describe an evacuation modeling framework based on a graph representation of the scene which is derived from its geometric description. Typically such graphs (geometric networks) are constructed using Medial Axis Transform (MAT) or Straight Medial Axis Transform (S-MAT). In our work we use Voronoi tessellation of a set of points approximating the scene (a single floor plan) along with the dual graph -Delaunay triangulation. Using these two graphs we extract not only the information about paths in the building, but also information about path widths and areas assigned to vertices. Typically only path lengths from MAT or S-MAT based geometric networks are used in evacuation modeling. Our approach enables us to include flow analysis and e.g. locate bottlenecks. We discuss a typical density-based evacuation model coupled with a partial behavioral evacuation model within proposed framework.
Lecture Notes in Computer Science, 2012
The use of general descriptive names, registered names, trademarks, etc. in this publication does... more The use of general descriptive names, registered names, trademarks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use.
CS&P, 2019
Tolerance Rough Set Model (TRSM) is an extension of Rough Set theory and can be used as a tool fo... more Tolerance Rough Set Model (TRSM) is an extension of Rough Set theory and can be used as a tool for approximation of hidden concepts in collections of documents. In recent years, numerous successful applications of TRSM in web intelligence including text classification, clustering, thesaurus generation, semantic indexing, and semantic search, etc., have been proposed. This paper revises the basic concepts of TRSM, some of its possible extensions and some typical applications of TRSM in text mining. We also discuss some further research on TRSM.
Springer eBooks, 2013
Mathematical publications are often labelled with Mathematical Subject Classication codes. These ... more Mathematical publications are often labelled with Mathematical Subject Classication codes. These codes are grouped in a treelike hierarchy created by experts. In this paper we posit that this hierarchy is highly correlated with content of publications. Following this assumption we try to reconstruct the MSC tree basing on our publications corpora. Results are compared to the original hierarchy and conclusions are drawn.
Studies in computational intelligence, 2013
One of the common problems when dealing with digital libraries is lack of classification codes in... more One of the common problems when dealing with digital libraries is lack of classification codes in some of the documents. In the following publication we deal with this problem in a multi-label, hierarchical case of Mathematics Subject Classification System. We develop modifications of ML-KNN algorithm and show how they improve results given by the algorithm on example of Springer textual data.

On Exploring Soft Discretization of Continuous Attributes
Cognitive technologies, 2004
Searching for a binary partition of attribute domains is an important task in data mining. It is ... more Searching for a binary partition of attribute domains is an important task in data mining. It is present in both decision tree construction and discretization. The most important advantages of decision tree methods are compactness and clearness of knowledge representation as well as high accuracy of classification. Decision tree algorithms also have some drawbacks. In cases of large data tables, existing decision tree induction methods are often inefficient in both computation and description aspects. Another disadvantage of standard decision tree methods is their instability, i e, small data deviations may require a significant reconstruction of the decision tree. We present novelsoft discretizationmethods usingsoft cutsinstead of traditionalcrisp(or sharp) cuts. This new concept makes it possible to generate more compact and stable decision trees with high accuracy of classification. We also present an efficient method for soft cut generation from large databases.
Electronic Notes in Theoretical Computer Science, Mar 1, 2003
Towards VNUMED for Healthcare Research Activities in Vietnam
Ercim News, 2019
Tolerance Rough Set Model and Its Applications in Web Intelligence
Web Intelligence/IAT Workshops, 2013
Tolerance Rough Set Model (TRSM) has been introduced as a tool for approximation of hidden concep... more Tolerance Rough Set Model (TRSM) has been introduced as a tool for approximation of hidden concepts in text databases. In recent years, numerous successful applications of TRSM in web intelligence including text classification, clustering, thesaurus generation, semantic indexing, and semantic search, etc., have been proposed. This paper will review the fundamental concepts of TRSM, some of its possible extensions and some typical applications of TRSM in text mining. Moreover, the architecture o a semantic information retrieval system, called SONCA, will be presented to demonstrate the main idea as well as stimulate the further research on TRSM.
Speeding Up Recommender Systems Using Association Rules
Lecture Notes in Computer Science, 2022
A Method of Web Search Result Clustering Based on Rough Sets
Page 1. A method of web search result clustering based on rough sets Chi Lang Ngo, Hung Son Nguye... more Page 1. A method of web search result clustering based on rough sets Chi Lang Ngo, Hung Son Nguyen (∗) Institute of Mathematics, Warsaw University Banacha 2, 02-097 Warsaw, Poland (∗) E-mail (contact person): [email protected] Abstract ...

Rough Sets and Boolean Reasoning
IGI Global eBooks, May 24, 2011
This chapter presents the Boolean reasoning approach to problem solving and its applications in R... more This chapter presents the Boolean reasoning approach to problem solving and its applications in Rough sets. The Boolean reasoning approach has become a powerful tool for designing effective and accurate solutions for many problems in decision-making, approximate reasoning and optimization. In recent years, Boolean reasoning has become a recognized technique for developing many interesting concept approximation methods in rough set theory. This chapter presents a general framework for concept approximation by combining the classical Boolean reasoning method with many modern techniques in machine learning and data mining. This modified approach - called “the approximate Boolean reasoning” methodology - has been proposed as an even more powerful tool for problem solving in rough set theory and its applications in data mining. Through some most representative applications in many KDD problems including feature selection, feature extraction, data preprocessing, classification of decision rules and decision trees, association analysis, the author hopes to convince that the proposed approach not only maintains all the merits of its antecedent but also owns the possibility of balancing between quality of the designed solution and its computational time.
Discretization Problem for Rough Sets Methods
Springer eBooks, 1998
We study the relationship between reduct problem in Rough Sets theory and the problem of real val... more We study the relationship between reduct problem in Rough Sets theory and the problem of real value attribute discretization. We consider the problem of searching for a minimal set of cuts on attribute domains that preserves discernibility of objects with respect to any ...

Rough Sets and Current Trends in Computing: 5th International Conference, RSCTC 2006, Kobe, Japan, November 6-8, 2006, Proceedings (Lecture Notes in Computer Science)
Springer eBooks, Nov 1, 2006
Invited Papers.- Decision Trees and Flow Graphs.- Granular Computing - The Concept of Generalized... more Invited Papers.- Decision Trees and Flow Graphs.- Granular Computing - The Concept of Generalized Constraint-Based Computation.- Bipolar Representations in Reasoning, Knowledge Extraction and Decision Processes.- Kansei Engineering and Rough Sets Model.- Stochastic Approach to Rough Set Theory.- Commemorative Papers for Professor Pawlak.- Zdzis?aw Pawlak Commemorating His Life and Work.- Pawlak Rough Set Model, Medical Reasoning and Rule Mining.- Logics in Rough Sets.- Algebras of Terms in Pawlak's Information Systems.- Monads Can Be Rough.- On Testing Membership to Maximal Consistent Extensions of Information Systems.- The Research of Rough Sets in Normed Linear Space.- Two Kinds of Rough Algebras and Brouwer-Zadeh Lattices.- Logics in Fuzzy Sets.- Balanced Fuzzy Gates.- Triangle Algebras: Towards an Axiomatization of Interval-Valued Residuated Lattices.- Fuzzy-Rough Hybridization.- An Approach to Parameterized Approximation of Crisp and Fuzzy Sets.- Rough Fuzzy Set Approximations in Fuzzy Formal Contexts.- Webpage Classification with ACO-Enhanced Fuzzy-Rough Feature Selection.- Approximate and Uncertain Reasoning.- Association Reducts: Complexity and Heuristics.- Planning Based on Reasoning About Information Changes.- Rough Approximation Operators in Covering Approximation Spaces.- Variable Precision Rough Set Models.- A New Method for Discretization of Continuous Attributes Based on VPRS.- On Variable Consistency Dominance-Based Rough Set Approaches.- Variable-Precision Dominance-Based Rough Set Approach.- Incomplete/Nondeterministic Information Systems.- Applying Rough Sets to Data Tables Containing Imprecise Information Under Probabilistic Interpretation.- Ensembles of Decision Rules for Solving Binary Classification Problems in the Presence of Missing Values.- Expanding Tolerance RST Models Based on Cores of Maximal Compatible Blocks.- Local and Global Approximations for Incomplete Data.- Missing Template Decomposition Method and Its Implementation in Rough Set Exploration System.- On Possible Rules and Apriori Algorithm in Non-deterministic Information Systems.- Decision Support.- Generalized Conflict and Resolution Model with Approximation Spaces.- Rough Set Approach to Customer Satisfaction Analysis.- Utility Function Induced by Fuzzy Target in Probabilistic Decision Making.- Multi-criteria Decision Support.- Dominance-Based Rough Set Approach to Decision Involving Multiple Decision Makers.- Quality of Rough Approximation in Multi-criteria Classification Problems.- Rough-Set Multiple-Criteria ABC Analysis.- Rough Sets in KDD.- A Method of Generating Decision Rules in Object-Oriented Rough Set Models.- Knowledge Reduction in Set-Valued Decision Information System.- Local Reducts and Jumping Emerging Patterns in Relational Databases.- Mining Rough Association from Text Documents.- NetTRS - Induction and Postprocessing of Decision Rules.- Outlier Detection Based on Rough Membership Function.- Rough Sets in Medicine.- An Approach to a Rough Set Based Disease Inference Engine for ECG Classification.- Attribute Selection for EEG Signal Classification Using Rough Sets and Neural Networks.- Automatic Planning of Treatment of Infants with Respiratory Failure Through Rough Set Modeling.- Developing a Decision Model for Asthma Exacerbations: Combining Rough Sets and Expert-Driven Selection of Clinical Attributes.- Granular Computing.- A GrC-Based Approach to Social Network Data Protection.- An Interpretation of Flow Graphs by Granular Computing.- Attribute Reduction Based on Granular Computing.- Methodological Identification of Information Granules-Based Fuzzy Systems by Means of Genetic Optimization.- Optimization of Information Granulation-Oriented Fuzzy Set Model Using Hierarchical Fair Competition-Based Parallel Genetic Algorithms.- Grey Systems.- A Grey-Based Rough Set Approach to Suppliers Selection Problem.- A Hybrid Grey-Based Dynamic Model for International Airlines Amount Increase Prediction.- On the Combination of Rough Set Theory and Grey Theory Based on Grey Lattice Operations.- Ontology and Mereology.- An Ontology-Based First-Order Modal Logic.- Enhancing a Biological Concept Ontology to Fuzzy Relational Ontology with Relations Mined from Text.- On a Parthood Specification Method for Component Software.- Ontology Driven Concept Approximation.- Statistical Mathods.- A Statistical Method for Determining Importance of Variables in an Information System.- Distribution of Determinants of Contingency Matrix.- Interpretation of Contingency Matrix Using Marginal Distributions.- Machine Learning.- A Model of Machine Learning Based on User Preference of Attributes.- Combining Bi-gram of Character and Word to Classify Two-Class Chinese Texts in Two Steps.- Combining Monte Carlo Filters with Support Vector Machines for Option Price Forecasting.- Domain Knowledge Assimilation by Learning Complex Concepts.- Learning Compound Decision Functions for Sequential Data in Dialog with Experts.- Sampling of Virtual…
Data mining also known as knowledge discovery from datasets has been recognized as an important a... more Data mining also known as knowledge discovery from datasets has been recognized as an important area of database research. This area can be defined as efficiently discovering interesting patterns from large data sets. In this paper a generic method has been proposed to extract interesting periodicities of patterns from large datasets where the transactions in the data sets are associated with patterns and time intervals in which the patterns hold. Considering the hierarchy associated with time stamps of the form day-date-hour-minutes-seconds, different types of periodic patterns such as daily, weekly, monthly patterns can be extracted.
Rough Sets and Current Trends in Computing : 5th International Conference, RSCTC 2006, Kobe, Japan, November 6-8, 2006, Proceedings
... Our special thanks go to Tsuneo Okura, Mika Kuroda, Daisuke Toyama, Namiko Sugimoto, and Masa... more ... Our special thanks go to Tsuneo Okura, Mika Kuroda, Daisuke Toyama, Namiko Sugimoto, and Masahiro Kagawa for their help ... JunSun, WenboXu, WeiFang Identification and Speed Control of Ultrasonic Motors Based on Modified Immune Algorithm and Elman Neural Networks ...
Uploads
Papers by Hung Son Nguyen