Academia.edu no longer supports Internet Explorer.

To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.

Log In

or

Email

Password

Remember me on this computer

or reset password

Enter the email address you signed up with and we'll email you a reset link.

Need an account? Click here to sign up

Log In
Sign Up

Learning from examples in a single graph

2005

Sign up for access to the world's latest research

checkGet notified about relevant papers

checkSave papers to use in your research

checkJoin the discussion with peers

checkTrack your impact

Related papers

Learning from Supervised Graphs

Studies in Computational Intelligence, 2007

We describe an approach to learning patterns in relational data represented as a graph. The approach, implemented in the Subdue system, searches for patterns that maximally compress the input graph. Subdue can be used for supervised learning, as well as unsupervised pattern discovery and clustering. Mining graph-based data raises challenges not found in linear attribute-value data. However, additional requirements can further complicate the problem. In particular, we describe how concepts can be learned from training examples which are embedded into a single connected graph, or supervised graph. We demonstrate the technique using data from a a NASA SST domain as well as a homeland security domain.

View PDFchevron_right

Graph-Based Structural Pattern Learning

2006

Public reporting burden for this collection of information is estimated to average 1 hour per response, including the time for reviewing instructions, searching data sources, gathering and maintaining the data needed, and completing and reviewing the collection of information. Send comments regarding this burden estimate or any other aspect of this collection of information, including suggestions for reducing this burden to

View PDFchevron_right

Graph-based Mining of Complex Data

Advanced Information and Knowledge Processing

We describe an approach to learning patterns in relational data represented as a graph. The approach, implemented in the Subdue system, searches for patterns that maximally compress the input graph. Subdue can be used for supervised learning, as well as unsupervised pattern discovery and clustering. Mining graph-based data raises challenges not found in linear attribute-value data. However, additional requirements can further complicate the problem. In particular, we describe how Subdue can incrementally process structured data that arrives as streaming data. We also employ these techniques to learn structural concepts from examples embedded in a single large connected graph.

View PDFchevron_right

Graph-based hierarchical conceptual clustering

The Journal of Machine Learning …, 2002

Hierarchical conceptual clustering has proven to be a useful, although under-explored, data mining technique. A graph-based representation of structural information combined with a substructure discovery technique has been shown to be successful in knowledge discovery. The SUBDUE substructure discovery system provides one such combination of approaches. This work presents SUBDUE and the development of its clustering functionalities. Several examples are used to illustrate the validity of the approach both in structured and unstructured domains, as well as to compare SUBDUE to the Cobweb clustering algorithm. We also develop a new metric for comparing structurally-defined clusterings. Results show that SUBDUE successfully discovers hierarchical clusterings in both structured and unstructured data.

View PDFchevron_right

Structural mining of molecular biology data

IEEE Engineering in Medicine and Biology Magazine, 2001

View PDFchevron_right

Knowledge Discovery and Data Mining

2001

View PDFchevron_right

Substructure discovery in the SUBDUE system

Knowledge Discovery and Data Mining, 1994

Because many databases contain or can be embellished with structural information, a method for identifying interesting and repetitive substructures is an essential component to discovering knowledge in such databases. This paper describes the SUBDUE system, which uses the minimum description length (MDL) principle to discover substructures that compress the database and represent structural concepts in the data. By replacing previously-discovered substructures in the data, multiple passes of SUBDUE produce a hierarchical description of the structural regularities in the data. Inclusion of background knowledgeguides SUBDUE toward appropriate substructures for a particular domain or discovery goal, and the use of an inexact graph match allows a controlled amount of deviations in the instance of a substructure concept. We describe the application of SUBDUE to a variety of domains. We also discuss approaches to combining SUBDUE with non-structural discovery systems.

View PDFchevron_right

Graph-based data mining

IEEE Intelligent Systems, 2000

at Arlington THE LARGE AMOUNT OF DATA collected today is quickly overwhelming researchers' abilities to interpret the data and discover interesting patterns in it. In response to this problem, researchers have developed techniques and systems for discovering concepts in databases. 1-3 Much of the collected data, however, has an explicit or implicit structural component (spatial or temporal), which few discovery systems are designed to handle. 4 So, in addition to the need to accelerate data mining of large databases, there is an urgent need to develop scalable tools for discovering concepts in structural databases. One method for discovering knowledge in structural data is the identification of common substructures within the data. Substructure discovery is the process of identifying concepts describing interesting and repetitive substructures within structural data. The discovered substructure concepts allow abstraction from the detailed data structure and provide relevant attributes for interpreting the data. The substructure discovery method is the basis of Subdue, which performs data mining on databases represented as graphs. The system performs two key data-mining techniques: unsupervised pattern discovery and supervised concept learning from examples. Our test applications have demonstrated the scalability and effectiveness of these techniques on a variety of structural databases.

View PDFchevron_right

Substucture Discovery in the SUBDUE System

Knowledge Discovery and Data Mining, 1994

Because many databases contain or can be embellished with structural information, a method for identifying interesting and repetitive substructures is an essential component to discovering knowledge in such databases. This paper describes the SUBDUE system, which uses the minimum description length (MDL) principle to discover substructures that compress the database and represent structural concepts in the data. By replacing previously-discovered substructures in the data, multiple passes of SUBDUE produce a hierarchical description of the structural regularities in the data. Inclusion of background knowledgeguides SUBDUE toward appropriate substructures for a particular domain or discovery goal, and the use of an inexact graph match allows a controlled amount of deviations in the instance of a substructure concept. We describe the application of SUBDUE to a variety of domains. We also discuss approaches to combining SUBDUE with non-structural discovery systems.

View PDFchevron_right

Substructure Analysis of Metabolic Pathways by Graph-Based Relational Learning

Biomedical Data and Applications, 2009

Systems biology has become a major field of post-genomic bioinformatics research. A biological network containing various objects and their relationships is a fundamental way to represent a bio-system. A graph consisting of vertices and edges between these vertices is a natural data structure to represent biological networks. Substructure analysis of metabolic pathways by graph-based relational learning provides us biologically meaningful substructures for system-level understanding of organisms. This chapter presents a graph representation of metabolic pathways to describe all features of metabolic pathways and describes the application of graph-based relational learning for structure analysis on metabolic pathways in both supervised and unsupervised scenarios. We show that the learned substructures can not only distinguish between two kinds of biological networks and generate hierarchical clusters for better understanding of them, but also have important biological meaning.

View PDFchevron_right

Concept formation using graph grammars

2002

Recognizing the expressive power of graph representation and the ability of certain graph grammars to generalize, we attempt to use graph grammar learning for concept formation. In this paper we describe our initial progress toward that goal, and focus on how certain graph grammars can be learned from examples. We also establish grounds for using graph grammars in machine learning tasks. Several examples are presented to highlight the validity of the approach.

View PDFchevron_right

TECHNOLOGY Graph-Based Analysis of Nuclear Smuggling Data

2009

Much of the data that is collected and analyzed today is structural, consisting not only of entities but also of relationships between the entities. As a result, analysis applications rely on automated structural data mining approaches to find patterns and concepts of interest. This ability to analyze structural data has become a particular challenge in many security-related domains. In these

View PDFchevron_right

An Empirical Study of Domain Knowledge and Its Benefits to Substructure Discovery

IEEE Transactions on Knowledge and Data Engineering, 1999

Discovering repetitive, interesting, and functional substructures in a structural database improves the ability to interpret and compress the data. However, scientists working with a database in their area of expertise often search for predetermined types of structures or for structures exhibiting characteristics specific to the domain. This paper presents a method for guiding the discovery process with domain-specific knowledge. In this paper, the SUBDUE discovery system is used to evaluate the benefits of using domain knowledge to guide the discovery process. Domain knowledge is incorporated into SUBDUE following a single general methodology to guide the discovery process. Results show that domain-specific knowledge improves the search for substructures that are useful to the domain and leads to greater compression of the data. To illustrate these benefits, examples and experiments from the computer programming, computer-aided design circuit, and artificially generated domains are presented.

View PDFchevron_right

An Emprirical Study of Domain Knowledge and Its Benefits to Substructure Discovery

IEEE Transactions on Knowledge and Data Engineering, 1997

Discovering repetitive, interesting, and functional substructures in a structural database improves the ability to interpret and compress the data. However, scientists working with a database in their area of expertise often search for predetermined types of structures or for structures exhibiting characteristics specific to the domain. This paper presents a method for guiding the discovery process with domain-specific knowledge. In this paper, the SUBDUE discovery system is used to evaluate the benefits of using domain knowledge to guide the discovery process. Domain knowledge is incorporated into SUBDUE following a single general methodology to guide the discovery process. Results show that domain-specific knowledge improves the search for substructures that are useful to the domain and leads to greater compression of the data. To illustrate these benefits, examples and experiments from the computer programming, computer-aided design circuit, and artificially generated domains are presented.

View PDFchevron_right

The Integraton of Graph-Based Knowledge Discovery with Image Segmentation Hierarchies for Data Analysis, Data Mining and Knowledge Discovery

IGARSS 2008 - 2008 IEEE International Geoscience and Remote Sensing Symposium, 2008

Currently available pixel-based image analysis techniques do not effectively extract the information content from the increasingly available high spatial resolution remotely sensed imagery data. We are exploring an approach to object-based image analysis in which hierarchical image segmentations provided by the Recursive Hierarchical Segmentation (RHSEG) software are analyzed by the Subdue graph-based knowledge-discovery system. In this paper we discuss our initial approach to representing the RHSEG-produced hierarchical image segmentations in a graphical form understandable by Subdue, and discuss results from real and simulated data.

View PDFchevron_right

Substructure Discovery Using Minimum Description Length and Background Knowledge

Journal of Artificial Intelligence Research, 1994

The ability to identify interesting and repetitive substructures is an essential component to discovering knowledge in structural data. We describe a new version of our SUBDUE substructure discovery system based on the minimum description length principle. The SUBDUE system discovers substructures that compress the original data and represent structural concepts in the data. By replacing previously-discovered substructures in the data, multiple passes of SUBDUE produce a hierarchical description of the structural regularities in the data. SUBDUE uses a computationally-bounded inexact graph match that identifies similar, but not identical, instances of a substructure and finds an approximate measure of closeness of two substructures when under computational constraints. In addition to the minimum description length principle, other background knowledge can be used by SUBDUE to guide the search towards more appropriate substructures. Experiments in a variety of domains demonstrate SUB...

View PDFchevron_right

Efficient mining of graph-based data

2000

With the increasing amount of structural data being collected, there arises a need to efficiently mine information from this type of data. The goal of this research is to provide a system that performs data mining on structural data represented as a labeled graph. We demonstrate how the graph-based discovery system Subdue can be used to perform structural pattern discovery and structural hierarchical clustering on graph data.

View PDFchevron_right

Graph-Based Analysis of Nuclear Smuggling Data

Journal of Applied Security Research, 2009

Much of the data that is collected and analyzed today is structural, consisting not only of entities but also of relationships between the entities. As a result, analysis applications rely on automated structural data mining approaches to find patterns and concepts of interest. This ability to analyze structural data has become a particular challenge in many security-related domains. In these domains, focusing on the relationships between entities in the data is critical to detect important underlying patterns. In this study we apply structural data mining techniques to automate analysis of nuclear smuggling data. In particular, we choose to model the data as a graph and use graph-based relational learning to identify patterns and concepts of interest in the data. In this article, we identify the analysis questions that are of importance to security analysts and describe the knowledge representation and data mining approach that we adopt for this challenge. We analyze the results using the Russian nuclear smuggling event database.

View PDFchevron_right

Graph-based relational learning with application to security

2005

We describe an approach to learning patterns in relational data represented as a graph. The approach, implemented in the Subdue system, searches for patterns that maximally compress the input graph. Subdue can be used for supervised learning, as well as unsupervised pattern discovery and clustering. We apply Subdue in domains related to homeland security and social network analysis.

View PDFchevron_right

Unsupervised and Supervised Pattern Learning in Graph Data

Cook/Mining Graph Data, 2006

The success of machine learning and data mining for business and scientific purposes has fueled the expansion of its scope to new representations and techniques. Much collected data is structural in nature, containing entities as well as relationships between these entities. Compelling data in bioinformatics [32], network intrusion detection [15], web analysis [2, 8], and social network analysis [7, 27] has become available that requires effective handling of structural data. The ability to learn 1 This work is partially supported by the National Science Foundation grants IIS-0505819 and IIS-0097517.

View PDFchevron_right

Related topics

Computer Science

Data representation

Loading Preview

Sorry, preview is currently unavailable. You can download the paper by clicking the button above.

About
Press
Papers
Topics
We're Hiring!
Help Center

Find new research papers in:
Physics
Chemistry
Biology
Health Sciences
Ecology
Earth Sciences
Cognitive Science
Mathematics
Computer Science

Terms
Privacy
Copyright
Academia ©2025