Data analysis with intersection graphs

Irene Vairinhos

Data analysis with intersection graphs

Irene Vairinhos

Sign up for access to the world's latest research

checkGet notified about relevant papers

checkSave papers to use in your research

checkJoin the discussion with peers

checkTrack your impact

Abstract

This paper presents a new framework for multivariate data analysis, based on graph theory, using intersection graphs . We have named this approach DAIG Data Analysis with Intersection Graphs. This new framework represents data vectors as paths on a graph, which has a number of advantages over the classical table representation of data. To do so, each node represents an atom of information, i.e. a pair of a variable and a value, associated with the set of observations for which that pair occurs. An edge exists between a pair of nodes whenever the intersection of their respective sets is not empty. We show that this representation of data as an intersection graph allows an easy and intuitive geometric interpretation of data observations, groups of observations, and results of multivariate data analysis techniques such as biplots, principal components, cluster analysis, or multidimensional scaling. These will appear as paths on the graph, relating variables, values and observations. This approach allows for a compact and memory efficient representation of data that contains many missing values or multi-valued attributes. The basic principles and advantages of this approach are presented with an example of its application to a simple toy problem. The main features of this methodology are illustrated with the aid software specifically developed for this purpose.

JAN DE LEEUW

Institute of Mathematical Statistics Lecture Notes - Monograph Series, 2000

In this paper we explore the relationship between multivariate data analysis and techniques for graph drawing or graph layout. Although both classes of techniques were created for quite different purposes, we find many common principles and implementations. We start with a discussion of the data analysis techniques, in particular multiple correspondence analysis, multidimensional scaling, parallel coordinate plotting, and seriation. We then discuss parallels in the graph layout literature. Categories of second variable FIGURE 1. The multivariable graph of a toy example 1 A bipartite graph is a 2-layered graph, where edges only go from one layer to the other layer.

Log In

Data analysis with intersection graphs

Sign up for access to the world's latest research

Abstract

Related papers