Visualization Analysis & Design
Network Data (Ch 9)
Dr. Lucy Nwosu
University of Houston
@tamaramunzner
Network data
• networks
– model relationships Spatial
between things
• aka graphs
– two kinds of items,
both can have attributes
• nodes
• links
• tree
– special case
– no cycles
• one parent per node
2
Network tasks: topology-based and attribute-based
• topology based tasks
– find paths
– find (topological) neighbors
– compare centrality/importance measures
– identify clusters / communities
• attribute based tasks (similar to table data)
– find distributions, ...
• combination tasks, incorporating both
– example: find friends-of-friends who like cats
• topology: find all adjacent nodes of given node
• attributes: check if has-pet (node attribute) == cat
3
Node-link diagrams
• nodes: point marks
• links: line marks
– straight lines or arcs
– connections between nodes
• intuitive & familiar
– most common
– many, many variants
4
Criteria for good node-link layouts
• minimize
– edge crossings, node overlaps
– distances between topological neighbor nodes
– total drawing area
– edge bends
• maximize
– angular distance between different edges
– aspect ratio disparities
• emphasize symmetry
– similar graph structures should look similar in layout
5
Criteria conflict
• most criteria NP-hard individually
• many criteria directly conflict with each other
6
Optimization-based layouts
• formulate layout problem as optimization problem
• convert criteria into weighted cost function
– F(layout) = a*[crossing counts] + b*[drawing space used]+...
• use known optimization techniques to find layout at minimal cost
– energy-based physics models
– force-directed placement
– spring embedders
7
Force-directed placement
• physics model
– links = springs pull together
– nodes = magnets repulse apart
• algorithm
– place vertices in random locations
– while not equilibrium
• calculate force on vertex
– sum of
» pairwise repulsion of all nodes
» attraction between connected nodes
• move vertex by c * vertex_force
[Link]
8
Force-directed placement properties
• strengths
– reasonable layout for small, sparse graphs
– clusters typically visible
– edge length uniformity
• weaknesses
– nondeterministic
– computationally expensive: O(n^3) for n nodes
• each step is n^2, takes ~n cycles to reach equilibrium
– naive FD doesn't scale well beyond 1K nodes [Link]
– iterative progress: engaging but distracting
9
Idiom: force-directed placement
• visual encoding
–link connection marks, node point marks
• considerations
–spatial position: no meaning directly encoded
• left free to minimize crossings
–proximity semantics?
• sometimes meaningful
• sometimes arbitrary, artifact of layout algorithm
• tension with length
–long edges more visually salient than short [Link]
• tasks
–explore topology; locate paths, clusters
• scalability
–node/edge density E < 4N
10
Idiom: circular layouts / arc diagrams (node-link)
• restricted node-link layouts: lay out nodes around circle or along
line
• data
– original: network
–derived: node ordering attribute (global computation)
• considerations: node ordering crucial to avoid
excessive clutter from edge crossings
–examples: before & after barycentric ordering
[Link] 11
Adjacency matrix representations
• derive adjacency matrix from network
12
Adjacency matrix examples
13
Node order is crucial: Reordering
[Link] 14
Adjacency matrix
•˜
bad for topology tasks
good for topology tasks related to paths
related to neighborhoods
(node 1-hop neighbors)
15
Structures visible in both
[Link] 16
Idiom: adjacency matrix view
• data: network
– transform into same data/encoding as heatmap
• derived data: table from network [NodeTrix: a Hybrid Visualization of Social Networks.
Henry, Fekete, and McGuffin. IEEE TVCG (Proc.
– 1 quant attrib InfoVis) 13(6):1302-1309, 2007.]
• weighted edge between nodes
– 2 categ attribs: node list x 2
• visual encoding
– cell shows presence/absence of edge
• scalability
– 1K nodes, 1M edges
[Points of view: Networks. Gehlenborg and Wong. Nature Methods 9:115.]
17
Node-link vs. matrix comparison
• node-link diagram strengths
–topology understanding, path tracing
–intuitive, flexible, no training needed
• adjacency matrix strengths [Link]
–focus on edges rather than nodes
–layout straightforward (reordering needed)
–predictability, scalability
–some topology tasks trainable
• empirical study
–node-link best for small networks
–matrix best for large networks [On the readability of graphs using node-link and matrix-based
• if tasks don’t involve path tracing! representations: a controlled experiment and statistical analysis. Ghoniem,
Fekete, and Castagliola. Information Visualization 4:2 (2005), 114–135.]
18
Idiom: NodeTrix
• hybrid nodelink/matrix
• capture strengths of both
[NodeTrix: a Hybrid Visualization of Social Networks. Henry, Fekete, and McGuffin. IEEE TVCG (Proc.
InfoVis) 13(6):1302-1309, 2007.]
19
Trees
20
Node-link trees
• Reingold-Tilford
– tidy drawings of trees
• exploit parent/child structure
– allocate space: compact but
without overlap
• rectilinear and radial variants
[Tidier drawing of trees. Reingold and Tilford. IEEE
Trans. Software Eng., SE-7(2):223–228, 1981.]
– nice algorithm writeup
[Link] [Link]
[Link]
21
Idiom: radial node-link tree
[Link]
• data
– tree
• encoding
– link connection marks
– point node marks
– radial axis orientation
• angular proximity: siblings
• distance from center: depth in tree
• tasks
– understanding topology, following paths
• scalability
– 1K - 10K nodes (with/without labels)
22
Link marks: Connection and containment
• marks as links (vs. nodes)
– common case in network drawing
– 1D case: connection
• ex: all node-link diagrams
• emphasizes topology, path tracing
• networks and trees
– 2D case: containment
• ex: all treemap variants
• emphasizes attribute values at leaves (size coding)
• only trees
[Elastic Hierarchies: Combining Treemaps and Node-Link
Diagrams. Dong, McGuffin, and Chignell. Proc. InfoVis 2005, p.
57-64.]
23
Idiom: treemap
• data
– tree
– 1 quant attrib at leaf nodes
• encoding
– area containment marks for hierarchical structure
– rectilinear orientation
– size encodes quant attrib
• tasks
– query attribute at leaf nodes
– ex: disk space usage within filesystem [Link]
• scalability [Cushion Treemaps. van Wijk and van de Wetering.
Proc. Symp. InfoVis 1999, 73-78.]
– 1M leaf nodes
24
Idiom: implicit tree layouts (sunburst, icicle plot)
• alternative to connection and containment: position
– show parent-child relationships only through relative positions
containment position (radial) position (rectilinear)
25
Idiom: implicit tree layouts (sunburst, icicle plot)
• alternative to connection and containment: position
– show parent-child relationships only through relative positions
containment position (radial) position (rectilinear)
only leaves visible inner nodes & leaves visible inner nodes & leaves visible
26
Idiom: implicit tree layouts (sunburst, icicle plot)
• alternative to connection and containment: position
– show parent-child relationships only through relative positions
containment position (radial) position (rectilinear)
only leaves visible inner nodes & leaves visible inner nodes & leaves visible
Implicit
Spatial Position
27
Tree drawing idioms comparison
[Quantifying the Space-Efficiency of 2D Graphical Representations of Trees.
McGuffin and Robert. Information Visualization 9:2 (2010), 115–140.] 28
Comparison: tree drawing idioms
• data shown
–link relationships
–tree depth
–sibling order
[Quantifying the Space-Efficiency of 2D Graphical Representations of Trees.
McGuffin and Robert. Information Visualization 9:2 (2010), 115–140.] 29
Comparison: tree drawing idioms
• data shown
–link relationships
–tree depth
–sibling order
• design choices
–connection vs containment link marks
–rectilinear vs radial layout
–spatial position channels
[Quantifying the Space-Efficiency of 2D Graphical Representations of Trees.
McGuffin and Robert. Information Visualization 9:2 (2010), 115–140.] 30
Comparison: tree drawing idioms
• data shown
–link relationships
–tree depth
–sibling order
• design choices
–connection vs containment link marks
–rectilinear vs radial layout
–spatial position channels
• considerations
–redundant? arbitrary?
–information density?
• avoid wasting space [Quantifying the Space-Efficiency of 2D Graphical Representations of Trees.
McGuffin and Robert. Information Visualization 9:2 (2010), 115–140.]
• consider where to fit labels!
31
[Link]: Many, many options!
[Link] 32
Arrange networks and trees
Implicit
Spatial Position