0% found this document useful (0 votes)
39 views14 pages

M2 - Visualization Across Time, Space, Relationships

Uploaded by

krishnabadhe20
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views14 pages

M2 - Visualization Across Time, Space, Relationships

Uploaded by

krishnabadhe20
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Data Visualization

Visualization Across Time, Space and Relationships


Data Visualization

Table of Contents
Introduction ............................................................................................................................................................................. 3

1. Visualization of Statistical Data ......................................................................................................................................... 4

2. Visualizing Variation Through Time .................................................................................................................................. 5

2.1. Line Graphs ................................................................................................................................................................. 5

2.2. Combination Line and Bar Graphs ............................................................................................................................ 5

2.4. Sparklines ................................................................................................................................................................... 6

3. Visualizing Variation Across Space .................................................................................................................................. 9

4. Visualizing Relationships Between Numerical Measures .............................................................................................. 10

4.1. Scatter Plots .............................................................................................................................................................. 10

4.2. Bubble Chart ............................................................................................................................................................. 11

5. Visualizing Relationships Between Categorical Measures ............................................................................................ 12

5.1. Visual CrossTabs ...................................................................................................................................................... 12

5.2. Radar/Spider Charts ............................................................................................................................................... 13

Summary ............................................................................................................................................................................... 14
Data Visualization

Introduction
The techniques described in this topic pertain to visualizing statistical data. In data analysis,
a thorough understanding of the data is required prior to taking it forward for modelling and
extracting insights. Because of the huge volume of data being processed nowadays in
enterprises, understanding the data is a difficult task. Visualization methods, hence play a
major role in helping to analyse sampled enterprise data to arrive at an initial set of
characteristics for describing the data.

There are two important methods to visualize the variation of measures through time, which
are Line graphs and Sparklines. To visualize relationships between numerical measures, the
techniques of Scatter Plots and Bubble Charts are utilized.

Commonly used methods to describe relationships between categories are scatter plots
depicted as visual crosstabs, and radar charts.

Learning Objectives
Upon completion of this topic, you will be able to:
• Describe the techniques used for time-series and spatial visualization
• Describe techniques used for visualizing inter-measure relationships
Data Visualization

1. Visualization of Statistical Data


Examining the Data
To understand the data in a dataset, the categorical and numerical measures associated
with it need to be examined. Statistical meaning is found both in the variations within the
categorical and numerical measures as well as in the relationships among these measures.

The following table summarises the different types of variations and relationships for
measures.
Variation within Categorical How items in the categories relate to each other?
measures (Ranking, Part-to-whole)
Variation within Numerical How values in the measure are distributed across the
measures range? (Distribution)
Variation through Time How values change through time? (Time-series)
Relationship between How measures relate to one another? (Correlation)
Numerical measures
Variation across Space Where are values located in space relative to one
another? (Spatial)
Relationship between How categories relate to each other mediated by
Categorical measures measures? (Inter-category)
Table 1.1. – Summary table of variations and relationships within and across measures.
Data Visualization

2. Visualizing Variation Through Time


This analysis is done to examine how data changes through time that is, time-series
relationships. The visualization methods used in this analysis are given below.

2.1. Line Graphs


We use line graphs to compare multiple instances of one or more measures taken at
equidistant points in time.

Lines describe the overall shape of the values and connect the individual data values to give
a sense of continuity across them. Time is always on the horizontal axis.

Figure 2.1. - Earthquakes per month


Source: canvasjs.com
Unlike a bar graph, the quantitative scale of a line graph need not begin at zero but can be
narrowed to a range of values beginning just below the lowest, and just above the highest
values in the data.

2.2. Combination Line and Bar Graphs


We use combination of graphs in cases where some data is best displayed using bars, with
an emphasis on individual values, and some data using a line, with an emphasis on the
overall shape of the data.
Data Visualization

Figure 1.2. - The bars show individual player scores and the line shows average score
Source: jpowered.com

2.4. Sparklines
A sparkline is a very small line chart, typically drawn without axes or coordinates.
It presents the general shape of the variation over time in a measurement such as
temperature or stock market price in a simple and highly condensed way.
Data Visualization

Figure 1.3. - Sparkline showing agent performance trend over month

Source -
msdnshared.blob.core.windows.net/media/TNBlogsFS/BlogFileStorage/blogs_msdn/seanboon/WindowsLiveWriter/HowToBu
ildSparklineReportsinSQLServerRep_13C99/SSRS%20Sparkline_2.jpg

Sparklines are drawn small enough to be embedded along with text. Alternatively, several
sparklines may be grouped together as elements of a small multiple as shown in figure 1.4.
Data Visualization

Figure 1.4. - Multiple Sparklines


Source - excelcharts.com
Data Visualization

3. Visualizing Variation Across Space


Examining the variation of data across space is useful in understanding the geo-spatial
location of values and their relative size and density. This is useful in cases where much of
the information that the enterprise needs to monitor, and understand is tied to geographical
locations.

As an example, sales information is understood more easily if the location of those sales on
a map are provided.

Quantitative values are typically encoded on maps using:


• Objects such as points or bubbles that vary in size (and colour intensity)
• Colour filled regions with varying intensity

Figure 3.1. - Mapping of tweet density across the USA


Source: Slate.com
Data Visualization

4. Visualizing Relationships Between Numerical Measures


As described earlier, variation within numerical measures is examined by visualizing the
data in the form of distributions. In the following section, visualization methods which are
used to examine relationships between numerical measures are described.

The relationship between quantitative measures is called as correlation. The variables are
said to be correlated when the value of one of them varies systematically with the other’s
value.

The nature of a correlation can be characterised by its strength, direction and shape.

4.1. Scatter Plots


Scatter plots were invented specifically for examining correlations between 2 quantitative
variables.

It comprises of a set of data points with a trend line. The two paired variables are displayed
using the horizontal X-axis and the vertical Y axis. This is also easier to perceive as human
perception is tuned to grasp 2-D positions and patterns.

The scatter plot displays whether or not, in what direction, and to what degree two paired
sets of quantitative values are correlated. In addition, the other aspects which are also
observed are the gaps (where no values occur), and clusters (where data points are
dense).
Data Visualization

Figure 4.1. - Correlation between consumption of sugar and tooth decay in adults
Source: Pew Research Center

4.2. Bubble Chart


The bubble chart enables display of a third dimension of data to the scatter plot by using
size of the points or colours as another attribute.
Each data point is plotted as a bubble
that expresses two of the values through
the 2-D location on the graph and the
third through its size. (Colour can be
used as a separate fourth characteristic
in the bubble chart, thereby giving the
option of describing the data distribution
in yet another dimension).

Figure 4.2. - Bubble chart sample


Source: Oracle site
Data Visualization

5. Visualizing Relationships Between Categorical Measures


Multivariate data is a data that has more than one attribute describing it, which occurs very
frequently in data analysis. Once the data crosses two dimensions, it is difficult for human
beings to mentally picture the data.

5.1. Visual CrossTabs


Though pivot tables (crosstabs) were designed for the purpose of examining relationships
between categories, they are useful mainly for looking up specific values. When it is
required to compare multiple dimensions and for observing patterns in the data, a graphical
representation of these tables called Visual Crosstabs are used.

Figure 5.1. - Multiple scatterplot


Source: The Information Lab

Scatter plots depicted as visual crosstabs in the small multiples format, and are very useful
to examine correlation between multiple categories. Each scatter plot shows a set of
pairwise relations among variables, together called a scatter plot matrix. The matrix
Data Visualization

enables visual inspection of the correlations between any pair of variables. The figure 5.1,
depicts the correlation between profit and sales revenue for 3 office products (Furniture,
Office Supplies, Technology) across 3 market segments (Consumer, Corporate, Home
Office) for 4 geographical regions (Central, East, South, West).

5.2. Radar/Spider Charts


Radar or spider charts are used for displaying multivariate data with three or more
quantitative variables. It is essentially a line graph with the categorical scale arranged along
a circular axis.

Figure 5.1. - Comparison of units sold across 3 categories and 3 regions


Source: Datapine.com
Data Visualization

Summary
To understand the data in a dataset, the categorical and numerical measures associated
with it need to be examined to derive the statistical meaning underlying the variations and
relationships pertaining to these measures.

Variation through time examines time-series relationships. The visualization methods used
for this are mainly line graphs, combination of line and bar graphs and sparklines.

Visualizing the variation of data across space is useful to understand the relative location,
size and density of values in a geo-spatial context.

The relationship between quantitative measures is termed as correlation. The techniques


used to visualize are mainly scatter plots and bubble charts.

Relationships between categorical measures involve multiple variables. Scatter plots


depicted as visual crosstabs in the small multiples format and radar or spider charts are
used to examine these relationships.

You might also like