Advanced Data
Visualization
and Interpretation
(Part 3)
2
Overview
Uncertainties in Data Analysis
Data Storytelling
Uncertainties
in Data Analysis
What are Uncertainties?
Uncertainty is not a flaw, but a feature of data.
It reflects the complexity and variability of the real world, and the
limitations of our methods and tools.
Ignoring or hiding uncertainty can lead to overconfidence,
misinterpretation, or manipulation of data.
On the other hand, acknowledging and communicating uncertainty
can help you present a more accurate, nuanced, and transparent
picture of your data, and invite your audience to engage with it
critically and constructively.
Measures of uncertainty?
The uncertainty expressed in the same units as the
Absolute measurement.
Uncertainty
Example: A length of 10.0 ± 0.1 cm.
The uncertainty expressed as a fraction or
Relative percentage of the measurement.
Uncertainty Example: A relative uncertainty of 1% means the
measurement is uncertain by 1% of its value.
Standard Error This is used to estimate the variability of the data.
A range of values within which the true value of a
Confidence
quantity is likely to fall with a certain level of
Interval
confidence.
Standard Error
The standard error is one of the mathematical tools used in statistics to
estimate the variability. It is abbreviated as SE.
The standard error of a statistic or an estimate of a parameter is the
standard deviation of its sampling distribution.
We can define it as an estimate of that standard deviation
S = standard deviation
n = number of observations.
Standard Error (Example)
Age 15 16 16 17 18 16 17 16 25 17
Mean = 17.3
Standard deviation = 2.68
Confidence Interval
Confidence Level Z Value
80% 1.282
85% 1.440
90% 1.645
95% 1.960
99% 2.576
99.5% 2.807
99.9% 3.291
Confidence Interval (Example)
Age 15 16 16 17 18 16 17 16 25 17
Confidence Level Z Value
Mean = 17.3 Standard Deviation = 2.68
80% 1.282
Confidence Level = 95% Z = 1.96 85% 1.440
90% 1.645
95% 1.960
99% 2.576
99.5% 2.807
99.9% 3.291
Visualizing Uncertainties
1. Error bars:
Lines extending from data points to indicate the range of
uncertainty.
Types
Standard Error
Represents the standard deviation of the
sample mean.
Standard Deviation
Represents the variability of the data
points around the mean
Confidence Interval
Represents the range in which the true
population parameter likely are.
Visualizing Uncertainties
2. Confidence Bands
Confidence bands are shaded regions
around a fitted line or curve in a graph.
They visually represent the uncertainty
associated with the estimated function
or relationship between variables.
Unlike confidence intervals, which
provide a range of values for a single
point, confidence bands show the
plausible range of values for the entire
function.
Visualizing Uncertainties
2. Density Plots
A density plot is a graphical
representation of the probability density
function (PDF) of a continuous variable.
It provides a smooth, continuous curve
that estimates the probability
distribution of the data.
Unlike histograms, which divide data into
bins, density plots provide a more
nuanced view of the data's distribution.
Why is it important to communicate uncertainty?
By acknowledging uncertainty, you demonstrate
Builds trust
transparency and honesty with your audience.
Informs Uncertainty can help viewers understand the
decision- limitations of the data and make more informed
making decisions.
Improves data Visualizing uncertainty can help viewers understand
interpretation the true variability and significance of the data
Data Storytelling
Aspects of Data Storytelling
Audience Emotional
Understanding Connection
Simplify complex
data
Tailor the Use visual aids Create relatable
Message stories
Address Needs Highlight impact
and Questions
Clarity and
Simplicity
Real World Case Studies
The World Health Organization tracks COVID across the globe in an interactive dashboard.
Readers can choose to view the number of COVID cases, deaths, or vaccinations; the various
measures taken by each country. Healthcare organizations can also combine historical data
with clinical trial results to explain the benefits and risks of new treatments to patients.
Real World Case Studies
Data story
showing the
medal history of
different countries
across 120 years of
Olympic
Users can
choose season,
years, country,
event, athlete and
gender.
Real World Case Studies
Data story
showing the
distribution and
availability of
public toilets in
Australia
Conclusion
Visualizing uncertainty is crucial for responsible and effective data
communication.
Data storytelling transforms data into impactful and engaging
narratives.
By understanding and applying these principles, you can
communicate data more effectively and influence others with your
findings.
Assignment
Subjects Score Age 1. State the chart types you would use
to visualize for the following questions
Math 48 17 a) Age distribution
Art 88 16 b) Percentage of students taking each
English 76 16 subject
c) Average score for each subject
Art 76 15
Art 78 16
2. What is the Confidence interval for Age
English 89 18 of the students?
Math 79 16
Math 90 17
3. Calculate the standard error for the
Math 77 16 Scores
English 67 16