ÅPNINGSSIDE/TITTEL HER
APPLIED DATA SCIENCE
Undertittel. Evt. navn/dato/årstall
Data Visualization
PhD Emmanouil Papagiannidis:
[email protected]Goals
1. Understand and apply design principles.
2. Identify the good the bad and the ugly.
3. Identify the tricks and the cheats.
4. Be able to find trends and insights.
5. Create interactive dashboards.
2
Visualization Books
3
What is data visualization
4
Importance of data visualization
Interactive data visualization refers to the use of software that enables direct
actions to modify elements on a graphical plot.
Intuitive refers to the use of data visualization in order to interoperate the data and
find useful insights.
Personalized refers to the use of data visualization to change the appearance or
functionality of the graphs to increase its personal relevance to an individual, a
company or an organization.
Easy to Share refers to the ability to share the work internally or externally.
5
Why visualize data – Example 1
6
Why visualize data – Example 1
7
Why visualize data – Example 2
8
Why visualize data – Example 2
9
Five Qualities of Great Visualizations
1. Truthful
2. Functional
3. Beautiful
4. Insightful
5. Enlightening
10
Truthful
11
Truthful
The whole truth!
12
Truthful
What is wrong here
can you spot it?
13
Functional
Any visible
drawbacks?
14
Functional
Much easier to identify
what happened.
15
Beautiful
16
Beautiful
Is this a good example?
17
Beautiful
What about
this example?
18
Beautiful
Improvement of the
previous graph.
19
Insightful
The purpose of visualization is insight, not pictures – Stuart Card “Reading in Information
Visualization”
Check the next link and find any insights you can:
https://www.scientificamerican.com/article/how-nations-fare-in-phds-by-sex-interactive/
20
Enlightening
Choosing topics ethically and
wisely, casting light over relevant
issues, matters a lot.
Some topics do matter more than
others indeed because they are
critical to the well-being of more
people.
21
PARC design guidelines
22
Example: Oslo Public Transportation
A hard task.
What do you
notice when you
see the lines?
23
Proximity
Minas Morgul What is important here
(788) 444-2233
and where is the
Sauron the Great address?
Mordor
MR 2667
Grouping the
address information Sauron the Great
makes this easier to
read. Minas Morgul
(788) 444-2233
Mordor MR 2667
24
Proximity
LINKS TO IMAGES Links to Images Grouping related
House items can help
HOUSE Tree improve lists.
TREE
Capitalizing all text
LINKS TO ESSAYS
Links to Essays
Building doesn’t usually help
Botany readability.
BUILDING
BOTANY
25
Proximity
More things to try:
Make sure headlines are closer to the related text than to the text or graphics above
them.
Make sure captions are close to their photographs.
Put space between unrelated things.
Ensure any hierarchy between the items on the page is represented in their spatial
arrangement.
26
Alignment
Line things up with each other.
Choose one alignment and use it for the whole page.
Don’t mix alignments.
It looks very odd.
And can be difficult to read
as well.
While you are doing this try and keep text away from the edge of the
page to reduce the number of difficult to read long, long lines.
Here are
some examples:
27
Alignment
Alignment means
Home Home
introducing the same text
alignment, e.g. flush left,
for everything on the
About Us page. About Us
It also means aligning
Contact Us buttons etc. along edges Contact Us
to create visual
connections.
Events
Events
Check this using grids or
rulers.
28
Alignment - horizontal
Home Contact
Events
About Us Us
Home About Us Contact Us Events
Horizontal alignment is as important as vertical alignment.
Horizontal baseline alignment is a feature that can be controlled for
tables in most web authoring packages.
29
Alignment – centred
Home
Centred alignment Home
can be a problem
because it introduces Email
Email an invisible line down
Browse
Browse the centre of objects.
Search
Search Visually it can be
stronger to flush left
Software
Software or right and makes The Basics
The the link between
Support
Basics items clearer down
the resulting vertical
Support line.
30
Alignment
Turn table borders off and improve the text alignment.
Red Green
Blue Orange It can often look better and be easier to read.
Pink Yellow Tufte suggests this as the principle of minimizing the
Violet Indigo non-data ink on a page.
The visual display of quantitative information, Edward Tufte, Graphics Press, 1983.
Red Green
Blue Orange
Pink Yellow
Violet Indigo
31
Repetition
Repeating elements tie together all parts of a web site.
It should be clear that all the pages belong to the same site simply by looking at
them.
Repeating navigation buttons on every page is one example and means visitors do
not need to learn their way around again on every page they visit.
Probably you do this by consistently creating titles in the same font.
Extend this and consciously push further to create a visual key that ties your
design together.
32
Repetition
Unified by navigation bar and the choice of
font.
Buttons repeat the style.
Also link to the product.
33
Repetition
Simple design can be work well.
34
Contrast
Contrast draws your eye to something.
It can be used to indicate hierarchy and relationships.
Contrast needs to do what it says on the tin.
If something is not the same make it very DIFFERENT.
Contrast can be between: font, bold, style, colour, images, spatial arrangement.
35
Contrast
Create a focal point on a page with the other elements related to it in a hierarchy.
A focal point can be created using contrast.
Simple example is to use font size to make the most important item stand out.
Sauron the Great Sauron the Great
Minas Morgul Minas Morgul
Mordor Mordor
MR 2667 MR 2667
(788) 444-2233 (788) 444-2233
36
Contrast and Repetition
Sauron the Great Sauron the Great
Minas Morgul Minas Morgul
Mordor Mordor
MR 2667 MR 2667
(788) 444-2233 (788) 444-2233
Repeat the font and bold type of the first
line on the last line.
Where does your eye go?
37
Contrast and Repetition
Similar lines in the left table, intentional or a mistake?
Contrasting lines in
the right table seem
clearer.
38
Moving along with the visuals…..
What stands out?
Popout is an important
characteristic of the visual system.
Those visual features that popout
provide us a tool for designing
more effective visualizations.
(You will also see this referred to
as pre-attentive processing).
40
What stands out?
The brain detects some features more
easily than others.
Popout happens when there is strong
enough contrast between the object
and its surroundings:
Colour, Brightness, Position, Shape,
Sharpness, Lighting, Shadows,
Blinking, Motion, and stereo 3D.
Popout is best when there is a single
feature we want to highlight.
41
Do all differences work?
Find the inverted T.
Even within a single cue
channel it can be hard to
find a target even when
it is different from other
channels.
Need to check carefully
we have enough cue
contrast.
42
Can we learn and do this better?
Even if you look at this from
the side the dot still pops out,
the 6s don’t.
This is an innate ability – you
can’t improve your ability to
find the 6s by practice.
43
Can we combine cues?
Find the three green squares!
Combining cues is not always
a good idea as many cue
combinations are hard to see.
Here green circles mask the
three green squares – no
popout.
44
Can we combine cues?
These seven symbols are designed to
be independently searchable.
You could include them in the same
diagram, and you should be able to
easily spot the one you are looking
for.
There are lots of opinions published
in this area, and far too few scientific
studies.
45
Most distinct 48 colours in the survey
There is a case these could be the most
reliably identifiable colours.
There is a full table of the top 954 colours on
the survey page:
https://blog.xkcd.com/2010/05/03/color-
surveyresults/
46
Sequential colour palettes
L varies
L and S vary
H, L and S vary
Zellis et al, 2009
You can vary one or more of the HLS components to get a sequential palette, good for
quantitative data.
47
Colour palettes
Zellis et al, 2009
These have a neutral value in the middle of the palette and diverge in each direction,
good for quantitative data.
48
Colour blindness
There are a range of colour deficiencies in
the population.
It is rare to be completely colour blind, but
do check colours.
Try and be sure colour is not your only
method for conveying information, e.g. use
shape too.
49
Factfullness
What should we do in Data Science to be more ethical?
One person who thought about this in detail was Hans Rosling. Factfullness is his response
to this question, as a set of ten recommendations for Data Scientists (and indeed everyone)
about using and presenting facts.
https://www.youtube.com/watch?v=usdJgEwMinM
50
The Gap
Where is the gap.
Beware of averages.
Beware extremes.
Don't look down.
51
Negativity
Negative news is news.
Good news is rarely news. Gradual improvements
are not news.
Expect to mainly see bad news.
Are things bad but getting better?
More bad news may just be better records.
Be very wary of rosy pasts - rarely true.
52
Don’t assume straight lines
Straight lines are rare.
Never assume that things will continue in straight
lines.
S-Bends are much more common.
53
Manage fears
Understand real risks
Risk = danger * exposure
54
Size
Recognize impressive numbers and check them.
Compare the context e.g. Globally.
80/20 check what makes up 80% of the effect.
Divide to get a rate especially when comparing
different size groups
55
Generalization
Question categories.
Look for difference within groups.
Look for similarities across groups.
Look for differences across groups.
Check the majority is.
Beware of vivid examples.
Assume people are smart, be curious
56
Destiny
Things appear constant because change happens
slowly.
Slow change is still change.
Track gradual improvements.
Update your knowledge.
Collect examples of cultural change.
57
Have more than one perspective
Test ideas.
Limited expertise.
Hammers and nails.
Numbers but not only numbers.
Beware simple ideas and simple solutions
58
Beware blame
Identify scapegoats, resist blaming an individual.
Look for causes, not for villains.
Look for systems not for heroes.
59
Resist Urgency
Decisions are rarely urgent.
Take a breath, look for time to reflect.
It can be now and ever Insist on data that is
relevant & accurate.
Beware predictions without uncertainty, look for
options.
Be wary of drastic action, Kaizen!
Kaizen is a concept referring to business activities that continuously improve all
functions and involve all employees from the CEO to the assembly line workers.
60
Trust in data visualization
Trying to generate more trust may be counterproductive.
But one aim might be to be trustworthy when we present data.
There are three properties that could help generate trust:
1. Accessibility – can we get the data we need.
2. Usability – is it in a form, with the tools needed, to be used.
3. Assessibility – can we test the data, reanalyse & compare it.
Transparency alone != Trustworthiness.
https://www.ted.com/talks/onora_o_neill_what_we_don_t_understand_about_trust
?language=en
61
Storytelling
Creating a narrative : Aristotle introduced this structure in Poetics in the 4th Century BC.
Let the data tell the story. If there are human characters in the story are they a main or a
secondary element?
Very often the characters we present will be abstract entities described by data.
Even so the same rules of story will help you engage and inform your audience.
http://www.sheilacurranbernard.com/documentary-storytelling.html
Free chapter from which is here:
http://www.sheilacurranbernard.com/uploads/1/0/2/7/10273986/else_ch19_docstory2nded.
pdf
62
Narrative Structure
Constructing a data story (Knaflic):
Act 1: set the context, main (data) character(s) and the questions that will be answered.
Act 2: what is the data, what are the possible conclusions, what are the difficult questions.
Act 3: proposed conclusion and a call to action.
You can see her video here: https://youtu.be/8EMW7io4rSI
63
Legal argumentation
What would a reasonable person think?
Start by stating the conclusion (good rule in all rhetoric).
State the general rule you will use to support your conclusion.
Marshal evidence and apply the rule to the facts.
Summarize the case and come to a conclusion.
Burden of proof (facts) vs burden of rejoinder (fallacies).
e.g. see http://www.franks.org/fr01123.htm
64
Answer these
Who?
What?
How?
Action
3-minute story
Storyboarding
Graphs
66
Use text
67
Remove what you do not need
68
Make it obvious
69
Make it obvious
70
Lines can be helpful
71
Avoid pies
72
Avoid pies
73
Avoid 3D
74
Be careful with colours
75
Avoid secondary y-axis
76
Avoid secondary y-axis
Plot your uncertainty
Clean your visualization
79
Remove chart border
80
Remove gridlines
81
Remove data markers
Clean up axis labels
83
Label data directly
84
Leverage consistent color
85
Final comparison
Before After
86
The whole picture – Dashboard Example 1
87
The whole picture – Dashboard Example 2
88
Remember
Be consistent.
Have a layout.
Choose the right colour palette.
Give information gradually.
Interact with your data.
Do not give too much information.
89
Thanks for watching!
Questions?