0% found this document useful (0 votes)
55 views31 pages

Day 5 Discriptive Statistics

The document outlines the steps involved in quantitative data analysis, focusing on descriptive data analysis which includes summarizing data through frequency distributions, graphs, and measures of central tendency. It provides examples of analyzing categorical and continuous data, including the use of crosstabs and probability distributions. Additionally, it discusses methods for exploring trends and relationships within data using scatter plots and time series graphs.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
55 views31 pages

Day 5 Discriptive Statistics

The document outlines the steps involved in quantitative data analysis, focusing on descriptive data analysis which includes summarizing data through frequency distributions, graphs, and measures of central tendency. It provides examples of analyzing categorical and continuous data, including the use of crosstabs and probability distributions. Additionally, it discusses methods for exploring trends and relationships within data using scatter plots and time series graphs.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd

Descriptive Data Analysis

Aliford Asoms Mpinganjira


Lecturer in Biostatistics

DEPARTMENT OF MATHEMATICAL SCIENCES

MALAWI UNIVERSITY OF BUSINESS AND APPLIED SCIENCES


Quantitative Data Analysis
In quantitative research, after collection of data, the
statistical analysis follows the following steps:

 First step is to conduct descriptive analysis of the


variables
 Second step is to conduct inferential statistics

2
Descriptive Data Analysis
Descriptive Data Analysis is the type of data analysis which
involves:
 Giving the data summaries ( frequency distributions and graphs)
to help in visualization/exploring/describing/discussing/finding:

- outstanding issues( in terms of %/ proportions/probabilities)


- shape of distribution ( normal or skewed)
- trends/patterns/ relationships
 Working out and interpreting measures of central
tendency( mean, mode and median)
 Work out measures of location( percentiles, deciles and quartiles)
 Working out measures of variability/spread/dispersion

3
Descriptive Data Analysis

4
Descriptive statistics involving
Categorical Data
Categorical data can be
summarized by
[Link] frequency distributions
[Link] charts
[Link] graphs
[Link] also called Contingency tables (
two/three way)
Categorical Frequency
Distribution
This distribution is used for data
that can be placed in specific
categories, such as nominal or
ordinal data.
e.g. Data on political affiliation,
religious affiliation, race and
gender are examples of categorical
data.
Example
Twenty-five graduate engineers were given a
blood test to determine their blood type. The
data set is
A B B AB O
O O B AB B
B B O A O
A O O O AB
AB A O B A
Construct a frequency distribution for the data.
Solution
Category Tally Frequenc
y
A //// 5
B //// // 7
AB //// 4
O //// //// 9
Interpretation: Most of the graduate engineers in the
population are of blood type O, with a frequency of 9
Cont…
Activity 1
Using variable race in the lowbirthweight data set, create and
interpret:
-Categorical relative frequency distribution
-pie chart
-bar graph
Try the same using any categorical variable in your data set
Steps:
analyze>descriptive stat>select var> frequencies/pie/bar under
chart
Cont..
Output

Interpretation? Note: Avoid re-writing the frequency table in words


Most of the women who gave birth in the population of interest( mention
it)are whites with a percentage of 50.79. Maybe this is because of….
Example of a pie
chart

Interpretation?
Example of a bar graph

Interepretation?

Note: the variable(likert scale) can be transformed to have just two


categories( dissatisfied and satisfied)
Comparing groups within groups

Example of a clustered bar graphs


Activity 2

• Using Lowbirthweight dataset generate a clustered bar graph of low


birthweight comparing smoking status and comment on it

Steps:

Graph> legacy diologs>bar>clustered > select variables>ok

• try the same with any two categorical variables in your data set
Cont…
Crosstabs( used to summarize two or three categorical variables
Activity 3
 Using the Lowbirthweight data set, create cross tabs and
comment on them for the following variables:
1. Lowbirthweight and smoking status
2. Low birthweight, smoking status and race
Steps
Analyze> descriptive statistics> crosstab>selects variables>ok
 Re-do 1 and instead of frequencies, find these probability
distributions: joint, marginal and conditional ( given smoking
status)
Cont…
Example of crosstab

Counts can be misleading sometimes, so its good to convert them into


proportions/probabilities
Cont…
Joint probability distribution

Interpretation:There is a higher chance(0.455) of finding a


woman who was not smoking during pregnancy and gave birth to
a child of birth weight> 2500g.
Cont…
Conditional probability distribution

Interpretation: Among smokers, there is a higher chance of finding a woman who gave
birth to a child of birth weight > 2500 than birth weight <2500, there is a higher chance
of giving birth to a child of birth weight < 2500 among smokers than non smokers.
Descriptive statistics involving
continuous data
Continuous data can be
summarized by
[Link] frequency distributions
[Link]
[Link] polygon
[Link]
Example of grouped frequency distribution
(amount of waste, in kgs, produced per household
per month)
Class Cumm
Class Limits Tally Freq
Boundaries ulative
24 - 30 23.5 - 30.5 /// 3 3
31 - 37 30.5 - 37.5 / 1 4
38 - 44 37.5 - 44.5 //// 5 9
45 - 51 44.5 - 51.5 //// //// 9 18
52 - 58 51.5 - 58.5 //// / 6 24
59 - 65 58.5 - 65.5 / 1 25
Total 25
Activity 4

In a study about distances students cover when going to school , the


following data were obtained:

30 50 33 70 81 49 61 35 19 80 25
10 40 35 30 30 61 80 40 56 62 24 60
44 80 20 90 30 70 40 10 50

• Construct a grouped cumulative frequency distribution with six classes

• comment on the constructed distribution


Example of a Histogram
Example of a Frequency
Polygon
Example of an Ogive
Activity 5

Using the variable age in the Lowbirth weight data set


create and interpret the following:

grouped frequency distribution, histogram, frequency


polygon, and ogive and interpret it.

 Use a continues variable in your data set to create


grouped frequency distribution, histogram, frequency
polygon, and ogive
Descriptive statistics to
explore trends/relationships

• Scatter plots to explore relationships in


continuous data

• Historigram(time series graph) to


explore trends in time series data
• Example of scatter plot

A B C

Interpretation?
Activity 6

Using the Lowbirth weight data set create, fit a line of best
fit and interpret a scatter plot of age of the mother and
weight at last..

Steps
Graphs>chart bilder>scatter>drag variables into appropriate axises>ok>
double click the graph>right click>add fit line.
• Example of scatter plot

Interpretation

There is a positive linear relationship between age of the mother and


the weight of the mother
Example of time series
graph
Activity 7

Use any data set containing time series data and create
time series graph of the variable against time

Steps
Graphs>chart bilder>scatter>drag variables into appropriate( time in x-
axis and the other variable(s) in y-axis axises>ok> double click the graph>
add interpolation line from the menue bar)

You might also like