0% found this document useful (0 votes)
10 views29 pages

Module 3

Uploaded by

Riya Sinha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views29 pages

Module 3

Uploaded by

Riya Sinha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Module 3

• Meaning of Central Tendency


• Mean (Arithmetic, Weighted Mean)
• Median – for skewed data
• Practical Examples: Employee salaries,
Sales per month

[Link]
Measures of Central Tendency:
Unveiling the Heart of Your Data
Welcome to Module 3, where we embark on a journey into
the fundamental concepts of Measures of Central Tendency.
In this comprehensive exploration, we will uncover how
these powerful statistical tools help us understand the
typical, central, or average value of a dataset. Whether you're
analyzing business performance, market trends, or social
phenomena, mastering central tendency is crucial for making
informed decisions. This module is designed to provide
business students with a robust understanding of the "what,"
"why," and "how" of mean, median, and mode, equipping
you with practical skills for real-world data analysis.

[Link]
Agenda: Your Roadmap to Understanding Data's Core

Introduction to Central Tendency The Arithmetic Mean (Average) Understanding the Weighted Mean

Defining what central tendency means and why Detailed breakdown of its calculation, properties, Exploring scenarios where some data points carry
it's a cornerstone of descriptive statistics. and common applications. more importance.

The Median: Tackling Skewed Data Practical Applications & Case Studies
Why the median is often preferred when outliers distort the mean. Real-world examples of mean and median in business contexts, like salaries
and sales.

Our journey through this module will build a solid foundation, starting with conceptual clarity and progressing to hands-on application. Each section is designed to provide
actionable insights, ensuring you can confidently apply these measures in your academic and professional pursuits.
[Link]
What is Central Tendency?
• At its core, central tendency refers to a single value that attempts to
describe a set of data by identifying the central position within that set. It's
like finding the "typical" or "most representative" value in a group of
numbers. Imagine you have a large dataset, say, the sales figures for all
products in a quarter. Instead of looking at every single sale, measures of
central tendency provide a snapshot—a single number that summarizes
the entire distribution. This summary value acts as a benchmark, helping
us understand the overall performance, typical customer behavior, or
average market response.
• These measures are fundamental because they simplify complex data,
making it easier to interpret and communicate findings. Without them,
we'd be drowning in raw numbers, unable to discern patterns or draw
meaningful conclusions. Central tendency measures are the first step in
any data analysis, laying the groundwork for more advanced statistical
inferences. They help answer questions like: "What is the average income
of our target demographic?" or "What's the typical number of customer
complaints we receive per day?"

[Link]
The Mean: Your Everyday Average

The arithmetic mean, often simply called the "average," is the most commonly used measure of central
tendency. It's calculated by summing all the values in a dataset and then dividing by the total number of
values. Mathematically, if you have a dataset with 'n' values (x₁, x₂, ..., xₙ), the mean (μ or x̄) is given by:

The mean is intuitive and widely understood, making it an excellent choice for symmetric distributions
where data points are evenly distributed around the center. For instance, if you're calculating the average
height of students in a class, the mean would likely be a good representative value. It utilizes all data
points in its calculation, reflecting every value's contribution. However, this sensitivity also makes it
susceptible to the influence of outliers—extremely high or low values that can pull the mean away from
the true center of the bulk of the data. For example, a single very high salary in a company could
significantly inflate the 'average' salary for all employees, making it misleading.

[Link]
Ungrouped Data/Raw Data
The arithmetic mean is defined as being equal to the sum of the numerical values of each and every
observation divided by the total number of observations. Symbolically, it can be represented as:

where,
∑Xi indicates the sum of the values of all the observations, and N is the total number of
observations.

[Link]
Question 1

For example, let us consider the monthly salary (Rs.) of 10 employees


of a firm x
2500, 2700, 2400, 2300, 2550, 2650, 2750, 2450, 2600, 2400

[Link]
Solution 1
If we compute the arithmetic mean, then
2500+2700+2400+2300+2550+2650+2750+2450+2600+2400
= 25300
Mean=25300 /10= Rs. 2530.
Therefore, the average monthly salary is Rs. 2530.

[Link]
Discrete Data
When the observations are classified into a frequency distribution, Therefore, for discrete data; the
arithmetic mean is defined as

Where, f is the frequency for corresponding variable x and N is the total frequency, i.e. N = Σf.

[Link]
Question 2

[Link]
Solution 2

Mean =
8270/200=
41.35

[Link]
Continuous Data
When the observations are classified into a frequency distribution, Therefore, for grouped data; the arithmetic
mean is defined as

Where X is midpoint of various classes, f is the frequency for corresponding class and N is the total frequency,
i.e. N = Σf.

[Link]
This method is illustrated for the following data which relate to
Question 3 the monthly sales of 200 firms. the midpoint of the class interval
would be treated as the representative average value of that class.

[Link]
Solution 3

Mean=
102000/
200
=
510

[Link]
When to Use the Weighted Mean
While the arithmetic mean treats every data point equally, there are situations where some values
inherently carry more importance or occur with greater frequency than others. This is where the weighted
mean becomes indispensable. The weighted mean assigns different "weights" to each data point,
reflecting its relative significance.
The formula for the weighted mean is:

Use weighted mean when:


• Some values are more important than others.
• Data is collected in groups or categories.
• You're averaging values with different frequencies or significance.

[Link]
Question 4

[Link]
Solution 4

[Link]
The Median: The Middle Ground for Skewed Data
The median is the middle value in a dataset that has been ordered from least to greatest. Unlike the mean, the
median is not affected by extreme outliers, making it a robust measure of central tendency, especially for skewed
distributions. A dataset is skewed when its distribution is not symmetrical, meaning a significant number of data
points are concentrated on one side of the distribution, with a "tail" extending to the other side.
To find the median:

• Odd Number of Data Points: Arrange the data in ascending order. The median is the exact middle value. For
example, in {1, 3, 5, 8, 9}, the median is 5.

• Even Number of Data Points: Arrange the data in ascending order. The median is the average of the two middle
values. For example, in {2, 4, 7, 10}, the median is (4+7)/2 = 5.5.

[Link]
The Median: The Middle Ground for Skewed Data

Consider a dataset of household incomes in a city. A few billionaires could drastically pull up the mean
income, making it seem as if the "average" citizen is much wealthier than they are. In such a scenario,
the median income would provide a much more realistic representation of what the typical household
earns, as it simply finds the income level that separates the wealthier half from the poorer half,
regardless of how extreme the highest incomes are. The median is particularly useful in fields like real
estate (median home prices), economics (median income), and social sciences where data often exhibit
skewness.

[Link]
The Median: The Middle Ground for Skewed Data
DISCRETE SERIES

• First we find cumulative frequency then locate (N+1/2) the value in cumulative frequency corresponding that
value of x is median.

Question 5

[Link]
Solution 5

N=200

N+1/2=100.5

Median= 40

[Link]
The Median: The Middle Ground for Skewed Data
CONTINUOUS DATA
• For continuous data, First we find cumulative frequency. then locate (N+1/2) the value in
cumulative
frequency. corresponding class interval is median class. The following formula may be used to
locate the value of median.

where l1 is the lower limit of the median class, cf is the preceding cumulative frequency to
the median class, f is the frequency of the median class and h is the width of the median class.
Consider the following data which relate to the age distribution of 1000 workers in an
industrial establishment. The location of median value is facilitated by the use of a cumulative
frequency distribution as shown below in the table.

[Link]
Question 6
Consider the following data which relate to the age distribution of 1000 workers in an industrial
establishment. The location of median value is facilitated by the use of a cumulative frequency
distribution as shown below in the table.

[Link]
Solution 6

N=1000, Median Class=(1000+1)/2=500, 5th =(35-40), l1 =35,


cf =425, f=160, N/2=500, h=5
Median =35+{( 500-425)*5}/160 =35+375/160=35+2.34=37.34 [Link]
Mean vs. Median: When to Choose Which?
Choose the Mean When: Choose the Median When:
• The data is symmetrically distributed (e.g., • The data is skewed (e.g., income, house prices,
normal distribution). test scores)
• You need a measure that includes all data • The dataset contains outliers that might distort
points in its calculation. the mean.
• Further statistical analysis requiring • You are dealing with ordinal data (data that can
algebraic manipulation is planned. be ordered).
• The dataset does not contain significant • You want a measure that truly represents the
outliers. "typical" value, unaffected by extremes.

Understanding the strengths and weaknesses of both the mean and the median is critical for accurate data
interpretation. Misapplying these measures can lead to flawed conclusions and poor decision-making.
Always visualize your data (e.g., with a histogram) before deciding which measure of central tendency is
most appropriate. This visual inspection can quickly reveal skewness or the presence of outliers, guiding
your choice.
[Link]
Practical Example: Analyzing Employee Salaries
Imagine you're the HR manager for a small startup, and you need to report on the typical Let's calculate both the mean and the median:
employee salary. Here's a hypothetical dataset of annual salaries (in USD) for 10
Mean Salary:Sum of salaries = $1,085,000Number of employees = 10Mean =
employees:
$1,085,000 / 10 = $108,500
• $50,000

• $55,000

• $60,000 Median Salary:Ordered salaries: $50K, $55K, $60K, $62K, $65K, $68K, $70K, $75K,
• $62,000 $80K, $500KSince there are an even number of values (10), the median is the average of
the 5th and 6th values:Median = ($65,000 + $68,000) / 2 = $66,500
• $65,000

• $68,000

• $70,000

• $75,000

• $80,000

• $500,000 (CEO's salary)

In this case, the mean salary ($108,500) is significantly higher than the median salary ($66,500). This large difference is due to the single outlier: the CEO's $500,000 salary. If you
were to state that the "average" employee earns $108,500, it would be misleading, as most employees earn far less. The median, at $66,500, provides a much more accurate
representation of the typical salary for an employee at this startup, as it is unaffected by the CEO's exceptionally high income. This highlights why the median is often preferred for
financial data like salaries or property values, which tend to be positively skewed.
[Link]
Practical Example: Sales Performance Analysis
Consider a sales manager evaluating monthly sales performance for 12
different product lines over a quarter. The sales data (in thousands of
USD) for a specific month is as follows:

• $15, $18, $20, $22, $25, $28, $30, $32, $35, $40, $45, $150
• (a newly popular product)

Mean Monthly Sales:Sum of sales = $550,000Number of product lines =


12Mean = $550,000 / 12 = $45,833.33

Median Monthly Sales:Ordered sales: $15, $18, $20, $22, $25, $28,
$30, $32, $35, $40, $45, $150The two middle values are $28K and
$[Link] = ($28,000 + $30,000) / 2 = $29,000

Similar to the salary example, the mean sales ($45,833) is higher than the
median ($29,000) due to the single product line with $150,000 in sales, which
acts as a positive outlier. If the manager reports only the mean, it might create
an overly optimistic view of typical product line performance. The median
provides a more grounded understanding of what most product lines are
actually generating in sales. This is crucial for resource allocation and setting
realistic targets. For example, if the company bases its inventory decisions on
the mean, it might overstock for most products, leading to excess carrying
costs. Using the median provides a more stable and representative figure for
day-to-day operational planning. [Link]
Assignment

[Link]
Key Takeaways & Next Steps

Central Tendency Defined


Measures of central tendency summarize a dataset with a single, representative value, providing insights into its typical value.

Mean vs. Median


The mean is the average (sum/count), best for symmetric data. The median is the middle value, robust against outliers
and ideal for skewed distributions.

Practical Applications
Always consider the data's distribution (skewness, outliers) before choosing between mean and median for accurate
business insights.
Your understanding of these fundamental measures is a vital step in becoming a data-savvy business professional. In the next
module, we will delve into Measures of Dispersion, which will complement your understanding of central tendency by showing
how spread out your data is.
Action Item: Practice calculating mean and median on various datasets. Seek out real-world business reports and identify which
measure of central tendency they use and why.
[Link]

You might also like