BCA 3rd sem.
Statistical methods
UNIT-1
Introduction to Statistics
Statistics is a branch of mathematics that involves the collection, organization, analysis,
interpretation, and presentation of data. It provides tools and methods to understand and make
sense of numerical information, enabling informed decision-making in various fields.
Key Concepts of Statistics
1. Data: The raw facts and figures collected for analysis. Data can be:
o Qualitative (Categorical): Descriptive information (e.g., gender, color, type).
o Quantitative (Numerical): Numerical information (e.g., height, weight, age).
2. Population and Sample:
o Population: The entire group being studied.
o Sample: A subset of the population used for analysis.
3. Variable: A characteristic or property that can take different values.
o Independent Variable: The cause or factor being manipulated.
o Dependent Variable: The outcome or effect being measured.
4. Descriptive Statistics:
o Concerned with summarizing and describing data.
o Includes measures like:
Central tendency: Mean, median, mode.
Dispersion: Range, variance, standard deviation.
5. Inferential Statistics:
o Focuses on making predictions or inferences about a population based on a
sample.
o Involves hypothesis testing, confidence intervals, and regression analysis.
Importance of Statistics
1. Decision-Making: Facilitates better decision-making in business, government,
healthcare, etc.
2. Research: Vital for scientific studies to analyze experimental data.
3. Policy Making: Helps governments design policies based on population data.
4. Quality Control: Used in manufacturing to maintain product quality.
Collaboration with Other Fields
Statistics is a versatile tool used across various disciplines, such as:
1. Mathematics: Provides theoretical underpinnings for statistical methods.
2. Economics: Used for economic modeling, demand forecasting, and market analysis.
3. Biology and Medicine: Helps in clinical trials, genetics, and epidemiology.
4. Computer Science: Core to data science, artificial intelligence, and machine learning.
5. Social Sciences: Used for surveys, behavioral studies, and demographic analysis.
6. Engineering: Applied in quality assurance, process optimization, and risk analysis.
Statistics bridges these disciplines, providing a common framework for analyzing data, finding
patterns, and solving complex problems. Its interdisciplinary nature makes it an indispensable
tool in modern science and decision-making processes.
Data Collection, Tabulation, and Graphical Representation
Data collection, tabulation, and graphical representation are essential steps in the statistical
process. Below, we discuss each concept with detailed explanations, examples, and relevant
graphs.
1. Data Collection
Data collection is the process of gathering information for analysis. It can be categorized into
two types:
Types of Data Collection
1. Primary Data: Data collected firsthand for a specific purpose.
o Methods: Surveys, experiments, interviews, observations.
o Example: A survey to determine students' study hours.
2. Secondary Data: Data collected by someone else, used for a different purpose.
o Sources: Research articles, government reports, historical records.
o Example: Census data used for demographic studies.
Example of Data Collection
A study is conducted to understand the weekly study hours of students in a class of 10 students:
Data: [5, 7, 8, 4, 6, 10, 5, 7, 9, 6]
2. Data Tabulation
Tabulation involves organizing data into a structured format, typically in rows and columns, to
make it easier to analyze.
Steps in Tabulation
1. Determine the categories or classes.
2. Tally the data into respective categories.
3. Create a table with headings and frequencies.
Example: Frequency Distribution Table
Using the study hours data:
Study Hours (Range) Frequency
4-5 3
6-7 4
8-9 2
10 - 11 1
3. Graphical Representation
Graphical representation transforms data into visual formats, making patterns and trends easier
to identify.
a. Bar Charts
A bar chart uses rectangular bars to represent data. The length of each bar corresponds to the
frequency of the category.
Example
Categories: Study Hours (4–11 hours)
Frequencies: [3, 4, 2, 1]
b. Histogram
A histogram represents the frequency of data using adjacent rectangular bars, particularly for
continuous data.
Example
Using the frequency distribution table:
X-axis: Study hours range (e.g., 4–5, 6–7).
Y-axis: Frequency.
c. Frequency Polygon
A frequency polygon connects midpoints of class intervals with straight lines, forming a closed
figure.
Steps to Draw:
1. Plot the midpoints of each class interval.
2. Connect the points with lines.
3. Extend the line to the baseline at both ends.
d. Ogive (Cumulative Frequency Graph)
An ogive is a graph representing cumulative frequencies, useful for determining medians and
percentiles.
Steps to Draw:
1. Calculate cumulative frequencies.
2. Plot cumulative frequencies against the upper boundary of each class.
3. Connect the points with a smooth curve.
Graphical Examples
Below, we'll create charts to visualize these representations.
Graph Generation
Let me create these graphs for you.
Here are the graphical representations of the data:
1. Bar Chart: Displays the frequencies of study hours in distinct intervals.
2. Histogram: Shows the distribution of study hours with continuous data representation.
3. Frequency Polygon: Connects the midpoints of each class interval to visualize the
frequency distribution.
4. Ogive: Illustrates the cumulative frequency, helpful for finding medians and percentiles.
Graphical presentation
Q1: Represent the data in Histogram :
Weekly 10-15 15-20 20-25 25-30 30-40 40-60 60-80
Wages Rs
Number 7 19 27 15 12 12 8
of
workers
Q4 : Draw a Frequency polygon for the following :
Mid value 2.5 7.5 12.5 17.5 22.5 27.5
Frequency 3 5 10 15 7 5
Q5 : Draw “ less than “ and “More than” ogive curve from the data given below :
Weekly 0-20 20-40 40-60 60-80 80-100
Wages
Number of 10 20 40 20 10
Workers
Q6 : Draw a Histogram from the following data :
Class inerval Frequency
0-10 5
10-20 11
20-30 19
30-40 16
40-50 10
50-60 8
60-70 3
Q7: Represent the data in Histogram :
Weekly 10-15 15-20 20-25 25-30 30-40 40-60 60-80
Wages Rs
Number 7 19 27 15 12 12 8
of
workers
Q8 : Draw a Frequency polygon for the following :
Mid value 2.5 7.5 12.5 17.5 22.5 27.5
Frequency 3 5 10 15 7 5
Q9 : Draw “ less than “ and “More than” ogive curve from the data given below :
Weekly 0-20 20-40 40-60 60-80 80-100
Wages
Number of 10 20 40 20 10
Workers
Q10 : Draw a Histogram from the following data :
Class inerval Frequency
0-10 5
10-20 11
20-30 19
30-40 16
40-50 10
50-60 8
60-70 3
UNIT-2
Measures of central tendencies:
Measures of Central Tendency can be categorized into mathematical averages and positional
averages based on how they are calculated or identified.
1. Mathematical Averages
Mathematical averages are computed using arithmetic or algebraic methods. They are based on
mathematical operations and provide a numerical summary of the data.
Examples:
1. Arithmetic Mean (Mean):
Mean( X̆ )=
∑X
N
2. Weighted Mean:
Weighted Mean=
∑ WX
∑W
Wi = weight assigned to each data point.
Properties:
Easily influenced by extreme values (outliers).
Commonly used in quantitative data.
2. Positional Averages
Positional averages are determined by the position of values in a sorted data set, rather
than by calculation.
Examples:
1. Median:
o The middle value when the data is arranged in ascending or descending order.
o If N is odd: Median = value at position( N+1)/2
2. If N is even: Median = average of the values at positions N/2and (N/2)+1.
1. Mode:
o The most frequently occurring value in the data set.
o Positional in the sense that it depends on frequency distribution.
2. Percentiles, Quartiles, and Deciles:
o Percentiles: Divide the data into 100 equal parts (e.g., 50th percentile = median).
o Quartiles: Divide the data into 4 equal parts (e.g., Q2 = median).
o Deciles: Divide the data into 10 equal parts.
Properties:
Less affected by outliers (e.g., median).
Used for ordinal or skewed data.
Some numerical for practice:
Q11:The following are the marks of 180 students in stats. Find the mean :
Marks 0 10 20 30 40 50 60
(more
than)
Frequenc 180 170 150 120 70 30 5
y
Q12: From the following data given below find out :
(1) Which factory pays a larger amount as daily wages :
(2) What is the average of daily wages for the workers of the factories combined.
factory A B
NO. of wage earners 250 200
Average daily wages 2.0 2.5
Q13 :Find the missing frequency : If the mean marks is 16.82
Marks 0-5 5-10 10-15 15-20 20-25 25-30 30-35
Frequenc 10 12 16 ------- 14 10 8
y
Q14 :Find the missing frequencies of following data if Mean 1.46 :
Number 0 1 2 3 4 5 Total
of
accidents
Frequenc 46 ---- ---- 25 10 5 200
y
Q15 : Determine the Median of the following data :
Daily Wages 10 5 7 11 8
Workers 15 20 15 18 12
Q16 : From the following data calculate Median :
Value 0-4 5-9 10-19 20-29 30-39 40-49 50-59 60-69
Frequenc 328 350 720 664 598 524 378 244
y
Q17 : Calculate The Median :
Age 55-60 50-55 45-50 40-45 35-40 30-35 25-30 20-25
Number 7 13 15 20 30 33 28 14
of
persons
Q18 : Find the median :
Age 10 20 30 40 50 60 70 80
(Blow)
[Link] 2 5 9 12 14 15 15.5 15.6
persons
Q19: Calculate Median :
Size (more 50 40 30 20 10
than)
Frequency 0 40 98 123 165
Q20: Compute Median of the following :
Mid- 115 125 135 145 155 165 175 185 195
values
F 6 25 48 72 116 60 38 22 3
Q21: In the frequency distribution of 100 families find the missing frequencies if median is
known to be 50 :
Expenditure 0-20 20-40 40-60 60-80 80-100
Number of 14 ---------- 27 --------- 15
families
Q22: Find the quartiles ,Deciles 7,percentile 35 :
Overtime 10-15 15-20 20-25 25-30 30-35 35-40
Hours
No. of 11 20 35 20 8 6
families
Q23: Calculate the MODE from the following data :
Marks 10-20 20-30 30-40 40-50 50-60 60-70 70-80 80-90
FRE. 2 17 7 18 6 18 4 8
Q24 : Calculate MODE :
Marks 0-5 5-10 10-15 15-20 20-25 25-30 30-35 35-40 40-45
FREQUENCY 3 5 10 12 19 18 16 8 3
Q25 :Modal marks for a group of 94 students are 54 . Find the missing frequency :
Marks 0-20 20-40 40-60 60-80 80-100
FRE. 10 ------ 30 ----- 14
Dispersion:
Q26: Find the Range , Inter quartile Range ,Quartile Deviation , and coefficient of quartile
deviation :
Pocket 20 22 38 42 18 24
expenditur
e
No. of 2 3 7 5 10 3
students
Q27: A teacher asked the students to complete 60 pages of a record note book. Eight students
have completed only 32, 35, 37, 30, 33, 36, 35 and 37 pages. Find the standard deviation of the
pages yet to be completed by them.(Ans-2.34)
Q28: Find the variance and standard deviation of the wages of 9 workers given below: ₹310,
₹290, ₹320, ₹280, ₹300, ₹290, ₹320, ₹310, ₹280.(Ans-SD 14.91)
Q29: The rainfall recorded in various places of five districts in a week are given below.
Rainfall(in 45 50 55 60 65 70
mm.)
Number 5 13 4 9 5 4
of places
Find its standard deviation.(Ans-7.76)
Q30: In a study about viral fever, the number of people affected in a town
were noted as
Age in 0-10 10-20 20-30 30-40 40-50 50-60 60-70
years
Numbe 3 5 16 18 12 7 4
r of
people
Find its standard deviation.(Ans-14.6)
Q31: The following table gives the values of mean and variance of heights and weights of the 10th
standard students of a school.
Height Weight
Mean 155 cm. 46.5 kg
variance 72.25 cm2 28.09 kg2
Which is more varying than the other?
(Ans: C .V1 = 5.48% and C .V2 = 11.40%)
UNIT-3
Correlation analysis:
Correlation analysis is a statistical technique used to measure and interpret the strength and
direction of the relationship between two quantitative variables. The result of correlation
analysis is expressed as a correlation coefficient, typically ranging between -1 and +1.
Positive correlation: Indicates that as one variable increases, the other tends to increase
as well.
Negative correlation: Indicates that as one variable increases, the other tends to
decrease.
Zero correlation: Indicates no linear relationship between the variables.
Rank Correlation
Rank correlation measures the relationship between two variables based on their ranked
(ordinal) values rather than their raw data. It is often used when data is not normally distributed
or is ordinal in nature.
Spearman's Rank Correlation Coefficient (ρ or Rs):
This is the most common measure of rank correlation. It is calculated as:
Where:
di: Difference between the ranks of corresponding values
n: Number of observations
Key Points:
Rs ranges between -1 and +1.
Rs=+1: Perfect positive rank correlation.
Rs=−1: Perfect negative rank correlation.
Rs=0: No correlation.
Karl Pearson's Coefficient of Correlation
Karl Pearson’s correlation coefficient (denoted by rrr) measures the degree of linear relationship
between two continuous variables.
Where:
Xi, yi: Individual data points for variables x and y
xˉ,yˉ: Means of x and y
Key Points:
rrr ranges between -1 and +1.
r=+1: Perfect positive linear relationship.
r=−1: Perfect negative linear relationship.
r=0: No linear relationship.
Properties of Correlation
1. Range: The correlation coefficient (r) always lies between -1 and +1.
2. Direction:
o Positive values of r indicate a positive relationship.
o Negative values of r indicate a negative relationship.
3. No Units: Correlation is a dimensionless measure, meaning it has no units.
4. Symmetry: The correlation between X and Y is the same as between Y and X.
5. Linearity: Pearson’s correlation measures only linear relationships.
6. Independence: A correlation coefficient of 0 implies no linear relationship, but it does
not guarantee independence.
7. Affected by Outliers: Extreme values can significantly impact the correlation coefficient.
8. Correlation ≠ Causation: A high correlation does not imply that one variable causes
changes in the other.
Linear Regression Analysis
Linear regression is a statistical method used to model the relationship between a
dependent variable (y) and one or more independent variables (x). In its simplest form, it
models a linear relationship between two variables.
Simple Linear Regression
The model is: y=a+bx
Where:
y: Dependent variable
x: Independent variable
a: Intercept (the value of y when x=0)
b: Slope or regression coefficient (rate of change of y for a one-unit change in x)
The slope (b) and intercept (a) are estimated using the least squares method, which
minimizes the sum of squared residuals:
Equation for Coefficients
Slope (b):
Relationship Between Correlation and Regression
Correlation and regression are closely related statistical tools, but they serve different purposes
and have distinct interpretations. Below is a detailed comparison and explanation of their
relationship:
1. Basis of Relationship
Correlation measures the strength and direction of a linear relationship between two
variables. It is a unitless measure that ranges from -1 to +1.
Regression quantifies the relationship by providing an equation to predict one variable
(dependent) based on another variable (independent).
2. Mathematical Relationship
The slope (b) of the regression line and the correlation coefficient (r) are related as:
b=r*σy/σx
Where:
b: Slope of the regression line
r: Pearson’s correlation coefficient
σx: Standard deviation of the independent variable x
σy: Standard deviation of the dependent variable y
Key Insights:
The slope (b) depends on the scale (units) of x and y, while r is dimensionless.
If r=0 then b=0, meaning no linear relationship exists between x and y.
3. Direction
Both correlation and regression coefficients (b) share the same sign:
o Positive (r>0,b>0): Indicates a direct relationship (as x increases, y increases).
o Negative (r<0,b<0): Indicates an inverse relationship (as x increases, y decreases).
4. Symmetry
Correlation is symmetric: The correlation of x with y is the same as y with x.
Regression is not symmetric: The regression coefficient of y on x is generally not equal
to the regression coefficient of x on y
5. Interpretation
Correlation quantifies the degree of association between two variables without implying
causation.
Regression provides a predictive model that assumes x influences y (but still does not
imply causation).
6. Special Cases
1. Perfect Correlation (r=±1):
o The regression line perfectly fits all data points.
o The slope (b) is exactly proportional to the ratio of standard deviations (σy/σx).
2. Zero Correlation (r=0):
o No linear relationship exists between xxx and y.
o The regression line is horizontal (slope b=0).
7. Similarities
Both assume a linear relationship between the variables.
Both are affected by outliers.
Both measure the association between two variables, but regression also predicts
values.
Q31: From following information find the correlation coefficient between
advertisement expenses and sales volume using Karl Pearson’s coefficient of
correlation method.
Firm 1 2 3 4 5 6 7 8 9 10
Advertisement 11 13 14 16 16 15 15 14 13 13
Exp. (Rs. In
Lakhs)
Sales Volume 50 50 55 60 65 65 65 60 60 50
(Rs. In Lakhs)
(Ans-.786)
Q32: Find the correlation coefficient between age and playing habits of the
following students using Karl Pearson’s coefficient of correlation method.
(Ans-.99)
Q33: Find Karl Pearson’s coefficient of correlation between capital employed and
profit obtained from the following data.
(Ans=.85)
Q34:
(Ans-: - .4311)
Q35:
(Ans-7.5)
Q36:
(rs=.7833)
Q37:
(Ans-.75)
Q38:
(ANS- X = -1.5 + 0.8256Y, Y = 3.095 + 1.127X)
Q39:
(ANS- 3)
Q40:
(ANS- X = 3.2406 + 0.9815Y, Y = -3.1784 + 1.0057X)
Q41:
(ANS- 22,94,160 units)
UNIT-4
INDEX NUMBERS:
Index number:
Index Number Definition:
In the study of statistics, index numbers are the utmost requisite. Imagine how it would be
without these numbers while you change the variable in the estimation of any particular
statistics! The procedure itself will turn out to be completely ineffective. Thus, index numbers
are the measurement of any change in a variable or variables across a determined period.
These numbers show a general relative change and not a direct measurable figure. An index
number is expressed in the percentage form.
Let us know more about the Index numbers – their importance, characteristics, types, and
limitations will be discussed accordingly. We also have included a bonus section, continue to
study the content to find out the same.
Importance of Index Number
Index numbers are most commonly used in the study of the economic status of a particular
region. As mentioned, the index number defines the level of a variable relative to the level in a
particular period of time span. These index numbers serve as a measure to study the change in
the effects of all the factors that cannot be measured or estimated on a direct basis.
Thus, Index numbers occupy an important place due to their efficacy in measuring the extent of
economic changes across a stipulated period. It helps to study such changes' effects due to
factors that cannot be directly measured.
How would You identify an Index Number? – Features and Characteristics of Index Numbers
The main highlighting features of index numbers are mentioned as below–
It is a special category of average for measuring relative changes in such instances
where absolute measurement cannot be undertaken
Index number only shows the tentative changes in factors that may not be directly
measured. It gives a general idea of the relative changes
The method of index number measure alters from one variable to another related
variable
It helps in the comparison of the levels of a phenomenon concerning a specific date and
to that of a previous date
It is representative of a special case of averages especially for a weighted average
Index numbers have universal utility. The index that is used to ascertain the changes in
price can also be used for industrial and agricultural production.
Types of Index Numbers
There are various types of index numbers that have particular usage. We will study the types of
Index numbers to know the same. This section which is related to the types of Index numbers
will help the students to understand the importance of each type in regard to the task which is
practiced for.
Value Index
A value index number is formed from the ratio of the aggregate value for a particular period
with that of the aggregate value that is found in the base period. The value index is utilized for
inventories, sales, and foreign trade, among others.
Quantity Index
A quantity index number is used to measure changes in the volume or quantity of goods that
are produced, consumed, and sold within a stipulated period. It shows the relative change
across a period for particular quantities of goods. Index of Industrial Production (IIP) is an
example of Quantity Index.
Price Index
A price index number is used to measure how price alters across a period. It will indicate the
relative value and not the absolute value. The Consumer Price Index (CPI) and Wholesale Price
Index (WPI) are major examples of a price index.
Uses of Index Number in Statistics
We have known the features and types of the Index numbers. For a further comprehensive
study, we will now discuss the uses of Index numbers.
Index numbers are useful in many basic to complicated studies. Like it is used in the basic study
of human population in a country and also it is used to determine the extinction rate of the rare
animals in a particular region. There are many more usages of Index Numbers, let us find out:
It helps in measuring changes in the standard of living as well as the price level.
Wage rate regulation is consistent with the changes in the price level. With the
determination of price levels, wage rates may be revised.
Government policies are framed following the index number of prices. This price
stability inherent to fiscal and economic policies is based on index numbers.
It gives a pointer for international comparison concerning different economic variables
—for instance, living standards between two countries.
Advantages of Index Number
The advantages of Index numbers are directly linked with their usages. So the summation
advantages are studies as under:
It adjusts primary data at varying costs, which is useful for deflating. It facilitates the
transformation from nominal wage to real wage.
Index numbers find extensive usage in economics and help in the framing of appropriate
policies. Such findings help with the establishment of researches as well.
It helps in the case of trends such as drawing outcomes for irregular forces and cyclical
forces.
Index numbers can be leveraged in case of future development of activities in the
economic sphere. This time series analysis is utilized for the determination of trends and
cyclical developments.
The number is useful in measuring the changes that take place in the standard of living
in different countries over an established period.
Limitations of Index Number
We know everything existing has both advantages and limitations. Index numbers have a lot of
advantages, but to an extent, this is when their limitations creep up. The limitations of index
numbers are as follows:
There are chances for errors given that index numbers come as a result of samples.
These samples are put together after deliberation, which creates chances for errors. It
can also be found in weights or base periods etc.
It is always calculated based on items. Items that are so selected may not exactly be in
trend, which in turn creates an inaccurate analysis.
Multiple methods can be used to formulate index numbers. Due to this multiplicity of
methods, outcomes may bring forward a different set of values which may further lead
to confusion.
The index numbers show the approximate indications of the relative changes that occur.
Moreover, the changes in variables that are compared over a prolonged time may fall
short on reliability.
The selection of representative commodities may be skewed. It is since these
commodities are based on samples.
Q42:Calculate price index number for 2005 by (a) Laspeyre’s (b) Paasche’s method (c) Marshals
Edgeworth method (d) Fisher”s ideal index number
Laspeyre’s IN = 144.8 Paasche’s IN = 144.4
Q43:Calculate by a suitable method, the index number of price from the following data:
. Laspeyre’s IN = 228.2 Paasche’s IN = 225.4
Q44:Compute (i) Laspeyre’s (ii) Paasche’s (iii) Fisher’s Index numbers for the 2010 from the
following data.
Ans: Laspeyre’s IN = 106.6 Paasche’s IN = 106.8 Fisher’s IN= 106.7
Q45:Using the following data, construct Fisher’s Ideal index?
Fisher’s IN = 138.5
Q46:Using Fisher’s Ideal Formula, compute price index number for 1999 with 1996 as base year,
given the following:
Fisher’s IN = 8.
UNIT-5
Fundamentals of Probability: Concepts and Applications
What is Probability?
Probability is a branch of mathematics that deals with quantifying uncertainty. It measures the
likelihood of an event occurring and is expressed as a number between 0 and 1:
0: The event is impossible.
1: The event is certain.
Formula for Probability
Number of favorable outcomes
P(A)=
Total number of possible outcome s
Where:
P(A): Probability of event A
Favorable outcomes: Outcomes in which A occurs
Total outcomes: All possible outcomes in the sample space
Key Concepts in Probability
1. Experiment:
o A procedure or process with well-defined outcomes.
o Example: Tossing a coin or rolling a die.
2. Sample Space (S):
o The set of all possible outcomes of an experiment.
o Example: For a coin toss, S={H,T}
3. Event (A):
o A subset of the sample space.
o Example: Getting heads when tossing a coin (A={H}).
4. Mutually Exclusive Events:
o Events that cannot occur simultaneously.
o Example: Rolling a 3 and rolling a 4 on a single die are mutually exclusive.
5. Independent Events:
Events where the occurrence of one does not affect the other.
o Example: Tossing two coins.
6. Complementary Events:
o The complement of an event A (Ac) is the event that A does not occur.
o Example: If A={H}, then Ac={T}.
Basic Rules of Probability
1. Addition Rule:
o For mutually exclusive events:
2. Multiplicative rule:
3. Complement Rule:
4. Conditional Probability:
The probability of A given B:
Applications of Probability
1. Risk Assessment:
o Used in finance, insurance, and project management to evaluate risks and
uncertainties.
2. Games and Gambling:
o Predicting outcomes in games of chance, like dice games, card games, or
lotteries.
3. Weather Forecasting:
o Predicting the likelihood of rain, storms, or other weather conditions.
4. Medical Diagnosis:
o Assessing probabilities in medical tests (e.g., the likelihood of having a disease
given test results).
5. Reliability Engineering:
o Estimating the likelihood of system or component failures.
6. Machine Learning:
o Probability underpins algorithms such as Naive Bayes, probabilistic graphical
models, and reinforcement learning.
7. Quality Control:
o Determining defect rates and process improvements in manufacturing.
8. Sports:
o Predicting game outcomes, player performances, and tournament results.
Some numerical of probability
:Two players, Sangeet and Rashmi, play a tennis match. The probability of
Sangeet winning the match is 0.62. What is the probability that Rashmi will
win the match?
Q42: Two coins (a one rupee coin and a two rupee coin) are tossed once. Find
a sample space.
Q43: One card is drawn from a deck of 52 cards, well-shuffled. Calculate the
probability that the card will
(i) be an ace,
(ii) not be an ace.
Q44: A coin is tossed three times, consider the following events.
P: ‘No head appears’,
Q: ‘Exactly one head appears’ and
R: ‘At Least two heads appear’.
Check whether they form a set of mutually exclusive and exhaustive events.
Q45: On a six-sided die, each side has a number between 1 and 6. What
is the probability of throwing a 3 or a 4?
Q46: There are 6 blue marbles, 3 red marbles, and 5 yellow marbles in a
bag. What is the probability of selecting a blue or red marble on the first
draw?
Q47:What is the probability of leap year conataining 53 Sundays?
Q48: A Math Problem is given to three students there chances of solving the
problem is 1/3, 1/2, 4/5 ,If they are solving Independently ,What is the
probability that math problem will be solved?
Probability Distribution:
Binomial Distribution: Theory
Definition:
The binomial distribution is a discrete probability distribution that models the number
of successes in a fixed number of independent trials of a binary experiment. Each trial
has only two possible outcomes: success (p) or failure (1−p).
Characteristics:
1. Number of Trials (n):
o Fixed number of trials in the experiment.
2. Probability of Success (p):
o Probability of success in each trial remains constant.
3. Independence:
o Each trial is independent of the others.
4. Random Variable (X):
o Represents the number of successes in n trials.
Practice problems of binomial distribution
Q1. A coin is tossed 4 times. What is the probability of getting exactly 2 heads?
Q2. A factory produces 10 items, and the probability of a defect in any item is 0.1. What is the
probability that at least one item is defective?
Q3. A basketball player has a 70% chance of making a free throw. If she attempts 5 shots, what
is the probability she makes all 5?
Q4. A machine produces bolts, and the probability of a defective bolt is 0.05. If 20 bolts are
sampled, find the mean and variance of the number of defective bolts.
Q5. A bag contains 3 defective items out of 10. If 4 items are randomly selected, what is the
probability of selecting at most 2 defective items?
Poisson distribution:
The Poisson distribution is a probability distribution used to model the number of
events that occur in a fixed interval of time, space, or other domains, assuming these
events occur:
1. Independently of each other.
2. At a constant average rate.
3. Without multiple occurrences at the same exact moment (for very small intervals).
The distribution is defined by the parameter λ\lambdaλ (lambda), which represents
the average number of events in the interval.
The probability mass function (PMF) of the Poisson distribution is given by:
Practice problems of poisson distribution:
1: Calculate P(X=3) for λ=4
2: Find P(X≤2) for λ=5
Normal Distribution Theory
The normal distribution is one of the most commonly used continuous probability
distributions. It is often used to model natural phenomena and measurement errors. It
has a bell-shaped curve, symmetric about the mean, and is fully characterized by two
parameters:
1. Mean (μ): The center of the distribution.
2. Standard deviation (σ): A measure of the spread or dispersion of the data.
The probability density function (PDF) of the normal distribution is:
X− X
Z=
σ
Problem 1
The heights of adult men in a city are normally distributed with a mean of 175 cm and
a standard deviation of 8 [Link] is the probability that a randomly selected man is
taller than 180 cm?
Problem 2
The average time taken by students to solve a math problem is normally distributed
with a mean of 45 minutes and a standard deviation of 10 minutes.
Find the percentage of students who solve the problem in less than 40 minutes.
Problem 3
A machine produces bolts with diameters that are normally distributed with a mean of
20 mm and a standard deviation of 0.5 mm. What proportion of bolts have diameters
between 19.5 mm and 20.5mm?
Problem 4
The scores on a standardized test are normally distributed with a mean of 500 and a
standard deviation of 100
If the top 10% of students are given scholarships, what is the minimum score required
to qualify for the scholarship?