0% found this document useful (0 votes)
75 views72 pages

Business Statitics New

Business statistics New
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
75 views72 pages

Business Statitics New

Business statistics New
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 72

Unit-one

Introduction to Statistics
Definition
• The term statistics have two definitions;
• In its plural sense, it is equivalent to numerical facts,
figures or measurements.
• But all figures/data are not statistics.
Statistics in its singular sense:
The branch of applied research that deals with the
development and application of methods for
collecting, organizing, presenting, analysing and
interpreting of numerical data.

1
Definition …
Generally [Statistics]
• Is a science that helps us make better decisions in
business and economics as well as in other fields.
• Teaches us how to summarize, analyze, and draw
meaningful inferences from data that then lead to
improve decisions.

Maximizing information by reducing error

2
A numerical data to be statistical data
The data should be numerically expressed.
The data must be comparable.
The data should be collected in systematic
manner.
The data should be collected for a pre
determined purpose.

3
Classifications of statistics
Depending on how data can be used
Descriptive Statistics:
Is concerned with summary calculations,
graphs, charts and tables.
A statistical method that is concerned with
the collection, organization, summarization,
and analysis of data from a sample of
population.
Helps to describe a given set of data without
going beyond that data themselves. 4
……….Continued
Inferential Statistics:
 is a method used to generalize from a sample to a
population(helps to make inference /conclusion
about a population based on the selected
sample).
It consists of
 Predict and forecast values of population
parameters
 Test hypotheses about values of population
parameters
 Make decisions
5
Stages in Statistical Investigation
There are five stages in any statistical investigation.
1. Collection of data: the process of measuring,
gathering, assembling the raw data up on which the
statistical investigation is to be based.
The process of obtaining measurements or counts.
2. Organization of data: Summarization of data in some
meaningful way, e.g. table form
Includes editing, classifying, and tabulating the collected
data.

6
……….Continued
3. Presentation of the data: The process of re-
organization, classification, compilation, and
summarization of data to present it in a meaningful
form
 overall view of what the data actually looks like.
 facilitate further statistical analysis.
 Can be done in the form of tables and graphs or diagrams.
4. Analysis of data: The process of extracting
relevant information from the summarized data(like
mean,median,mode,range,variance….)
 To dig out useful information for decision
making
7
……….Continued
5. Inference(Interpretation) of data:
 Concerned with drawing conclusions from
the data collected and analyzed; and giving
meaning to analysis results.
 A difficult task and requires a high degree of skill
and experience.

8
Definition of Some Basic terms
A population consists of the set of all measurements/elements
which have common characteristics under study for which
the investigator is interested.

Sample: It is a subset of the population, selected using some


predefined sampling technique in such a way that they
represent the population.

Sample size: The number of elements or observation to be included in


the sample.

Census: Complete enumeration or observation of the elements of the


population.( it is the collection of data from every element in a
population) 9
……….Continued
Parameter: A statistical characteristic or measure obtained from
a population. data.

Sampling: The process or method of sample


selection from the population.
Statistic: A statistical characteristic or
measure obtained from a sample data

Data refers to a collection of facts, values,


observations, or measurements that the
variables can assume
10
……….Continued
Variable: It is an item of interest that can take on any
different numerical values
Examples :
 age,
 diastolic blood pressure,
 heart rate,
 the height of adult males,
 the weights of preschool children,
 gender of statistics students,
 marital status of instructors at Bahirdar University,
 ethnic group of patients 11
Types of variables(Data)
On the basis of information contained by
the data
Qualitative Variables are non numeric
variables and can't be measured.
Examples gender, religious affiliation,
Color, Nationality, marital status of patients
and state of birth.

12
……….Continued

Quantitative Variables are numerical variables and


can be measured.
Examples balance in checking account, number of
children in family, Temperatures, Salaries, Number
of points scored on a 100 point exam, Number of
students in a class; number of cars in a parking lot etc

13
……….Continued
Note that quantitative variables are either discrete (which
can assume only certain values, and there are usually "gaps"
between the values, such as the number of bedrooms in your
house)
Or continuous (which can assume any value within a
specific range, such as the air pressure in a tire.)

14
Discrete Variables
• are variables which assume a finite or countable number of
possible values.
• are usually obtained by counting.
• is characterized by gaps or interruptions in the values that it
can assume. These gaps or interruptions indicate the absence of
values between particular values that the variable can assume.
Example:
• The number of daily admissions to a general hospital, and
• The number of first year statistics students
• The number of decayed, missing or filled teeth per child in an
elementary school.

15
Continuous Variables
• are variables which assume an infinite number
of possible values between any two
specific values.
• are usually obtained by measurement.
• does not possess the gaps or interruptions
characteristic of a discrete variable.
Example:
• Weight, age, length, temperature, weight,
speed, salary and mark of students
16
Scales of measurement
On the basis of the measurement scales
Four levels of measurement scales are commonly
distinguished:
– Nominal
– Ordinal
– Interval
– Ratio and
Each possessed different properties of measurement
systems.

17
Nominal Scales
 Only "naming" and classifying observations is possible.
When numbers are assigned to categories, it is only for
coding purposes and it does not provide a sense of size.
 Level of measurement which classifies data into mutually
exclusive, all inclusive categories in which no order or ranking
can be imposed on the data.

 No arithmetic(+,-,*,/ are impossible) and relational operation


can’t be applied( Comparision is impossible).
 Used for grouping or classification.

18
……….Continued
Examples:
– Political party preference (Republican, Democrat, or
Other,)
– Sex (Male or Female.)
– Marital status(married, single, widow, divorce)
– Country code
– Regional differentiation of Ethiopia.
– Eye color (e.g. brown, blue)
– religion (Muslim, Christian),
– place of residence (urban, rural) etc
19
Ordinal Scales
• Level of measurement which classifies data
into categories that can be ranked.
 We can talk of greater than or less than and it conveys
meaning to the value but;
 Impossible to express the real difference between
measurements in numerical terms.
• Ordering is the sole property of ordinal scale.
• Used for grouping and ordering
 +, -, *, / are impossible
 The magnitude b/n the values is not clearly known

20
……….Continued
• Arithmetic operations are not applicable but relational
operations are applicable. Examples:
– Letter grades (A, B, C, D, F).
– Rating scales (Excellent, Very good, Good, Fair, poor).
– Military status/ranks.
– Economic status (poor,medium,higher) or Socio-
economic status (very low, low, medium, high, very
high)
– severity(mild, moderate, sever)
– blood pressure (very low, low, high, very high etc.
21
Interval Scales
• Level of measurement which classifies data that can be ranked and
differences are meaningful. (The magnitude b/n the values is
clearly known)

• has arbitrary zero value


• + ,*& - are possible but / are impossible(All arithmetic operations
except division are applicable)
• However, there is no meaningful zero, so ratios are meaningless.
• Relational operations are also possible.
• Examples:
– IQ
– Temperature in oC It is meaningful to say the difference between
30oC & 40 oC and 25oC & 35oC is equal (i.e. 10 oC).
22
Ratio Scales
• The highest level of measurement scale, characterized
by the fact that equality of ratios as well as equality of
intervals can be determined.
• Level of measurement which classifies data that can
be ranked, differences are meaningful, and there is a
true zero.
• True ratios exist between the different units of
measure.
 Allarithmetic and relational operations are
applicable. (+, -, *, / are possible)
23
……….Continued
Examples:
– Weight
– Height
– Number of students / items
– Age
– Salary
– Volume
– Length
24
25
Applications, uses and limitations of Statistics

Applications of statistics:
It is applicable in any field of study which seeks
quantitative evidence. For instance,
• In almost all fields of human endeavor.
• Almost all human beings in their daily life are
subjected to obtaining numerical facts e.g. about price.
• Applicable in some process e.g. invention of certain
drugs, extent of environmental pollution.
• In industries especially in quality control area.
• To compare the improvement in yield due to
application of fertilizer, pesticide……….. 26
Uses of statistics:
The main function of statistics is to enlarge our
knowledge of complex phenomena. The following are
some uses of statistics:
1. It presents facts in a definite and precise form.
2. 2. Data reduction.
3. Measuring the magnitude of variations in data.

27
……….Continued
• 4. Furnishes a technique of comparison
• 5. Estimating unknown population
characteristics
• 6. Testing and formulating of hypothesis
• 7. Studying the relationship between two or
more variable
• 8. Forecasting future events.

28
Limitations of statistics
As a science statistics has its own limitations. The
following are some of the limitations:
• Deals with only quantitative information(It does not
study qualitative characteristics directly).
• Deals with only aggregate of facts and not with
individual data items.
• Statistical data are only approximately and not
mathematical correct.
• Statistics can be easily misused and therefore
should be used by experts.
29
Exercise-1
The following are list of different attributes/ variables or data.
Classify the variables/data in to different measurement scales.
1. Your checking account number as a name for your account.
2. Your score on statistics test as a measure of your knowledge of
statistics.
3. A response to the statement "Abortion is a woman's right" where
"Strongly Disagree" = 1, "Disagree" = 2, "No Opinion" = 3,
"Agree" = 4, and "Strongly Agree" = 5, as a measure of attitude
toward abortion.
4. Times for swimmers to complete a 50-meter race
5. Months of the year as September, October…
6. Economic status of a family when classified as low, middle and
upper classes.
7. Blood type of individuals as A, B, AB and O.
8. Regions of Ethiopia as region 1, region 2, region 3…

07/25/2024 30
Exercise

• QUIZ: (Identify the usual level of measurement for each of


the following)
1. year in school 2. IQ scores
3. life expectancy 4. fatigue
5. cynicism 6. grade point average
7. hair color 8. type of neighborhood
9. temperature 10. climate

07/25/2024 31
Exercise-1

The time it takes an employee to drive to work is


the variable of interest. What type of variable is
being observed?
a. categorical variable
b. continuous variable
c. discrete variable
d. explanatory variable

07/25/2024 32
Unit-two
Sampling & Sampling Distributions

Definition and Some Basic Terms


Population: the entire group of individuals or objects of
interest under investigation or study.
Unit: An element of the population. This will be a person
or object on which observations can be made or from
which information can be obtained.
Sampling Frame: The list of all the units in the
population.
Target population: the population about which one
wishes to make an inference.
Sample size: - the number of individuals in the sample.
Sampling: - It is the process of selecting a sample from
the population. 33
Major reasons why sampling is necessary

1) the destructive nature of certain


tests/studies
2) physical impossibility of checking all items
in the population/ infinite population
3) cost of studying all items in the
population is often prohibitive/ sampling
reduce cost/
4) The adequacy of sample result. Sampling
has greater adequacy and accuracy.
5) In terms of time/ sampling has greater
speed/
34
Types of Errors
Sampling errors
 It is the discrepancy between population parameter
and the sample statistic.
 The error which arise due to only a sample being used
(technique and sample size) to estimate population
parameter. Even if we have a representative sample
will also introduce errors if the sample size is small.
 estimates of parameters will often be inaccurate if our
sample is not representative of the population.
Because of this we need to know how to choose a
sample.
 Sampling error is the difference b/n an estimate and
the true value of the parameter being evaluated. 35
Non sampling errors
• Suppose we have a representative sample and
have chosen a sample large enough to ensure
our parameter estimates are accurate to a
good degree of precision, errors may still arrive
such as measurement errors, recording errors,
non-response errors, respondent bias,
interviewer error, errors in processing the data,
and reporting error.
• Another common form of error is the non-
response error.
• Non responses can be due to refusals. 36
Sampling Methods

 Sampling techniques can be


grouped into two categories:
1) Random (probability) sampling
methods, and
2) Non-random (non-probability)
sampling methods.

37
Random (probability) sampling methods
Random sampling: sampling method in which the items
are included in the sample in a random basis.
Simple random sample: a sampling technique in which
member of the population is equally likely to be included
in the sample. It might be done in different ways.
Lottery method – the units to be included in the sample
are chosen by a lottery. Assign numbers to each element
in the population. Write each number in a split of paper,
toss then draw one number at a time. This method can
only be used if the population is not very large otherwise
it is cumbersome.
38
……….Continued
Stratified random sampling: is often used
when the population is split into subgroups or
“strata”.
 The different subgroups are believed to be
very different from each other, but it is
thought that the individuals who make up
each subgroup are similar.
 The number of units to be chosen from
each sub-group is fixed in advance and the
units are chosen by simple random
sampling within the sub group. 39
……….Continued
Cluster sampling: in some case the identification and
location of an ultimate unit for sampling may require
considerable time and cost in such cases cluster
sampling is used.
In cluster sampling the population is subdivided into
groups or clusters and a probability of these clusters is
then drawn and studied.
Clusters may be Region, Zones, Weredas, Kebeles etc.
This method of sampling has less cost, faster and
more convenient but it may not be very efficient and
representative due to the usual tendency of the units
in different cluster be similar
40
……….Continued
Systematic sampling: the items or individuals of
the population are arranged in some way
alphabetically, in file drawer by data received or
some other method.
A random starting point is selected and then
every Kth member of the population is selected
for the sample. For example if we want select n
items from the population of size N using
systematic sampling, we divide N by n and
choose one b/n 1 and K then we take every Kth
member.
41
Non-random (non-probability) sampling methods

In non-probability sampling, the sample is not based


on chance. It is rather determined by personal
judgment of the researcher. This method is cost
effective; however, we cannot make objective
statistical inferences.
Depending on the technique used, non-probability
samples are classified into quota, judgment or
purposive and convenience samples.
Judgment sampling: the subjective judgment of the
researcher is the basis for selecting items to be
included in a sample. Judgment sampling often used
to pre-test the questionnaire.
42
……….Continued
Quota sampling: In this sampling technique major
population characteristics play an important role
in selection of the sample. It has some aspects in
common with stratified sampling, but has no
randomization. Here the population may be
divided in two groups like stratified sampling to
give quota and select from each group.
Convenient sampling: this technique of selecting
sample which is simply convenient to the
researcher in terms of time, money and
administration.
43
Sampling Distribution of the Mean and Proportion
Sampling Distribution of the Mean
Suppose we have a simple random
sample of size n, picked up from a
population of size N. We take
measurements on each sample member
in the characteristic of our interest and
denote the observation as x1, x2, …,xn,
respectively. The sample mean for this
sample is defined as:
44
……….Continued
 The possible values of this random variable depends on
the possible values of the elements in the random
sample from which sample mean is to be computed.
 The random sample, in turn, depends on the
distribution of the population from which it is drawn. As
a random variable, X ̅ has a probability distribution. This
probability distribution is the sampling distribution of X ̅

45
……….Continued
 The sampling distribution of X ̅ is the probability distribution of all possible
values the random ̅variable X ̅ may take when a sample of size n is taken from a
specified population.
 There are commonly three properties of interest of a given sampling
distribution.
 It‘s Mean,
 Its Variance,

 Its Functional form.

 When sampling without replacement from a finite population, the probability


distribution of the second random variable depends on what has been the
outcome of the first pick and so on. In other words, the n random variables
representing the n sample members do not remain independent, the
46
expression for the variance of X ̅ ̅ changes.
……….Continued

 The results in this case will be:

47
Sampling Distribution of The Proportion

48
……….Continued

49
Unit-three
Statistical Estimations

Basic concepts
• The objective of estimation is to determine
the approximate value of a population
parameter on the basis of a sample statistic.
• For example, the sample mean is employed to
estimate the population m e a n .
• We refer to the sample mean as the estimator
of the population m e an . Once the sample
mean has been computed, its value is called
the estimate.
50
Point estimators of the mean and proportion

Point Estimator
 A point estimate is a single statistic used to estimate
a population parameter.
 Suppose Best Buy, Inc. wants to estimate the mean
age of buyers of high-definition televisions.
 They select a random sample of 50 recent purchasers,
determine the age of each purchaser, and compute
the mean age of the buyers in the sample.
 The mean of this sample is a point estimate of the
mean of the population.
 Generally, Point estimate is the statistic, computed
from sample information, which is used to estimate
the population parameter 51
There are three drawbacks to using point estimators.
 It is virtually certain that the estimate will be wrong.
 Often need to know how close the estimator is to the
parameter.
 In drawing inferences about a population, it is intuitively
reasonable to expect that a large sample will produce
more accurate results because it contains more
information than a smaller sample does. But point
estimators don‘t have the capacity to reflect the effects
of larger sample sizes. As a consequence, we use the
second method of estimating a population parameter,
the interval estimator.

52
Methods of Estimation

 Let us outline the procedures by


which we can find the point
estimators of a parameter.
 The procedures to be used here are:
(1) the method of moments, and
(2) the maximum likelihood method.
53
The Method of Moments

54
Maximum likelihood estimators
 The essential feature of the principle of
maximum likelihood estimation, as it applies to
the problem of estimation, is that it requires
the investigator to choose as an estimate of the
parameter that value of the parameter for
which there is the prior probability of obtaining
the sample point actually observed, is as large
as possible.
 This probability will in general depend on the
parameter, which is then given that value for
which this probability is as large as possible.
55
……….Continued

56
Statistics as Estimators for Parameters
 It is clearly visible that we use statistics
to estimate parameters due to the lack of
time, energy, resources, and infinite
populations.
 Statistics, from the sample, can be listed
as: proportions, Arithmetic averages,
ranges, quartiles, deciles, percentiles,
variances, and standard deviations.
 It will become clear enough what each
one means and what it will stand for. 57
Point estimator of the proportion

58
Example

59
Point estimator of the mean

60
Example

61
Interval Estimator
 An interval estimator draws inferences about
a population by estimating the value of an
unknown parameter using an interval.
 An Interval Estimation is a range of values,
calculated based on the information in the
sample that the parameter in a population will
be within that range with some degree of
confidence.

62
……….Continued
 The purpose of an interval estimate is to provide information
about how close the point estimate, provided by the sample,
is to the value of the population parameter.
 The general form of an interval estimate of a population mean
is

63
Unit-Four
HYPOTHESIS TESTING

Concepts of Hypothesis Testing


 Testing a statistical hypothesis is the second main
and major part of inferential statistics.
 A statistical hypothesis is an assumption or a
statement, about one or two parameters and
involving one or more than one population.
 A statistical hypothesis may or may not be true.
We need to decide, based on the data in a sample,
or samples, whether the stated hypothesis is true
or not
64
……….Continued
 Testing a statistical hypothesis is a technique,
or a procedure, by which we can gather some
evidence, using the data of the sample, to
support, or reject, the hypothesis we have in
mind. This is also one way of making inference
about population parameter, where the
investigator has prior notion about the value
of the parameter.

65
THE NULL AND ALTERNATIVE HYPOTHESES
 The first step, in testing a statistical hypothesis, is to set
up a null hypothesis and an alternative hypothesis.
 When we conjecture a statement, about one parameter
of a population, or two parameters of two populations,
we usually keep in mind an alternative conjecture to
the first one. Only one of the conjectures can be true.
So, in essence we are weighing the truth of one
conjecture against the truth of the other.
 This idea is the first basic principle in testing a statistical
hypothesis.

66
……….Continued
 For Example, a person is accused of a crime; he/she
faces a trial. The prosecution presents its case, and a
jury must make a decision on the basis of the
evidence presented. In fact, the jury conducts a test
of hypothesis
 Typically, the question of interest will be represented
by the alternative hypothesis, as illustrated in the
following examples, note how consistently what is
interesting to the analyst is the alternative hypothesis
in the following examples of some questions we
might encounter and the corresponding statistical
hypotheses that might be framed:.
67
Example 1
 An accountant doing an audit is becoming suspicious
of the figures shown in the books of a big company
called Unron;
 she/he extracts the data from several hundred
transactions and wants to know if the frequencies of
the ten digits(0,1,…9) in the last portions of the
entries are equal (radical deviation from equality
would suggest that the numbers were fraudulently
invented, since people aren‘t very good at making up
numbers that fit the uniform probability
distribution).

68
Example 2
 Suppose a stock broker has become interested
in the performance of the shares for DASHEN
BANK;
 he wants to know if the data for the last three
years support the view that the growth rate is
at least 6% per year. If it is, he will recommend
to a client interested in long-term investments
that the investment fits the client‘s profile.

69
Example 3
 An investor is looking at two different
manufacturers of plant as potential
investments. One of the steps in due diligence
is to examine the reliability of quality control
of the two factories‘ production lines by
comparing the variances of the products

70
……….Continued

71
Possible Decisions

72

You might also like