EDA (Engineering Data Analysis) Reviewer
Statistics was derived from the New Latin statisticum collegium
(“council of state”), or Italian word Statista meaning “Political State or Government”
The methods for decision making based on small set data was discovered by William S. Gosset
in the 20th century.
Plural Sense of Statistics: refers to numerical facts and figures collected systematically for a
specific purpose. Or in other word it is in a numerical form!
Singular Sense of Statistics: refers to the science of methods and also involves collecting,
organizing, presenting, analyzing, and interpreting data/ conclusion about a population from
data.
Importance of Statistics in Different Fields: essential in all areas of human activity, business
and management, economics, mathematics, war, state, and researches.
Power of Statistics in Everyday Life: medical studies, genetics, political campaigns, insurance,
consumer goods & retail, quality testing and stock market.
DESCRIPTIVE STATISTICS: concerned with describing the population under study. Organize,
analyze and present data in a meaningful way. Charts Graphs and Tables. To describe a
situation. To summarize sample!
Population: The entire group of individuals, data, or objects being studied.
Sample: A smaller, representative part of the population. Free from bias. Used when the
studying population is impractical.
Sampling: the process of selecting the sample.
Sampling units: The individual items or people in the sample.
Sample Size: The number of units in the sample.
Parameter: fixed, descriptive measure (mean, median).
Statistic: is a numerical value calculated from a sample.
To simply put parameters, describe population, statistics describe samples.
Variable: any characteristic, number, or quantity that can be measured or counted. (eg age,
sex, income, expenses, eye color, class grades, vehicle type, etc.)
TWO TYPES OF VARIABLES: Quantitative and Qualitative
Quantitative Variable: numerical variables measurable quantities like “how many” or “how
much”
Two Types: Continuous and Discrete Variable
Continuous Variables: can take any value within a range (including decimals and fractions)
Discrete Variables: can take whole number values only (mostly countable)
Qualitative Variable: categorical variables describe qualities or characteristics – like what type
or which category.
Two Types: Ordinal Variables and Nominal Variables
Ordinal Variables: categories have no logical order, no fixed numeric difference. (eg Grades(A,
B,C), size(S,M,L), attitude(agree’n disagree)
Nominal Variables: categories have no logical order. (eg Eye color, gender, religion brand)
Data: quantitative information about characteristic (which can originally be qualitative or
quantitative)
Qualitative characteristics: are often converted to numerical from of analysis.
Univariate Data: involves one variable, focuses on describing or analyzing a single characteristic
(estimating the average weight of high school students)
Bivariate Data: involves two variables, used to explore relationships or correlations between
variables (studying the relationship between height and weight of high school students.
THE FOUR LEVELS OF MEASUREMENT: Guides Data Interpretation and Determines Appropriate
Statistical Analysis
The Four Levels of Measurement: Nominal, Ordinal, Interval, Ration
Nominal Measurement: in name only, used for names/labels/categories, no order or ranking is
implied, and qualitative data) (eg. Eye color, yes/no surveys, favorite cereal, jersey numbers)
Ordinal Measurement: order matters but not the difference, data can be ranked, but the gaps
between values are not meaningful.
Interval Measurement: order and differences between values do make sense. Equal intervals:
the gap between values is meaningful. No true zero’n ratios don’t make sense. (eg. 30F to 40GF
is the same as 70F to 80F (Fahrenheit) but 80F is not twice as hot as 40F.
Ratio Measurement: has all the features of interval level. Absolute zero exists’n rations are
meaningful. (eg. Weight, number of clients, age, income)