Course: SCIENTIFIC RESEARCH METHODS AND PUBLICATIONS ETHICS.
Lecture 9: Measurement and Scaling Concepts
Covered Areas:
I. Measurement and Levels of Measurement
II. Measurement levels ■ Non-metric levels ■ Metric levels
III. Scales Used to Measure Attitudes and Types
IV. Accuracy of Measurement
V. Errors in Measuring
Objectives:
* To be able to comprehend the importance of measurement,
* To be able to have information about measurement levels,
* To be able to learn the types of scales,
* To be able to evaluate the accuracy of the measurement.
I. MEASUREMENT AND LEVELS OF MEASUREMENT
* Measurement is not only an activity related to scientific curiosity, but also a very important part of our
lives.
* Through these measurements, we make evaluations and make various decisions. Our success in a
business or making the right decisions depends on our measurements accuracy and reliability.
* Making good decisions is closely related to the accuracy, reliability and validity of data.
* Measurement is an effort to quantify.
* Measurement is the assignment of numerical values to observations (or answers) within the framework
of certain rules.
* There are four (characteristics to describe the) levels of measurement:
1. Identification: It is based on a single identifier or label. For example, "yes-no", "true-false", or the
ages of the respondents can be considered descriptive.
2. Ranking: Shows the relative volume and magnitude of the identifier. "Greater than... less than" or "...
equal to".
3. Distance: If the absolute differences between the descriptors are certain and can be expressed as a
unit, then there is "distance" here.
4. Origin: If there is only one starting point, such as zero, there is a measurement of origin.
* It is not possible to determine the measurement levels of a scale without knowing the above features.
* There are four levels of measurement: nominal measurement, ordinal measurement, interval
measurement, and ratio measurement. The first two levels are non-metric (categorical) and the last
two levels are expressed as metrics.
II- MEASUREMENT LEVELS:
A) TYPES OF VARIABLES:
1. Categorical variable: Categorical variables are measured through classification. Numbers are just
labels. For example, whether Muslim or Christian is categorical variable.
2. Continuous variable: For continuous variables, it is possible to take infinite number of values in one
interval or more than one range. Examples of a continuous variable are age, degrees on a thermometer,
or income status.
3. Discrete variable: the numbers that the variable can take are both finite and countable. For example,
the number of children is a discrete variable.
Examples:
marital status - single/married - categorical
age - continuous
height - continuous
how many defects in a work shift - discrete
eye color - categorical
weight - continuous
choice of beverage a/b/c - categorical
how many children do you have ? discrete
Non-metric (categorical) levels
1. Nominal and 2. ordinal scales are scales that are considered at the non-metric (categorical) level.
* At the nominal measurement level, numbers are used only as labels.
* With this measurement, different categorical levels of the answers are tried to be determined.
* For each object, a number is given or assigned. Numbers do not mean size, smallness, or level of
liking.
1. NOMINAL MEASURES
* gender F/M
* marital status M/S/D
* education - none-high school-college-PhD.
Counting (nominal) measurement
Rates can be used. Like 30% of the respondents are blue-collar and 70% are white-collar.
Magnitude does not mean ranking. For example, it does not mean that there will be 20 times the size
between the apartment with door number 1 and apartment number 20.
The arithmetic average cannot be used. For example, the codes given to occupations in a survey
cannot be added and divided by the number of observations.
Mode can be calculated. Mode refers to the most repeated number in a series.
Its frequency can be calculated.
2. Sequential (Ordinal) measurement
* At this level of measurement, numbers assigned can be used for sorting as well. In a measurement
made at this level of measurement, level of carrying a feature, an order of preference, an order of like
can be given.
* Mode, frequency, ratios can be used.
* The median can be calculated. The median calculation, known as the value that falls in the middle, can
be applied to data measured at the sequential measurement level.
* The arithmetic mean cannot be calculated. It is not possible to calculate averages because the
distances between the numbers are not equal.
Example 1: please order cities that you want to live
Istanbul Ankara Izmir Adana Mardin
Example 2: which beverages do you prefer in breakfast?
1 2 3 4 5
Example 3: how many minutes do you exercise a day?
P <30 mins 30-60 mins >60 mins
* Metric Levels
* At this measurement level, measurements made with intermittent and ratio scale are accepted as
metric levels.
3. Intermittent (Interval) measurement
* In intermittent measurement, there are all the features found in ordinal measurement, but in addition,
the absolute differences between the measurement values are considered equal. In other words, the
difference between 1 and 2 and the difference between 7 and 8 is considered equal.
* Mode, frequency, ratios, median can be used.
* The arithmetic mean can be used.
* Standard deviation can be used.
* Skewness and kurtosis measures can be calculated.
* Correlation, regression analyses, t- and F tests can also be applied to such data.
4. Ratio measurement
* Proportional measurement incorporates all the features of nominal, ordinal and intermittent scales, but
there is also a zero point on this scale.
* It is the most powerful, detailed level of measurement. Answers to questions such as age, income, and
consumption amount can be given as examples of proportional measurement.
III. Scales Used to Measure Attitudes and Types
Custom Scale Types and Their Features
1. Likert Scale
* The Likert scale is developed by Rensis Likert to measure attitudes and named after him. On this
scale, the level of agreement with the statements, which the researcher has predetermined, is asked.
* Although the Likert scale can be used with 5 points, 7 points, 9 points or 11 points, the most common
use is in the form of five points of Likert. It is a balanced scale due to the odd number of points. In
addition, the distances between the points are considered equal.
2. Semantic Differences Scale
* The scale of semantic differences is also called the dimensional separation scale. In this scale,
attitudes are measured by placing opposite adjectives placed at two extremes as 7. The midpoint
expresses the neutral situation, while the two extremes express the degree of respondent's participation
in the adjective. The semantic differences scale is mostly used to learn or discover connotations and
similar meanings.
IV. ACCURACY OF MEASUREMENT
* The accuracy of measurement is one of the most basic indicators of whether the studies can give
healthy and accurate results.
1. Reliability
Reliability is to obtain similar or consistent results if measurements are repeated.
In order to determine the reliability of a study, it can be looked at whether the scale used gives similar
results in different measurements (sessions) or whether similar results are obtained by different
researchers in different situations.
Approaches used to determine reliability are test-retest, alternative forms, split-half.
A) Test-retest: The correlation between the results of measurements made at different times is
examined. If the correlation is high, the reliability of the scale is also high, if not, it is said to be the
opposite. The time between the two measurements varies between two and four weeks.
B) Alternative forms: In the alternative form application, two similar scales developed to measure
similar subjects are used. Again, the duration varies from two weeks to four weeks. The two
measurement results must be highly correlated.
C- Split-half: One of the approaches used to determine the reliability of the scale is split-half. Scale
items are randomly divided into two. The high correlation is expected in the results of the two groups.
A scale can very rarely have excellent reliability. But the reliability of the scales can be improved in
four ways.
1. Clear conceptualization of scale: Reliability increases when each scale measures only a single
concept. This means a clear, precise, theoretical definition of the scale.
2. Increasing the level of the scale: The clearer and more detailed information a scale provides, the
more reliable it becomes. The basic principle is to measure at the highest possible level.
3. Using more indicators: The third way to increase the reliability of the scale is to use two or more
indicators to measure a structure. For example, if we ask more than two questions instead of two to
measure mood, we get a more reliable measurement for a few reasons. First, it helps to better define the
concept. The concept is measured in many aspects. Second, when measuring with a single question, an
error may occur because a good question is not selected.
4. Conducting preliminary tests, pilot studies and iterations: In order to increase the level of
reliability, it will be useful to conduct pilot studies to test the level of clearness and clarity of the scale
before finalizing and using it, and to try it on different samples over and over again.
2. Validity
If a scale measures what is intended to be measured, it can be said that that scale is valid. Such a scale
would be valid to the extent that it is free from systematic error, or in other words, to the extent that it
reflects real differences between persons in a given period or for the same person over time.
Valuation is made on the basis of three different validities, usually indicated by the names of prediction
validity, content validity, and structural validity.
A) Prediction validity: It is the degree of correlation between the quality measured by the scale and the
actual quality (observed quality). For prediction validity, different measurements can be made on the
same subject and they can be compared.
B) Content validity: It is a measure of the validity of the content of the scale. Experts express their
personal judgment about the extent to which the scale can represent what they want to measure, Based
on these judgments, a conclusion is reached about the content validity of the scale,
C) Structural validity: It is related to the extent to which the theoretical reasons for the prediction and
content validity of the scale can be determined. In structural validity, it is looked at to determine which
concepts and characteristics the scale measures. For this purpose, it is checked to what extent the scale
matches other criteria that measure the same structure and how much the scale measuring one
structure has less relationship than the scale measuring the other structure. The first is convergent
validity, and the second is discriminant validity.
• Relationship between Reliability and Validity
* Reliability is a prerequisite for validity, and it is easier to ensure reliability than it is to ensure validity.
* Reliability is a must, but not a sufficient requirement.
* Validity and reliability are often complementary concepts.
3. Sensitivity
* Sensitivity is a measure of how precisely the scale can make measurements. In order to achieve this,
it is necessary to add new qualities or points to the scale (Kurtuluş, 2010: 110). In the measurements, if
possible, it should be tried to obtain as detailed and precise values as possible. If it is possible to
measure a data at the metric level, it should not be measured at non-metric levels, in other words, using
a nominal or ordinal scale.
V. ERRORS IN MEASURING
1. Systematic errors: researcher's knowledge, skills or value judgments or wrong decisions he has
made; mistake made in the scale, questionnaire, form or sampling; are sources of systematic error in all
measurements.
* For example, let's assume that a study is prepared to measure the shopping habits of consumers from
retailers and is applied comparatively in two different convenient stores. Suppose the survey applications
in the first convenient store were conducted between the 15th and 18th of the month, and in the second
market between the 25th and 28th. Since civil servants receive their salaries on the 15th of the month,
they make their shopping times between the 15th and 20th of the month, and spread small purchases
throughout the month. Therefore, the scores in the first measurement will be higher. If the researcher is
not aware of the systematic error, he will be able to claim that the difference in sales amounts is due to
the difference in markets.
2. Random Errors: Another type of error is errors that are caused by accidental unforeseen causes.
Random error can occur because of any effect that influence the measurement results. An example of a
random error is a low performance in the exam because the student is sick on the exam day.