0% found this document useful (0 votes)
436 views101 pages

M&E Course Chapters

Upon completion of this lesson on reduction and oxidation, students will be able to accurately describe the three main components of the reduction and oxidation system and their functions with 80% accuracy on an assessment.

Uploaded by

Hamse Abdihakin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
436 views101 pages

M&E Course Chapters

Upon completion of this lesson on reduction and oxidation, students will be able to accurately describe the three main components of the reduction and oxidation system and their functions with 80% accuracy on an assessment.

Uploaded by

Hamse Abdihakin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 101

Chapter 1

Concepts of Assessment,
Measurement and Evaluation
Measurement, Assessment, Test, and
Evaluation
• Assessment is any variety of procedures used to obtain
information about student performance.
• Test is an instrument or systematic procedure for measuring a
sample of behavior by posing a set of questions in a uniform
manner.
• Measurement is the process of obtaining a numerical
description of the degree to which an individual possesses a
particular characteristic.
• Evaluation is the process of delineating, obtaining, and
providing useful information for judging decision alternatives
PURPOSE OF ASSESSMENT
• Assessment involves deciding how well students have learnt
a given content or how far the objective we earlier set out
have been achieved quantitatively.
• The data so obtained can serve various educational functions
in: the school:
o Classroom function
▪ Determination of level of achievement
▪ Effectiveness of the teacher, teaching method, learning
situation and instructional materials
▪ Motivating the child by showing him his progress i.e.
success breeds success.
▪ It can be used to predict students performance in novel
situations.
PURPOSE OF ASSESSMENT
o Guidance functions
▪ Give the teacher diagnostic data about individual pupils
in his class
▪ Decide on which method to use or what remedial
activities that are necessary
o Administrative functions
▪ Assessing can serve as communication of information
when data collected are used in reports to parents
▪ It could form the basis upon which streaming, grading,
selection and placement are based.
▪ Making appropriate decisions and recommendations on
curricula packages and curricula activities.
THE CONCEPT OF CONTINUOUS ASSESSMENT

• Continuous assessment is defined as:


“A mechanism whereby the final grading of a student in the
cognitive, affective and psychomotor domains of behaviour
takes account in a systematic way, of all his performances
during a given period of schooling”
• Advantages of a continuous assessment:
o It provides useful information about the academic progress
of the learner.
o It makes the learner to keep on working in a progressive
manner.
o It informs the teacher about the teaching-learning
effectiveness achieved;
THE CONCEPT OF CONTINUOUS ASSESSMENT
• Using continuous assessment to improve teaching and
learning:
o Motivation
o Individual Differences
o Record-Keeping
o Examination Malpractice
• Characteristics of continuous assessment tests:
o Continuous assessment tests are periodical, systematic, and
well planned.
o Continuous Assessment tests may be oral, written,
practical, announced, or unannounced, multiple choice
objective, essay, or subjective and so on
THE CONCEPT OF CONTINUOUS ASSESSMENT
o Continuous assessment tests are often based on what has
been learnt within a particular period.
o All continuous assessment tests should meet a specified
criteria.
• Continuous assessment tests have been abused by some
dishonest teachers:
o Making the test extremely cheap so that undeserving students in their
school can pass;
o Inflating the marks of the continuous assessment tests so that
undeserving students can pass the final examinations and be given
certificates not worked for.
o Conducting few (less than appropriate) continuous assessment tests
and thus making the process not a continuous or progressive one;
o Reducing the quality of the tests simply because the classes are too
large for a teacher to examine thoroughly;
Tests in the Classroom
What is a test?

• Tests are detailed or small scale task carried out to identify


the candidate’s level of performance and to find out how far
the person has learnt what was taught or be able to do what
he/she is expected to do after teaching.
• Tests are carried out in order to measure the efforts of the
candidate and characterize the performance.
• Test is therefore an instrument for assessment
Tests in the Classroom
Purpose of Tests
• To find out whether the objectives we set for a particular
course, lesson or topic have been achieved or not.
• To determine the progress made by the students
• To determine what students have learnt or not learnt in the
class
• To place students/candidates into a particular class, school,
level, or employment
• To reveal the problems or difficulty areas of a learner
• To predict outcomes – whether learner can do specific job, can
perform well in college, etc
Tests in the Classroom
OBJECTIVES OF CLASSROOM TESTS
• Inform teachers about the performance of the learners in their classes.
• Show progress that the learners are making in the class.
• Compare the performance of one learner with the other to know how to
classify them either as weak learners, average learners, and strong or high
achievers that can be used to assist the weak learners.
• Promote a pupil or student from one class to another.
• Reshape teaching items, especially where tests show that certain items are
poorly learnt either because they are poorly taught or difficult for the
learners to learn.
• For certification – we test in order to certify that a learner has completed
the course and can leave. After such tests or examinations, certificates are
issued.
• Conduct a research – sometimes we conduct class tests for research
purposes
Tests in the Classroom
• Nature of a test
o The test is reliable
o The test is valid ( effective)
o It is objective
o Must accomplish with norms
o Should not be expensive
o Less time consuming
o Must produce results and its implementation
o Its feasibility
o Must have educational values
Chapter 2
EDUCATIONAL AND
LEARNING OBJECTIVES
TAXONOMY OF EDUCATIONAL OBJECTIVES
• Benjamin Bloom classified all educational objectives into
three:
o Cognitive domain involves remembering previously learnt matter
(What do you want your student to know?).
o Affective domain relates to interests, appreciation, attitudes and
values. (What do you want your student to think or care about?)
o Psychomotor domain deals with motor and manipulative skills. (What
do you want your student to be able to do?)
• Educational aims, goals, and objectives are slightly different:
o Aim refers to the kind of outcome you expect to see or produce in the
future, e.g. Student will be proficient in English Language
TAXONOMY OF EDUCATIONAL OBJECTIVES
o Goal refers to the intended outcomes in general terms in specific
period of time, e.g. At the end of the course, student will be able read,
write, and speak English Language.
o Objective refers to the intended outcomes in specific terms in specific
period of time, e.g. At the end of the course, student will be able to write
a short paragraph of at least three sentences.
• Instructional objectives describe in detail the behaviours that
students will be able to perform at the conclusion of a unit of
instruction. The ABCD Model is used to construct an instructional
objective
o Audience – Object of the behaviour (learners, readers, participants, etc
o Behaviour – What the audience is intended to do
o Condition – When or while the behaviour is being done
o Degree – Standard or Criteria (speed, quality, quantity, accuracy, etc)
TAXONOMY OF EDUCATIONAL OBJECTIVES
How to write learning objective
• Specify both an observable behaviour (actions) and the
object (audience) of that behaviour
“Students will be able to write a research paper”
• In addition, the criterion (degree) could also be specified
“Students will be able to write a research paper in the
appropriate scientific style”
• The condition (optional) under which the behaviour
occurs could be specified:
“At the end of the field research, students will be able
to write a research paper in appropriate scientific
style”
TAXONOMY OF EDUCATIONAL OBJECTIVES
Objectives should be specific, measurable, attainable, realistic,
and time-bound (SMART)
Consider the following Examples:
“At the end of the lesson, students will be able to describe
the human systems”
Poor
“At the end of the lesson, students will be able to describe
the digestive and respiratory system”
Better
“At the end of the lesson, students will be able to describe
the components of the digestive and respiratory systems”
Best
TAXONOMY OF EDUCATIONAL OBJECTIVES
Activity 1:
Evaluate the following learning objectives as “Poor”, “Better”, and “Best”

“At the end of the lesson, students will be able to explain at least two
functions of the ear drum”
“At the end of the lesson, students will be able to explain the human
hearing system”
“At the end of the lesson, students will be able to explain the three
parts of the ear”

Construct a learning objective for reduction and oxidation system using


the following format:
CONDITION + AUDIENCE + BEHAVIOR + DEGREE
TAXONOMY OF EDUCATIONAL OBJECTIVES
Learning Outcomes:
• Learning outcomes are statements that specify what learners will
know or be able to do as a result of a learning activity. Learning
outcomes address knowledge, skills, and dispositions of ideal
students.
• Possible formats of a learning outcomes include:
o Format #1: To (Action verb) + (Target) + (Object) + (Modifiers)
o Format #2: The (Object) + (Action verb) + (Modifiers) + (Target)
Example:
Format #1: To demonstrate an understanding of chemical
bonds by the students naming at least two types of
bonds
Format #2: Students will demonstrate an understanding of
chemical bonds by naming at least two types of bonds
TAXONOMY OF EDUCATIONAL OBJECTIVES
Learning Outcomes:
• Learning outcomes and learning objectives are slightly different:
o Learning objectives are intended results of instruction, curricula,
etc
o Learning outcomes are achieved results of what was learnt.
• Following are examples of learning outcomes and learning
objectives:
o Learning Outcome: Students will show mastery of addition of
two numbers up to 5-digits
o Learning Objective: Students will solve correctly a minimum of
35 of 40 problems on the addition of 5-digit numbers
TAXONOMY OF EDUCATIONAL OBJECTIVES
BLOOM’S TAXONOMY IN THE COGNITIVE DOMAIN
1. Knowledge: define, label, list, name, order, recognize, recall, label,
memorize, reproduce, repeat,
2. Comprehension: classify, describe, discuss, explain, identify, indicate, locate,
recognize, report, review, select, translate
3. Application: apply, choose, demonstrate, employ, illustrate, interpret,
operate, practice, schedule, sketch, solve, use
4. Analysis: analyze, appraise, calculate, categorize, compare, contrast,
diagram, differentiate, discriminate, distinguish, examine, test,
question
5. Synthesis: arrange, assemble, collect, compose, construct, create, design,
formulate, manage, organize, plan, prepare, propose, write
6. Evaluation: argue, assess, choose, defend, estimate, judge, predict, rate,
score, select, support, value, evaluate
TAXONOMY OF EDUCATIONAL OBJECTIVES
Stages in the Assessment of Cognitive Behaviors
A. Preparation
i. Break curriculum into contents (tasks) to be dealt with weekly.
ii. Break contents into content elements
iii. Specify the performance objectives
B. Practice
i. Give quality instruction
ii. Engage pupils in activities designed to achieve objectives or give them tasks
to perform.
iii. Measure their performance and assess them in relation to set objectives.
C. Use of Outcome
i. Take note of how effective the teaching has been; feedback to teacher and
pupils.
ii. Record the result
iii. Cancel if necessary
iv. Result could lead to guidance and counselling and/or re-teaching.
TAXONOMY OF EDUCATIONAL OBJECTIVES
Stages in the Assessment of Psychomotor Behaviors
• The learning outcomes stretch from handling of writing materials to
activities in role play, practicals, laboratory activities, technical
subjects, games and athletics. Some of the learning outcomes are
subject based or non-subject based, e.g. subject based outcomes.
o Fluency in speech from language
o Saying prayers from religious studies
o Measuring quantities and distance from mathematics
o Laboratory activities in sciences
o Planting of crops/Experiments
• Example of psychomotor objective:
“At the end of the lesson, student should be able to draw a
rectangular dimension with sides of 2 cm long”
TAXONOMY OF EDUCATIONAL OBJECTIVES
Stages in the Assessment of affective Behaviors
• These learning outcomes include feelings, beliefs, attitudes,
interests, social relationships etc. which, at times are referred to as
personality traits.
• The most appropriate instrument for assessment here is
observation. Others like self-reporting inventories; questionnaires,
interviews; rating scales, may as well be used as the occasion
demands.
• For example, if you want to assess student’s personality traits, you
may use the following:
o Attendance behavior
o Appearance
o Conduct
o Cooperative behavior in groups
TAXONOMY OF EDUCATIONAL OBJECTIVES
Stages in the Assessment of affective Behaviors
• Affective domain has been divided in to categories:
o Receiving – Before you have effect on student, he must be willing to
receive your attention, e.g.
o Responding – When a student pays attention, he responds to or
participates in the process
o Valuing – Student must develop his own values with internally
consistent if a sense of personal stability is to result.
o Organization – Student must organize his own values
o Characterization – A student develops values, then makes the values
internally consistent, and finally his behavior is controlled by his
value system
• Example of affective objective
“At the end of the lesson, student should be able to pay attention in the
arithmetic calculation up to 3 digits”
Chapter 3
Types and Characteristics of
Tests
Types of Tests
• Types of tests can be determined from different perspectives
whether they are discrete or integrative. Discrete point tests
are expected to test one item or skills at a time, while
integrative tests combine various items, structures, skills into
one single test.
Discrete Test
Example: Fill in the gap with the correct form of the verb:
“Abdi ________________ to the market yesterday (go)”
Integrative Test
Example: Fill in the gap with the correct words
“Firstly, he has to understand the _______ as the speaker says
_____________. He must not stop the _________ in order to look up
a _________ or an unfamiliar sentence”
Types of Tests
• The second perspective for identifying different kinds of tests
is by the aim and objectives of the test:
o Placement test: for placing students at a particular level,
school, or college
o Achievement tests: for measuring the achievement of a
candidate in a particular course either during or at the end
of the course
o Diagnostic tests: for determining the problems of a
student in a particular area, task, course, or programme.
o Aptitude tests: are designed to determine the aptitude of a
student for a particular task, course, programme, job, etc
Types of Tests
o Predictive tests: designed to be able to predict the
learning outcomes of the candidate.
o Standardized tests: are any of the above mentioned tests
that have been tried out with large groups of individuals,
whose scores provide standard norms or reference points
for interpreting any scores that anybody who writes the
tests has attained.
o Continuous assessment tests: are designed to measure
the progress of students in a continuous manner.
o Teacher-made tests are tests produced by teachers for a
particular classroom use
Types of Tests
• The third perspective of identifying test is by viewing their
structure:
o Multiple-choice test: the test- taker is supposed to select the
"best" choice among a set of four or five options. (They are
sometime called "selected-response tests.“
o True-false tests: tests in which the student or examinee
indicates whether each of several statements is true or false.
o Matching tests: tests consisting of two sets of items to be
matched with each other for a specified attribute.
o Essay tests: tests which consist of a small number of questions
to which the student is expected to demonstrate in his/her
response his/her ability to (a) recall factual, conceptual, or
procedural knowledge, (b) organize this knowledge, and (c)
interpret the information critically in a logical, integrated
answer to the question.
Characteristics of Good Tests
• There are some qualities that are observed and analyzed in a good
test:
o A good test should be valid
o A good test should be reliable
o A good test must be capable of accurate measurement of the
academic ability of the learner
o A good test should combine both discrete point and
integrative test procedures for a fuller representation of
teaching-learning points
o A good test must represent teaching-learning objectives and
goals
o Test materials must be properly and systematically selected
o Variety is also a characteristic of a good test
Characteristics of Good Tests
Reliability
• Reliability means the degree to which an assessment tool produces
stable and consistent results. Reliability essentially denotes
consistency, stability, dependability, and accuracy of assessment
results.
Types of Reliability:
• Test-retest reliability is a measure of reliability obtained by
administering the same test twice over a period of time to a group of
individuals.
• Parallel forms reliability is a measure of reliability obtained by
administering different versions of an assessment tool (both
versions must contain items that probe the same construct, skill,
knowledge base, etc.) to the same group of individuals.
Characteristics of Good Tests
• Inter-rater reliability is a measure of reliability used to assess the
degree to which different judges or raters agree in their assessment
decisions. Inter-rater reliability is useful because human observers
will not necessarily interpret answers the same way; raters may
disagree as to how well certain responses or material demonstrate
knowledge of the construct or skill being assessed.
• Internal consistency reliability is a measure of reliability used to
evaluate the degree to which different test items that probe the
same construct produce similar results.
o Average inter-item correlation is obtained by taking all of the items on a test
that probe the same construct (e.g., reading comprehension), determining the
correlation coefficient for each pair of items, and finally taking the average of
all of these correlation coefficients.
o Split-half reliability is the process of obtaining split-half reliability is begun by
“splitting in half” all items of a test that are intended to probe the same area of
knowledge (e.g., World War II) in order to form two “sets” of items
Characteristics of Good Tests
Validity
• Validity of a test is the extent to which a test (such as a chemical,
physical, or scholastic test) accurately measures what it suppose to
measure
Types of Validity
• Face validity ascertains that the measure appears to be assessing
the intended construct under study. The stakeholders can easily
assess face validity
• Construct Validity is used to ensure that the measure is actually
measure what it is intended to measure (i.e. the construct), and not
other variables
• Criterion-Related Validity is used to predict future or current
performance - it correlates test results with another criterion of
interest
Characteristics of Good Tests
•Formative Validity when applied to outcomes assessment it is used
to assess how well a measure is able to provide information to help
improve the program under study.
•Sampling Validity (similar to content validity) ensures that the
measure covers the broad range of areas within the concept under
study.
What are some ways to improve validity?
o Make sure your goals and objectives are clearly defined and
operationalized
o Match your assessment measure to your goals and objectives
o Get students involved; have the students look over the assessment for
troublesome wording, or other difficulties
o If possible, compare your measure with other measures, or data that
may be available
Chapter 4
Principles of Test
Construction
Test Construction
• Teacher-made tests are indispensable in evaluation as they
are handy in assessing the degree of mastery of the specific
units taught by the teacher.
• The principles behind the construction of the different
categories of tests mentioned above are essentially the same.
Planning Test Construction
• In planning test construction, major questions should be
considered:
o What is the intended function of the test?
o What are the specific objectives of the content area?
o What content area has been taught?
o What type of test shall be suitable to achieve the intended objectives?
Test Construction
• When planning a test construction, major decisions are
required to consider. The following are some of the decisions
crucial for test construction:
o Specificity and acceptance of the objectives
o Comprehensiveness of the measure (test)
o Test Administration
o Level of teacher’s expectation of the students
o Reporting the results
Test Construction
Specificity and acceptance of objectives
• Whenever you are faced with the task of preparing an
evaluation device for your classroom, first determine which of
these categories seems to fit your particular situation best:
o All objectives are specified in performance terms, e.g. English grammar,
typing, trainings on safe use of chemicals in laboratory.
o Not all objectives are specified in performance terms, but most agree
those specified objectives represent fairly well the major goals on
instruction, e.g. mathematical skills
o Not all objectives are specified in performance terms; also a lack of
agreement regarding the degree completeness of objectives exists
o No real attempt were made to translate objectives into performance
terms.
Test Construction
Comprehensiveness of the measure (test)
• In considering the comprehensiveness of the measure, three
distinct possibilities exist:
o Every objective is evaluated by one or more items in the
measurement device
o A random sample of items is selected for the measurement
device. Not every objective is evaluated.
o A biased selection of objectives is evaluated (Bias occurs in
deliberate selection of evaluation items by the teacher)
Test Construction
Test Administration
• Test administration is important based on the content of the
measure (test) and time of testing. Four categories need more
attention:
o Same items for all students are administered at the same
time.
o Same items for all students are administered at different
times (as they become ready)
o Different items are administered to students at the same
time
o Different items are administered to students at different
times
Test Construction
Level of teacher’s expectation of students
• When you administer some test to students, your expectation falls
into one of four categories:
o All students will completely master the objectives
o All students will be brought to some minimum performance level
on the objectives
o A distribution of performance is expected with the measurement
device (test)
o The range of scores on the test is maximized by concentrating
additional effort on the higher performers
Reporting the results
• Attainment criteria of the objectives
• Comparing individual performance of students
Test Construction
Table of Specification (Blueprint)
• Table of specification (test blueprint) is a table showing the number
of items that will be asked under each topic of the content and/or
behavioral objective.
• When constructing a test, teachers need to be concerned that the
test measures an adequate sampling of the class content at the
cognitive level that the material was taught.
• The table of specification can help teachers map the amount of class
time spent on each objective with the cognitive level at which each
objective was taught thereby helping teachers to identify the types
of items they need to include on their tests.
• Selecting number of items in the different cognitive levels can be
determined by the relative importance of the content/objectives in
terms of time and depth
Test Construction
Example of Table of Specification based on objectives
Learning Objectives Cognitive Domain Levels
Knowledge Comprehension Application

List the three states of a


matter 2

Identify the different ways


a matter can be
1
transformed to its
different states
Interpret the graph the
stages of matter change at 2
different temperatures
Test Construction
Example of Table of Specification based on content
Content Cognitive Domain Levels
Knowledge Comprehension Application

States of Matter
1 1

Matter Change
1 2

Matter change
temperatures in graph 1 2
Multiple Choice Construction
• Multiple choice question (MCQ) consists of two distinct parts:
o The first part that contains task or problem is called stem
of the item. The stem of the item may be presented either
as a question or as an incomplete statement. The form
makes no difference as long as it presents a clear and a
specific problem to the examinee.
o Second part presents a series of options or alternatives.
Each option represents possible answer to the question. In
a standard form one option is the correct or the best
answer called the correct response and the others are
misleads or foils called distracters
Multiple Choice Construction
Strengths of Multiple-Choice Items
o Versatility in measuring all levels of cognitive skills.
o Permit a wide sampling of content and objectives.
o Provide highly reliable test scores.
o Can be machine-scored quickly and accurately.
o Reduced guessing factor compared with true-false items.
Limitations of Multiple-Choice Items
o Difficult and time-consuming to construct.
o Depend on student’s reading skills and instructor’s writing
ability.
o Ease of writing low- level knowledge items leads instructors to
neglect writing items to test higher- level thinking.
o May encourage guessing (but less than true- false).
Multiple Choice Construction
• General guidelines for constructing a multiple choice items:
o Each question should be designed to assess an important learning
outcome/objective.
o Present a single clearly formulated problem in the stem of the question
o State the stem in simple and clear language
o Place as much of the wording as possible in the stem
o Whenever possible, state the stem in positive form
o Make certain that the intended answer is correct or clearly best
o Check all alternatives are grammatically consistent with the stem and
parallel in form
o Avoid verbal clues
o Make the distracters plausible and attractive
o Vary the relative length of the correct answer to eliminate length as a clue
o Avoid using the alternative ‘all of the above’ and use ‘none of the above’
with extreme caution
Multiple Choice Construction
•Each question should be designed to assess an important learning
outcome/objective.
o Avoid for testing unimportant details, unrelated bits of
information, and material that is irrelevant to the desired
outcomes
•Present a single clearly formulated problem in the stem of the
question, e.g.
Poor: A table of specification
a) Indicates how a test will be used to improve learning.
b) Provides a more balanced sampling of content.*
c) Arranges the instructional objectives in order of their
importance.
d) Specifies the method of scoring to be used on a test.
Multiple Choice Construction
Better: What is the main advantage of using a table of specifications
when preparing an achievement test?
a) It reduces the amount of time required.
b) It improves the sampling of content.*
c) It makes the construction of test questions easier.
d) It increases the objectivity of the test
• State the stem in simple and clear language
Poor: The paucity of plausible, but incorrect, statements that can be
related to a central idea poses a problem when constructing
which one of the following types of test questions?
A. Short-answer.
B. True-false.
C. Multiple-choice.*
D. Essay.
Multiple Choice Construction
Better: The lack of plausible, but incorrect, alternatives will cause the
greatest difficulty when constructing:
A. short-answer questions.
B. true-false questions.
C. multiple-choice questions.*
D. essay questions
Another common fault in stating MCQs is to load the stem with irrelevant and,
thus, nonfunctioning material.
Poor: Testing can contribute to the instructional program of the school in
many important ways. However, the main function of testing in
teaching is:
Better: The main function of testing in teaching is:
Multiple Choice Construction
• Place as much of the wording as possible in the stem
Poor: In objective testing, the term objective:
A. refers to the method of identifying the learning outcomes.
B. refers to the method of selecting the test content.
C. refers to the method of presenting the problem.
D. refers to the method of scoring the answers.*
Better: In objective testing, the term objective refers to the method of:
A. identifying the learning outcomes.
B. selecting the test content.
C. presenting the problem.
D. scoring the answers*
Multiple Choice Construction
Sometimes the problem is to reword the entire question.
Poor: Instructional objectives are most apt to be useful for test construction
purposes when they are stated in such a way that they show:
A. the course content to be covered during the instructional period.
B. the kinds of behavior students should demonstrate upon reaching
the goal.*
C. the things the teacher will do to obtain maximum student learning.
D. the types of learning activities to be participated in during the
course.
Better: Instructional objectives are most useful for test-construction purposes
when they are stated in terms of:
A. course content.
B. student behaviour.*
C. teacher behavior.
D. learning activities
Multiple Choice Construction
• Whenever possible, state the stem in positive form
Poor: Which one of the following is not a desirable practice when preparing
MCQs?
A. Starting the stem in positive form.
B. Using a stem that could function as a short-answer question.
C. Underlining certain words in the stem for emphasis.
D. Shortening the stem by lengthening the alternatives*
Better: All of the following are desirable practices when preparing MCQs
EXCEPT:
A. stating the stem in positive form.
B. using a stem that could function as a short-answer question.
C. underlining certain words in the stem for emphasis.
D. shortening the stem by lengthening the alternatives.*
Multiple Choice Construction
• Make certain that the intended answer is correct or clearly best
Poor: What is the best method of selecting subject content for test questions?
Better: Which one of the following is the best method of selecting subject
content for test questions?
• Check all alternatives are grammatically consistent with the stem and
parallel in form
Poor: The recall of factual information can be measured best with a:
A. matching question. C. short-answer question.*
B. multiple-choice question D. essay question.
Better: The recall of factual information can be measured best with:
A. matching questions.
B. multiple-choice questions.
C. short-answer questions.*
D. essay questions.
Multiple Choice Construction
Stating all of the alternatives in parallel form also prevents unnecessary clues
to be given to students.

Poor: Why should negative terms be avoided in the stem of a MCQ?


A. They may be overlooked.*
B. The stem tends to be longer.
C. The construction of alternatives is more difficult.
D. The scoring is more difficult.
Better: Why should negative terms be avoided in the stem of a MCQ?
A. They may be overlooked.*
B. They tend to increase the length of the stem.
C. They make the construction of alternatives more difficult.
D. They may increase the difficulty of the scoring.
Multiple Choice Construction
• Avoid verbal clues
Example 1 Similarity of wording in both the stem and the correct answer
Poor: Which one of the following would you consult first to locate research
articles on achievement testing?
A. Journal of Educational Psychology
B. Journal of Educational Measurement
C. Journal of Consulting Psychology
D. Review of Educational Research*
Example 2 Stereotyped correct answer
Poor: Learning outcomes are most useful in preparing tests when they are:
A. clearly stated in behavioral terms.*
B. developed co-operatively by teachers and students.
C. prepared after the instruction has ended.
D. stated in general terms
Multiple Choice Construction
Example 3 Too much details in the correct answer may provide a clue
Poor: Lack of attention to learning outcomes during test preparation:
A. will lower the technical quality of the questions.
B. will make the construction of test questions more difficult.
C. will result in the greater use of essay questions.
D. may result in a test that is less relevant to the inst. program.*
Example 4 Use of two responses that are all inclusive makes it possible to
eliminate the other alternatives.
Poor: Which one of the following types of test questions assesses learning
outcomes at the recall level?
A. Supply-type questions.*
B. Selection-type questions.
C. Matching questions.
D. MCQs.
Multiple Choice Construction
• Make the distracters plausible and attractive
Poor: Obtaining a dependable ranking of students is of major concern when
using:
A. norm-referenced summative tests.*
B. behavior descriptions
C. check lists.
D. questionnaires.
Better: Obtaining a dependable ranking of students is of major concern when
using:
A. norm-referenced summative tests.*
B. teacher-made diagnostic tests.
C. mastery achievement tests.
D. criterion-referenced formative tests
Multiple Choice Construction
• Vary the relative length of the correct answer to eliminate length as a
clue
Poor: One advantage of MCQs over essay questions is that they:
A. measure more complex outcomes.
B. depend more on recall.
C. require less time to score.
D. provide for a more extensive sampling of course content.*
Better: One advantage of MCQs over essay questions is that they:
A. provide for the measurement of more complex learning outcomes.
B. place greater emphasis on the recall of factual information.
C. require less time for test preparation and scoring.
D. provide for a more extensive sampling of course content.*
Multiple Choice Construction
• Avoid using the alternative ‘all of the above’ and use ‘none
of the above’ with extreme caution
Poor: Which one of the following is a category of the cognitive
domain in
Bloom’s Taxonomy?
A. Critical thinking.
B. Scientific thinking.
C. Reasoning ability.
D. None of the above.*
Multiple Choice Construction
Constructing multiple choice items with the different cognitive domain
levels:
Knowledge: The following belong to the states of matter, EXCEPT
A. Solid
B. Particle
C. Liquid
D. Gas
Comprehension: When liquid is heated, its temperature rises. If the
temperature of the liquid goes beyond 1000c, it changes to:
A. Solid
B. Concrete
C. Liquid
D. Gas
Multiple Choice Construction
Application: When water gets a temperature of 00c, it turns into ice. If a salt is
mixed with the water at 00c, it does not turn into ice. One of the
following changes happened:
A. Freezing point of the water is changed from 00c to lower point
B. Melting point of the water is changed from 00c to lower point
C. Boiling point of the water is changed from 1000c to lower point
D. Freezing point of the water is changed from 00c to higher point

Analysis: When a water is heated at a temperature of 1000c for some time, it


gets into vapor. The relationship between the state change and time is
that:
A. As time is increased, the state of the matter changes to vapor at 1000c
B. As time is decreased, the state of the matter changes to vapor at 1000c
C. As time is decreased, the state of the matter changes to solid at 1000c
D. As time is increased, the state of the matter remain liquid at 1000c
Multiple Choice Construction
Evaluation: Based on the information about the matter, several
applications were based on the fact that matter changes
from state to another. Which of the following shows the
reality of the matter in our lives:
A. Milk is powdered because it can be used for longer time
B. Food is cooked because it can be easily eaten
C. Books were made from trees because they help us to
write on it
D. Mobile phones use electromagnetic waves because
sound can be passed on them
True-False Construction
Advantages of True-False
There are several advantages to using True-False items. They can
provide
• The widest sampling of content or objectives per unit of
testing time.
• Scoring efficiency and accuracy.
• Versatility in measuring all levels of cognitive ability.
• Highly reliable test scores.
• An objective measurement of student achievement or ability.
True-False Construction
Disadvantages of True-False
True-False items also have several limitations. They:
• Incorporate an extremely high guessing factor. For simple true-false
items, each student has a 50/50 chance of correctly answering the
item without any knowledge of the item's content.
• Can often lead an instructor to write ambiguous statements due to
the difficulty of writing statements which are unequivocally true or
false.
• Do not discriminate between students of varying ability as well as
other item types.
• Can often include more irrelevant clues than do other item types.
• Can often lead an instructor to favor testing of trivial knowledge.
True-False Construction
• Guidelines for constructing True-False questions
o Avoid using more than one idea in a True or False question. Make your
main point prominent.
o Keep the statement short and simple. The question should be based on
the learner’s knowledge and not their ability to interpret the question.
o True statements should be true under all circumstances. Avoid using
may, seldom, possible, often, and other qualifiers
o Use negative statements sparingly and do not use double negatives.
Negative words are often overlooked and should be underlined or in
capital letters.
o Opinion statements should be attributed to some source. Instead of
agreeing with the stated opinion, the students should be aware of the
opinions of the organization or individuals
o When cause and effect relationships are being measured; use only true
propositions
True-False Construction
• Avoid using more than one idea in a True or False question.
Make your main point prominent.
Poor: All spiders have exoskeletons and only prey on insects
Better: All spiders have exoskeletons
• Keep the statement short and simple. The question should be
based on the learner’s knowledge and not their ability to
interpret the question.
Poor: Hydrogen is used in the Haber process for the fixation of
atmospheric nitrogen, in the production of methanol, and
in hydrogenation of fats and oils.
Better: Hydrogen is used for the production of methanol
True-False Construction
• True statements should be true under all circumstances. Avoid
using may, seldom, possible, often, and other qualifiers
Poor: Solar energy is often used as an alternative energy source
Better: Solar energy is an alternative energy source

• Use negative statements sparingly and do not use double


negatives. Negative words are often overlooked and should be
underlined or in capital letters.
Poor: Bread and grains are not at the top of the food pyramid
Better: Bread and grain are at the bottom of the food pyramid
True-False Construction
• Opinion statements should be attributed to some source.
Instead of agreeing with the stated opinion, the students
should be aware of the opinions of the organization or
individuals
Poor: Scientific method is the only way of studying science
Better: Dr. Bartels prefers using the Chaos theory to study
science
• When cause and effect relationships are being measured; use
only true propositions
Poor: Sulfur dioxide produces sulfuric acid because sulfur gases
are emitted from industrial smoke stacks.
Better: Sulfur dioxide produces sulfuric acid because of oxidation
True-False Construction
Bloom’s Taxonomy
Knowledge: TF Solid is one of the states of matter
Comprehension: T F When liquid is heated at1000c, it
changes to solid
Application: T F Salt is used to decrease the melting
point of an ice
Analysis: T F The relationship between the state
change and time is that as time is
decreased, the state of the matter
changes to vapor at 1000c
Evaluation: T F The reality of matter change is
significant in many aspects of life,
like powered milk
Matching Construction
In matching questions, students are given instructions (called lead-in) are
presented with a list of explanations, descriptions, or definitions (called
premises), are required to match these on factual or logical basis with the
correct alternative (called responses) from second list
Advantages of Matching Questions
• Allow the comparison of related ideas, concepts, etc
• Efficient means of assessing the association between a variety of items
• Encourage the integration of information
• Preferable to several multiple choice questions that have the same answer
choices
• Relatively quick and easy to score
• Objective nature limits bias in scoring; no bias for good or bad writing
skills
• Easy to administer to large groups
Matching Construction
Disadvantages of matching questions
• Difficult to generate a sufficient number of plausible premises
• Not effective at testing isolated facts
• May limit assessment to lower levels of understanding
• Students can guess
Matching Construction
Guidelines for writing matching questions
• Use only homogeneous material in a single matching exercise.
• Include an unequal number of responses and premises and instruct
the student that responses may be used once, more than once, or
not at all.
• Keep the list of items to be matched brief, and place the shorter
responses on the right.
• Arrange the list of responses in logical order. Place words in
alphabetical order and numbers in sequence.
• Indicate in the directions the basis for matching the responses and
premises.
• Ambiguity and confusion will be avoided. And testing time will be
saved.
• Place all of the items for one matching exercise on the same page.
Matching Construction
• The following example fits all the guidelines stated in the previous
slide
Direction: On the line to the left of each compound in the Premises,
write the letter of the compound's formula presented in Responses.
Use each formula only once
Premise Response
1. _______ Water A. H2SO2
2. _______ Salt B. H2O
3. _______ Ammonia C. H2HCI
4. _______ Sulfuric Acid D. HCI
E. NaCL
Essay Construction
• As defined by Stalnaker (1951, p.495), “An essay question is a test
item which requires a response composed by the examinee, usually
in the form of one or more sentences, of a nature that no single
response or pattern of responses can be listed as correct, and the
accuracy and quality of which can be judged subjectively only by
one skilled or informed in the subject”.
• Advantages of an essay type test
o By their very nature essay questions assess higher-order thinking.
o Essay questions are easy to construct.
o The use of essay questions eliminates the problem of guessing.
o Essay questions benefit all students by placing emphasis on the
importance of written communication skills.
o Essay questions encourage students to prepare more thoroughly.
Essay Construction
• Disadvantages of an essay type test
o Grading is often subjective and not consistent, colored by
preconceptions of student, prior performance, time of day,
neatness and handwriting, spelling and grammar, etc.
o Can be limited sampling of content
o Good writing requires time to think, organize, write and revise
o Time consuming to correct
o Advantageous to students with good writing and verbal skills
o Essay questions are not properly developed to assess higher
thinking skills
o Advantageous to students who are quick to develop arguments
Essay Construction
• General guidelines for constructing essay type tests
o Essay questions should draw the possibility of higher level questioning
o Where possible, a series of short-answer essay questions are better
than one more
o Frame the question so that there is a correct response; that is, a
response that people knowledgeable in the field would agree is correct
o If the items are to be used with students from varying performance
levels, try to set the items up so all can make at least some response
o In general, don’t allow a choice of questions
o Write the expected response to the question as you write the question
o Developing model answers as you write the questions will also allow
you to develop point values in advance
o A well-constructed essay question should establish a framework within
which the student operates by delimiting the area covered by the
question, using clear descriptive words, and aim the student to the
desired response.
Essay Construction
• Essay questions should draw the possibility of higher level
questioning
Example: Compare and contrast the aerobic and anaerobic respiratory
process
• Where possible, a series of short-answer essay questions are better than
one more general
General: Describe how the different kinds of pollution adversely
affect our environment by giving examples
Short questions:
Describe the process whereby thermal pollution causes adverse
effects in a lake
Contrast the effects of carbon monoxide, sulfur dioxide, and soot on
man and his environment
Describe the process whereby DDT causes softness in eagle eggs
Essay Construction
• Frame the question so that there is a correct response; that is, a
response that people knowledgeable in the field would agree is
correct
Poor: In your opinion, what are the causes of cardiovascular
diseases?
Better: State three risk factors of cardiovascular diseases
• If the items are to be used with students from varying performance
levels, try to set the items up so all can make at least some response
Example: Explain two stages of matter change
• In general, don’t allow a choice of questions. Don’t include your
direction with the statement, “Answer three out of the five”
Essay Construction
• A well-constructed essay question should establish a framework within
which the student operates by delimiting the area covered by the question,
using clear descriptive words, and aim the student to the desired response.
Poor: Describe the respiratory system
Better: Describe the function of lungs

Poor: Discuss the advantages of essay type question


Better: Discuss the advantages of essay type question by comparing with
objective type questions

Poor: Discuss the Somaliland Education Act


Better: Discuss the statement “Primary education should be free” . In your
answer, include how it was received by a) human right activists b)
religious leaders c) primary school headmasters
Essay Construction
• Scoring or grading essay type questions
Guidelines:
o Student papers should be scored/graded anonymously and that all
answers to a given item be scored one at a time, rather than grading
each total separately
o Avoid distractors in scoring essay tests, like student’s good
handwriting, writing style, correct grammar, neatness, and knowledge
of the student
o Decide what factors constitute a good answer before administering an
essay question
o Explain the factors in the test item
o Read all answers to a single essay question before reading other
questions
o If possible, reread the essay answers a second time after initial score
Essay Construction
• Scoring or grading essay type questions
Ways of scoring essay test
o Analytical scoring – in this type, the essay is scored in terms of each
components.

Example: Identify the three types of chemical reactions in terms of their


formation (6 Marks)
Model answer:
Chemical reactions occur in different types. First, synthesis reactions (1
mark) occur when two or more chemical species combine to form a more
complex product (1 mark). Second, decomposition reactions (1 mark) occur
when a compound broken into smaller chemical species (1 mark). Third,
single displacement reactions (1 mark) occur when one element is being
displaced from a compound by another element (1 mark).
Essay Construction
• Another example
Question: Describe the process involved from dumping of the
phosphates to the death of the fish in lakes (6 Marks)
Model Answer:
“Phosphates dumped in the lakes become food for algae and then more
algae grow (1 Marks). Thus increased algae lead to increased dead algae,
(1 Marks) which leads to increased bacteria (1 Marks). The bacteria use
oxygen in the water (1 Marks) so increased bacteria caused increased use
of oxygen (1 Marks). Then less oxygen is available to fish in the lake and
then cause the fish to die. (1 Marks)”
Essay Construction
• Holistic/Rating Scoring – in this type, a total score is assigned in
each essay type based on the teachers general impression or
standards/overall assignment
Example: Explain how a new substance is formed through the process
of chemical reaction (10 Marks)
Model Answer:
In new substance, only the atoms present in the reactants end up in the
product. No new atoms are created, and no atoms ae destroyed.
Reactants contact each other, bonds between atoms in the reactants are
broken, and atoms rearrange and form new bonds to make the product.
The atoms in the reactants rearrange themselves and bond together
differently to form one or more new products with different
characteristics than reactants.
Superior= 5 Above-average = 4 Average= 3 Below-average=2
Inferior= 1
Chapter 5
Test Contingencies
Test Contingencies
• A number of factors can affect the outcome of the test in the
classroom.
o Student Factors
▪ Socio-economic background
▪ Health
▪ Anxiety
▪ Interest
▪ Mood etc
o Teacher Factors
▪ Teacher characteristics
▪ Instructional Techniques
▪ Teachers’ qualifications/knowledge
Test Contingencies
o Learning Materials
▪ The nature
▪ Appropriateness etc.
o Environmental
▪ Time of day
▪ Weather condition
▪ Arrangement
▪ Invigilation etc
• There are other factors that do affect tests negatively, which are
inherent in the design of the test itself: These include:
o Appropriateness of the objective of the test.
o Appropriateness of the test format
o Relevance and adequacy of the test content to what was taught
Test Contingencies
Validity
• Validity of a test is the extent to which a test accurately measures what it is
supposed to measure
Types of Validity
• Face validity - This is a validity that depends on the judgment of the
external observer of the test. It is the degree to which a test appears to
measure the knowledge and ability based on the judgment of the external
observer.
• Construct Validity - This refers to how accurately a given test actually
describes an individual in terms of a stated psychological trait.
• Criterion-Related Validity This validity involves specifying the ability
domain of the learner and defining the end points so as to provide absolute
scale. In order to achieve this goal, the test that is constructed is compared
or correlated with an outside criterion, measure or judgment
Test Contingencies
• Predictive validity - It suggests the degree to which a test accurately
predicts future performance. For example, if we assume that a student who
does well in a particular mathematics aptitude test should be able to
undergo a physics course successfully, predictive validity is achieved if the
student does well in the course
• Content Validity - This validity suggests the degree to which a test
adequately and sufficiently measures the particular skills, subject
components, items function or behavior it sets out to measure.
• What are some ways to improve validity?
o Make sure your goals and objectives are clearly defined and
operationalized
o Match your assessment measure to your goals and objectives
o Get students involved; have the students look over the assessment for
troublesome wording, or other difficulties
o If possible, compare your measure with other measures, or data that
may be available
Test Contingencies
Reliability
• Reliability means the degree to which an assessment tool produces
stable and consistent results. Reliability essentially denotes
consistency, stability, dependability, and accuracy of assessment
results.
Types of Reliability:
• Test-retest reliability is a measure of reliability obtained by
administering the same test twice over a period of time to a group of
individuals.
• Parallel forms reliability is a measure of reliability obtained by
administering different versions of an assessment tool (both
versions must contain items that probe the same construct, skill,
knowledge base, etc.) to the same group of individuals.
Test Contingencies
• Inter-rater reliability is a measure of reliability used to assess the
degree to which different judges or raters agree in their assessment
decisions. Inter-rater reliability is useful because human observers
will not necessarily interpret answers the same way; raters may
disagree as to how well certain responses or material demonstrate
knowledge of the construct or skill being assessed.
• Internal consistency reliability is a measure of reliability used to
evaluate the degree to which different test items that probe the
same construct produce similar results.
o Average inter-item correlation is obtained by taking all of the items on a test
that probe the same construct (e.g., reading comprehension), determining the
correlation coefficient for each pair of items, and finally taking the average of
all of these correlation coefficients.
o Split-half reliability is the process of obtaining split-half reliability is begun by
“splitting in half” all items of a test that are intended to probe the same area of
knowledge (e.g., World War II) in order to form two “sets” of items
Test Contingencies
FACTORS AFFECTING RELIABILITY
• Some of the factors that affect reliability include:
o The relationship between the objective of the tester and that of
the students.
o The clarity and specificity of the items of the test.
o The significance of the test to the students.
o Familiarity of the tested with the subject matter.
o Interest and disposition of the tested.
o Level of difficulty of items.
o Socio-cultural variables.
o Practice and fatigue effects
Test Contingencies
Correlation Coefficient
• Reliability of tests is often expressed in terms of correlation
coefficients. Correlation concerns the similarity between two
persons, events or things.
• Correlation coefficient is a statistics that helps to describe with
numbers, the degree of relationship between two sets or pairs of
scores.
• Positive correlations are between 0.00 and + 1.00. While negative
correlations are between 0.00 and – 1.00. Correlation at or close to
zero shows no reliability; Correlation between 0.00 and + 1.00,
some reliability; correlation at + 1.00 perfect reliability.
Test Contingencies
Methods of Calculating Correlation Coefficient
• Product – moment correlation method which uses the
derivations of students’ scores in two subjects being compared

• Pearson product – moment Correlation coefficient


Test Contingencies
Example 1
A teacher administered a biology test to a group of five students and
then records their results. After two weeks, the teacher administered
the same test to the same students and records the results. The below
table show the results of the two tests.

Students 1 2 3 4 5
1st Test Results (x) 8 3 7 6 6
2nd Test Results (y) 7 4 4 7 8

a) Find the correlation coefficient by using:


o Pearson-Product Correlation Method
Chapter 6
Interpretation of Test Scores
Using Test Results
• Before tests could be used for those purposes, the teacher
needs to know how well designed the test is in terms of
difficulty level and discrimination power, then he should be
able to compare a child’s performance with those of his peers
in the class
• To do this, the following will be carried out:
o Item analysis.
o Finding measures of central tendency (Mean, Mode,
Median)
o Assigning grades
Item Analysis
• Item analysis helps to decide whether a test is good or poor in two
ways:
o It gives information about the difficulty level of a question.
o It indicates how well each question shows the difference
(discriminate) between the bright and dull students. In essence,
item analysis is used for reviewing and refining a test.
Difficulty Level
• we mean the number of candidates that got a particular item right in
any given test. The proportion usually ranges from 0 to 1 or 0 to
100%.
𝑛 𝑥 100 𝑈+𝐿
𝑝 = or 𝑝=
𝑁 𝑁
P = Item difficult, n = # of students who got the item correct
N= Total students, U = upper 1/3 of students, L=lower 1/3 of students
Item Analysis
Example: If a class of 45 students , 30 of them got a question correct, what
will be the difficult index? (20 upper 1/3 and 10 lower 1/3)

Item Discrimination
• The discrimination index shows how a test item discriminates between the
bright and the dull students. A test with many poor questions will give a
false impression of the learning situation. Usually, a discrimination index of
0.4 and above are acceptable. Items which discriminate negatively are bad.
𝑈−𝐿
𝑝=
1
𝑁
2
Example: Consider of a class with 60 students. If 36 of the upper 1/3 of
the students and 20 of the lower 1/3 of the students got question item
correctly, what will be the difficult index?
Measures of Central Tendency
• Modeis the most frequent or popular score in the population
• Median is the middle score after all the scores have been arranged
in order of magnitude i.e. 50% of the score are on either side of it.
• Mean is the average of all the scores and it is obtained by adding the
scores together and dividing the sum by the number of scores.
𝑆𝑢𝑚 𝑜𝑓 𝑎𝑙𝑙 𝑠𝑐𝑜𝑟𝑒𝑠
𝑥=
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑠𝑐𝑜𝑟𝑒𝑠
Example: A class of 7 students sat a math exam and obtained the
following grades:
6, 8, 7, 5, 6, 5, 6
a) Find the mode, median, and mean
Assigning Grades
Credit System and Grading System
• Credit Units - Courses are often weighed according to their credit
units in the course credit system. Credit units of courses often range
from 1 to 4. Each credit unit equals 15 hours of teaching.
• Grade Point - This is a point system which has replaced the A to F
Grading System.
• Grade Point Average (GPA) - This is obtained by multiplying the
Grade Point attained in each course by the number of Credit Units
assigned to that course, and then summing these up and dividing by
the total number of credit units taken.

𝐺𝑟𝑎𝑑𝑒 𝑃𝑜𝑖𝑛𝑡𝑠 𝑥 𝐶𝑟𝑒𝑑𝑖𝑡 𝐻𝑜𝑢𝑟𝑠


𝐺𝑃𝐴 =
𝑇𝑜𝑡𝑎𝑙 𝐶𝑟𝑒𝑑𝑖𝑡 𝐻𝑜𝑢𝑟𝑠

You might also like