Data Processing and Tabulation
Editing, Coding and tabulation of data
Data is cleaned and treated for missing
responses
Stages of Data Analysis
Raw Data
The unedited responses from a respondent exactly as indicated by that
respondent.
Preparing The Raw Data i) Adherence to Sampling Instructions ii) Legibility
iii) Understandability iv) Completeness v) Consistency
Non-respondent Error
Error that the respondent is not responsible for creating, such as when the
interviewer marks a response incorrectly.
Data Integrity
The notion that the data file actually contains the information that the
researcher is trying to obtain to adequately address research questions.
Why Editing?
Editing - I
Editing
The process of checking the completeness, consistency, and
legibility of data and making the data ready for coding and
transfer to storage.
Field Editing
Preliminary editing by a field supervisor on the same day as
the interview to catch technical omissions, check legibility
of handwriting, and clarify responses that are logically or
conceptually inconsistent.
In-House Editing
A rigorous editing job performed by a centralized office
staff.
Editing - II
Checking for Consistency
Respondents match defined population
Check for consistency within the data collection framework
Deleting Incorrect Answers
Editing for Completeness
What about missing data?
List-wise deletion
The entire record for a respondent that has left
a response missing is excluded from use in
statistical analysis.
Pair-wise deletion
Only the actual variables for a respondent that
do not contain information are eliminated from
use in statistical analysis.
Editing - III
Pitfalls of Editing
Allowing subjectivity to enter into the editing process.
Data editors should be intelligent, experienced, and
objective.
A systematic procedure for assessing the questionnaire should
be developed so that the editor has clearly defined decision
rules.
Pretesting Edit
Editing during the pre test stage can prove very valuable for
improving questionnaire format, identifying poor instructions
or inappropriate question wording.
Coding
Coding means assigning a code, usually a number to
each possible response to each question in the
questionnaire.
Rules for Coding
i) Exhaustive A number for each category
ii) Mutually Exclusive
Test Tabulation
Tallying of a small sample of the total number of replies to a particular question
in order to construct coding categories.
Devising the Coding Scheme
A coding scheme should not be too elaborate.
The coders task is only to summarize the data.
Categories should be sufficiently unambiguous that coders will not classify items
in different ways.
Code book
Identifies each variable in a study and gives the variables description, code
name, and position in the data matrix.
Tabulation
Tabulation -sorting of Data into Categories and Classes
described by Dummy Tables , which has been established
and countig the number of responses associated with
each category
Results Summarized and presented in a more compact
way
Hand Tabulation ( Tally bars) Computer Tabulation
Frequency Table
A table showing the different ways respondents answered a question.
Series Individual, Discrete , Continuous
Analysis of Data
Descriptive Analysis of Data
The elementary transformation of raw data in a
way that describes the basic characteristics such as
central tendency, distribution, and variability.
Histogram
A graphical way of showing a frequency distribution
in which the height of a bar corresponds to the
observed frequency of the category
Levels of Scale Measurement and
Suggested Descriptive Statistics
Computer Programs for Analysis
Statistical Packages
Spreadsheets
Excel
Statistical software:
SAS
SPSS (Statistical Package
for Social Sciences)
MINITAB
Common types of charts
Bar Charts
Column Chart
Pie Chart
Line Chart