18-May-20
Dr. Rohit Vishal Kumar
Associate Professor, IMI Bhubaneswar
Raw Data
Editing of Data
Data
Cleaning Coding of Data
Data File for Analysis
Data Analysis Software / Manual Analysis
Analysis
Plan
Descriptive Univariate Bivariate Multivariate
Analysis Analysis Analysis Analysis
1
18-May-20
Raw Data:
The unedited response from a respondent exactly as indicated by that respondent
Non Respondent Error:
Errors which a respondent is not responsible for creating; such as when it is made by the
interviewer
Data Integrity:
Refers to the notion that the data file actually contains the information which the researcher
promised to the decision maker.
Alternatively it can be looked upon as that the data contained are true and accurate representation
of respondent’s view
The process of checking the completeness, consistency and legibility of data and making
data ready for coding and transfer to storage
Can be broadly classified into:
Field Editing
In-House Editing
Checking done for
Transference errors
Units of measurement errors
Item non-response (unanswered questions or partially answered questions)
2
18-May-20
How long have you lived at the current address? 48
What is your age? 32 years
Does your organisation has more than one computer network?
Yes No
If “Yes” How Many? 3
Ethical Issue:
Should we “plug in” or “impute” the data that is missing? Will it lead to better answers of bias
the answer and findings
3
18-May-20
4
18-May-20
Is the process of assigning a numerical score or other character symbols to previously edited
data
Codes, refer to numerical symbols assigned to pieces of information
We shall learn about Coding by using a Small Questionnaire
Codes are Transferred into a Data-File
Is to understand how to convert a hard copy of the questionnaire to the electronic file
suitable for analysis
We shall talk about:
How to translate the Questionnaire into Codes
How to get these codes into a common base file [Delimited File]
How to get it into SPSS or some other program from Analysis
Fully Practical Session
“We will prepare a sample data-file fit for analysis”
Please Fill in the Questionnaire given to you
10
5
18-May-20
Dichotomous
Only One Response Possible
The least difficult to deal with
Multiple Choice
One or more than one response possible
Somewhat difficult to deal with
Open Ended:
Responses are recorded verbatim
Most difficult to deal with
11
Dichotomous:
Only one column needed in MS-Excel
Multiple Choice
Can be looked upon in 2 different ways
A series of Dichotomous Questions
A set of Responses
SPSS uses MR-Group to handle these kinds of responses
12
6
18-May-20
Go through the open-ended questions in about 10-20
questionnaires
Identify key outputs
Start numbering the key outputs - [Known as Post-Code
Number]
If an idea is similar to an identified key output put it under
the same Post-Code Number
Else create a new Post Code Number
Keep a track of Post Code Numbers
Take a call whether some of the post codes can be merged or
not
Now write the Post Code numbers on the questionnaire and
enter the data
Essentially reduce the responses to “a set of responses” and
then use MR-Group fro analysis
13
14
7
18-May-20
SPSS considers a full row as a “Single Case”
SPSS considers a full column as a “Single Variable”
A Variable can be of the Following Type:
Nominal
Ordinal
Scaled
A data is considered to be of type “scale” if it is on interval or ratio scale
A Variable Name can be of 8 characters
It can have a long descriptive name called label
15
Case
Variable
Current Cursor Position
16
8
18-May-20
Click on the “Variable View” Tab
Set the desired properties in the tab
The Key Properties are:
Name: Represents the variable name. 8 characters
Type: Represents the type of data: 8 Different Types
Width: The total number of places
Decimal: The number of decimal places to be shown
Label: The Long Descriptive Name [OPTIONAL]
Value: Value Names of the Variable [NOMINAL & ORDINAL]
Missing: Value used to represent Missing Value
Columns: The number of columns to use
Align: Alignment [LEFT, CENTER, RIGHT]
Measure: The Scaling Type [NOMINAL, ORDINAL, SCALED]
17
18
9
18-May-20
Nominal or Ordinal Data can have different levels
For Example:
Excellent 1
Very Good 2
Good 3
Average 4
Bad 5
Value Names Variable Values
Take help of Data -> Define Variable Properties or use the questionnaire
19
NEVER WORK ON THE ORIGINAL DATA FILE
Keep the original data-file safe
Email it to yourself
Keep it on a CD/ Flash Drive and put it in the locker
Used Dropbox or Google Drive
Never truncate your data-file
Statistical softwares cannot read truncated data files
Can be completely ignored
Deal with missing values
May be Ignored
Missing Completely at Random (MCAR)
Missing at Random (MAR)
Not Missing at Random (NMAR) Cannot be Ignored
20
10
18-May-20
21
11