0% found this document useful (0 votes)
134 views10 pages

Biomedical Research Data Management

The document discusses key aspects of data management for biomedical research studies. It covers defining variables, creating a study database and data dictionary, entering data and correcting errors, backing up data, and creating a dataset for analysis. Additional topics include data structure, individual and aggregated databases, the basic structure of a database with records and variables, and elements like identifiers, variable names, coding, and constructing a data dictionary. Proper data management is essential for high quality research.

Uploaded by

Anant Khot
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
134 views10 pages

Biomedical Research Data Management

The document discusses key aspects of data management for biomedical research studies. It covers defining variables, creating a study database and data dictionary, entering data and correcting errors, backing up data, and creating a dataset for analysis. Additional topics include data structure, individual and aggregated databases, the basic structure of a database with records and variables, and elements like identifiers, variable names, coding, and constructing a data dictionary. Proper data management is essential for high quality research.

Uploaded by

Anant Khot
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

8/1/2019

BASIC COURSE IN BIO-MEDICAL RESEARCH


nie.gov.in

Data management includes


BASIC COURSE IN BIO-MEDICAL RESEARCH

Define variables

Create study database and data dictionary

Enter data and correct errors

Create dataset for analysis

Back up and archive the dataset nie.gov.in

1
8/1/2019

Key elements of data management

BASIC COURSE IN BIO-MEDICAL RESEARCH


• Data structure
• Data entry
• Individual and aggregated databases
• Mother and daughter databases

nie.gov.in

Basic structure of a database


• Lines represent records BASIC COURSE IN BIO-MEDICAL RESEARCH
• Columns represent variables
Identifier Variable 1 Variable 2 Variable 3 Variable 4 Etc…
Record 1
Record 2
Record 3
Etc…

Structure
nie.gov.in

2
8/1/2019

Data documentation

BASIC COURSE IN BIO-MEDICAL RESEARCH


• Structure
• Name, number of records etc
• Variables
• Name, values, coding
• History
• Creation, modification
• Storage information
• Media, location, back up
• Additional information Structure
nie.gov.in

Identifier in the database


BASIC COURSE IN BIO-MEDICAL RESEARCH

• Unique
• Maintained by a computerized index
• Secured by quality assurance procedures

Structure
nie.gov.in

3
8/1/2019

Using codes within the unique identifier

BASIC COURSE IN BIO-MEDICAL RESEARCH


Village
• Unique identifier may
contain all information about Street

that particular ID House

• Each digit or set of digits refer Person


to specific information 1 2 3 4 5 6 7
• Example:
• First and second digit: village
• Third and fourth digit: Street
• Fifth digit: House Structure


nie.gov.in
Sixth and seventh digit: Person

Structure of the variables in the database


BASIC COURSE IN BIO-MEDICAL RESEARCH
• Integer
• Specify the number of digits
• Numeric
• Specify the number of decimals
• Alpha-numeric
• Specify length
• Turn all letters to capitals
Structure
• Dates (specific format) nie.gov.in

4
8/1/2019

Creating variable names

BASIC COURSE IN BIO-MEDICAL RESEARCH


• Clear
• Need to refer to the questionnaire item
• Understandable (e.g., “EXERDAILY” for “Exercise daily”)
• Short, no space
• Most softwares require less than 10 characters
• Consistent
• “EXERPAST” for “Exercise daily in the past”
• “EXERCURRDLY” for “Exercise daily in the current ”
• “EXERPASTOCC” for “Exercise occasionally in the past”
• “EXERCURROCC” for “Exercise occasionally in the current”
• “VARIAB” for all crude variables (EXERCISE)
• “VARIAB_12” for all dichotomized variables (EXERCISE_12)
• No duplicate Structure
• Trimming of names by software can create duplicate name nie.gov.in

Design data entry-friendly data collection


BASIC COURSE IN BIO-MEDICAL RESEARCH
instrument
• Outline
• Identifiers
• Demographics
• Outcome (Health problem/disease)
• Exposures (variables, including third factors)
• Auto-coding function
Entry
nie.gov.in

5
8/1/2019

Coding

BASIC COURSE IN BIO-MEDICAL RESEARCH


• Prefer numerical coding
• Decide on
• Missing values (.) or (9, 99, 999)
• Not applicable (8, 98, 998)
• Avoid cumbersome codes
• WALKING (1) and CYCLING (2)
• Doing WALKING and CYCLING (12)
• Use as “1” or “0” (“1” or “2”) as baseline for
gradients (Yes/No or Present/Absent) as appropriateEntry
depending on software for analysis nie.gov.in

Constructing a data dictionary


BASIC COURSE IN BIO-MEDICAL RESEARCH
• Contains, for each variable:
• Variable name Question Variable
name
Type Format Values Logical
checks
• Description of questionnaire 1 EXERDAILY Integer Yes =1 Skip
No =2 pattern
item
2 EXERTYPE Integer Walking =1
• Various values of variable Cycling =2
(e.g., 1, 2, 3)
ETC…
• Meaning of each value (e.g.,
1= Yes, 2=No) Some softwares create variable catalogue automatically; Ideally investigator constructs the same

• The catalogue is particularly useful:


• When a database is shared with others Entry
nie.gov.in
• If the researcher has to get back to the database later

6
8/1/2019

Check specifications before data entry

BASIC COURSE IN BIO-MEDICAL RESEARCH


• Minimum and maximum values
• Legal codes
• Set of values that will be accepted
e.g., 1, 0 and 9 for “Yes”, “No” and “Missing”
• Skip patterns
• Automatic coding
• Copying data from preceding record Entry

• Calculations nie.gov.in

Data entry
BASIC COURSE IN BIO-MEDICAL RESEARCH

• Use as opportunity for partial data cleaning


• Write comments
• Seek clarification
• Use checks
• Mark each paper as data entry is completed
• Validate after data entry
Entry
nie.gov.in

7
8/1/2019

Individual and aggregated databases

BASIC COURSE IN BIO-MEDICAL RESEARCH


• Individual databases
• Each record is an observation
• Aggregated database
• Records contain counts
• Normalized database
• Only one count by record
• Facilitates further aggregation
nie.gov.in

Individual and aggregated databases

Aggregating individual data BASIC COURSE IN BIO-MEDICAL RESEARCH

Individual data Aggregated file


I Place Age Sex Onset
D I Place Count
D
1 A 3 1 1 Jan 06
1 A 5
2 B 1 2 1 Jan 06
2 B 3
3 C 35 2 3 Jan 06
3 C 37
4 D 67 1 4 Jan 06
4 D 67
5 A 2 1 2 Jan 06
6 B 2 1 4 Jan 06
5 C 2 1 5 Jan 06
nie.gov.in
… … … … …
Individual and aggregated databases

8
8/1/2019

Mother and daughter databases

BASIC COURSE IN BIO-MEDICAL RESEARCH


• Information is available at various levels
• Village
• Household
• Individual
• Illness episode
• Store information at each level in separate
databases
• Link databases together with identifiers nie.gov.in

Mother and daughter databases

Mother and daughter databases


Household level data Individual level data BASIC COURSE IN BIO-MEDICAL RESEARCH
HousI Location Communit HousInco HousID PersonID Diseased Exposed
D y m 1 101 1 1
1 A 3 1 1 102 2 1
2 B 1 2 2 201 2 2
3 C 35 2 2 202 1 2
4 D 67 1
5 E 2 1 • Each database has its own
6 F 2 1 unique identifier
5 G 2 1
• Link these relational databases
… … … …
using a common index identifier
nie.gov.in
• Merge files when needed
Mother and daughter databases

9
8/1/2019

Summing up on data management

BASIC COURSE IN BIO-MEDICAL RESEARCH


• Code database numerically
• Enter data using quality assurance procedures
• Store information at the level where it needs to
be stored
• Relate/Merge files when needed and as required

nie.gov.in

BASIC COURSE IN BIO-MEDICAL RESEARCH

nie.gov.in

10

You might also like