2022/08/19
Data & File
Management
Section 6
Data and File Management
Outline
Data vs information
Ways of presenting information
Data Concepts
Analogue vs digital data
Need for data converters
Data representation in computers
Data types
Data entry techniques
Methods of data collection
Methods of data capture
Codes
Data entry methods
Data entry checks
Verification
Validation
File organisation
File components
File access methods
Type of files
1
2022/08/19
Data and File Management
Data vs information
Data Information
Raw facts and figures Processed data e.g.
e.g. 140593 140593 is the date of
birth
No meaning to the user Has a meaning to the
user
Input to the computer Output from the
computer
Data Concepts
Ways presenting information
There
are various ways of presenting information as
computer output. Information can be presented as:
A screen display:
A soft-copy (digital information); displayed on the screen
Hard copy:
printed information on a physical medium like paper
Multimedia presentations:
the use of text, graphics with motion and sound including
video, audio, animation and photographs together.
Sound:
E.g. music
Virtual Reality:
gives the user a real life experience via multimedia
effects; this is used in simulation of real life situations
2
2022/08/19
Data Concepts
Analogue vs Digital data
Analogue data Digital Data
Consists of continuous values. - A Consists of discrete or fixed values.
good example of this is - A good example of this is a digital
an analogue clock. An analogue clock. A digital clock jumps from
clock shows the time with one second to another in clear
a smoothly moving seconds hand. steps. The change is not smooth or
The change is continuous. continuous.
Not directly processed by the Directly processed by the
computer computer
- This is because they need an ADC - This is because they are already in
to be processed the form the computer
understands
Data Concepts
Need for Data converters
These are devices used to convert data from one form to the
other :
Analogue to Digital Converter (ADC)- Used to convert analogue
data into digital form so that the computer can understand and
process the data.
E.g. When processing data sent from a sensor
Digital to Analogue Converter (DAC)- Used to convert digital
from the computer to analogue form so that the computer can
effectively control these devices.
E.g. If the computer is being used to control a device
(motor/valve) the device will be controlled by variable
voltages; the DAC will be used to send out analogue signal
Modem (MODulator DEModulator)-
Converts computer’s digital signals (modulates it) into
analogue for transmission through telephone lines
Reverse this process- analogue signal from a telephone line
into digital for a computer to process the data (demodulates
it)
The main use it to connect to computer networks over long
distances using existing telephone line
3
2022/08/19
Data Concepts
Data types
Many different data types can be stored on a computer system. The data types
which are commonly used are as follows:
Numeric
Numeric (number) data can be in two forms:- Integer and Real. E.g. 1288, 12.45,
-156
Text/Alphanumeric
This allows you to type in text, numbers and symbols. E.g. John Smith, John123
Date/time
Usually formatted in a specific way, e.g. dd/mm/yy, dd/mmm/yyyy, long time
etc. The format depends upon the setup of the computer, the software in use
and the user’s preferences. E.g. 25/10/2007, 15:56PM
Percentage
Percentage numbers are real numbers (decimals) that have been formatted to
show values out of 100. Percentage are usually shown with the percentage
symbol (%). For example: 10%, -178%
Currency
Currency refers to real numbers that are formatted in a specific way. Usually
currency is shown with a currency symbol and (usually) two decimal places,
e.g. P5.23
Boolean
Boolean data is sometimes called 'logical' data (or in some software, 'yes/no'
data). Boolean data can only have two values: TRUE or FALSE
Data Concepts
Data representation in computers
When people count, they use the digits 0 to 9, which are digits
in the decimal system.
A computer understands only two states, it uses a number
system that has just 2 unique digits, 0 and 1. This number
system is referred to as binary system.
Each 0 or 1 digital value is called a bit (binary digit) and represents
the smallest unit of data the computer can handle
By itself a bit is not very informative; when 8 bits are grouped
together as a unit, they are called a byte. 1byte is used to
represent 1 character.
And 1024 bytes=1KB, 1024KB=1GB, and so on…
The combinations of 0s and 1s are used to represent
characters on the keyboard, which are defined by patterns
called a coding scheme.
The number 1 is represented as 00110001, the number 2 is
represented as, 00110010, and so on…
Everything stored and processed by a computer is
represented in binary form
images, audio, video, text etc
4
2022/08/19
Data entry techniques
Methods of data collection
Observation
Involves looking at how things are done, while
making notes on the information obtained
Interviews
Preparing questions and giving them to
respondents to answer immediately they get
questions, either orally or written
Questionnaires
Preparing set of questions giving them to the
respondents to answer at their own time.
Document study
Allows the analyst to see how the paper files are
kept, look at operating instructions and training
manuals, check the accounts, etc.
Data entry techniques
Methods of data capture
Key-to-disk
Optical Character Recognition
Magnetic Ink Character Recognition
Optical Mark Recognition
Barcode readers/scanners
Magnetic Stripe reading
Voice Recognition System
Data logging
List an advantage and a disadvantage for each of these method
above
5
2022/08/19
Data entry techniques
Key to disk
The information is keyed via a keyboard and then
stored
Advantages Disadvantages
There shouldn’t be
much of a need for Slow to enter data
training, as it is the most
common data input
method
Transcription (data
No need for entry) errors can
occur
specialised data
collection documents
Keyboard is cheaper Handwriting
to purchase recognition can be
unreliable
Data entry techniques
Barcode Reading
USE: Used to scan bar codes which contains unique
information about a product.
Much faster than Barcodes may be
swapped during data
using manual preparation
methods (key to Rely on undamaged to
disk) barcodes in order to
function (i.e) does not
Very accurate as work on damaged
there is no manual codes
Barcode reader may be
typing involved.
expensive to purchase
Barcode only contains
numerical code
6
2022/08/19
Data entry techniques
Optical Mark Recognition
USE: Used to scan in marks from multiple
choice exams, surveys, and lottery tickets.
Advantages Disadvantages
Damaged and dirty
Recognition is documents are difficult to
exactly accurate read
It is a much faster The forms need special
designing to make sure
method of recording that the marks can easily
data than doing it all be read by the machine.
manually If the forms are not
Less chance of errors correctly filled, they
cannot be read properly
Data entry techniques
Optical Character recognition
USE: scans text from hardcopies and converts it into an
editable form which can be used and edited in a
range of software including word processors.
Advantages Disadvantages
Much faster than Damaged and dirty
entering all the data documents are difficult to
manually read
No special data- The system cannot easily
preparation equipment read handwriting (text
required – it just uses text and/or numbers)
on ordinary paper
Data is easily read by It is not very accurate
humans as well as the Converted documents
computer will need to be checked
7
2022/08/19
Data entry techniques
Magnetic stripe reading
USE: Used to read data found on magnetic
stripes found on the back of cards.
Advantages Disadvantages
Limited storage capacity
Fast data entry on a magnetic stripe
compared to manual If the stripe becomes
entry damaged in any way, all
of the data is lost
Secure/Error Free – No
The card needs to be
Typing close to the reader for it
Not effected by water
to work properly
Not secure- stripes are
and robust if dropped easily duplicated
Data entry techniques
Voice Recognition
USE: convert sounds made by a user, via a microphone
into commands that the computer can carry out
Advantages Advantages
No special data- Recognition is not 100%
preparation equipment accurate
required – you just say Dictation systems need
the data to be trained
Data is easily understood Not everything – e.g.
by humans as well as the mathematical formulae
computer – are easy to describe in
Little training is required words
8
2022/08/19
Data entry techniques
Magnetic Ink Character Recognition
USE: Used to process bank cheques. The characters at the
bottom of cheque which are printed in a special ink are read
by the Magnetic Ink Character Reader.
Advantages Disadvantages
Specialist high-quality printing
No need to manually equipment is required – this
enter text – less chance obviously costs more
Only certain characters can
of human error. be written that the device will
Characters can not be be able to interpret
Its more expensive than most
altered. direct data entry methods
Characters can be read Limited amount of characters
even if they have been can be read.
written over.
Codes
Introduction
This usually means shortening the original data in an agreed
manner. The agreement is between the users of the system.
This coding scheme could be part of the training of how to
use the system, and it could also be documented within the
system for new users.
E.g. suppose that a field could contain one of three possible
values; Small, Medium or Large. Instead of typing in the full
word each time we could instead type S, M or L.
Reasons for using codes
Speeding up data entry- because there is less to type
Increase accuracy of data entry- Fewer key presses are
needed when entering a value in the field
Allows for use of validation- packages allow automatic
validation checks to be set up to make sure that only the
allowed codes have been input in a field
Less storage space required- due to fewer/less characters
used in codes
Faster searching for data- much easier to search than typing
the whole word
9
2022/08/19
Data entry checks
Checking data
Data stored on a computer is only useful as
long as it is correct and up-to-date.
it is important to check data when it is
entered to make sure that it is both sensible
and correct.
If data is not checked before it is processed
any errors could cause the final output to be
nonsense.
There are two methods that can be used to
check data when it is input.
These are called verification and validation.
Data entry checks
Verification
Verification means to check that the data on the
original source document is identical to the data
that you have entered into the system.
Verification does not check if data makes sense or is within
acceptable boundaries, it only checks that the data entered
is identical to the original source (source document).
Verification methods:
Double entry:
Data is entered twice and the computer checks that they
match up
Visual check/proof reading:
The user manually reads and compares the newly inputted
data against the original source to ensure they match
10
2022/08/19
Data entry checks
Validation
Automated checking by a program that data is reasonable before it is
accepted into a computer system
It is performed automatically by the computer system to ensure that only
data is that is reasonable is accepted
Data that is not sensible or allowed is rejected by the computer
Validation checks:
Range Check: Checks the data falls between an acceptable upper and
lower value, within a set range.
Format Check: Checks that data is in a specific format E.g. Date should
be in the form dd/mm/yyyy
Length Check: Checks if the input data contains the required number of
characters
Character/type Check: Checks that the data entered is of an expected
type, e.g. text or a number
Presence Check: Checks if data is actually present and has not been
missed out
Check Digit: Look at an extra digit which is calculated from the digits of a
number and then put on the end of the number E.g. Check digits can
identify three types of error:
If two digits have been inverted during input
An incorrect digit entered twice
A digit missed out altogether
Data entry checks
Validation Vs Verification
Verification Validation
- Can be done manual and by -Only done by a software
the computer
-Checks data against the source -Checks data for reasonability
document and accuracy
-Check applied after data entry -check at the time of entering
before processing
11
2022/08/19
File organisation
File components
A database is an organised collection of information consisting of 1
or more files (or tables)
A database file is a collection of related records. For example, a file
of information about all the pupils in a school
A record in a database file is a collection of related fields. In the
records of the same type , the fields are in the same order.
A field is an item of information. A field contains one individual item
of data. For example, each pupils surname.
A key field or a primary key uniquely identifies a record.
Key Field Fields
Records
CarReg Make Model
B101AAA Toyota Corolla
B999ABD VW Polo
B675BDA BMW x5
File organisation
File access methods
Direct access
This method is also called Random Access
means that the required data can be found straight away
without having to read through all the data on the disk.
Hard disks, CDs, DVDs, USB memory sticks all allow direct
access to data.
Speed of access is faster
Serial Access
Data is accessed by starting at the beginning and then
searched through, in order/sequence, until the required
information is found.
Because it take longer to locate a file on serial access
devices, they are only used as backup and batch
processing.
Were the speed of locating data is not important
Magnetic tape allows only serial access to data.
12
2022/08/19
File organisation
Types of files
Transaction file:
A temporary file containing transactions/changes that took place.
Used to update the master file.
Master file:
Represent the on-going information pertaining to an organisation
The most important, permanent copy a file of an organisation
It is lost or damaged, the whole system may break down
The grandfather-father-son principle often used for file security
13