Module 7
Module 7
INTEGRATED DISEASE
SURVEILLANCE PROJECT
BASIC EPIDEMIOLOGY
s
IN
DISEASE SURVEILLANCE
Module -7
137
CONTENTS
1. Introduction 139
2. Specific Instructional Objectives 139
3. Format of the Training Session at a Glance 140
4. Key Points to Remember 140
5. Group Activities 145
6. Frequently Asked Questions 146
7. Handout on Basic Epidemiology 146
8. Evaluation Questions 158
138
1. INTRODUCTION
Epidemiology is the basic science for public health and surveillance the backbone of
Public health. Thus it is very important to understand some basic concepts of
epidemiology by all those who are practicing surveillance for public health. Statistics
and epidemiology form the cornerstone of public health surveillance. An understanding
of statistical principles is necessary to comprehend the published literature and practice
in a rational manner. The purpose of this section is to review some of the basic statistical
principles and formulas. More in-depth discussion can be obtained in texts of
epidemiology and biostatistics
Incidence
Prevalence
Case fatality
Clustering
Disease trends
Define at least 70% of the common definitions and terminology related to disease
surveillance
Make tables, graphs and maps from the data provided and perform essential
analysis to interpret the data.
139
3. FORMAT OF THE TRAINING SESSION AT A GLANCE
DURATION OF SESSION 3 HOURS
Unit CONTENT METHODOLOGY DURATION TENTATIVE TEACHING
No AIDS
140
Case based surveillance: The surveillance of a disease by collecting specific data on
each case (e.g., reporting of details on each case of AFP)
Cluster: The occurrence of an unusual number of cases in person, place, and time,
compared to a known or estimated baseline frequency of occurrence.
Community surveillance: Surveillance where the starting point is a health event
occurring in the community and reported by a community worker or actively sought by
investigators. This may be particularly useful during an outbreak and where syndromic
case definitions can be used. (the active identification of community cases of Ebola
virus infection in Kikwit was an example of this type of surveillance)
Contact: An individual who has had interaction with a case in a way that is considered
to have caused significant exposure and therefore risk of infection.
Due dates: The dates by which reports from a specified period should be received by
each level of the surveillance (used to calculate timeliness)
Endemic: The continuing presence of a disease within a given geographic area or
population group.
Epidemic: The occurrence of an illness or injury clearly in excess of expectancy. This is
often referred to as an outbreak (See below)
Epidemiological case definitions: The definition of a case used for reporting to the
surveillance system. The definition may be clinical, laboratory or both. It may relate to
a specified disease (e.g. Measles, yellow fever) or may identify a syndrome (Fever with
Rash, Jaundice, Cough, AFP).
Exception flagging system: The existence of an automated system of data analysis
that calculates thresholds for unusual events or exceptions.
Exposed: A person who has had contact with an infectious agent in a manner that is
known to possibly lead to disease.
Feedback: The regular process of sending analyses and surveillance reports on the
surveillance data back through all levels of the surveillance system so that all
participants can be informed of trends and performance.
Health event: Any event relating to the health of an individual (e.g. the occurrence of a
special disease or syndrome, the administration of a vaccine or an admission to hospital).
Hospital Surveillance: Surveillance where the starting point for a report is the admission
of a patient with a particular disease or syndrome.
Infectious disease: An illness due to a specific infectious agent or its toxic products
that arises through transmission of that agent or its product from an infected person,
animal, or reservoir to a susceptible host, either directly or indirectly through an
intermediate plant or animal host vector, or inanimate environment.
Intensified surveillance: The upgrading from a passive to an active surveillance system
for a specified reason and period (usually because of an outbreak). It must be noted
141
that the system becomes more sensitive and secular trends may need to be interpreted
carefully.
Integrated Surveillance: A coordinated approach to data collection, analysis,
interpretation, use, and dissemination of surveillance information designed for decision
making for public health action. The approach involves integration of surveillance
activities at national, regional, district and health-facility levels.
Laboratory Surveillance: Surveillance where the starting point is the identification or
isolation of a particular organism in a laboratory (e.g. Surveillance of salmonellosis)
Mandatory Surveillance: A surveillance where participants must report to the system.
Notifiable diseases are one example of a mandatory system where reporting is required
by law. Another example is when a health authority requires that all public laboratories
report diseases in conjunction with their contractual duties.
Notifiable disease: A disease that must be reported to the authorities by law or
ministerial decree.
Outbreak: The occurrence of two or more linked cases of a communicable disease.
Passive surveillance: Surveillance where reports are awaited and no attempt made
to actively seek reports from the participants in the system.
Primary care Surveillance: Surveillance where the starting point for a report is a new
consultation for a particular disease or syndrome with a primary care physician or
health worker at a clinic.
Performance indicators: Specific agreed measurements of the process of reporting,
action taken in response to surveillance information and the impact of surveillance on
the disease or syndrome in question.
Periodicity: The presence of a repeating pattern of cases. The repeater period can be
in years, months or weeks.
Reporting completeness: Proportion of all expected reports that were actually received.
(Usually stated as “% completeness as of a certain date”)
Reporting timeliness: Proportion of all expected reports that were received by a certain
due date.
Reporting system: The specific process by which diseases or health events are
reported. This process varies with the importance of the disease and the type of
surveillance conducted.
Routine surveillance: The regular systematic of specified data in order to monitor a
disease or health event.
Sentinel surveillance: The surveillance of a specified health event in a setting of the
population at risk. E.g. HIV surveillance in antenatal clinic attendants.
Sero-surveillance: The surveillance of an infectious disease by measuring pathogen
specific antibodies in a population or sub population.
142
Surveillance: The ongoing systematic collection, and analysis of data and the
dissemination of information to those who need to know in order that action may be
taken.
Surveillance report: A regular publication with specific information on the disease
under surveillance. It should contain updates of standard tables and graphs as well as
information on outbreaks, etc. In addition it may contain information on the performance
of participants using agreed performance indicators.
Surveillance sensitivity: The ability of a surveillance system to detect an outbreak
(the proportion of all outbreaks that can be detected by the system)
Surveillance predictive value: The likelihood that an “outbreak” detected by a
surveillance system is truly an outbreak or not.
Survey: An investigation in which information is collected systematically. It is usually
carried out in a sample of a defined population group and in a defined time period.
Unlike surveillance, it is not ongoing but may be repeated. If repeated regularly, surveys
can form the basis of a surveillance system.
Unusual event: The occurrence of a disease or health event in excess of the expectation.
This expectation is either a static or dynamic threshold set by the system.
Voluntary surveillance: A surveillance system where participants take part and report
voluntarily.
Zero reporting: The reporting of zero cases when the participant has detected no
cases. This allows the next level of the system to be sure that the participant has not
sent data that have been lost or has not forgotten to report.
Rate is the frequency of disease expressed per unit size of population and in relation to
time. Note that in rate the denominator includes the numerator. Example is incidence
and prevalence rate.
Ratio is the number of affected persons relative to the number who are unaffected.
Here, the numerator is not a part of the denominator.
Sex ratio is the number of females per 1000 males
Incidence: The number of persons in a defined population who become ill with a certain
disease during a defined time period.
Incidence Rate – the number of new cases that have occurred in a 1000 population
over a fixed period of time.
Formula for calculation of Incidence rate
Number of new cases
--–––––––––––––––––– X 1000
Population at risk
Prevalence: The number of persons in a defined population who have a disease at a
specific time.
143
Case fatality is the total number of deaths that occur amongst those who had the
disease expressed as percentage
Case Fatality Rate is calculated as number of deaths
———————— x 100
Number of cases
Example: In a village with a population of 5000, 50 cases of diarrhea were reported in
the month of July 2004, calculate the Incidence rate
Total population at risk = 5000
Cases of diarrhoea reported = 50
Incidence Rate = 50
—— X 1000 = 10/1000 population
5000
Example: In a town with total population of 10,000 there have been 200 cases of malaria
reported in last year. Find the prevalence of this disease in that town for that year
Population of the town = 10,000
Cases of malaria reported in the year = 200
Prevalence rate is 200 X 1000
————— = 20 per 1000 population
10,000
Example In a village, 300 cases of measles were reported in a month and out of them
30 died. Calculate the case fatality rate
300 cases occurred in a month and 30 died
Case fatality rate = 30
—— X 100 = 10%
300
Qualities Of Good Data
Completeness - All the information needed is provided
Timeliness- Reaches the reporting site in the stipulated date and time
Accurate - the people who are in-charge of collecting the information do it in a
stipulated manner.
Example When a health worker is informed by some village women that 5 cases of
diarrhea are reported from village A, and if the health Worker reports in the reporting
format, based on what has been told to her by the woman without confirming the
cases, it is not accurate information and it may even be misleading information.
Standard and Uniform case (disease) definition
144
Methods of Presentation of Data
Tables Simple
Cross
Graphical presentations
Bar Diagrams
Simple bar diagram
Multiple bar diagram
Sub divided or Component bar diagram
Histogram
Line diagrams (One or many)
Pie diagram
Spot maps
5. GROUP ACTIVITIES
Topics for Group Work
Basic Epidemiology
Exercise 1
Yearly report of one CHC is given below. The date of forwarding the reports to the
higher authorities is 5th on each month. The MO in PHC I was transferred in the month
of January so they has not sent the reports for that month The February report of PHC
I was received on 10th of the month in august, as there were floods in the area. MO PHC
II is regular but they have failed to send the report just once in the whole year
Monthly Analysis of Data
Reporting Site / months A M J J A S O N D J F M
Hospital 1………. 12 14 - 32 11 20 12 - - 45 12 15
Hospital 2………… 12 15 14 16 12 10 16 13 12 10 12 11
PHC 1 45 45 57 65 68 35 34 58 59 30 - -
PHC 2…………… 13 1 5 36 23 47 56 23 35 12 24 -
Total 82 75 76 136 114 69 118 91 106 97 48 26
146
literature and practice in a rational manner. The purpose of this section is to review
some of the basic statistical principles and formulas. More in-depth discussion can be
obtained in texts of epidemiology and biostatistics.
7.2 Measurements of disease frequency
Prevalence is the most frequently used measure of disease frequency and is defined
as:
Number of existing cases of a disease
Prevalence = ——————————————————
Total population at a given point in time
Incidence quantifies the number of new cases that develop in a population at risk during
a specific time interval:
Cumulative Incidence = Number of new cases of a disease during a given time period
————————————————————————————
Total population at risk
Cumulative incidence reflects the probability that an individual will develop a disease
during a given time period.
Mortality rate is an incidence measure:
Mortality = Number of deaths
—————————
Total population
Case –fatality rate is another incidence measure:
Case-fatality Rate = Number of deaths from the disease
————————————————
Total number of cases of the disease
Attack rate is also an incidence measure:
Attack rate = Number of cases of the disease during a given time period
———————————————————————————
Total population at risk due to having been exposed
7.3 Test result characteristics
It is important to understand predictive value, which helps in interpreting test results
for an individual. The predictive value positive expresses the probability that a person
with a positive test result is actually infected; the predictive value negative is the
probability that a person with a negative test result is not infected. The predictive
value depends not only on the accuracy of the test itself but also on the prevalence (the
percentage of persons who are infected in the population tested). The predictive value
of a positive test result decreases as the prevalence declines in the population tested.
Table below demonstrates how these values are generated.
147
From this table four important statistics can be derived:
Sensitivity—A sensitive test detects a high proportion of the true cases, and this quality
is measured by a /a + c
7.4 Comparison of a survey test with a reference test
Survey test result Reference test result
Positive Negative
Positive True positives (a) False positives (b) Survey test positives: a+b
Negative False negatives (c) True negatives (d) Survey test negatives: c+d
True positives: a+c True negatives: b+d
• Specificity—A specific test has few false positives, and this quality is measured by
d/b+d.
• Systematic error—For epidemiological rates it is particularly important for the test
to give the right total count of cases. This is measured by the ratio of the total
numbers positive to the survey and the reference tests, or (a+b) / (a+c).
• Predictive value—The proportion of positive test results that are truly positive; it
is important in screening. It should be noted that both systematic error and
predictive value depend on the relative frequency of true positives and true
negatives in the study sample (that is, on the prevalence of the disease or exposure
that is being measured).Predictive value is measured by a/a+b.
7.5 How Does Surveillance Case Definition Relate to Sensitivity and Specificity?
As noted above public health officials rely on reporting units (health-care providers,
laboratory personnel, and other public health personnel) to report the occurrence of
diseases, conditions, injuries, and so on to health departments. To facilitate this reporting
case definitions are developed to provide uniform criteria for identifying these disease
and conditions.
Case definitions always involve a balancing act of sensitivity as opposed to specificity.
A definition is sensitive if it identifies all the cases of a disease or condition in question.
A definition is specific if it excludes individuals without the disease or condition in
question. Sensitivity and specificity thus describe the accuracy of the test. Sensitivity
determines the percentage of false-negative results, and specificity determines the
percentage of false –positive results, when a large number of positive and negative
samples are tested.
An insensitive case definition may suffice when cases are plentiful and it does not
matter if some cases are missed. On the other hand, in the end-game of control (when
a disease nears elimination), it is important to have a sensitive definition to ensure that
all possible cases are captured, even if many are false positive. Discuss the case
definition of cholera with regard to sensitivity and specificity.
148
CHOLERA
Clinical case definition
• In an area where the disease is not known to be present: severe dehydration or
death from acute watery diarrhea in a patient aged 5 years or more or
• In an area where there is a cholera epidemic: acute watery diarrhea, with or
without vomiting in a patient aged 5 years or more.
Laboratory criteria for diagnosis
• Isolation of Vibrio cholerate 01 or 0139 from stools in any patient with diarrhea.
Case classification
• Suspected: A case that meets the clinical definition.
• Probable: Not applicable
• Confirmed: A suspected case that is laboratory confirmed.
Note: In a cholera-threatened area, when the number of “confirmed” cases rises,
shift should be made to using primarily the “suspected” case classification.
6
cholera does appear in children under 5-years old; however, the inclusion of all cases of acute watery diarrhea in the
2-to 4 year-old group in the reporting of cholera greatly reduces the specificity of reporting. For management of cases
of acute watery diarrhea in an area where there is a cholera epidemic, Cholera should be suspected in all patients.
149
7.7 Data
What is data? It is health related information that is collected and documented. It
may be either quantitative or qualitative data.
Quantitative data is measurable data and is in form of numbers like hemoglobin
level, blood pressure measurement, number of ANC visits attended etc.
Qualitative data gives details about the event like, reasons for smoking, place of
seeking treatment, reasons for non utilization of a health service, etc.
Where can this data be generated?
Under IDSP the routine sources of data availability and reporting are explained
in module 6. Apart from these sources data can also come from
• In the outreach areas i.e. the communities
• House to house surveys
• Family surveys
• In the OPD, General OPD, Special OPD
• Antenatal Clinics
• Postnatal clinics
• In the indoor departments,
• Surveys routine and (?) sentinel
Who collects the data
• The health workers
• Para medical workers
• Nurses
• Private health care facilities
• Members of the community
Characteristics of good data
Valid - A valid data is the one where the reporting is as per the (accepted) case
definitions and the definition is adhered to at all times.
Complete – This means that all the components of the information from all reporting
units is incorporated in the report.
A report is said to be Complete when all the reporting units within its catchment’s
area has submitted the reports on time. If only 8 out of 10 centres have sent the
reports, then the report is said to be incomplete (or 80% complete)
150
Timeliness - is sending the report on the stipulated time
Time of sending the information
May be Daily in case of outbreaks
Weekly in some cases
Monthly most of the situations
A centre is said to be sending their reports timely, if the reports reaches the designated
level within the prescribed time period. If it reaches, later, then the report is considered
to be late (and of lesser public health use). The timeliness of a reporting unit can be
calculated by assessing how many of its expected reports have come on time.
7.8 Methods of Presentation of the Data
Simple tables: Data can to be presented in form of tables. When the data is
presented in form of variables and numbers they form tables. Frequency tables
are simple tables that have variables and percentages
Table showing the blood slides tested for malarial parasite
Blood slides tested for Malarial parasite Number Percentage
Positive tests 130 65
Negative tests 70 35
Total 200 100
Cross tables
Compound tables (Cross-tabulations) are tables where two variables are compared
and then analyzed, for correlation or association and if it is so, then, tests of significance
can be applied to find out the statistical significance of the correlation or significance.
In the below example, if the information with respect to ages is available, then the
positive and negative tests can be further classified and could be as shown below:
Blood slides tested for Infants All other age groups Total
Malarial parasite
Negative tests 1 69 70
151
Cases of URTI & Diarrhea in 2 villages
Village A Village B Sub centre XYZ
Variable Number Percentage Number Percentage Total Percentage
Cases of URTI 35 45.45 38 74.5 73 57.03
Cases of diarrhea 42 54.55 13 25.5 55 42.97
2) Multiple bar diagram This method can be used for data which is made up of two
or more components. In this method the components are shown as separate
adjoining bars. The height of each bar represents the actual value of the component.
The components are shown by different shades or colors. Where changes in actual
values of component figures only are required, multiple bar charts are used.
152
3) Sub - divided or component Bar Diagram: - While constructing such a diagram,
the various components in each bar should be kept in the same order. A common
and helpful arrangement is that of presenting each bar in the order of magnitude
with the largest component at the bottom and the smallest at the top. The
components are shown with different shades or colors with a proper index.
Illustration: - During 1998 -2000, the number of cases of URTI and Diarrhoea reported
by 3 sub-centres is as follows Represent the data by a similar diagram.
CASES OF URTI AND DIARRHOEA IN 3 VILLAGES
Year Village A Village B Village C Total
1998-1999 35 24 23 112
1999-2000 38 60 10 108
153
(b) Histogram – when the data is continuous as the data on age there is no gap
between the bars and so it is a continuous bar diagram.
(c) Line diagrams: Instead of representing the values as bars, each value can be
plotted and the different plots can be joined together to obtain what is called as a
line diagram. The advantage with line diagrams is that many sets of variables can
be simultaneously plotted without the diagram looking cluttered up. It is generally
recommended that not more than 5 to 8 lines or value sets be used. A village-wise
diarrhea case over the period 1995 to 2000 has been represented as a line diagram
below. It may be noted that the trend in the occurrence of the cases over the years
also becomes evident.
(d) Pie Charts
The pie chart is a staple form of data presentation graph. Used properly, it can be
an effective way of presenting a small number of pieces of data, provided the
following limitations are observed:
It should be used only where the values have a constant sum (usually 100%). It
should be used where the individual values show significant variations; a pie chart
of seven equal values is of no use.
It is often worthwhile adding annotations, especially the values for each category
(thus saving the need for a separate table of data values). It should be used when
the number of categories (‘slices’) is reasonably small; as a rule of thumb, the number
of categories should be normally between 3 and 10.
154
(e) Pictogram. At times common man does not understand the matter depicted by
bar diagram so some pictures are used to depict a fixed % of matter and multiples
of these can be used to present data.
(f) Scatter graphs
Scatter graphs are widely used in science to present measurements on two (or
more) variables that are though to be related; in particular, the values of the
variables as the y (vertical) axis are thought to be dependent on the values of the
variable plotted along the x (horizontal) axis. The latter is said to be the independent
variable. In the scattergram below, the ages of the 25 cases of poliomyelitis is
plotted. It can be seen that most of the cases are scattered in and around 2 to 4
years of age
155
Rates And Proportions
Rate is the frequency of disease expressed per unit size of population and in relation to
time. In rate the denominator include the numerator also. Example is incidence and
prevalence rate.
Ratio is the number of affected persons relative to the number of unaffected. Here the
numerator is not a part of the denominator.
Sex ratio. It is the number of females per 1000 males.
Incidence of a disease – the number of new cases of a specific disease that have been
detected in the specified period in a given population
Incidence Rate – the number of new cases that have occurred in a 1000 population
over a fixed period of time.
Formula for calculation of Incidence rate
Number of new cases
—————————— X 1000
Population at risk
Incidence rate of disease A:
No of new cases of disease A
——————————————————— X 1000
Mid year Population in the catchment area
Uses of incidence rates
1. To find out new cases occurring in a given population in a specified time.
2. It tells us about the force of transmission of the disease
3. It tells us about the location of the newfound cases so action can be focused to
that group and stamp out the spark before it becomes a big fire requiring fire
brigade to put it off.
4. It tells us the trends when they are compared over a period of time.
Prevalence of disease is the number of old and new cases that have been detected in
the specified period of time or at a given point
Prevalence Rate –is the number of old and new cases that have occurred in a 1000
population over a fixed period of time or a given point of time
Prevalence rate of disease A:
No of Old and new cases of disease
———————————————————— A X 1000
Mid year Population in the catchment area
156
Uses of Prevalence Rates
It gives an idea of the total load of disease (Burden of disease)
It gives us an idea of the amount of resources that may be required to manage this
burden
Maps – this helps the officer to identify the areas where health events are occurring.
Many diseases are concentrated in a fixed geographical area and the factors responsible
for their causations are affecting the people out there thus it is essential to geographically
locate each and every case that has occurred over a period of time on the map of the
area. It gives the concerned health care providers an idea where quick action needs to
be taken to stop the disease spread, it also tells the epidemiologists about the
impending dangers if similar pattern is shown in the past Example
Clustering of cases indicates that a large number of similar cases have occurred in a
limited geographical area or have occurred around the index case or have occurred
near the vicinity of the event where a large number of people gathers. This gives us an
idea of the causative, predisposing factors that might have played a role in its occurrence
Trends can be identified by studying the annual number of cases occurring as compared
with the number of cases of previous years. So when the incidence of a disease is
studied over a period of time the reduction or and increase in the incidences shows us
the trends.
These trends indicate the direction of work done in the disease under study.
Example: As a result of implementation of effective immunization Programme measles
has shown decreasing trends i.e. the incidence of measles over years has reduced
Cyclic pattern of the disease Some diseases tend to occur in peaks once in a few years
and this is called as the cyclic occurrence of the disease. By studying the incidence of
the disease over years, we can anticipate the increased incidence.
Examples: Prior to the immunization programmeme a cyclic pattern in the incidence of
vaccine preventable diseases was recorded. For example, an increase in the incidence
of measles was reported every two to four years. The period between the peaks will,
however, increase and the intensity of the peaks decline with increasing immunization
coverage levels? If the cyclic pattern of the disease in the area is known, an increase in
the incidence can be anticipated and precautionary measures can be taken in the high-
risk pockets. If the recorded data do not show any change in the cyclic pattern despite
high sustained immunization coverage levels, the quality of the immunization
Programme including cold chain maintenance, potency testing results of the DPT and
the reported immunization coverage levels should be checked.
7.11 Definition of an Epidemic
An epidemic is commonly defined as the occurrence in a community or area, of cases of
a disease that are clearly in excess of what is expected. (Compared to that in previous
years) Epidemics occur over a large area.
157
Example : In the rainy season there are cases of diarrhea reported from many areas
spread over a large area.
7.12 Definition of an Outbreak
An outbreak is defined as the occurrence in a community of cases of an illness clearly in
excess of expected numbers. While the out break is usually limited to a small focal area
an epidemics covers a large geographic area and has more than one focal point
There is yet another definition of an out break – Occurrence of two or more
epidemiologically linked cases of a disease of ‘out break potential’ like measles cholera,
Dengue, J E.
Example: In a slum pocket in an urban area there was sudden reporting of 50 cases of
fever with rash in adults and one of the case was diagnosed for dengue on laboratory
investigations. Following this 6 more cases had confirmed laboratory results positive
for Dengue. There were no similar reports of fever from this area in the past. Here we
can say that there is an outbreak of Dengue in this slum area.
8. EVALUATION QUESTIONS
What do you understand by Incidence of a disease?
How do you calculate the Incidence rate of a disease?
What do you understand by prevalence rate?
How do you calculate prevalence rate
Clustering of diseases indicate______
To find out the trends of a disease we need to find the
What are the difference between rate and ratio? Give one example each for both.
Fill in the gaps
__________ is the minimum number of cases in a district for qualification as an epidemic
for Japanese encephalitis.
____________ of data refers to accuracy and quality of data made available through the
systems.
___________ refer to the receipt of data before or as the due date.
158