Using Structured Data
Contemporary Challenges in
Data Policy and Governance
Lecture #06
18-08-2025 PS-632 CPS IIT Bombay 1
Physical Data Model
• Tables
• Columns
• Keys/Indices
• Triggers
The standard for creating and manipulation relational databases
is SQL
18-08-2025 PS-632 CPS IIT Bombay 2
Tables and Columns
• A Table is a two dimensional structure. The rows of a table are
just like the records in a file. Each row can have many columns.
Each column has a specific type. A table can be used to store
all attributes of an entity as well as to represent relations
• The columns in a table are like the fields of a record in a file.
They are used to store attributes of an entity or data that
participate in a relation
• Each row is a collection of attribute values of different types
• Each column is a collection of attribute values of the same type
18-08-2025 PS-632 CPS IIT Bombay 3
Downloading and installing PostgreSQL (optional)
https://www.postgresql.org/download/
18-08-2025 PS-632 CPS IIT Bombay 4
Downloading and installing pgAdmin (optional)
https://www.pgadmin.org/
18-08-2025 PS-632 CPS IIT Bombay 5
Physical Data Model for our example
Each entity will be represented by a table
• Reg_users
• Beneficiaries
• Hospitals
• Vaccinators
• Vaccines
• Appointments
• Vac_certs
18-08-2025 PS-632 CPS IIT Bombay 6
Creating a Database
A database can be created using the CREATE DATABASE
command. We will create a database called tempdb.
CREATE DATABASE tempdb;
18-08-2025 PS-632 CPS IIT Bombay 7
The Reg_users table
Table: Reg_users
mobile_number : string
CREATE TABLE Reg_users (
mobile_number varchar(15) );
18-08-2025 PS-632 CPS IIT Bombay 8
The Beneficiaries table
Table: Beneficiaries
linke bene first last mid ph_t ph_n gen year
CREATE TABLE Beneficiaries (
linked_mobile_number varchar(15) ,
beneficiary_id varchar(15) ,
first_name varchar(50) ,
last_name varchar(50) ,
middle_names varchar(50) ,
photo_id_type varchar(15) ,
photo_id_number varchar(15) ,
gender char(1) ,
18-08-2025 year_of_birth int ); PS-632 CPS IIT Bombay 9
The Hospitals table
Table: Hospitals
hid hname hstate hdistrict hpincode
CREATE TABLE Hospitals (
hid varchar(15) ,
hname varchar(50) ,
hstate varchar(20) ,
hdistrict varchar(20) ,
hpincode char(6));
18-08-2025 PS-632 CPS IIT Bombay 10
The Vaccinators table
Table: Vaccinators
mob vid first last mid id_ty id_nu gen
CREATE TABLE Vaccinators (
linked_mobile_number varchar(15) ,
vaccinator_id varchar(15) ,
first_name varchar(50) ,
last_name varchar(50) ,
middle_names varchar(50) ,
photo_id_type varchar(15) ,
photo_id_number varchar(15) ,
gender char(1) );
18-08-2025 PS-632 CPS IIT Bombay 11
The Vaccines Table
Table: Vaccines
vaccine_name manufacturer batch_number
CREATE TABLE Vaccines (
vaccine_name varchar(15) ,
manufacturer varchar(50) ,
batch_number varchar(20) );
18-08-2025 PS-632 CPS IIT Bombay 12
The Appointments Table
Table: Appointments
beneficiary_id hospital_id apt_date apt_time
CREATE TABLE Appointments (
beneficiary_id varchar(15) ,
hospital_id varchar(15) ,
apt_date date,
apt_time time );
18-08-2025 PS-632 CPS IIT Bombay 13
The Vaccination_certs Table
Table: Vaccination_certs
c_id c_ty b_id b_n b_g b_a vr_i vr_n h_id h_nm d_vc t_vc v_n v_b
CREATE TABLE Vaccination_certs(
vaccination_certificate_id varchar(15),
certificate_type char,
beneficiary_id varchar(15),
beneficiary_name varchar(50),
beneficiary_gender char,
beneficiary_age int,
vaccinator_id varchar(15),
vaccinator_name varchar(50),
hospital_id varchar(15),
hospital_name varchar(50),
date_vaccination date,
time_vaccination time,
vaccine_name varchar(50),
vaccine_batch_number varchar(50));
18-08-2025 PS-632 CPS IIT Bombay 14
Keys
• Keys are columns whose value is used to identify a row in a table
• Key can be of two main types
– Primary key : This is a column whose values can be used to uniquely
identify a row in the table in which it is a primary key
– Foreign key : This is a column whose values are the primary key in some
other table
• Keys are used to uniquely identify rows and establish relationships
across tables
• Keys are also called indices since an index is maintained on a
column which is a key to speed up access to a specific row
associated with a specific key value
18-08-2025 PS-632 CPS IIT Bombay 15
The Reg_users table
Table: Reg_users
mobile_number : string
primary key
CREATE TABLE Reg_users (
mobile_number varchar(15) );
18-08-2025 PS-632 CPS IIT Bombay 16
The Beneficiaries table
Table: Beneficiaries
linke bene first last mid ph_t ph_n gen year
primary key
CREATE TABLE Beneficiaries (
linked_mobile_number varchar(15) ,
beneficiary_id varchar(15) ,
first_name varchar(50) ,
last_name varchar(50) ,
middle_names varchar(50) ,
photo_id_type varchar(15) ,
photo_id_number varchar(15) ,
gender char(1) ,
18-08-2025 year_of_birth int ); PS-632 CPS IIT Bombay 17
The Hospitals table
Table: Hospitals
hid hname hstate hdistrict hpincode
primary key
CREATE TABLE Hospitals (
hid varchar(15) ,
hname varchar(50) ,
hstate varchar(20) ,
hdistrict varchar(20) ,
hpincode char(6));
18-08-2025 PS-632 CPS IIT Bombay 18
The Vaccinators table
Table: Vaccinators
mob vid first last mid id_ty id_nu gen
primary key
CREATE TABLE Vaccinators (
linked_mobile_number varchar(15) ,
vaccinator_id varchar(15) ,
first_name varchar(50) ,
last_name varchar(50) ,
middle_names varchar(50) ,
photo_id_type varchar(15) ,
photo_id_number varchar(15) ,
18-08-2025 gender char(1) ); PS-632 CPS IIT Bombay 19
The Vaccines Table
Table: Vaccines
vaccine_name manufacturer batch_number
No single column is a primary key
CREATE TABLE Vaccines (
vaccine_name varchar(15) ,
manufacturer varchar(50) ,
batch_number varchar(20) );
18-08-2025 PS-632 CPS IIT Bombay 20
The Appointments Table
Table: Appointments
beneficiary_id hospital_id apt_date apt_time
foreign key
foreign key
is primary key in
is primary key in
Beneficiaries table
Hospitals table
CREATE TABLE Appointments (
beneficiary_id varchar(15) ,
hospital_id varchar(15) ,
apt_date date,
apt_time time );
18-08-2025 PS-632 CPS IIT Bombay 21
The Vaccination_certs Table
Table: Vaccination_certs
c_id c_ty b_id b_n b_g b_a vr_i vr_n h_id h_nm d_vc t_vc v_n v_b
primary key
CREATE TABLE Vaccination_certs(
vaccination_certificate_id varchar(15),
certificate_type char,
beneficiary_id varchar(15),
beneficiary_name varchar(50), Foreign key to Beneficiaries table
beneficiary_gender char,
beneficiary_age int, Foreign key to Vaccinators table
vaccinator_id varchar(15),
vaccinator_name varchar(50), Foreign key to Hospitals table
hospital_id varchar(15),
hospital_name varchar(50),
date_vaccination date,
time_vaccination time,
vaccine_name varchar(50),
vaccine_batch_number varchar(50))
18-08-2025 PS-632 CPS IIT Bombay 22
Redundancy and Integrity
• Sometimes the physical data model may contain some redundancy,
for example the Vaccination_certs table need not contain the
beneficiary_name, hospital_name and vaccinator_name since the
beneficiary_id, hospital_id and vaccinator_id uniquely identify the
names. This may de done for reasons of speed of processing
• While introducing redundancy care must be taken to ensure
integrity. For example the beneficiary_name corresponding to a
beneficiary_id in the Vaccinations table must be the same as in the
Beneficiaries table.
18-08-2025 PS-632 CPS IIT Bombay 23
Triggers
• A trigger is a special procedure that runs automatically when
certain activity occurs in a database. These are typically set to
run when changes occur to a database.
• One of the important reasons for using triggers is to ensure
integrity constraints are met whenever a change is made to a
database.
18-08-2025 PS-632 CPS IIT Bombay 24
SQL
• SQL is a computer language that allows users to create and
manipulate relational databases
• SQL is an ANSI (American National Standards Institute)
standard
• There are many flavours of SQL but they all share a common
core
• The CREATE TABLE statement is an SQL statement
• SQL is used to write queries
18-08-2025 PS-632 CPS IIT Bombay 25
We wish to find
Using the Data Model 1. All the hospitals in a state
CREATE TABLE Hospitals (
2. All the hospitals in a district
hid varchar(15) ,
hname varchar(50) , 3. All hospital with a given PIN code
hstate varchar(20) ,
hdistrict varchar(20) ,
hpincode char(6));
18-08-2025 PS-632 CPS IIT Bombay 26
All the hospitals in a state
Using the Data Model
CREATE TABLE Hospitals ( SELECT hname FROM “Hospitals” WHERE hstate =
hid varchar(15) , ‘<name of the state>’;
hname varchar(50) , This will give only the hospital name field
hstate varchar(20) ,
hdistrict varchar(20) , SELECT * FROM “Hospitals” WHERE hstate = ‘<name
hpincode char(6)); of the state>’;
This will give all fields
SELECT hid , hname FROM “Hospitals” WHERE hstate
= ‘<name of the state>’;
This will give the hospital id and hospital name fields
18-08-2025 PS-632 CPS IIT Bombay 27
All the hospitals in a district and all the
Using the Data Model hospitals with a given PIN code
CREATE TABLE Hospitals (
hid varchar(15) , SELECT * FROM “Hospitals” WHERE hdistrict =
hname varchar(50) , ‘<name of the district>’;
hstate varchar(20) ,
hdistrict varchar(20) ,
SELECT * FROM “Hospitals” WHERE hpincode =
hpincode char(6));
‘<required pin code>’;
18-08-2025 PS-632 CPS IIT Bombay 28
A more complex query - All the
Using the Data Model hospitals in a district except those with
CREATE TABLE Hospitals ( a given pin code
hid varchar(15) ,
hname varchar(50) ,
SELECT * FROM “Hospitals” WHERE hdistrict =
hstate varchar(20) ,
‘<name of the district>’
hdistrict varchar(20) ,
AND
hpincode char(6));
hpincode <> ‘<the pincode to be excluded>’;
The symbol combination of <> stands for “not equal
to”
18-08-2025 PS-632 CPS IIT Bombay 29
Complex queries involving more than
Using the Data Model one table – Find all registered
CREATE TABLE Beneficiaries (
linked_mobile_number varchar(15) ,
beneficiaries that have not been
beneficiary_id varchar(15) ,
first_name varchar(50) ,
last_name varchar(50) ,
vaccinated
middle_names varchar(50) ,
photo_id_type varchar(15) ,
photo_id_number varchar(15) ,
gender char(1) ,
year_of_birth int );
SELECT * FROM “Beneficiaries” WHERE
NOT EXISTS
CREATE TABLE Vaccination_certs(
vaccination_certificate_id varchar(15), (SELECT * FROM “Certificates”
certificate_type char,
beneficiary_id varchar(15), WHERE “Certificates”.beneficiary_id =
beneficiary_name varchar(50),
beneficiary_gender char, “Beneficiaries”.beneficiary_id);
beneficiary_age int,
vaccinator_id varchar(15),
vaccinator_name varchar(50),
hospital_id varchar(15),
hospital_name varchar(50),
date_vaccination date,
time_vaccination time,
vaccine_name varchar(50),
vaccine_batch_number varchar(50))
18-08-2025 PS-632 CPS IIT Bombay 30
Find all beneficiaries for whom the
Using the Data Model second dose is overdue
CREATE TABLE Vaccination_certs(
vaccination_certificate_id varchar(15), SELECT beneficiary_id, beneficiary_name
certificate_type char,
beneficiary_id varchar(15),
FROM "Certificates" AS c1
beneficiary_name varchar(50), WHERE (CURRENT_DATE - date_vaccination)::INT > 84
beneficiary_gender char,
beneficiary_age int, AND NOT EXISTS (
vaccinator_id varchar(15),
vaccinator_name varchar(50),
SELECT 1
hospital_id varchar(15), FROM "Certificates" AS c2
hospital_name varchar(50),
date_vaccination date, WHERE c1.beneficiary_id = c2.beneficiary_id
time_vaccination time,
vaccine_name varchar(50),
AND c2.certificate_type = 'F'
vaccine_batch_number varchar(50)) );
18-08-2025 PS-632 CPS IIT Bombay 31
Some Shortcommings of the Model
• Due to reported malpractices it is desired to check on a
vaccination certificate if the specific vaccine batch number was
ever allocated to the hospital where the vaccination was
administered
• A report is required on how many beneficiaries did not take
the vaccination in a hospital in the area with the same pincode
as their residential address
18-08-2025 PS-632 CPS IIT Bombay 32
Possible addition of new relations
• In order to check if a vaccination given at a hospital is genuine,
we need to keep track of which batch numbers of which
vaccines have been allotted to which hospital.
• If we know that a vaccine batch was actually allotted to a
hospital, then we can subject the vaccination certificate to this
additional check.
18-08-2025 PS-632 CPS IIT Bombay 33
The Vaccine_Allotment table
• The Vaccine_Allotment table relates the batch number of a
vaccine to a hospital. It is possible that more than one hospital
may receive the same batch but it will not be very many
• The Vaccine_Allotment table has the following structure
vaccine_name hospital_id
CREATE TABLE Vaccine_Allotment(
vaccine_name varchar (15),
vaccine_batch_number varchar(20),
hospital_id varchar(15))
18-08-2025 PS-632 CPS IIT Bombay 34
Genuine Certificates
SELECT * FROM Vaccination_certs WHERE
EXISTS
(SELECT * FROM Vaccine_allotment
WHERE (Vaccination_certs.hospital_id = Vaccine_allotment.hospital_id)
AND
(Vaccination_certs.vaccine_batch_number_id =
Vaccine_allotment.vaccine_batch_number)
)
18-08-2025 PS-632 CPS IIT Bombay 35
Problematic Certificates
SELECT * FROM Vaccination_certs WHERE
NOT EXISTS
(SELECT * FROM Vaccine_allotment
WHERE (Vaccination_certs.hospital_id = Vaccine_allotment.hospital_id)
AND
(Vaccination_certs.vaccine_batch_number_id =
Vaccine_allotment.vaccine_batch_number)
)
18-08-2025 PS-632 CPS IIT Bombay 36
Triggers
• We discussed the need to see if a hospital has actually received
the batch number of vaccine that it is administering.
• This can be seen as an integrity requirement.
• One way of checking integrity is to initiate the check whenever
a change is detected in a database.
• This can be implemented by the use of a trigger
18-08-2025 PS-632 CPS IIT Bombay 37
Checking a certificate
CREATE TRIGGER check_cert BEFORE INSERT on Vaccination_certs
EXECUTE PROCEDURE check_cert();
CREATE FUNCTION check_cert() RETURNS TRIGGER AS $check_cert$
BEGIN
IF NOT EXISTS
(SELECT * FROM Vaccine_allotment
WHERE (NEW.hospital_id = Vaccine_allotment.hospital_id)
AND
(NEW.vaccine_batch_number_id =
Vaccine_allotment.vaccine_batch_number)
)
THEN RAISE EXCEPTION “% Vaccine batch has not been allocated to %
hospital”, NEW.vaccine_batch_number, NEW.hospital_id;
END IF;
END;
$check_cert$ LANGUAGE plpgsql;
18-08-2025 PS-632 CPS IIT Bombay 38
Useful Reports
• There is considerable value in knowing the vaccination status by
state, district and pincode. Some examples of such information
could be
– Percentage of people vaccinated with one dose in each age group by state,
district and pincode
– Percentage of people vaccinated with both doses in each age group by
state, district and pincode
– People whose second dose is overdue by pincode
• These types of queries are not possible in the current data model
since the location data for beneficiaries is not part of this data
model
18-08-2025 PS-632 CPS IIT Bombay 39
Possible Solutions – Explicitly seek this input
• Add the location data to the current data model
• Ask for it to be entered
• Infer it from the context like ip-address
– This will require changes to the beneficiary table where the location
information will be stored
18-08-2025 PS-632 CPS IIT Bombay 40
Possible Solutions – Use other databases
• Get the location data from another database which could be
the photo-id document (Aadhar, Passport etc.)
– The other databases may not be available online
– If they are will the data be copied to a modified version of the
beneficiaries database or will the data be fetched on demand as
needed by a query?
18-08-2025 PS-632 CPS IIT Bombay 41
Summary
• The physical data model involves the creation of tables
• Each table has several columns
• Some columns may be keys
• One column can be a primary key and many columns may be
foreign keys
• Physical data models must be contructed carefully and
redundancy balanced with the need for integrity
18-08-2025 PS-632 CPS IIT Bombay 42
Summary
• The data model can be used by writing queries in SQL
• We have seen examples of SQL – CREATE TABLE and SELECT. SQL is a very
rich language. The most commonly used SQL statement to use data is
SELECT
• Given a data model some types of queries are much easier to implement
than others. Therefore data model design must be done carefully
• It is not easy to make changes to a data model so careful analysis is
required before design
• In practical scenarios it may be required to link a given data model to
another external one. Doing this presents several alternatives and must be
done with care
18-08-2025 PS-632 CPS IIT Bombay 43