Roll no 57
Experiment No 2
Aim- Experiment to study query evaluation plan for different queries and analyze
it.
Queries:
1. select * from trains order by train_ticket_price ASC;
● Relational algebra
π train_name,train_id,train_seat_no,train_ticket_price,train_type,train_description (τ
train_ticket_price asc (trains))
● Query Tree
Roll no 57
● Query Evaluation
Clustered Index
There are clustered and nonclustered indexes. A clustered index is a special type of index
that reorders the way records in the table are physically stored. Therefore table can have only
one clustered index. The leaf nodes of a clustered index contain the data pages.
Selection took 0% time , and for sorting 78% for scanning data and finding the required it took
only 22%.
As train ticket price is not index for our table hence Server used clustered index scan (when need
to search whole table not only index).
2. select * from trains as t where not exists (select * from payment p where p.train_no =
t.train_id)
● Relational algebra
Πtrain_name, train_id, train_seat_no, train_ticket_price, train_type, train_description not exists
(𝜎train_no = train_id)(payment,train)
● Query Tree
Roll no 57
● Query Evaluation
As we have not exists in the query SQL performed left anti Semi join.
Scanning trains took 49% and payment 51%
3. select count(pay_cust_id) from payment where train_no = 1111
● Relational algebra
Π G(count)(pay_cust_id) 𝜎 train_no = 1111 (trains)
● Query Tree
Roll no 57
● Query Evaluation
Compute scalar is used to compute and give scalar value.
Stream Aggregate is used to compute scalar aggregate instead of hash [Link]
aggregate is used when data is arriving sorted by group by.
No need of sorting as primary key is used will directly fetch that data.
4. select sum(pay_amount) from payment group by pay_amount
● Relational algebra
Π G(sum)(pay_amount) G(group_by)(pay_amount)(payment)
● Query Tree
Roll no 57
● Query Evaluation
Sort data according to pay amount which took 78%.
Then process further the aggregate function and find scalar value.
5. select user.user_id, user.user_email, payment.train_no, payment.pay_amount from
user_rail as user inner join payment where user.user_id = payment.pay_cust_id
● Relational algebra
Π user.user_id, user.user_email, payment.train_no, payment.pay_amount (user_rail ⍴
user ⋈ user.user_id = payment.pay_cust_id payment)
● Query Tree
Roll no 57
● Query Evaluation
Performed inner join as given in the query.
More time taken for user_rail as it has more data then payment
6. select * from user_rail where user_id in (select user_id from payment where pay_amount
> 500)
● Relational algebra
R1 ← Πuser_id 𝜎pay_amount>500
(payment)
R2 ← Πuser_id, user_email,
user_mobile, user_address,user_pass (𝜎
user_id = R1(user_rail)
Roll no 57
● Query Tree
● Query Evaluation
Performed Left semi join as we need data only of left table which also exist in right table.
7. select distinct pay_amount from payment a where 3 >= (select count(distinct pay_amount)
from payment b where a.pay_amount <= b.pay_amount) order by a.pay_amount desc;
● Relational algebra
R1 ← 𝜎 pay_amount (payment)
R2 ← 𝜎 G
count_distinct(pay_amount) τ
Roll no 57
desc pay_amount (𝜎
pay_amount=R1(Payment))
R3 ← Π payamount( 𝜎
R2<=3(payment))
● Query Tree
Roll no 57
● Query Evaluation
Filter scan gives only required data depending on the predicates as output.
Inner join as only matched data is required.
Distinct Sort is performed as we have order by in the query.
Index spool SQL create temporary index when needs to read non indexed data,lazy
spool is when it create index for a row only when that row is required.