0% found this document useful (0 votes)

91 views5 pages

EXISTS Conditions: An Condition Tests For Existence of Rows in A Subquery

The document describes an efficient method for deleting duplicate rows from a table using a PL/SQL stored procedure. The procedure works by: 1. Selecting duplicate rows into a cursor, sorted by the duplicate key columns 2. Looping through the cursor rows, comparing the current row's key to the previous, and deleting any rows with matching keys except the first 3. This allows controlling which duplicate row is kept for each group, and performs the deletion more efficiently than a NOT IN clause by avoiding multiple table scans. On a test table with 500k rows and 45k duplicates, the stored procedure completed the deletion much faster than an alternative SQL method.

Uploaded by

yalamandu4358

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

91 views5 pages

EXISTS Conditions: An Condition Tests For Existence of Rows in A Subquery

Uploaded by

yalamandu4358

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

EXISTS Conditions

An EXISTS condition tests for existence of rows in a subquery.

exists_condition::=

Text description of exists_condition

Table 5-10 shows the EXISTS condition.

Table 5-10 EXISTS Conditions

Condition Operation Example

EXISTS TRUE if a subquery returns at least one SELECT department_id
FROM departments d
row. WHERE EXISTS
(SELECT * FROM employees
e
WHERE d.department_id
= e.department_id);
Description: With reference to your tip for the week 08/04/2002 submitted by Madan Patil, this is another
way of deleting duplicate rows from a table. The difference being the time it takes to delete the duplicate
rows with this method is many times faster than the earlier method.

Let us take a table containing 3 columns, then we can use the following command to delete the duplicate
rows from the table.

delete from where rowid in (

SELECT rowid FROM
group by rowid,col1,col2,col3
minus
SELECT min(rowid) FROM

group by col1,col2,col3);

To show the difference let's consider a table:

Table: EMP
EMPNO NUMBER
ENAME VARCHAR2(20)
JOB VARCHAR2(20)

CREATE TABLE EMP (

EMPNO NUMBER,
ENAME VARCHAR2(20),
JOB VARCHAR2(20)
);
/
begin
for i in 1..20 loop
insert into emp values (1,'xx','clerk');
end loop;

commit;
end;
/
begin
for i in 1..20 loop
insert into emp values (2,'yy','accountant');
end loop;
commit;
end;
/
begin

for i in 1..20000 loop

insert into emp values (3,'zz','manager');
end loop;
commit;
end;
/
begin
for i in 1..10000 loop
insert into emp values (4,'ab','accountant');
end loop;
commit;
end;

Using the previous method as in your TIP for the Week 08/04/2002

------------------------------------------------------------------------
SQL> select count(*) from emp;

COUNT(*)
----------
30040

SQL> set timing on;

SQL> DELETE FROM EMP E
2 WHERE [Link] > ANY (SELECT ROWID
3 FROM EMP M
4 WHERE [Link] = [Link]
5 AND [Link] = [Link]
6 AND [Link] = [Link] );

30036 rows deleted.

Elapsed: [Link].48

SQL> select count(*) from emp;

COUNT(*)
----------
4

Elapsed: [Link].10

Using the NEW suggested method:

-------------------------------------------
SQL> select count(*) from emp;

COUNT(*)
----------
30040
SQL> delete from emp where rowid in (
2 SELECT rowid FROM emp
3 group by rowid,empno,ename,job
4 minus
5 SELECT min(rowid) FROM emp
6 group by empno,ename,job);
30036 rows deleted.

Elapsed: [Link].94

SQL> select count(*) from emp;

COUNT(*)
----------
4

Elapsed: [Link].10
--------------------------------------------------------------------

As we can see the difference is multifold to achieve the same result. This is because the new method uses
the set operator to compute the list of duplicate rows. The bigger the table the better you can appreciate the
difference.

>>> MIN() allows you to select one row per group—duplicates and non-duplicates—so that you
get a list of all the rows you want to keep:

SELECT MIN(ID) AS ID, LastName, FirstName

FROM Customers
GROUP BY LastName, FirstName;
Listing 5 shows the output of the above code.

Now you just need to delete rows that are not in this list, using the last query as a subquery inside
an antijoin (the NOT IN clause):

DELETE FROM Customers

WHERE ID NOT IN
(SELECT MIN(ID)
FROM Customers
GROUP BY LastName, FirstName);
However, an antijoin query with the NOT IN clause is inefficient to make this work. In our case
two (!) full table scans need to be performed to resolve this SQL statement. That leads to
substantial performance loss for big data sets. For performance testing I created the Customers
data set with 500,000 rows and 45,000 duplicates (9 percent of the total). The above command
ran for more than one hour with no results—except that it exhausted my patience—so I killed the
process.

Another disadvantage of this syntax is that you can't control which row per group of duplicates
you can keep in the database.

A PL/SQL Solution: Deleting Duplicate Data with a Stored Procedure

Let me give you an example of a PL/SQL stored procedure, called DeleteDuplicate (see Listing
6), that cleans up duplicates. The algorithm for this procedure is pretty straightforward:

1. It selects the duplicate data in the cursor, sorted by duplicate key (LastName, FirstName
in our case), as shown in Listing 4.

2. It opens the cursor and fetches each row, one by one, in a loop.

3. It compares the duplicate key value with the previously fetched one.
4. If this is a first fetch, or the value is different, then that's the first row in a new group so it
skips it and fetches the next row. Otherwise, it's a duplicate row within the same group,
so it deletes it.

Let's run the stored procedure and check it against the Customers data:
BEGIN
DeleteDuplicates;
END;
/

SELECT LastName, FirstName, COUNT(*)

FROM Customers
GROUP BY LastName, FirstName
HAVING COUNT(*) > 1;
The last SELECT statement returns no rows, so the duplicates are gone.

The main job of extracting duplicates in this procedure is done by a SQL statement, which is
defined in the csr_Duplicates cursor. The PL/SQL procedural code is used only to implement the
logic of deleting all rows in the group except the first one. Could it all be done by one SQL
statement?

How To Remove Duplicate Records in SQL
No ratings yet
How To Remove Duplicate Records in SQL
16 pages
SQL Queries for Data Retrieval and Management
No ratings yet
SQL Queries for Data Retrieval and Management
47 pages
5 Ways To Delete Duplicate Records
No ratings yet
5 Ways To Delete Duplicate Records
6 pages
Advanced SQL Concepts for Interviews
No ratings yet
Advanced SQL Concepts for Interviews
55 pages
SQL Interview
No ratings yet
SQL Interview
17 pages
SQL
No ratings yet
SQL
21 pages
SQL Advanced Scenario With Solutions
No ratings yet
SQL Advanced Scenario With Solutions
18 pages
SQL Questions
No ratings yet
SQL Questions
23 pages
Oracle SQL FAQ: What Is SQL and Where Does It Come From?
No ratings yet
Oracle SQL FAQ: What Is SQL and Where Does It Come From?
8 pages
Unique Records in SQL Queries
No ratings yet
Unique Records in SQL Queries
25 pages
SQL IQ
No ratings yet
SQL IQ
6 pages
SQL Queries for Unique and Duplicate Records
No ratings yet
SQL Queries for Unique and Duplicate Records
21 pages
DBMS-CS502 - Lab-Manual 2024
No ratings yet
DBMS-CS502 - Lab-Manual 2024
28 pages
SQL Interview Prep Guide
No ratings yet
SQL Interview Prep Guide
9 pages
SQL Query
No ratings yet
SQL Query
14 pages
SQL Crash Sheet For MCQ Interview
No ratings yet
SQL Crash Sheet For MCQ Interview
12 pages
SQL Query Techniques
No ratings yet
SQL Query Techniques
39 pages
Tips For Writing Efficient SQL Queries. Vigyan Kaushik
No ratings yet
Tips For Writing Efficient SQL Queries. Vigyan Kaushik
6 pages
SQL Queries For Interviews
No ratings yet
SQL Queries For Interviews
18 pages
Return To Table of Contents
No ratings yet
Return To Table of Contents
8 pages
Dbms Lab Practical File
No ratings yet
Dbms Lab Practical File
9 pages
EXPT7 NCQ LMD
No ratings yet
EXPT7 NCQ LMD
7 pages
SQL Overview and Command Syntax
No ratings yet
SQL Overview and Command Syntax
13 pages
Techniques Used To Transform Data, Part 1
No ratings yet
Techniques Used To Transform Data, Part 1
12 pages
SQL Interview Questions
No ratings yet
SQL Interview Questions
7 pages
SQL Duplicate Removal Guide
No ratings yet
SQL Duplicate Removal Guide
11 pages
Dbms 1 To 18updated
No ratings yet
Dbms 1 To 18updated
39 pages
SQL Commands and Functions Overview
No ratings yet
SQL Commands and Functions Overview
11 pages
Teradata SQL Interview Guide
No ratings yet
Teradata SQL Interview Guide
12 pages
SQL Server Duplicate Removal Guide
No ratings yet
SQL Server Duplicate Removal Guide
88 pages
ROWNUM in Oracle Gemini AI
No ratings yet
ROWNUM in Oracle Gemini AI
14 pages
Remove Duplicate Rows in Oracle SQL
No ratings yet
Remove Duplicate Rows in Oracle SQL
4 pages
Data Cleaning
No ratings yet
Data Cleaning
21 pages
SQL
No ratings yet
SQL
5 pages
SQL Query Tips and Examples
No ratings yet
SQL Query Tips and Examples
126 pages
Delete Duplicates in SQL Table
No ratings yet
Delete Duplicates in SQL Table
7 pages
SQL Operations on Database Records
No ratings yet
SQL Operations on Database Records
31 pages
SQL Cheatsheet - 1
No ratings yet
SQL Cheatsheet - 1
16 pages
SQL
No ratings yet
SQL
8 pages
Cognizant Interview Guide
No ratings yet
Cognizant Interview Guide
8 pages
Mysql Cheat Sheet
No ratings yet
Mysql Cheat Sheet
11 pages
9 - SQL Notes
No ratings yet
9 - SQL Notes
9 pages
MySQL Queries Cheat Sheet
No ratings yet
MySQL Queries Cheat Sheet
8 pages
CS-502-DBMS Lab Manual
No ratings yet
CS-502-DBMS Lab Manual
20 pages
Table of Contents
No ratings yet
Table of Contents
4 pages
Dbms Lab Manual RGPV
75% (4)
Dbms Lab Manual RGPV
38 pages
SQL Query
No ratings yet
SQL Query
139 pages
SQL Lab Manual for CS Students
No ratings yet
SQL Lab Manual for CS Students
38 pages
Synopsys Payroll MGMT
No ratings yet
Synopsys Payroll MGMT
25 pages
Five Generations of Computers
100% (1)
Five Generations of Computers
16 pages
Natraj Interviews
No ratings yet
Natraj Interviews
32 pages
AHIR Seminar PDF
No ratings yet
AHIR Seminar PDF
32 pages
Security Testing - Quick Guide
No ratings yet
Security Testing - Quick Guide
53 pages
320 - CS8391 Data Structures - Important Questions
100% (6)
320 - CS8391 Data Structures - Important Questions
22 pages
Mastercoin Wallet Management Guide
No ratings yet
Mastercoin Wallet Management Guide
1 page
Cluster Setup for Tomcat and Apache
No ratings yet
Cluster Setup for Tomcat and Apache
3 pages
Test Processor
No ratings yet
Test Processor
15 pages
INSY 5339 - Data Mining Exam #2 Review
No ratings yet
INSY 5339 - Data Mining Exam #2 Review
1 page
Apple Xserve Diagnostics User Guide v3X103
No ratings yet
Apple Xserve Diagnostics User Guide v3X103
39 pages
New Text Document
70% (10)
New Text Document
8 pages
Swt549 Data Mining and Business Intelligence TH 1.10 Ac26
No ratings yet
Swt549 Data Mining and Business Intelligence TH 1.10 Ac26
2 pages
Computer Hardware Essentials II
No ratings yet
Computer Hardware Essentials II
97 pages
Cloud Service & Orchestration Guide
No ratings yet
Cloud Service & Orchestration Guide
58 pages
MATLAB Basics for ECE Students
No ratings yet
MATLAB Basics for ECE Students
32 pages
IoT Fundamentals 2.0 Lansing CC - June 6
No ratings yet
IoT Fundamentals 2.0 Lansing CC - June 6
38 pages
Radare 2 Book
No ratings yet
Radare 2 Book
287 pages
A Survey On Network Layer Attacks On Mobile Ad-Hoc Networks
No ratings yet
A Survey On Network Layer Attacks On Mobile Ad-Hoc Networks
6 pages
7 - Introduce Tekla Open API PDF
100% (2)
7 - Introduce Tekla Open API PDF
34 pages
Quality Assurance Process For Web Applications: March 2018
No ratings yet
Quality Assurance Process For Web Applications: March 2018
5 pages
Gujarat University BCA Syllabus Overview
No ratings yet
Gujarat University BCA Syllabus Overview
19 pages
Software Engineering Slides 1
No ratings yet
Software Engineering Slides 1
11 pages
8 An Overview of 3d Data Content File Formats and Viewers
No ratings yet
8 An Overview of 3d Data Content File Formats and Viewers
21 pages
CMD Keywords
No ratings yet
CMD Keywords
4 pages
Programming Challenges for Coders
No ratings yet
Programming Challenges for Coders
210 pages
Amazon Listing Upload Guide
0% (1)
Amazon Listing Upload Guide
18 pages
Load Testing Guide with JMeter
78% (9)
Load Testing Guide with JMeter
38 pages
Sed, A Stream Editor - by Ken Pizzini, Paolo Bonzini PDF
No ratings yet
Sed, A Stream Editor - by Ken Pizzini, Paolo Bonzini PDF
38 pages
8-Point Hadamard VHDL Design
No ratings yet
8-Point Hadamard VHDL Design
6 pages

EXISTS Conditions: An Condition Tests For Existence of Rows in A Subquery

Uploaded by

EXISTS Conditions: An Condition Tests For Existence of Rows in A Subquery

Uploaded by

EXISTS Conditions

An EXISTS condition tests for existence of rows in a subquery.

Text description of exists_condition

Table 5-10 shows the EXISTS condition.

Table 5-10 EXISTS Conditions

Condition Operation Example

delete from where rowid in (

To show the difference let's consider a table:

CREATE TABLE EMP (

for i in 1..20000 loop

SQL> set timing on;

30036 rows deleted.

SQL> select count(*) from emp;

Using the NEW suggested method:

SQL> select count(*) from emp;

SELECT MIN(ID) AS ID, LastName, FirstName

DELETE FROM Customers

A PL/SQL Solution: Deleting Duplicate Data with a Stored Procedure

SELECT LastName, FirstName, COUNT(*)

You might also like