0% found this document useful (0 votes)
403 views508 pages

Rbafy-Database SQL Programming 7.1

IBM DB2 SQL Programming

Uploaded by

dhanoj6522
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
403 views508 pages

Rbafy-Database SQL Programming 7.1

IBM DB2 SQL Programming

Uploaded by

dhanoj6522
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 508

IBM

IBM i

Database
SQL programming
7.1

IBM
IBM i

Database
SQL programming
7.1

Note
Before using this information and the product it supports, read the information in Notices, on
page 493.

This edition applies to IBM i 7.1 (product number 5770-SS1) and to all subsequent releases and modifications until
otherwise indicated in new editions. This version does not run on all reduced instruction set computer (RISC)
models nor does it run on CISC models.
Copyright IBM Corporation 1998, 2010.
US Government Users Restricted Rights Use, duplication or disclosure restricted by GSA ADP Schedule Contract
with IBM Corp.

Contents
SQL programming . . . . . . . . ..
|

|
|

What's new for IBM i 7.1 . . . . . . . . ..


1
PDF file for SQL programming . . . . . . ..
4
Introduction to DB2 for i Structured Query Language 4
SQL concepts . . . . . . . . . . . ..
5
SQL relational database and system
terminology . . . . . . . . . . ..
6
SQL and system naming conventions . . ..
7
Types of SQL statements . . . . . . ..
7
SQL communication area . . . . . . ..
9
SQL diagnostics area. . . . . . . . ..
9
SQL objects . . . . . . . . . . . . ..
9
Schemas . . . . . . . . . . . . ..
9
Journals and journal receivers . . . . ..
9
Catalogs . . . . . . . . . . . ..
10
Tables, rows, and columns . . . . . ..
10
Aliases . . . . . . . . . . . . ..
10
Views . . . . . . . . . . . . ..
10
Indexes . . . . . . . . . . . . ..
11
Constraints . . . . . . . . . . ..
11
Triggers . . . . . . . . . . . ..
12
Stored procedures . . . . . . . . ..
12
Sequences . . . . . . . . . . . ..
12
Global variables . . . . . . . . . ..
12
User-defined functions. . . . . . . ..
12
User-defined types . . . . . . . . ..
13
XSR objects . . . . . . . . . . ..
13
SQL packages . . . . . . . . . ..
13
Application program objects . . . . . . ..
13
User source file . . . . . . . . . ..
15
Output source file member . . . . . ..
15
Program . . . . . . . . . . . ..
15
SQL package . . . . . . . . . . ..
15
Module . . . . . . . . . . . . ..
16
Service program . . . . . . . . . ..
16
Data definition language . . . . . . . . ..
16
Creating a schema . . . . . . . . . ..
16
Creating a table . . . . . . . . . . ..
17
Adding and removing constraints . . . ..
17
Referential integrity and tables . . . . ..
18
Adding and removing referential
constraints. . . . . . . . . . ..
18
Example: Adding referential constraints ..
19
Example: Removing constraints. . . . ..
20
Check pending . . . . . . . . . ..
20
Creating a table using LIKE . . . . . . ..
21
Creating a table using AS. . . . . . . ..
21
Creating and altering a materialized query table 22
Declaring a global temporary table . . . ..
23
Creating a table with remote server data . ..
23
Creating a row change timestamp column . ..
24
Creating and altering an identity column . ..
24
Using ROWID . . . . . . . . . . ..
25
Creating and using sequences . . . . . ..
26
Comparison of identity columns and
sequences . . . . . . . . . . . ..
27
Copyright IBM Corp. 1998, 2010

|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|

|
|

Defining field procedures. . . . . . . ..


Field definition for field procedures . . ..
Specifying the field procedure . . . . ..
When field procedures are invoked . . ..
Parameter list for execution of field
procedures . . . . . . . . . . ..
The field procedure parameter value list
(FPPVL) . . . . . . . . . . ..
Parameter value descriptors for field
procedures . . . . . . . . . ..
Field-definition (function code 8) . . ..
Field-encoding (function code 0) . . ..
Field-decoding (function code 4) . . ..
Example field procedure program . . ..
General guidelines for writing field
procedures . . . . . . . . . . ..
Index considerations . . . . . . ..
Thread considerations . . . . . . ..
Debug considerations . . . . . . ..
Guidelines for writing field procedures that
mask data . . . . . . . . . . . ..
Example field procedure program that
masks data . . . . . . . . . ..
Creating descriptive labels using the LABEL ON
statement . . . . . . . . . . . . ..
Describing an SQL object using COMMENT ON
Changing a table definition . . . . . . ..
Adding a column . . . . . . . . ..
Changing a column. . . . . . . . ..
Allowable conversions of data types . . ..
Deleting a column . . . . . . . . ..
Order of operations for the ALTER TABLE
statement . . . . . . . . . . . ..
Using CREATE OR REPLACE TABLE . ..
Creating and using ALIAS names . . . . ..
Creating and using views. . . . . . . ..
WITH CHECK OPTION on a view . . ..
WITH CASCADED CHECK OPTION ..
WITH LOCAL CHECK OPTION . . ..
Example: Cascaded check option . . ..
Creating indexes. . . . . . . . . . ..
Creating and using global variables . . . ..
Replacing existing objects. . . . . . . ..
Catalogs in database design . . . . . . ..
Getting catalog information about a table ..
Getting catalog information about a column
Dropping a database object . . . . . . ..
Data manipulation language. . . . . . . ..
Retrieving data using the SELECT statement ..
Basic SELECT statement . . . . . . ..
Specifying a search condition using the
WHERE clause . . . . . . . . . ..
Expressions in the WHERE clause . . ..
Comparison operators . . . . . . ..
NOT keyword . . . . . . . . ..
GROUP BY clause . . . . . . . . ..

28
29
29
29
30
32
32
33
34
35
36
37
38
38
38
38
41
42
43
44
44
44
45
46
46
47
48
48
50
51
51
52
53
53
54
54
55
55
55
56
56
56
57
58
59
60
60

iii

HAVING clause . . . . . . . . . ..
ORDER BY clause . . . . . . . . ..
Static SELECT statements. . . . . . ..
Handling null values . . . . . . . ..
Special registers in SQL statements . . ..
Casting data types . . . . . . . . ..
Date, time, and timestamp data types . ..
Specifying current date and time values ..
Date/time arithmetic . . . . . . ..
Row change expressions . . . . . . ..
Handling duplicate rows . . . . . . ..
Defining complex search conditions . . ..
Special considerations for LIKE . . . ..
Multiple search conditions within a
WHERE clause . . . . . . . . ..
Using OLAP specifications . . . . . ..
Joining data from more than one table . ..
Inner join . . . . . . . . . . ..
Left outer join . . . . . . . . ..
Right outer join . . . . . . . . ..
Exception join . . . . . . . . ..
Cross join . . . . . . . . . . ..
Full outer join . . . . . . . . ..
Multiple join types in one statement . ..
Using table expressions . . . . . . ..
Using recursive queries . . . . . . ..
Using the UNION keyword to combine
subselects . . . . . . . . . . . ..
Specifying the UNION ALL keyword. ..
Using the EXCEPT keyword . . . . ..
Using the INTERSECT keyword . . . ..
Data retrieval errors . . . . . . . ..
Inserting rows using the INSERT statement ..
Inserting rows using the VALUES clause ..
Inserting rows using a select-statement . ..
Inserting multiple rows using the blocked
INSERT statement . . . . . . . . ..
Inserting data into tables with referential
constraints . . . . . . . . . . ..
Inserting values into an identity column ..
Selecting inserted values. . . . . . ..
Inserting data from a remote database . ..
Changing data in a table using the UPDATE
statement . . . . . . . . . . . . ..
Updating a table using a scalar-subselect ..
Updating a table with rows from another
table . . . . . . . . . . . . ..
Updating tables with referential constraints
Examples: UPDATE rules . . . . ..
Updating an identity column . . . . ..
Updating data as it is retrieved from a table
Removing rows from a table using the DELETE
statement . . . . . . . . . . . . ..
Removing rows from tables with referential
constraints . . . . . . . . . . ..
Example: DELETE rules . . . . . ..
Merging data . . . . . . . . . . ..
Using subqueries . . . . . . . . . ..
Subqueries in SELECT statements . . ..
Subqueries and search conditions. . ..
Usage notes on subqueries . . . . ..

iv

IBM i: Database SQL programming

62
63
65
65
66
68
68
68
69
69
69
70
71
72
73
76
76
77
78
78
79
80
81
81
83
95
98
100
102
104
105
106
107
108
108
109
110
110
111
112
112
113
114
114
114
116
117
118
119
120
121
122
122

|
|

Including subqueries in the WHERE or


HAVING clause . . . . . . . ..
Correlated subqueries . . . . . . ..
Correlated names and references . . ..
Example: Correlated subquery in a
WHERE clause . . . . . . . . ..
Example: Correlated subquery in a
HAVING clause . . . . . . . ..
Example: Correlated subquery in a
select-list . . . . . . . . . . ..
Example: Correlated subquery in an
UPDATE statement . . . . . . ..
Example: Correlated subquery in a
DELETE statement . . . . . . ..
Sort sequences and normalization in SQL . . ..
Sort sequence used with ORDER BY and row
selection . . . . . . . . . . . . ..
Sort sequence and ORDER BY . . . . ..
Sort sequence and row selection . . . ..
Sort sequence and views . . . . . . ..
Sort sequence and the CREATE INDEX
statement. . . . . . . . . . . . ..
Sort sequence and constraints . . . . . ..
ICU sort sequence . . . . . . . . . ..
Normalization . . . . . . . . . . ..
Data protection . . . . . . . . . . . ..
Security for SQL objects . . . . . . . ..
Authorization ID . . . . . . . . ..
Views . . . . . . . . . . . . ..
Auditing . . . . . . . . . . . ..
Data integrity . . . . . . . . . . ..
Concurrency. . . . . . . . . . ..
Journaling . . . . . . . . . . ..
Commitment control . . . . . . . ..
Savepoints . . . . . . . . . . ..
Atomic operations . . . . . . . . ..
Constraints . . . . . . . . . . ..
Adding and using check constraints . ..
Save and restore functions . . . . . ..
Damage tolerance . . . . . . . . ..
Index recovery . . . . . . . . . ..
Catalog integrity . . . . . . . . ..
User auxiliary storage pool . . . . . ..
Independent auxiliary storage pool . . ..
Routines . . . . . . . . . . . . . ..
Stored procedures . . . . . . . . . ..
Defining an external procedure . . . ..
Defining an SQL procedure. . . . . ..
Defining a procedure with default
parameters . . . . . . . . . . ..
Calling a stored procedure . . . . . ..
Using the CALL statement where
procedure definition exists . . . . ..
Using the embedded CALL statement
where no procedure definition exists ..
Using the embedded CALL statement
with an SQLDA . . . . . . . ..
Using the dynamic CALL statement where
no CREATE PROCEDURE exists . . ..
Examples: CALL statements . . . ..
Returning result sets from stored procedures

123
124
124
125
126
127
127
128
128
129
129
131
131
132
132
132
133
134
134
135
135
135
136
136
138
138
142
144
145
146
146
147
148
148
149
149
149
149
150
151
156
158
158
159
160
161
161
167

|
|
|

|
|
|
|
|
|
|

|
|

Example 1: Calling a stored procedure


that returns a single result set . . . ..
Example 2: Calling a stored procedure
that returns a result set from a nested
procedure . . . . . . . . . ..
Writing a program or SQL procedure to
receive the result sets from a stored
procedure . . . . . . . . . . ..
Parameter passing conventions for stored
procedures and user-defined functions . ..
Indicator variables and stored procedures
Returning a completion status to the calling
program . . . . . . . . . . . ..
Passing parameters from DB2 to external
proceduress . . . . . . . . . . ..
Parameter style SQL . . . . . . ..
Parameter style GENERAL . . . . ..
Parameter style GENERAL WITH NULLS
Parameter style DB2GENERAL . . ..
Parameter style Java . . . . . . ..
Using user-defined functions . . . . . ..
UDF concepts . . . . . . . . . ..
Writing UDFs as SQL functions . . . ..
Example: SQL scalar UDFs . . . . ..
Example: SQL table UDFs . . . . ..
Writing UDFs as external functions . . ..
Registering UDFs . . . . . . . ..
Passing arguments from DB2 to external
functions . . . . . . . . . . ..
Table function considerations . . . ..
Error processing for UDFs . . . . ..
Threads considerations . . . . . ..
Parallel processing. . . . . . . ..
Fenced or unfenced considerations . ..
Save and restore considerations . . ..
Examples: UDF code . . . . . . . ..
Example: Square of a number UDF . ..
Example: Counter . . . . . . . ..
Example: Weather table function . . ..
Using UDFs in SQL statements . . . ..
Using parameter markers or the NULL
values as function arguments . . . ..
Using qualified function references . ..
Using unqualified function references ..
Summary of function references . . ..
Triggers . . . . . . . . . . . . ..
SQL triggers . . . . . . . . . . ..
BEFORE SQL triggers . . . . . ..
AFTER SQL triggers . . . . . . ..
Multiple event SQL triggers . . . ..
INSTEAD OF SQL triggers . . . . ..
Handlers in SQL triggers . . . . ..
SQL trigger transition tables . . . ..
External triggers . . . . . . . . ..
Array support in SQL procedures . . . ..
Debugging an SQL routine . . . . . . ..
Obfuscating an SQL routine . . . . . ..
Managing SQL and external routine objects ..
Improving performance of procedures and
functions . . . . . . . . . . . . ..

168

169

174
180
185
187
187
187
189
189
189
190
190
190
192
193
193
194
194
197
203
203
204
204
204
205
205
205
207
208
214
214
214
215
216
217
218
219
220
221
222
224
225
225
226
227
228
230
231

|
|
|
|
|
|

Improving implementation of procedures and


functions . . . . . . . . . . . ..
Redesigning routines for performance . ..
Processing special data types . . . . . . ..
Large objects . . . . . . . . . . ..
Large object data types . . . . . . ..
Large object locators . . . . . . . ..
Example: Using a locator to work with a
CLOB value . . . . . . . . . . ..
Example: LOBLOC in C . . . . . ..
Example: LOBLOC in COBOL . . . ..
Indicator variables and LOB locators . ..
LOB file reference variables . . . . ..
Example: Extracting CLOB data to a file ..
Example: LOBFILE in C . . . . . ..
Example: LOBFILE in COBOL . . . ..
Example: Inserting data into a CLOB column
Displaying the layout of LOB columns . ..
Journal entry layout of LOB columns . ..
User-defined distinct types . . . . . . ..
Defining a UDT . . . . . . . . ..
Example: Money . . . . . . . ..
Example: Resum . . . . . . . ..
Defining tables with UDTs . . . . . ..
Example: Sales . . . . . . . . ..
Example: Application forms . . . ..
Manipulating UDTs . . . . . . . ..
Examples: Using UDTs . . . . . . ..
Example: Comparisons between UDTs and
constants . . . . . . . . . . ..
Example: Casting between UDTs . . ..
Example: Comparisons involving UDTs
Example: Sourced UDFs involving UDTs
Example: Assignments involving UDTs
Example: Assignments in dynamic SQL
Example: Assignments involving different
UDTs . . . . . . . . . . . ..
Example: Using UDTs in UNION. . ..
Examples: Using UDTs, UDFs, and LOBs . ..
Example: Defining the UDT and UDFs . ..
Example: Using the LOB function to
populate the database . . . . . . ..
Example: Using UDFs to query instances of
UDTs . . . . . . . . . . . . ..
Example: Using LOB locators to manipulate
UDT instances . . . . . . . . . ..
Using DataLinks . . . . . . . . . ..
Linking control levels in DataLinks . . ..
NO LINK CONTROL . . . . . ..
FILE LINK CONTROL with FS
permissions . . . . . . . . . ..
FILE LINK CONTROL with DB
permissions . . . . . . . . . ..
Working with DataLinks . . . . . ..
SQL statements and SQL/XML functions . . ..
XML input and output overview . . . . ..
Comparison of XML and relational models ..
Tutorial for XML . . . . . . . . . ..
Exercise 1: Creating a table that can store
XML data . . . . . . . . . . ..

Contents

231
233
234
234
235
235
236
236
238
239
239
240
241
242
243
243
243
244
244
245
245
245
245
246
246
246
246
247
248
248
249
249
249
250
251
251
252
252
253
253
254
254
254
255
255
257
258
259
261
261

|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|

Exercise 2: Inserting XML documents into


XML typed columns . . . . . . . ..
Exercise 3: Updating XML documents stored
in an XML column . . . . . . . ..
Exercise 4: Validating XML documents
against XML schemas . . . . . . ..
Exercise 5: Transforming with XSLT
stylesheets . . . . . . . . . . ..
Inserting XML data . . . . . . . . ..
Addition of XML columns to existing tables
Insertion into XML columns . . . . ..
XML parsing . . . . . . . . . ..
SQL/XML publishing functions for constructing
XML values . . . . . . . . . . . ..
Example: Construct an XML document with
values from a single table . . . . . ..
Example: Construct an XML document with
values from multiple tables. . . . . ..
Example: Construct an XML document with
values from table rows that contain null
elements . . . . . . . . . . . ..
Example: Transforming with XSLT stylesheets
Example: Using XSLT as a formatting engine
Example: Using XSLT for data exchange ..
Example: Using XSLT to remove namespaces
Important considerations for transforming
XML documents . . . . . . . . ..
Special character handling in SQL/XML
publishing functions . . . . . . . ..
XML serialization . . . . . . . . ..
Differences in an XML document after
storage and retrieval . . . . . . . ..
Data types for archiving XML documents
Using XMLTABLE to reference XML content as
a relational table . . . . . . . . . ..
Example: Use XMLTABLE to handle missing
elements . . . . . . . . . . . ..
Example: Use XMLTABLE to subset result
data . . . . . . . . . . . . ..
Example: Use XMLTABLE to handle multiple
values . . . . . . . . . . . . ..
Example: Use XMLTABLE with namespaces
Example: Number result rows for
XMLTABLE . . . . . . . . . . ..
Updating XML data . . . . . . . . ..
Deletion of XML data from tables . . ..
XML schema repository . . . . . . . ..
Application programming language support ..
XML column inserts and updates in CLI
applications . . . . . . . . . . ..
XML data retrieval in CLI applications . ..
Declaring XML host variables in embedded
SQL applications . . . . . . . . ..
Example: Referencing XML host variables
in embedded SQL applications . . ..
Recommendations for developing
embedded SQL applications with XML ..
Identifying XML values in an SQLDA ..
Java . . . . . . . . . . . . ..
XML data in JDBC applications . . ..
XML data in SQLJ applications . . ..

vi

IBM i: Database SQL programming

261
262
263
265
267
267
268
269
270
271
272

273
273
275
277
278
281
281
282
284
284
284
286
286
287
289
291
292
293
293
294
294
295
296
297
298
298
298
298
304

|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|

Routines . . . . . . . . . . . ..
XML support in SQL procedures . . ..
XML data type support in external
routines . . . . . . . . . . ..
XML data encoding . . . . . . . . ..
Encoding considerations when storing or
passing XML data . . . . . . . . ..
Encoding considerations for input of XML
data to a database . . . . . . . ..
Encoding considerations for retrieval of
XML data from a database . . . . ..
Encoding considerations for passing XML
data in routine parameters . . . . ..
Encoding considerations for XML data in
JDBC and SQLJ applications . . . ..
Effects of XML encoding and serialization on
data conversion . . . . . . . . ..
Encoding scenarios for input of internally
encoded XML data to a database . . ..
Encoding scenarios for input of externally
encoded XML data to a database . . ..
Encoding scenarios for retrieval of XML
data with implicit serialization . . ..
Encoding scenarios for retrieval of XML
data with explicit XMLSERIALIZE . ..
Mappings of encoding names to effective
CCSIDs for stored XML data . . . . ..
Mappings of CCSIDs to encoding names for
serialized XML output data. . . . . ..
Annotated XML schema decomposition . ..
Decomposing XML documents with
annotated XML schemas. . . . . . ..
Registering and enabling XML schemas for
decomposition . . . . . . . . . ..
Sources for annotated XML schema
decomposition . . . . . . . . . ..
XML decomposition annotations . . . ..
Specification and scope of XML
decomposition annotations . . . . ..
Annotations as attributes . . . . ..
Annotations as structured child elements
Global annotations . . . . . . ..
XML decomposition annotations Summary. . . . . . . . . . ..
db2-xdb:defaultSQLSchema decomposition
annotation . . . . . . . . . ..
db2-xdb:rowSet decomposition annotation
db2-xdb:table decomposition annotation
db2-xdb:column decomposition annotation
db2-xdb:locationPath decomposition
annotation . . . . . . . . . ..
db2-xdb:expression decomposition
annotation . . . . . . . . . ..
db2-xdb:condition decomposition
annotation . . . . . . . . . ..
db2-xdb:contentHandling decomposition
annotation . . . . . . . . . ..
db2-xdb:normalization decomposition
annotation . . . . . . . . . ..
db2-xdb:order decomposition annotation

306
307
307
311
312
312
312
312
312
313
313
315
316
319
321
321
321
321
322
322
322
323
323
323
324
324
326
327
331
333
335
338
341
344
348
350

|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|

db2-xdb:truncate decomposition
annotation . . . . . . . . . ..
db2-xdb:rowSetMapping decomposition
annotation . . . . . . . . . ..
db2-xdb:rowSetOperationOrder
decomposition annotation . . . . ..
Keywords for annotated XML schema
decomposition . . . . . . . . ..
Treatment of CDATA sections in annotated
XML schema decomposition . . . . ..
NULL values and empty strings in annotated
XML schema decomposition . . . . ..
Checklist for annotated XML schema
decomposition . . . . . . . . . ..
Examples of mappings in annotated XML
schema decomposition . . . . . . . ..
Annotations of derived complex types . ..
Decomposition annotation example: Mapping
to an XML column . . . . . . . ..
Decomposition annotation example: A value
mapped to a single table that yields a single
row . . . . . . . . . . . . ..
Decomposition annotation example: A value
mapped to a single table that yields multiple
rows . . . . . . . . . . . . ..
Decomposition annotation example: A value
mapped to multiple tables . . . . . ..
Decomposition annotation example:
Grouping multiple values mapped to a single
table . . . . . . . . . . . . ..
Decomposition annotation example: Multiple
values from different contexts mapped to a
single table . . . . . . . . . . ..
XML schema to SQL types compatibility for
annotated schema decomposition. . . ..
Limits and restrictions for annotated XML
schema decomposition . . . . . . ..
Schema for XML decomposition annotations
Using SQL in different environments . . . ..
Using a cursor . . . . . . . . . . ..
Types of cursors . . . . . . . . ..
Examples: Using a cursor . . . . . ..
Step 1: Defining the cursor . . . . ..
Step 2: Opening the cursor . . . . ..
Step 3: Specifying what to do when the
end of data is reached . . . . . ..
Step 4: Retrieving a row using a cursor
Step 5a: Updating the current row . ..
Step 5b: Deleting the current row. . ..
Step 6: Closing the cursor . . . . ..
Using the multiple-row FETCH statement
Multiple-row FETCH using a host
structure array . . . . . . . . ..
Multiple-row FETCH using a row storage
area . . . . . . . . . . . ..
Unit of work and open cursors . . . ..
Dynamic SQL applications . . . . . . ..
Designing and running a dynamic SQL
application . . . . . . . . . . ..
CCSID of dynamic SQL statements . . ..
Processing non-SELECT statements . . ..

352
354
356
357
358
358
359
360
360
366

367

368
370

371

373
375
381
382
383
383
383
384
386
387
387
388
389
389
389
390
390
392
394
395
395
396
396

Using the PREPARE and EXECUTE


statements . . . . . . . . . ..
Processing SELECT statements and using a
descriptor . . . . . . . . . . ..
Fixed-list SELECT statements . . . ..
Varying-list SELECT statements . . ..
SQL descriptor areas . . . . . . ..
SQLDA format . . . . . . . . ..
Example: A SELECT statement for
allocating storage for SQLDA . . . ..
Example: A SELECT statement using an
allocated SQL descriptor. . . . . ..
Parameter markers . . . . . . ..
Using interactive SQL . . . . . . . ..
Starting interactive SQL . . . . . . ..
Using the statement entry function . . ..
Prompting . . . . . . . . . . ..
Syntax checking . . . . . . . ..
Statement processing mode. . . . ..
Subqueries . . . . . . . . . ..
CREATE TABLE prompting . . . ..
Entering DBCS data . . . . . . ..
Using the list selection function . . . ..
Example: Using the list selection function
Session services description . . . . ..
Exiting interactive SQL . . . . . . ..
Using an existing SQL session . . . . ..
Recovering an SQL session . . . . . ..
Accessing remote databases with interactive
SQL . . . . . . . . . . . . ..
Using the SQL statement processor . . . ..
Execution of statements after errors occur
Commitment control in the SQL statement
processor . . . . . . . . . . . ..
Source listing for the SQL statement
processor . . . . . . . . . . . ..
Using the RUNSQL CL command . . . ..
Distributed relational database function and SQL
DB2 for i distributed relational database support
DB2 for i distributed relational database
example program . . . . . . . . . ..
SQL package support. . . . . . . . ..
Valid SQL statements in an SQL package ..
Considerations for creating an SQL package
CRTSQLPKG authorization . . . . ..
Creating a package on a database other
than DB2 for i . . . . . . . . ..
Target release (TGTRLS) parameter . ..
SQL statement size . . . . . . ..
Statements that do not require a package
Package object type . . . . . . ..
ILE programs and service programs . ..
Package creation connection . . . ..
Unit of work . . . . . . . . ..
Creating packages locally . . . . ..
Labels . . . . . . . . . . . ..
Consistency token . . . . . . . ..
SQL and recursion. . . . . . . ..
CCSID considerations for SQL. . . . . ..
Connection management and activation groups
Source code for PGM1 . . . . . . ..
Contents

396
397
397
398
399
399
402
406
408
410
411
412
412
414
414
414
414
414
415
415
417
418
419
419
419
421
423
423
423
425
427
427
428
429
430
430
430
430
431
431
431
432
432
432
432
432
432
432
433
433
433
434

vii

|
|
|
|
|
|
|
|
|
|
|
|
|
|
|

Source code for PGM2 . . . . . . ..


Source code for PGM3 . . . . . . ..
Multiple connections to the same relational
database . . . . . . . . . . . ..
Implicit connection management for the
default activation group . . . . . . ..
Implicit connection management for
nondefault activation groups . . . . ..
Distributed support . . . . . . . . ..
Determining the connection type . . . ..
Connect and commitment control restrictions
Determining the connection status . . ..
Distributed unit of work connection
considerations . . . . . . . . . ..
Ending connections . . . . . . . ..
Distributed unit of work. . . . . . . ..
Managing distributed unit of work
connections . . . . . . . . . . ..
Checking the connection status . . . ..
Cursors and prepared statements . . . ..
DRDA stored procedure considerations. . ..
WebSphere MQ with DB2 . . . . . . . ..
WebSphere MQ messages . . . . . . ..
WebSphere MQ message handling . . ..
DB2 MQ services . . . . . . . ..
DB2 MQ policies . . . . . . . ..
DB2 MQ functions. . . . . . . . . ..
DB2 MQ dependencies . . . . . . ..
DB2 MQ tables . . . . . . . . . . ..
DB2 MQ CCSID conversion . . . . . ..
Websphere MQ transactions . . . . . ..
Basic messaging with WebSphere MQ . . ..
Sending messages with WebSphere MQ . ..
Retrieving messages with WebSphere MQ . ..
Application to application connectivity with
WebSphere MQ. . . . . . . . . . ..
Reference. . . . . . . . . . . . . ..
DB2 for i sample tables . . . . . . . ..
Department table (DEPARTMENT) . . ..
DEPARTMENT . . . . . . . . ..
Employee table (EMPLOYEE) . . . . ..
EMPLOYEE . . . . . . . . . ..
Employee photo table (EMP_PHOTO) . ..
EMP_PHOTO . . . . . . . . ..

viii

IBM i: Database SQL programming

434
435
437
438
439
439
440
442
442
444
444
445
445
447
447
448
448
449
449
450
450
451
452
453
458
459
460
461
462
463
463
463
464
465
465
466
467
468

|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|

Employee resum table (EMP_RESUME) ..


EMP_RESUME . . . . . . . . ..
Employee to project activity table
(EMPPROJACT) . . . . . . . . ..
EMPPROJACT . . . . . . . . ..
Project table (PROJECT) . . . . . . ..
PROJECT. . . . . . . . . . ..
Project activity table (PROJACT) . . . ..
PROJACT . . . . . . . . . ..
Activity table (ACT) . . . . . . . ..
ACT . . . . . . . . . . . ..
Class schedule table (CL_SCHED) . . ..
CL_SCHED . . . . . . . . . ..
In-tray table (IN_TRAY) . . . . . . ..
IN_TRAY . . . . . . . . . . ..
Organization table (ORG) . . . . . ..
ORG . . . . . . . . . . . ..
Staff table (STAFF) . . . . . . . ..
STAFF . . . . . . . . . . . ..
Sales table (SALES) . . . . . . . ..
SALES. . . . . . . . . . . ..
Sample XML tables . . . . . . . ..
Product table (PRODUCT) . . . . . ..
PRODUCT . . . . . . . . . ..
Purchase order table (PURCHASEORDER)
PURCHASEORDER . . . . . . ..
Customer table (CUSTOMER) . . . . ..
CUSTOMER . . . . . . . . . ..
Catalog table (CATALOG) . . . . . ..
CATALOG . . . . . . . . . ..
Suppliers table (SUPPLIERS) . . . . ..
SUPPLIERS . . . . . . . . . ..
Inventory table (INVENTORY) . . . ..
INVENTORY . . . . . . . . ..
Product Supplier table
(PRODUCTSUPPLIER) . . . . . . ..
PRODUCTSUPPLIER. . . . . . ..
DB2 for i CL command descriptions . . . ..

Appendix. Notices . . . . . . . ..
Programming interface information .
Trademarks . . . . . . . . .
Terms and conditions. . . . . .

.
.
.

.
.
.

.
.
.

468
469
469
470
471
472
473
474
476
476
477
477
477
478
478
479
479
480
481
481
482
482
483
484
485
487
488
489
489
489
490
490
490
490
491
491

493
..
..
..

494
495
495

SQL programming
The DB2 for IBM i database provides a wide range of support for Structured Query Language (SQL).
The examples of SQL statements shown in this topic collection are based on the sample tables and
assume that the following statements are true:
v They are shown in the interactive SQL environment or they are written in ILE C or in COBOL. EXEC
SQL and END-EXEC are used to delimit an SQL statement in a COBOL program.
v Each SQL example is shown on several lines, with each clause of the statement on a separate line.
v SQL keywords are highlighted.
v Table names provided in the sample tables use the schema CORPDATA. Table names that are not
found in the Sample Tables should use schemas you create.
v Calculated columns are enclosed in parentheses, (), and brackets, [].
v The SQL naming convention is used.
v The APOST and APOSTSQL precompiler options are assumed although they are not the default
options in COBOL. Character string literals within SQL and host language statements are delimited by
single-quotation marks (').
v A sort sequence of *HEX is used, unless otherwise noted.
Whenever the examples vary from these assumptions, it is stated.
Because this topic collection is for the application programmer, most of the examples are shown as if they
were written in an application program. However, many examples can be slightly changed and run
interactively by using interactive SQL. The syntax of an SQL statement, when using interactive SQL,
differs slightly from the format of the same statement when it is embedded in a program.
Note: By using the code examples, you agree to the terms of the Code license and disclaimer
information on page 491.
Related concepts:
Embedded SQL programming
Related reference:
DB2 for i sample tables on page 463
These sample tables are referred to and used in the SQL programming and the SQL reference topic
collections.
DB2 for i5/OS SQL reference
|

What's new for IBM i 7.1

Read about new or significantly changed information for the SQL programming topic collection.

Result sets returned to programs

|
|
|

You can return a result set from a stored procedure to a program and have the program consume the
result set. For more information, see Writing a program or SQL procedure to receive the result sets from
a stored procedure on page 174.

Copyright IBM Corp. 1998, 2010

XML

| The XML data type has been added to SQL. This includes internal handling of XML values and XML
| schema decomposition. For more information, see SQL statements and SQL/XML functions on page
| 257.
|

Field procedures

|
|
|
|
|

A field procedure is a user-written exit routine to transform values in a single column. When values in
the column are changed, or new values inserted, the field procedure is invoked for each value, and can
transform that value (encode it) in any way. The encoded value is then stored. When values are retrieved
from the column, the field procedure is invoked for each value, which is encoded, and must decode it
back to the original value. For more information, see Defining field procedures on page 28.

MERGE statement

| The MERGE statement can be used to either insert new rows or update existing rows in a table or view,
| depending on whether they already exists in the table or view. For more information, see Merging data
| on page 119.
|

Global variables

| Global variables can be created and used in SQL. For more information, see Creating and using global
| variables on page 53.
|

Arrays in SQL procedures

| An array data type has been added for use in SQL procedures. For more information, see Array support
| in SQL procedures on page 226 and Debugging an SQL routine on page 227.
|

WebSphere MQ with DB2

|
|
|
|

DB2 provides an application programming interface to the WebSphere MQ message handling system
through a set of external user-defined functions, which are called DB2 MQ functions. You can use these
functions in SQL statements to combine DB2 database access with WebSphere MQ message handling. For
more information, see WebSphere MQ with DB2 on page 448.

| Other functional changes to the SQL programming information


| v An option to replace an existing object has been added to many DDL CREATE statements. See
|
Replacing existing objects on page 54.
|

What's new as of April 2015

| The OR REPLACE option has been added to the CREATE TABLE statement. For more information, see
| Using CREATE OR REPLACE TABLE on page 47.
|

What's new as of October 2014

Pipelined table functions

| A pipelined SQL table function is a more flexible version of a table function. For more information, see
| Example: SQL table UDFs on page 193.

IBM i: Database SQL programming

SQL variable debug for SQL routines

|
|

Debugging of SQL routines allows you to display values of SQL variables. For more information, see
Debugging an SQL routine on page 227.

What's new as of February 2013

Multiple event triggers

|
|

A trigger can be defined for more than one event. For more information, see Multiple event SQL
triggers on page 221.

System name for tables, views, and indexes

|
|
|

When creating a table, view, or index, the system name for the object can be specified on the create
statement. For more information, see Creating a table on page 17 and Creating and using views on
page 48.

What's new as of October 2012

Defaults for procedure parameters and using parameter names in CALL

|
|
|
|

You can define parameters for SQL and external procedures to have default values. Parameters with
default values can be omitted when calling the procedure. The CALL statement can specify parameter
names for any arguments. For more information, see Defining a procedure with default parameters on
page 156.

CREATE TABLE referencing a remote table

|
|

You can create a local table with the definition and data pulled from a non-local table. For more
information, see Creating a table with remote server data on page 23.

SQL and external routine management

|
|

Procedures and functions are tied to system objects that can be administered with CL commands. For
more information, see Managing SQL and external routine objects on page 230.

RUNSQLSTM OPTION parameter

|
|

The RUNSQLSTM command does not need to generate a listing. For more information, see Using the
SQL statement processor on page 421.

What's new as of April 2012

Obfuscation

|
|

The content of an SQL procedure or SQL function can be obfuscated. For more information, see
Obfuscating an SQL routine on page 228.

XMLTABLE

|
|
|
|

The XMLTABLE table function provides a way to work with XML content as a relational table. For more
information, see Using XMLTABLE to reference XML content as a relational table on page 284. In
addition, SQL XML Programming contains all the new and existing XML information that is related to
SQL.

SQL programming

Insert from a remote table

| You can insert into a local table with data pulled from a non-local table. For more information, see
| Inserting data from a remote database on page 110.
|

RUNSQL

| This new CL command runs a single SQL statement. For more information, see Using the RUNSQL CL
| command on page 425.
|

What's new as of April 2011

Connect by

| Hierarchical queries can be defined using the CONNECT BY syntax. For more information, see Using
| recursive queries on page 83.
|

How to see what's new or changed

| To help you see where technical changes have been made, the information center uses:
| v The
| v The

image to mark where new or changed information begins.


image to mark where new or changed information ends.

| In PDF files, you might see revision bars (|) in the left margin of new and changed information.
| To find other information about what's new or changed this release, see the Memo to users.

PDF file for SQL programming


You can view and print a PDF file of this information.
To view or download the PDF version of this document, select SQL programming (about 1,714 KB).

Saving PDF files


To save a PDF on your workstation for viewing or printing:
1. Right-click the PDF link in your browser.
2. Click the option that saves the PDF locally.
3. Navigate to the directory in which you want to save the PDF.
4. Click Save.

Downloading Adobe Reader


You need Adobe Reader installed on your system to view or print these PDFs. You can download a free
copy from the Adobe Web site (http://get.adobe.com/reader/)

Introduction to DB2 for i Structured Query Language


Structured Query Language (SQL) is a standardized language for defining and manipulating data in a
relational database. These topics describe the System i implementation of the SQL using the DB2 for i
database and the IBM DB2 Query Manager and SQL Development Kit for i licensed program.

IBM i: Database SQL programming

SQL manages information based on the relational model of data. SQL statements can be embedded in
high-level languages, dynamically prepared and run, or run interactively. For information about
embedded SQL, see Embedded SQL programming.
SQL consists of statements and clauses that describe what you want to do with the data in a database
and under what conditions you want to do it.
SQL can access data in a remote relational database, using the IBM Distributed Relational Database
Architecture (DRDA).
Related concepts:
Distributed database programming
Related reference:
Distributed relational database function and SQL on page 427
A distributed relational database consists of a set of SQL objects that are spread across interconnected
computer systems.

SQL concepts
DB2 for i SQL consists of several main parts, such as SQL runtime support, precompilers, and interactive
SQL.
v SQL runtime support
SQL run time parses SQL statements and runs any SQL statements. This support is part of the IBM i
licensed program, which allows applications that contain SQL statements to be run on systems where
the IBM DB2 Query Manager and SQL Development Kit for i licensed program is not installed.
v SQL precompilers
SQL precompilers support precompiling embedded SQL statements in host languages. The following
languages are supported:
ILE C
ILE C++

ILE COBOL
COBOL
PL/I
RPG III (part of RPG)

ILE RPG
The SQL host language precompilers prepare an application program that contains SQL statements.
The host language compilers then compile the precompiled host source programs. For more
information about precompiling, see Preparing and running a program with SQL statements in the
Embedded SQL programming information. The precompiler support is part of the IBM DB2 Query
Manager and SQL Development Kit for i licensed program.
SQL interactive interface
The SQL interactive interface allows you to create and run SQL statements. For more information about
interactive SQL, see Using interactive SQL on page 410. Interactive SQL is part of the IBM DB2
Query Manager and SQL Development Kit for i licensed program.
Run SQL Scripts
The Run SQL Scripts window in System i Navigator allows you to create, edit, run, and troubleshoot
scripts of SQL statements.
Run SQL Statements (RUNSQLSTM) CL command
The RUNSQLSTM command can be used to run a series of SQL statements that are stored in a source
file or a source stream file. For more information about the RUNSQLSTM command, see Using the
SQL statement processor on page 421.
DB2 Query Manager
SQL programming

DB2 Query Manager provides a prompt-driven interactive interface that allows you to create data, add
data, maintain data, and run reports on the databases. Query Manager is part of the IBM DB2 Query
Manager and SQL Development Kit for i licensed program. For more information, see Query Manager
Use
.
v SQL REXX interface
The SQL REXX interface allows you to run SQL statements in a REXX procedure. For more information
about using SQL statements in REXX procedures, see Coding SQL statements in REXX applications in
the Embedded SQL programming information.
v SQL call level interface
The DB2 for i database supports the SQL call level interface. This allows users of any of the ILE
languages to access SQL functions directly through bound calls to a service program that is provided
by the system. Using the SQL call level interface, you can perform all the SQL functions without the
need to precompile. This is a standard set of procedure calls to prepare SQL statements, run SQL
statements, fetch rows of data, and even perform advanced functions, such as accessing the catalogs
and binding program variables to output columns.
For a complete description of all the available functions and their syntax, see SQL call level interface in
the Database section of the IBM i Information Center.
v Process Extended Dynamic SQL (QSQPRCED) API
This application programming interface (API) provides an extended dynamic SQL capability. You can
prepare SQL statements into an SQL package and run them by using this API. Statements that are
prepared into a package by this API persist until the package or statement is explicitly dropped. For
more information about the QSQPRCED API, see Process Extended Dynamic SQL (QSQPRCED) API.
For general information about APIs, see Application programming interfaces.
v Syntax Check SQL Statement (QSQCHKS) API
This API syntax checks SQL statements. For more information about the QSQCHKS API, see Syntax
Check SQL Statement (QSQCHKS) API. For general information about APIs, see Application
programming interfaces.
v DB2 Multisystem
This feature of the operating system allows your data to be distributed across multiple systems. For
more information, see DB2 Multisystem.
v DB2 Symmetric Multiprocessing
This feature of the operating system provides the query optimizer with additional methods for
retrieving data that include parallel processing. Symmetric multiprocessing (SMP) is a form of
parallelism achieved on a single system where multiple processors (CPU and I/O processors) that
share memory and disk resource work simultaneously toward achieving a single end result. This
parallel processing means that the database manager can have more than one (or all) of the system
processors working on a single query simultaneously. For more information, see Controlling parallel
processing for queries in the Database performance and query optimization topic collection.

SQL relational database and system terminology


In the relational model of data, all data is perceived as existing in tables. DB2 for i objects are created and
maintained as system objects.
The following table shows the relationship between system terms and SQL relational database terms.
Table 1. Relationship of system terms to SQL terms
System terms

SQL terms

Library. Groups related objects and allows you to find


the objects by name.

Schema. Consists of a library, a journal, a journal


receiver, an SQL catalog, and optionally a data
dictionary. A schema groups related objects and allows
you to find the objects by name.
Table. A set of columns and rows.

Physical file. A set of records.

IBM i: Database SQL programming

Table 1. Relationship of system terms to SQL terms (continued)


System terms

SQL terms

Record. A set of fields.

Row. The horizontal part of a table containing a serial


set of columns.
Column. The vertical part of a table of one data type.

Field. One or more characters of related information of


one data type.
Logical file. A subset of fields and records of one or
more physical files.
SQL package. An object type that is used to run SQL
statements.
User Profile

View. A subset of columns and rows of one or more


tables.
Package. An object type that is used to run SQL
statements.
Authorization name or Authorization ID.

Related concepts:
Distributed database programming

SQL and system naming conventions


You can use either the system (*SYS) or the SQL (*SQL) naming convention in DB2 for i programming.
The naming convention used affects the method for qualifying file and table names and the terms used
on the interactive SQL displays. The naming convention used is selected by a parameter on the SQL
commands or by using the SET OPTION statement.

System naming (*SYS)


In the system naming convention, tables and other SQL objects in an SQL statement are qualified by
schema name in the form:
schema/table

SQL naming (*SQL)


In the SQL naming convention, tables and other SQL objects in an SQL statement are qualified by schema
name in the form:
schema.table

Related reference:
Qualification of unqualified object names

Types of SQL statements


There are several basic types of SQL statements. They are listed here according to their functions.
v SQL schema statements, also known as data definition language (DDL) statements
v SQL data and data change statements, also known as data manipulation language (DML) statements
v Dynamic SQL statements
v Embedded SQL host language statements

SQL programming

SQL schema statements

| ALTER FUNCTION
| ALTER PROCEDURE

|
|

ALTER SEQUENCE
ALTER TABLE
COMMENT ON
CREATE ALIAS
CREATE FUNCTION
CREATE INDEX
CREATE PROCEDURE
CREATE SCHEMA
CREATE SEQUENCE
CREATE TABLE
CREATE TRIGGER
CREATE TYPE
CREATE VARIABLE
CREATE VIEW
DROP
GRANT
LABEL ON
RENAME
REVOKE

SQL data statements


ALLOCATE CURSOR
ASSOCIATE LOCATORS
CLOSE
DECLARE CURSOR
DELETE
FETCH
FREE LOCATOR
HOLD LOCATOR
INSERT
LOCK TABLE
OPEN
REFRESH TABLE
SELECT INTO
SET variable
UPDATE
VALUES INTO

SQL data change statements


DELETE
INSERT
MERGE
UPDATE

SQL connection statements


CONNECT
DISCONNECT
RELEASE
SET CONNECTION

SQL transaction statements


COMMIT
RELEASE SAVEPOINT
ROLLBACK
SAVEPOINT
SET TRANSACTION

SQL session statements


DECLARE GLOBAL TEMPORARY TABLE
SET CURRENT DECFLOAT ROUNDING MODE
SET CURRENT DEGREE
SET CURRENT IMPLICIT XMLPARSE OPTION
SET ENCRYPTION PASSWORD
SET PATH
SET SCHEMA
SET SESSION AUTHORIZATION

Dynamic SQL statements


ALLOCATE DESCRIPTOR
DEALLOCATE DESCRIPTOR
DESCRIBE
DESCRIBE CURSOR
DESCRIBE INPUT
DESCRIBE PROCEDURE
DESCRIBE TABLE
EXECUTE
EXECUTE IMMEDIATE
GET DESCRIPTOR
PREPARE
SET DESCRIPTOR

Embedded SQL host language statements


BEGIN DECLARE SECTION
DECLARE PROCEDURE
DECLARE STATEMENT
DECLARE VARIABLE
END DECLARE SECTION
GET DIAGNOSTICS
INCLUDE
SET OPTION
SET RESULT SETS
SIGNAL
WHENEVER

SQL control statements


CALL

IBM i: Database SQL programming

SQL statements can operate on objects that are created by SQL as well as externally described physical
files and single-format logical files. They do not refer to the interactive data definition utility (IDDU)
dictionary definition for program-described files. Program-described files appear as a table with only a
single column.
Related concepts:
Data definition language on page 16
Data definition language (DDL) describes the portion of SQL that creates, alters, and deletes database
objects. These database objects include schemas, tables, views, sequences, catalogs, indexes, and aliases.
Data manipulation language on page 56
Data manipulation language (DML) describes the portion of SQL that manipulates or controls data.
Related reference:
DB2 for i5/OS SQL reference

SQL communication area


The SQL communication area (SQLCA) is a set of variables that provides an application program with
information about its execution of SQL statements. The SQLCA is updated at the end of the execution of
every SQL statement.
Related concepts:
SQLCA (SQL communication area)
Handling SQL error return codes using the SQLCA

SQL diagnostics area


The SQL diagnostics area maintained by the database manager provides information about the SQL
statement that is most recently run. Your application program can access the SQL diagnostics area using
the GET DIAGNOSTICS statement.
Related concepts:
Using the SQL diagnostics area
Related reference:
GET DIAGNOSTICS statement

SQL objects
|
|
|

SQL objects are schemas, journals, catalogs, tables, aliases, views, indexes, constraints, triggers, sequences,
stored procedures, user-defined functions, user-defined types, global variables, and SQL packages. SQL
creates and maintains these objects as system objects.

Schemas
A schema provides a logical grouping of SQL objects. A schema consists of a library, a journal, a journal
receiver, a catalog, and, optionally, a data dictionary.
Tables, views, and system objects (such as programs) can be created, moved, or restored into any system
library. All system files can be created or moved into an SQL schema if the SQL schema does not contain
a data dictionary. If the SQL schema contains a data dictionary then:
v Source physical files or nonsource physical files with one member can be created, moved, or restored
into an SQL schema.
v Logical files cannot be placed in an SQL schema because they cannot be described in the data
dictionary.
You can create and own many schemas.

Journals and journal receivers


A journal and a journal receiver are used to record changes to tables and views in the database.
SQL programming

Journals and journal receivers are used in processing the SQL COMMIT, ROLLBACK, SAVEPOINT, and
RELEASE SAVEPOINT statements. Journals and journal receivers can also be used as audit trails or for
forward or backward recovery.
Related concepts:
Journal management
Commitment control

Catalogs
| An SQL catalog is a collection of tables and views that describe tables, views, indexes, procedures,
| functions, sequences, triggers, variables, constraints, programs, packages, and XSR objects.
This information is contained in a set of cross-reference tables in libraries QSYS and QSYS2. In each SQL
schema there is a set of views built over the catalog tables that contains information about the objects in
the schema.
A catalog is automatically created when you create a schema. You cannot drop or explicitly change the
catalog.
Related reference:
Catalog

Tables, rows, and columns


A table is a two-dimensional arrangement of data that consists of rows and columns.
The row is the horizontal part containing one or more columns. The column is the vertical part
containing one or more rows of data of one data type. All data for a column must be of the same type. A
table in SQL is a keyed or non-keyed physical file.
A materialized query table is a table that is used to contain materialized data that is derived from one or
more source tables specified by a select-statement.
A partitioned table is a table whose data is contained in one or more local partitions (members).
Related concepts:
DB2 Multisystem
Related reference:
Data types
Creating and altering a materialized query table on page 22
A materialized query table is a table whose definition is based on the result of a query, and whose data is in
the form of precomputed results that are taken from the table or tables on which the materialized query
table definition is based.

Aliases
An alias is an alternate name for a table or view.
You can use an alias to refer to a table or view in those cases where an existing table or view can be
referred to. Additionally, aliases can be used to join table members.
Related reference:
Aliases

Views
A view appears like a table to an application program. However, a view contains no data and only
logically represents one or more tables over which it is created.

10

IBM i: Database SQL programming

A view can contain all the columns and rows of the given tables or a subset of them. The columns can be
arranged differently in a view than they are in the tables from which they are taken. A view in SQL is a
special form of a nonkeyed logical file.
Related reference:
Views

Indexes
An SQL index is a subset of the data in the columns of a table that are logically arranged in either
ascending or descending order.
Each index contains a separate arrangement. These arrangements are used for ordering (ORDER BY
clause), grouping (GROUP BY clause), and joining. An SQL index is a keyed logical file.
The index is used by the system for faster data retrieval. Creating an index is optional. You can create
any number of indexes. You can create or drop an index at any time. The index is automatically
maintained by the system. However, because the indexes are maintained by the system, a large number
of indexes can adversely affect the performance of the applications that change the table.
Related concepts:
Creating an index strategy

Constraints
A constraint is a rule enforced by the database manager to limit the values that can be inserted, deleted,
or updated in a table.
DB2 for i supports the following constraints:
v Unique constraints
A unique constraint is the rule that the values of the key are valid only if they are unique. You can
create a unique constraint using the CREATE TABLE or ALTER TABLE statement. Although the
CREATE INDEX statement can create a unique index that also guarantees uniqueness, such an index is
not a constraint.
Unique constraints are enforced during the execution of INSERT and UPDATE statements. A PRIMARY
KEY constraint is a form of the UNIQUE constraint. The difference is that a PRIMARY KEY cannot
contain any nullable columns.
v Referential constraints
A referential constraint is the rule that the values of the foreign key are valid only if one of the following
conditions is met:
They appear as values of a parent key.
Some component of the foreign key is null.
Referential constraints are enforced during the execution of INSERT, UPDATE, and DELETE
statements.
v Check constraints
A check constraint is the rule that limits the values allowed in a column or group of columns. You can
create a check constraint using the CREATE TABLE or ALTER TABLE statement. Check constraints are
enforced during the execution of INSERT and UPDATE statements. To satisfy the constraint, each row
of data inserted or updated in the table must make the specified condition either TRUE or unknown
(because of a null value).
Related reference:
Constraints on page 145
The DB2 for i database supports unique, referential, and check constraints.

SQL programming

11

Triggers
A trigger is a set of actions that runs automatically whenever a specified event occurs to a specified table
or view.
An event can be an insert, an update, a delete, or a read operation. A trigger can run either before or
after the event. DB2 for i supports SQL insert, update, and delete triggers and external triggers.
Related tasks:
Triggering automatic events in your database

Stored procedures
A stored procedure is a program that can be called with the SQL CALL statement.
DB2 for i supports external procedures and SQL procedures. An external procedure can be any system
program, service program, or REXX procedure. It cannot be a System/36 program or procedure. An SQL
procedure is defined entirely in SQL and can contain SQL statements, including SQL control statements.
Related concepts:
Stored procedures on page 149
A procedure (often called a stored procedure) is a program that can be called to perform operations. A
procedure can include both host language statements and SQL statements. Procedures in SQL provide the
same benefits as procedures in a host language.

Sequences
A sequence is a data area object that provides a quick and easy way of generating unique numbers.
You can use a sequence to replace an identity column or a user-generated numeric column. A sequence
has uses similar to these alternatives.
Related reference:
Creating and using sequences on page 26
Sequences are similar to identity columns in that they both generate unique values. However, sequences
are objects that are independent of any tables. You can use sequences to generate values quickly and
easily.
| Global variables
| A global variable is a named variable that can be created, accessed, and modified using SQL.
| A global variable can provide a unique value for a session. The variable can be used as part of any
| expression in places such as a query, a create view, or an insert statement.

User-defined functions
A user-defined function is a program that can be called like any built-in functions.
DB2 for i supports external functions, SQL functions, and sourced functions. An external function can be
any system ILE program or service program. An SQL function is defined entirely in SQL and can contain
SQL statements, including SQL control statements. A sourced function is built over any built-in or any
existing user-defined function. You can create a scalar function or a table function as either an SQL
function or an external function.
Related concepts:
Using user-defined functions on page 190
In writing SQL applications, you can implement some actions or operations as a user-defined function
(UDF) or as a subroutine in your application. Although it might appear easier to implement new
operations as subroutines, you might want to consider the advantages of using a UDF instead.

12

IBM i: Database SQL programming

User-defined types
A user-defined type is a data type that you can define independently of the data types that are provided by
the database management system.
|
|

Distinct data types map to built-in types. Array data types are defined using a built-in type as the
element type and a maximum cardinality value.
Related concepts:
User-defined distinct types on page 244
A user-defined distinct type (UDT) is a mechanism to extend DB2 capabilities beyond the built-in data
types that are available.

|
|
|

XSR objects

|
|

You can use an XSR object during validation of an XML document or during annotated XML schema
decomposition.

An XSR object is one or more XML schema documents that have been registered in the XML schema
repository with the same name.

SQL packages
An SQL package is an object that contains the control structure produced when the SQL statements in an
application program are bound to a remote relational database management system (DBMS).
The DBMS uses the control structure to process SQL statements encountered while running the
application program.
SQL packages are created when a relational database name (RDB parameter) is specified on a Create SQL
(CRTSQLxxx) command and a program object is created. Packages can also be created with the Create
SQL Package (CRTSQLPKG) command.
Note: The xxx in this command refers to the host language indicators: CI for ILE C, CPPI for ILE C++,
CBL for COBOL, CBLI for ILE COBOL, PLI for PL/I, RPG for RPG/400, and RPGI for ILE RPG.
SQL packages can also be created with the Process Extended Dynamic SQL (QSQPRCED) API. The SQL
packages mentioned within this topic collection refer exclusively to distributed program SQL packages.
The QSQPRCED API uses SQL packages to provide extended dynamic SQL support.
Related reference:
Distributed relational database function and SQL on page 427
A distributed relational database consists of a set of SQL objects that are spread across interconnected
computer systems.
Process Extended Dynamic SQL (QSQPRCED) API

Application program objects


Several objects are created when a DB2 for i application program is being precompiled.
DB2 for i supports both non-ILE and ILE precompilers. Application programs can be either distributed or
nondistributed.
With the DB2 for i database, you might need to manage the following objects:
v The original source
v Optionally, the module object for ILE programs
v The program or service program
v The SQL package for distributed programs

SQL programming

13

With a nondistributed non-ILE DB2 for i program, you must manage only the original source and the
resulting program. The following figure shows the objects involved and the steps that happen during the
precompile and compile processes for a nondistributed non-ILE DB2 for i program. The user source file
precompiles the source to a temporary source file member. This member is then compiled into a program.

With a nondistributed ILE DB2 for i program, you might need to manage the original source, the
modules, and the resulting program or service program. The following figure shows the objects involved
and the steps that happen during the precompile and compile processes for a nondistributed ILE DB2 for
i program when OBJTYPE(*PGM) is specified on the precompile command. The user source file
precompiles the source to a temporary source file member. This member is then compiled into a module
that binds to a program.

With a distributed non-ILE DB2 for i program, you must manage the original source, the resulting
program, and the resulting package. The following figure shows the objects and the steps that occur
during the precompile and compile processes for a distributed non-ILE DB2 for i program. The user
source file precompiles the source to a temporary source file member. This member is then compiled into
a program. After the program is created, an SQL package is created to hold the program.

With a distributed ILE DB2 for i program, you must manage the original source, module objects, the
resulting program or service program, and the resulting packages. An SQL package can be created for
each distributed module in a distributed ILE program or service program. The following figure shows the
objects and the steps that occur during the precompile and compile processes for a distributed ILE DB2
for i program. The user source file precompiles the source to a temporary source file member. This
member is then compiled into a module that binds to a program. After the program is created, an SQL
package is created to hold the program.

14

IBM i: Database SQL programming

Note: The access plans associated with the DB2 for i distributed program object are not created until the
program is run locally.
Related tasks:
Preparing and running a program with SQL statements

User source file


A source file member or a source stream file contains the application language and SQL statements. You
can create and maintain the source file member by using the source entry utility (SEU), a part of the IBM
Rational Development Studio for i licensed program.

Output source file member


By default, the precompile process creates a temporary source file QSQLTxxxxx in the QTEMP library.
However, you can specify the output source file as a permanent file on the precompile command.
If the precompile process uses the QTEMP library, the system automatically deletes the file when the job
is completed. A member with the same name as the program name is added to the output source file.
This member contains the following items:
v Calls to the SQL runtime support, which have replaced embedded SQL statements
v Parsed and syntax-checked SQL statements
By default, the precompiler calls the host language compiler.
Related tasks:
Preparing and running a program with SQL statements

Program
A program is an object that is created as a result of the compilation process for non-ILE compilations or as
a result of the bind process for ILE compilations.
An access plan is a set of internal structures and information that tells SQL how to run an embedded SQL
statement most effectively. It is created only when the program has been successfully created. Access
plans are not created during program creation for SQL statements if the statements refer to an object,
such as a table or view, that cannot be found or to which you are not authorized.
The access plans for such statements are created when the program is run. If, at that time, the table or
view still cannot be found or you are still not authorized, a negative SQLCODE is returned. Access plans
are stored and maintained in the program object for non-distributed SQL programs and in the SQL
package for distributed SQL programs.

SQL package
An SQL package contains the access plans for a distributed SQL program.
An SQL package is an object that is created when:
v You successfully create a distributed SQL program by specifying the relational database (RDB)
parameter on the CREATE SQL (CRTSQLxxx) commands.
v You run the Create SQL Package (CRTSQLPKG) command.
SQL programming

15

When a distributed SQL program is created, the name of the SQL package and an internal consistency
token are saved in the program. They are used at run time to find the SQL package and to verify that the
SQL package is correct for this program. Because the name of the SQL package is critical for running
distributed SQL programs, an SQL package cannot be:
v Moved
v Renamed
v Duplicated
v Restored to a different library

Module
A module is an Integrated Language Environment (ILE) object that you create by compiling source code
using the Create Module (CRTxxxMOD) command (or any of the Create Bound Program (CRTBNDxxx)
commands, where xxx is C, CBL, CPP, or RPG).
You can run a module only if you use the Create Program (CRTPGM) command to bind it into a
program. You typically bind several modules together, but you can bind a module by itself. Modules
contain information about the SQL statements; however, the SQL access plans are not created until the
modules are bound into either a program or service program.
Related reference:
Create Program (CRTPGM) command

Service program
A service program is an Integrated Language Environment (ILE) object that provides a means of packaging
externally supported callable routines (functions or procedures) into a separate object.
Bound programs and other service programs can access these routines by resolving their imports to the
exports provided by a service program. The connections to these services are made when the calling
programs are created. This improves call performance to these routines without including the code in the
calling program.

Data definition language


Data definition language (DDL) describes the portion of SQL that creates, alters, and deletes database
objects. These database objects include schemas, tables, views, sequences, catalogs, indexes, and aliases.
Related concepts:
Types of SQL statements on page 7
There are several basic types of SQL statements. They are listed here according to their functions.
Related tasks:
Getting started with SQL

Creating a schema
A schema provides a logical grouping of SQL objects. To create a schema, use the CREATE SCHEMA
statement.
A schema consists of a library, a journal, a journal receiver, a catalog, and optionally, a data dictionary.
Tables, views, and system objects (such as programs) can be created, moved, or restored into any system
libraries. All system files can be created or moved into an SQL schema if the SQL schema does not
contain a data dictionary. If the SQL schema contains a data dictionary then:
v Source physical files or nonsource physical files with one member can be created, moved, or restored
into an SQL schema.
v Logical files cannot be placed in an SQL schema because they cannot be described in the data
dictionary.

16

IBM i: Database SQL programming

You can create and own many schemas.


You can create a schema using the CREATE SCHEMA statement. For example, create a schema called
DBTEMP:
CREATE SCHEMA DBTEMP

Related reference:
CREATE SCHEMA

Creating a table
A table can be visualized as a two-dimensional arrangement of data that consists of rows and columns.
To create a table, use the CREATE TABLE statement.
The row is the horizontal part containing one or more columns. The column is the vertical part
containing one or more rows of data of one data type. All data for a column must be of the same type. A
table in SQL is a keyed or non-keyed physical file.
|
|
|

You can create a table using the CREATE TABLE statement. You provide a name for the table. If the table
name is not a valid system object name, you can use the optional FOR SYSTEM NAME clause to specify
a system name.
The definition includes the names and attributes of its columns. The definition can include other
attributes of the table, such as the primary key.
Example: Given that you have administrative authority, create a table named 'INVENTORY' with the
following columns:
v Part number: Integer between 1 and 9999, and must not be null
v Description: Character of length 0 to 24
v Quantity on hand: Integer between 0 and 100000
The primary key is PARTNO.
CREATE TABLE INVENTORY
(PARTNO
SMALLINT
NOT NULL,
DESCR
VARCHAR(24 ),
QONHAND
INT,
PRIMARY KEY(PARTNO))

Related concepts:
Data types

Adding and removing constraints


Constraints can be added to a new table or to an existing table. To add a unique or primary key, a
referential constraint, or a check constraint, use the CREATE TABLE or the ALTER TABLE statement. To
remove a constraint, use the ALTER TABLE statement.
For example, add a primary key to an existing table using the ALTER TABLE statement:
ALTER TABLE CORPDATA.DEPARTMENT
ADD PRIMARY KEY (DEPTNO)

To make this key a unique key, replace the keyword PRIMARY with UNIQUE.
You can remove a constraint using the same ALTER TABLE statement:
ALTER TABLE CORPDATA.DEPARTMENT
DROP PRIMARY KEY (DEPTNO)

SQL programming

17

Referential integrity and tables


Referential integrity is the condition of a set of tables in a database in which all references from one table
to another are valid.
Consider the following example:
v CORPDATA.EMPLOYEE serves as a master list of employees.
v CORPDATA.DEPARTMENT acts as a master list of all valid department numbers.
v CORPDATA.EMP_ACT provides a master list of activities performed for projects.
Other tables refer to the same entities described in these tables. When a table contains data for which
there is a master list, that data should actually appear in the master list, or the reference is not valid. The
table that contains the master list is the parent table, and the table that refers to it is a dependent table.
When the references from the dependent table to the parent table are valid, the condition of the set of
tables is called referential integrity.
Stated another way, referential integrity is the state of a database in which all values of all foreign keys
are valid. Each value of the foreign key must also exist in the parent key or be null. This definition of
referential integrity requires an understanding of the following terms:
v A unique key is a column or set of columns in a table that uniquely identify a row. Although a table can
have several unique keys, no two rows in a table can have the same unique key value.
v A primary key is a unique key that does not allow nulls. A table cannot have more than one primary
key.
v A parent key is either a unique key or a primary key that is referenced in a referential constraint.
v A foreign key is a column or set of columns whose values must match those of a parent key. If any
column value used to build the foreign key is null, the rule does not apply.
v A parent table is a table that contains the parent key.
v A dependent table is the table that contains the foreign key.
v A descendent table is a table that is a dependent table or a descendent of a dependent table.
Enforcement of referential integrity prevents the violation of the rule that states that every non-null
foreign key must have a matching parent key.
SQL supports the referential integrity concept with the CREATE TABLE and ALTER TABLE statements.
Related reference:
DB2 for i sample tables on page 463
These sample tables are referred to and used in the SQL programming and the SQL reference topic
collections.
CREATE TABLE
ALTER TABLE
Adding and removing referential constraints:
You can use the CREATE TABLE statement or the ALTER TABLE statement to add a referential
constraint. To remove a referential constraint, use the ALTER TABLE statement.
Constraints are rules that ensure that references from one table, a dependent table, to data in another
table, the parent table, are valid. You use referential constraints to ensure referential integrity.
With a referential constraint, non-null values of the foreign key are valid only if they also appear as
values of a parent key. When you define a referential constraint, you specify:
v A primary or unique key
v A foreign key

18

IBM i: Database SQL programming

v Delete and update rules that specify the action taken with respect to dependent rows when the parent
row is deleted or updated.
Optionally, you can specify a name for the constraint. If a name is not specified, one is automatically
generated.
After a referential constraint is defined, the system enforces the constraint on every INSERT, DELETE,
and UPDATE operation performed through SQL or any other interface, including System i Navigator, CL
commands, utilities, or high-level language statements.
Related reference:
CREATE TABLE
ALTER TABLE
Example: Adding referential constraints:
You define a referential constraint that every department number in the sample employee table must
appear in the department table. The referential constraint ensures that every employee belongs to an
existing department.
The following SQL statements create the CORPDATA.DEPARTMENT and CORPDATA.EMPLOYEE tables
with those constraint relationships defined.
CREATE TABLE CORPDATA.DEPARTMENT
(DEPTNO
CHAR(3)
NOT NULL PRIMARY KEY,
DEPTNAME VARCHAR(29) NOT NULL,
MGRNO
CHAR(6),
ADMRDEPT CHAR(3)
NOT NULL
CONSTRAINT REPORTS_TO_EXISTS
REFERENCES CORPDATA.DEPARTMENT (DEPTNO)
ON DELETE CASCADE)
CREATE TABLE CORPDATA.EMPLOYEE
(EMPNO
CHAR(6)
NOT NULL PRIMARY KEY,
FIRSTNME VARCHAR(12) NOT NULL,
MIDINIT CHAR(1)
NOT NULL,
LASTNAME VARCHAR(15) NOT NULL,
WORKDEPT CHAR(3)
CONSTRAINT WORKDEPT_EXISTS
REFERENCES CORPDATA.DEPARTMENT (DEPTNO)
ON DELETE SET NULL ON UPDATE RESTRICT,
PHONENO CHAR(4),
HIREDATE DATE,
JOB
CHAR(8),
EDLEVEL SMALLINT
NOT NULL,
SEX
CHAR(1),
BIRTHDATE DATE,
SALARY
DECIMAL(9,2),
BONUS
DECIMAL(9,2),
COMM
DECIMAL(9,2),
CONSTRAINT UNIQUE_LNAME_IN_DEPT UNIQUE (WORKDEPT, LASTNAME))

In this case, the DEPARTMENT table has a column of unique department numbers (DEPTNO) which
functions as a primary key, and is a parent table in two constraint relationships:
REPORTS_TO_EXISTS
is a self-referencing constraint in which the DEPARTMENT table is both the parent and the
dependent in the same relationship. Every non-null value of ADMRDEPT must match a value of
DEPTNO. A department must report to an existing department in the database. The DELETE
CASCADE rule indicates that if a row with a DEPTNO value n is deleted, every row in the table
for which the ADMRDEPT is n is also deleted.

SQL programming

19

WORKDEPT_EXISTS
establishes the EMPLOYEE table as a dependent table, and the column of employee department
assignments (WORKDEPT) as a foreign key. Thus, every value of WORKDEPT must match a
value of DEPTNO. The DELETE SET NULL rule says that if a row is deleted from
DEPARTMENT in which the value of DEPTNO is n, then the value of WORKDEPT in
EMPLOYEE is set to null in every row in which the value was n. The UPDATE RESTRICT rule
says that a value of DEPTNO in DEPARTMENT cannot be updated if there are values of
WORKDEPT in EMPLOYEE that match the current DEPTNO value.
Constraint UNIQUE_LNAME_IN_DEPT in the EMPLOYEE table causes LASTNAME to be unique within
a department. While this constraint is unlikely, it illustrates how a constraint made up of several columns
can be defined at the table level.

Example: Removing constraints


When you remove the primary key over the DEPTNO column in the DEPARTMENT table, other tables
are affected.
You also remove the REPORTS_TO_EXISTS constraint that is defined on the DEPARTMENT table and the
WORKDEPT_EXISTS constraint that is defined on the EMPLOYEE table, because the primary key that
you remove is the parent key in those constraint relationships.
ALTER TABLE CORPDATA.EMPLOYEE DROP PRIMARY KEY

You can also remove a constraint by name, as in the following example:


ALTER TABLE CORPDATA.DEPARTMENT
DROP CONSTRAINT UNIQUE_LNAME_IN_DEPT

Check pending
Referential constraints and check constraints can be in a check pending state, where potential violations
of the constraints exist.
For referential constraints, a violation occurs when potential mismatches exist between parent and foreign
keys. For check constraints, a violation occurs when potential values exist in columns that are limited by
the check constraint. When the system determines that a constraint might have been violated (such as
after a restore operation), the constraint is marked as check pending. When this happens, restrictions are
placed on the use of tables involved in the constraint. For referential constraints, the following restrictions
apply:
v No input or output operations are allowed on the dependent file.
v Only read and insert operations are allowed on the parent file.
When a check constraint is in check pending, the following restrictions apply:
v Read operations are not allowed on the file.
v Insert and update operations are allowed and the constraint is enforced.
To get a constraint out of check pending, follow these steps:
1. Disable the relationship with the Change Physical File Constraint (CHGPFCST) CL command.
2. Correct the key (foreign, parent, or both) data for referential constraints or column data for check
constraints.
3. Enable the constraint again with the CHGPFCST CL command.
You can identify the rows that are in violation of the constraint with the Display Check Pending
Constraint (DSPCPCST) CL command.
Related concepts:
Check pending status in referential constraints

20

IBM i: Database SQL programming

Related tasks:
Working with constraints that are in check pending status

Creating a table using LIKE


You can create a table that looks like another table. That is, you can create a table that includes all of the
column definitions from an existing table.
The following definitions are copied:
v Column names (and system column names)
v Data type, length, precision, and scale
v CCSID
| v FIELDPROC
If the LIKE clause immediately follows the table name and is not enclosed in parentheses, the following
attributes are also included:
v Column text (LABEL ON)
v Column heading (LABEL ON)
v Default value
v Hidden attribute
v Identity attribute
v Nullability
If the specified table or view contains an identity column, you must specify the option INCLUDING
IDENTITY on the CREATE TABLE statement if you want the identity column to exist in the new table.
The default behavior for CREATE TABLE is EXCLUDING IDENTITY. There are similar options to include
the default value, the hidden attribute, and the row change timestamp attribute. If the specified table or
view is a non-SQL-created physical file or logical file, any non-SQL attributes are removed.
Create a table EMPLOYEE2 that includes all of the columns in EMPLOYEE:
CREATE TABLE EMPLOYEE2 LIKE EMPLOYEE

Related reference:
CREATE TABLE

Creating a table using AS


You can create a table from the result of a SELECT statement. To create this type of table, use the
CREATE TABLE AS statement.
All of the expressions that can be used in a SELECT statement can be used in a CREATE TABLE AS
statement. You can also include all of the data from the table or tables that you are selecting from.
For example, create a table named EMPLOYEE3 that includes all of the column definitions from
EMPLOYEE where the DEPTNO = D11.
CREATE TABLE EMPLOYEE3 AS
(SELECT PROJNO, PROJNAME, DEPTNO
FROM EMPLOYEE
WHERE DEPTNO = D11) WITH NO DATA

If the specified table or view contains an identity column, you must specify the option INCLUDING
IDENTITY on the CREATE TABLE statement if you want the identity column to exist in the new table.
The default behavior for CREATE TABLE is EXCLUDING IDENTITY. There are similar options to include
the default value, the hidden attribute, and the row change timestamp attribute. The WITH NO DATA
clause indicates that the column definitions are to be copied without the data. If you want to include the
SQL programming

21

data in the new table EMPLOYEE3, include the WITH DATA clause. If the specified query includes a
non-SQL-created physical file or logical file, any non-SQL result attributes are removed.
Related concepts:
Retrieving data using the SELECT statement on page 56
The SELECT statement tailors your query to gather data. You can use the SELECT statement to retrieve a
specific row or retrieve data in a specific way.
Related reference:
CREATE TABLE

Creating and altering a materialized query table


A materialized query table is a table whose definition is based on the result of a query, and whose data is in
the form of precomputed results that are taken from the table or tables on which the materialized query
table definition is based.
If the optimizer determines that a query runs faster against a materialized query table than it does
against the base table or tables, the query will run against the materialized query table. You can directly
query a materialized query table. For more information about how the optimizer uses materialized query
tables, see the Database performance and query optimization topic.
Assume a very large transaction table named TRANS contains one row for each transaction processed by
a company. The table is defined with many columns. Create a materialized query table for the TRANS
table that contains daily summary data for the date and amount of a transaction by issuing the following:
CREATE TABLE STRANS
AS (SELECT YEAR AS SYEAR, MONTH AS SMONTH, DAY AS SDAY, SUM(AMOUNT) AS SSUM
FROM TRANS
GROUP BY YEAR, MONTH, DAY )
DATA INITIALLY DEFERRED
REFRESH DEFERRED
MAINTAINED BY USER

This materialized query table specifies that the table is not populated at the time that it is created by
using the DATA INITIALLY DEFERRED clause. REFRESH DEFERRED indicates that changes made to
TRANS are not reflected in STRANS. Additionally, this table is maintained by the user, enabling the user
to use ALTER, INSERT, DELETE, and UPDATE statements.
To populate the materialized query table or refresh the table after it has been populated, use the
REFRESH TABLE statement. This causes the query associated with the materialized query table to be run
and causes the table to be filled with the results of the query. To populate the STRANS table, run the
following statement:
REFRESH TABLE STRANS

You can create a materialized query table from an existing base table as long as the result of the
select-statement provides a set of columns that match the columns in the existing table (same number of
columns and compatible column definitions). For example, create a table TRANSCOUNT. Then, change
the base table TRANSCOUNT into a materialized query table:
To create the table:
CREATE TABLE TRANSCOUNT
(ACCTID SMALLINT NOT NULL,
LOCID SMALLINT,
YEAR DATE
CNT INTEGER)

You can alter this table to be a materialized query table:

22

IBM i: Database SQL programming

ALTER TABLE TRANSCOUNT


ADD MATERIALIZED QUERY
(SELECT ACCTID, LOCID, YEAR, COUNT(*) AS CNT
FROM TRANS
GROUP BY ACCTID, LOCID, YEAR )
DATA INITIALLY DEFERRED
REFRESH DEFERRED
MAINTAINED BY USER

Finally, you can change a materialized query table back to a base table. For example:
ALTER TABLE TRANSCOUNT
DROP MATERIALIZED QUERY

In this example, the table TRANSCOUNT is not dropped, but it is no longer a materialized query table.
Related concepts:
Tables, rows, and columns on page 10
A table is a two-dimensional arrangement of data that consists of rows and columns.

Declaring a global temporary table


You can create a temporary table for use with your current session. To create a temporary table, use the
DECLARE GLOBAL TEMPORARY TABLE statement.
This temporary table does not appear in the system catalog and cannot be shared by other sessions.
When you end your session, the rows of the table are deleted and the table is dropped.
The syntax of this statement is similar to that of the CREATE TABLE statement and can include the LIKE
or AS clause.
For example, create a temporary table ORDERS:
DECLARE GLOBAL TEMPORARY TABLE ORDERS
(PARTNO
SMALLINT NOT NULL,
DESCR
VARCHAR(24),
QONHAND INT)
ON COMMIT DELETE ROWS

This table is created in QTEMP. To reference the table using a schema name, use either SESSION or
QTEMP. You can issue SELECT, INSERT, UPDATE, and DELETE statements against this table, the same
as any other table. You can drop this table by issuing the DROP TABLE statement:
DROP TABLE ORDERS

Related reference:
DECLARE GLOBAL TEMPORARY TABLE
|

Creating a table with remote server data

You can create a table on the local server that references one or more tables on a remote server.

|
|
|

Along with the select-statement, you can specify copy options to get attributes such as the default values
or identity column information copied for the new table. The WITH DATA or WITH NO DATA clause
must be specified to indicate whether to populate the table from the remote system.

|
|
|
|
|
|

For example, create a table named EMPLOYEE4 that includes column definitions from the EMPLOYEE
table on remote server REMOTESYS. Include the data from the remote system as well.
CREATE TABLE EMPLOYEE4 AS
(SELECT PROJNO, PROJNAME, DEPTNO
FROM REMOTESYS.TESTSCHEMA.EMPLOYEE
WHERE DEPTNO = D11) WITH DATA

SQL programming

23

|
|
|
|
|
|
|

You can also create this table as a global temporary table, which will create it in QTEMP. In this example,
different column names are provided for the new table. The table definition will pick up the default
values for its columns from the remote server.

|
|
|
|
|

The following restrictions apply to using a remote server as the source for the new table:
v The materialized query table clauses are not allowed.
v A column with a FIELDPROC cannot be listed in the select list.
v The copy options cannot be specified if the remote server is DB2 for LUW or DB2 for z/OS.
Related reference:

DECLARE GLOBAL TEMPORARY TABLE EMPLOYEE4 (Project_number, Project_name, Department_number) AS


(SELECT PROJNO, PROJNAME, DEPTNO
FROM REMOTESYS.TESTSCHEMA.EMPLOYEE
WHERE DEPTNO = D11) WITH DATA INCLUDING DEFAULTS

| Inserting data from a remote database on page 110


| You can insert into a table on the local server using a select statement to get rows from a remote server.

Creating a row change timestamp column


Every time a row is added or changed in a table with a row change timestamp column, the row change
timestamp column value is set to the timestamp corresponding to the time of the insert or update
operation.
The data type of a row change timestamp column must be TIMESTAMP. You can define only one row
change timestamp column in a table.
When you create a table, you can define a column in the table to be a row change timestamp column. For
example, create a table ORDERS with columns called ORDERNO, SHIPPED_TO, ORDER_DATE,
STATUS, and CHANGE_TS. Define CHANGE_TS as a row change timestamp column.
CREATE TABLE ORDERS
(ORDERNO SMALLINT,
SHIPPED_TO VARCHAR(36),
ORDER_DATE DATE,
STATUS CHAR(1),
CHANGE_TS TIMESTAMP FOR EACH ROW ON UPDATE AS ROW CHANGE TIMESTAMP NOT NULL)

When a row is inserted into the ORDERS table, the CHANGE_TS column for the row is set to the
timestamp of the insert operation. Any time a row in ORDERS is updated, the CHANGE_TS column for
the row is modified to reflect the timestamp of the update operation.
You can drop the row change timestamp attribute from a column:
ALTER TABLE ORDER
ALTER COLUMN CHANGE_TS
DROP ROW CHANGE TIMESTAMP

The column CHANGE_TS remains as a TIMESTAMP column in the table, but the system no longer
automatically updates timestamp values for this column.

Creating and altering an identity column


Every time a row is added to a table with an identity column, the identity column value for the new row
is generated by the system.
Only columns of type SMALLINT, INTEGER, BIGINT, DECIMAL, or NUMERIC can be created as
identity columns. You are allowed only one identity column per table. When you are changing a table
definition, only a column that you are adding can be specified as an identity column; existing columns
cannot.

24

IBM i: Database SQL programming

When you create a table, you can define a column in the table to be an identity column. For example,
create a table ORDERS with three columns called ORDERNO, SHIPPED_TO, and ORDER_DATE. Define
ORDERNO as an identity column.
CREATE TABLE ORDERS
(ORDERNO SMALLINT NOT NULL
GENERATED ALWAYS AS IDENTITY
(START WITH 500
INCREMENT BY 1
CYCLE),
SHIPPED_TO VARCHAR (36) ,
ORDER_DATE DATE)

This column is defined with a starting value of 500, incremented by 1 for every new row inserted, and
will recycle when the maximum value is reached. In this example, the maximum value for the identity
column is the maximum value for the data type. Because the data type is defined as SMALLINT, the
range of values that can be assigned to ORDERNO is from 500 to 32 767. When this column value
reaches 32 767, it will restart at 500 again. If 500 is still assigned to a column, and a unique key is
specified on the identity column, a duplicate key error is returned. The next insert operation will attempt
to use 501. If you do not have a unique key specified for the identity column, 500 is used again,
regardless of how many times it appears in the table.
For a larger range of values, specify the column to be data type INTEGER or even BIGINT. If you want
the value of the identity column to decrease, specify a negative value for the INCREMENT option. It is
also possible to specify the exact range of numbers by using MINVALUE and MAXVALUE.
You can modify the attributes of an existing identity column using the ALTER TABLE statement. For
example, you want to restart the identity column with a new value:
ALTER TABLE ORDER
ALTER COLUMN ORDERNO
RESTART WITH 1

You can also drop the identity attribute from a column:


ALTER TABLE ORDER
ALTER COLUMN ORDERNO
DROP IDENTITY

The column ORDERNO remains as a SMALLINT column, but the identity attribute is dropped. The
system will no longer generate values for this column.
Related reference:
Comparison of identity columns and sequences on page 27
While identity columns and sequences are similar in many ways, there are also differences.
Inserting values into an identity column on page 109
You can insert a value into an identity column or allow the system to insert a value for you.
Updating an identity column on page 114
You can update the value in an identity column to a specified value or have the system generate a new
value.

Using ROWID
Using ROWID is another way to have the system assign a unique value to a column. ROWID is similar
to identity columns. But rather than being an attribute of a numeric column, it is a separate data type.
To create a table similar to the identity column example:

SQL programming

25

CREATE TABLE ORDERS


(ORDERNO ROWID
GENERATED ALWAYS,
SHIPPED_TO VARCHAR (36) ,
ORDER_DATE DATE)

Creating and using sequences


Sequences are similar to identity columns in that they both generate unique values. However, sequences
are objects that are independent of any tables. You can use sequences to generate values quickly and
easily.
Sequences are not tied to a column in a table and are accessed separately. Additionally, they are not
treated as any part of a transaction's unit of work.
You create a sequence using the CREATE SEQUENCE statement. For an example similar to the identity
column example, create a sequence ORDER_SEQ:
CREATE SEQUENCE ORDER_SEQ
START WITH 500
INCREMENT BY 1
MAXVALUE 1000
CYCLE
CACHE 24

This sequence is defined with a starting value of 500, incremented by 1 for every use, and recycles when
the maximum value is reached. In this example, the maximum value for the sequence is 1000. When this
value reaches 1000, it will restart at 500 again.
After this sequence is created, you can insert values into a column using the sequence. For example,
insert the next value of the sequence ORDER_SEQ into a table ORDERS with columns ORDERNO and
CUSTNO.
First, create the table ORDERS:
CREATE TABLE ORDERS
(ORDERNO SMALLINT NOT NULL,
CUSTNO SMALLINT);

Then, insert the sequence value:


INSERT INTO ORDERS (ORDERNO, CUSTNO)
VALUES (NEXT VALUE FOR ORDER_SEQ, 12)

Running the following statement returns the values in the columns:


SELECT *
FROM ORDERS
Table 2. Results for SELECT from table ORDERS
ORDERNO

CUSTNO

500

12

In this example, the next value for sequence ORDER is inserted into the ORDERNO column. Issue the
INSERT statement again. Then run the SELECT statement.
Table 3. Results for SELECT from table ORDERS
ORDERNO

CUSTNO

500

12

501

12

26

IBM i: Database SQL programming

You can also insert the previous value for the sequence ORDER by using the PREVIOUS VALUE
expression. You can use NEXT VALUE and PREVIOUS VALUE in the following expressions:
v Within the select-clause of a SELECT statement or SELECT INTO statement as long as the statement
does not contain a DISTINCT keyword, a GROUP BY clause, an ORDER BY clause, a UNION
keyword, an INTERSECT keyword, or an EXCEPT keyword
v Within a VALUES clause of an INSERT statement
v Within the select-clause of the fullselect of an INSERT statement
v Within the SET clause of a searched or positioned UPDATE statement, though NEXT VALUE cannot be
specified in the select-clause of the subselect of an expression in the SET clause
You can alter a sequence by issuing the ALTER SEQUENCE statement. Sequences can be altered in the
following ways:
v Restarting the sequence
v Changing the increment between future sequence values
v Setting or eliminating the minimum or maximum values
v Changing the number of cached sequence numbers
v Changing the attribute that determines whether the sequence can cycle or not
v Changing whether sequence numbers must be generated in order of request
For example, change the increment of values of sequence ORDER from 1 to 5:
ALTER SEQUENCE ORDER_SEQ
INCREMENT BY 5

After this change is complete, run the INSERT statement again and then the SELECT statement. Now the
table contains the following columns.
Table 4. Results for SELECT from table ORDERS
ORDERNO

CUSTNO

500

12

501

12

528

12

Notice that the next value that the sequence uses is a 528. At first glance, this number appears to be
incorrect. However, look at the events that lead up to this assignment. First, when the sequence was
originally create, a cache value of 24 was assigned. The system assigns the first 24 values for this cache.
Next, the sequence was altered. When the ALTER SEQUENCE statement is issued, the system drops the
assigned values and starts up again with the next available value; in this case the original 24 that was
cached, plus the next increment, 5. If the original CREATE SEQUENCE statement did not have the
CACHE clause, the system automatically assigns a default cache value of 20. If that sequence was altered,
then the next available value is 25.
Related concepts:
Sequences on page 12
A sequence is a data area object that provides a quick and easy way of generating unique numbers.

Comparison of identity columns and sequences


While identity columns and sequences are similar in many ways, there are also differences.
Examine these differences before you decide which to use.

SQL programming

27

|
|

An identity column has the following characteristics:


v An identity column can be defined as part of a table when the table is created or it can be added to a
column using alter table. After a table is created, the identity column characteristics can be changed.
v An identity column automatically generates values for a single table.
v When an identity column is defined as GENERATED ALWAYS, the values used are always generated
by the database manager. Applications are not allowed to provide their own values when changing the
contents of the table.
v The IDENTITY_VAL_LOCAL function can be used to see the most recently assigned value for an
identity column.
A sequence has the following characteristics:
v A sequence is a system object of type *DTAARA that is not tied to a table.
v A sequence generates sequential values that can be used in any SQL statement.
v There are two expressions used to retrieve the next values in the sequence and to look at the previous
value assigned for the sequence. The PREVIOUS VALUE expression returns the most recently
generated value for the specified sequence for a previous statement within the current session. The
NEXT VALUE expression returns the next value for the specified sequence. The use of these
expressions allows the same value to be used across several SQL statements within several tables.
While these are not all of the characteristics of identity columns and sequences, these characteristics can
help you determine which to use depending on your database design and the applications that use the
database.
Related reference:
Creating and altering an identity column on page 24
Every time a row is added to a table with an identity column, the identity column value for the new row
is generated by the system.

Defining field procedures

| Field procedures are assigned to a table by the FIELDPROC clause of the CREATE TABLE and ALTER
| TABLE statements. A field procedure is a user-written exit routine that transforms values in a single
| column.
|
|
|
|
|

When values in the column are changed, or new values inserted, the field procedure is invoked for each
value, and can transform that value (encode it) in any way. The encoded value is then stored. When
values are retrieved from the column, the field procedure is invoked for each value, which is encoded,
and must decode it back to the original value. Any indexes defined on a non-derived column that uses a
field procedure are built with encoded values.

|
|
|
|
|
|
|
|

The transformation your field procedure performs on a value is called field-encoding. The same routine is
used to undo the transformation when values are retrieved; that operation is called field-decoding. Values
in columns with a field procedure are described to DB2 in two ways:
1. The description of the column as defined in CREATE TABLE or ALTER TABLE appears in the catalog
table QSYS2.SYSCOLUMNS. That is the description of the field-decoded value, and is called the
column description.
2. The description of the encoded value, as it is stored in the database, appears in the catalog table
QSYS2.SYSFIELDS. That is the description of the field-encoded value, and is called the field description.

| Important: The field-decoding function must be the exact inverse of the field-encoding function. For
| example, if a routine encodes 'ALABAMA' to '01', it must decode '01' to 'ALABAMA'. A violation of this
| rule can lead to unpredictable results. See General guidelines for writing field procedures on page 37.

28

IBM i: Database SQL programming

|
|
|
|

Field procedures can also perform masking of data when decoded (retrieved). In this case, the field
procedure would decode '01' to 'ALABAMA for certain users or environments and for other users or
environments may return a masked value such as XXXXXXXX instead. See Guidelines for writing field
procedures that mask data on page 38.

|
|
|

Field definition for field procedures

|
|
|

The data type of the encoded value can be any valid SQL data type except ROWID or DATALINK. Also a
field procedure cannot be associated with any column having values generated by IDENTITY or ROW
CHANGE TIMESTAMP.

|
|
|

If a DDS-created physical file is altered to add a field procedure, the encoded attribute data type cannot
be a LOB type or DataLink. If an SQL table is altered to add a field procedure, the encoded attribute
precision field must be 0 if the encoded attribute data type is any of the integer types.

|
|

A field procedure may not be added to a column that has a default value of CURRENT DATE,
CURRENT TIME, CURRENT TIMESTAMP, or USER.

|
|
|

A column defined with a user-defined data type can have a field procedure if the source type of the
user-defined data type is any of the allowed SQL data types. DB2 casts the value of the column to the
source type before it passes it to the field procedure.

|
|
|

Specifying the field procedure

|
|
|

You cannot use a field procedure on a ROWID or DATALINK column of a table or on any column having
values generated by IDENTITY or ROW CHANGE TIMESTAMP. However, you can specify it for other
columns in the same table.

|
|
|
|
|
|
|

The optional parameter list that follows the procedure name is a list of constants, enclosed in parentheses,
called the literal list. The literal list is converted by DB2 into a data structure called the field procedure
parameter value list (FPPVL). The FPPVL is passed to the field procedure during the field-definition
operation. At that time, the procedure can modify it or return it unchanged. The output form of the
FPPVL is called the modified FPPVL. It is stored in the DB2 QSYS2.SYSFIELDS catalog table as part of the
column description. The modified FPPVL is passed again to the field procedure whenever that procedure
is invoked for field-encoding or field-decoding.

|
|
|
|
|
|
|

When field procedures are invoked

|
|
|

v For field-encoding, when a column value is to be encoded. Encoding occurs for any value that:
Is inserted in the column by an SQL INSERT statement, SQL MERGE statement, or native write
operation.

Is changed by an SQL UPDATE statement, SQL MERGE statement, or native update operation.

The field procedure is also invoked when the table is created or altered, to define the data type and
attributes of an encoded value to DB2. That operation is called field-definition.

To name a field procedure for a column, use the FIELDPROC clause of the CREATE TABLE or ALTER
TABLE statement, followed by the name of the procedure and, optionally, a list of parameters.

A field procedure that is specified for a column is invoked in three general situations.
v For field-definition, when the CREATE TABLE or ALTER TABLE statement that names the procedure is
executed. During this invocation, the procedure is expected to:
Determine whether the data type and attributes of the column are valid.
Verify the literal list, and change it if desired.
Provide the field description of the column.

SQL programming

29

|
If the data needs to be copied and the target column has a field procedure, it is possible that the
|
field procedure may be invoked to encode the copied data. Examples include the SQL statements
|
ALTER TABLE or CREATE TABLE (with a LIKE or as-result-table clause) and the CL commands
|
CPYF or RGZPFM.
|
Is compared to a column with a field procedure. The QAQQINI option
|
FIELDPROC_ENCODED_COMPARISON is used by the optimizer to decide if the column value is
|
decoded or if the variable, constant, or join column is encoded.
|
At CREATE or ALTER TABLE time for the DEFAULT value, if the column has a field procedure.
|
If there are any after or read triggers, the field procedure is invoked before any of these triggers. For
|
before triggers, there may be multiple invocations of the field procedure with encode and decode
|
operations. The number of calls to the field procedure depends on many factors including the type of
|
trigger and if the trigger changes the data in the trigger buffer. The database manager will ensure that
|
the field procedure is called to encode the data that will be inserted into the table.
| v For field-decoding, when a stored value is to be field-decoded back into its original value. This occurs
|
for any value that is:
|
Retrieved by an SQL SELECT or FETCH statement, or by a native read operation.
|
If the data needs to be copied and the source column has a field procedure, it is possible that the
|
field procedure may be invoked to decode the data prior to making the copy. Examples include the
|
SQL statements ALTER TABLE or CREATE TABLE (with a LIKE or as-result-table clause) and the CL
|
commands CPYF or RGZPFM.
|
|
|
|
|
|

Is compared to a column with a field procedure. The QAQQINI option


FIELDPROC_ENCODED_COMPARISON is used by the optimizer to decide if the column value is
decoded or if the variable or constant is encoded.
A field procedure is never invoked to process a null value. It is also not invoked for a DELETE
operation without a WHERE clause when the table has no DELETE triggers. The field procedure is
invoked for empty strings.

|
|
|
|
|
|
|
|

Recommendation: Avoid encoding blanks in a field procedure. When DB2 compares the values of two
strings with different lengths, it temporarily pads the shorter string with the appropriate blank characters
(for example, EBCDIC or double-byte blanks) up to the length of the longer string. If the shorter string is
the value of a column with a field procedure, padding is done to the encoded value, but the pad
character is not encoded. Therefore, if the procedure changes blanks to some other character, encoded
blanks at the end of the longer string are not equal to padded blanks at the end of the shorter string.
That situation can lead to errors; for example, some strings that ought to be equal might not be
recognized as such. Therefore, encoding blanks in a field procedure is not recommended.

| Parameter list for execution of field procedures


| The field procedure parameter list communicates general information to a field procedure.
| The parameter list tells what operation is to be done and allows the field procedure to signal errors. DB2
| provides storage for all parameters that are passed to the field procedure. Therefore, parameters are
| passed to the field procedure by address.
|
|
|
|
|

When defining and using the parameters in the field procedure, care should be taken to ensure that no
more storage is referenced for a given parameter than is defined for that parameter. The parameters are
all stored in the same space and exceeding a given parameter's storage space can overwrite another
parameter's value. This, in turn, can cause the field procedure to see invalid input data or cause the value
returned to the database to be invalid.

| Parameter 1
|
A small (2 byte) integer that describes the function to be performed. This parameter is input only.
|
Supported values are:
v 0 field-encoding
v 4 field-decoding

|
|

30

IBM i: Database SQL programming

v 8 field-definition

|
|
|
|
|

Parameter 2
A structure that defines the field procedure parameter value list (FPPVL).
v For function code 8, this parameter is input/output.
v For function code 0 and 4, this parameter contains the output of the function code 8 call. This
parameter is input only

|
|
|
|

Parameter 3
The decoded data attribute that is defined by the Column Value Descriptor (CVD). This is the
column attributes that were specified at CREATE TABLE or ALTER TABLE time. This parameter
is input only.

|
|
|
|

Parameter 4
The decoded data. The exact structure is dependent on function code.
v If function code 8, then the NULL value. This parameter is input only.
v If function code 0, then the data to be encoded. This parameter is input only.

v If function code 4, then the location to place the decoded data. This parameter is output only.

|
|
|
|
|
|

Parameter 5
The encoded data attribute that is defined by the Field Value Descriptor (FVD).
v If function code 8, then the structure containing the encoded data attributes. This parameter is
output only.
v If function code 0 or 4 , then a structure containing the encoded data attributes that was
returned by the function 8 call. This parameter is input only.

|
|
|
|
|

Parameter 6
The encoded data that is defined by the Field Value Descriptor (FVD). The exact structure is
dependent on function code.
v If function code 8, then the NULL value. This parameter is input only.
v If function code 0, then the location to place the encoded data. This parameter is output only.

|
|
|

v If function code 4, then the encoded form of the data. This parameter is input only.
Parameter 7
The SQLSTATE (character(5)). This parameter is input/output.

|
|
|
|
|

This parameter is set by DB2 to '00000' before calling the field procedure. It can be set by the field
procedure. While normally the SQLSTATE is not set by a field procedure, it can be used to signal
an error to the database as follows:
v If the field procedure detects an error, it should set the SQLSTATE to '38xxx', where xxx may
be one of several possible strings. For more information, see DB2 Messages and Codes.

Warnings are not supported for field procedures

|
|
|
|
|
|
|
|
|
|
|
|

Parameter 8
The message text area (varchar(1000)). This parameter is input/output.
This argument is set by DB2 to the empty string before calling the field procedure. It is a
VARCHAR(1000) value that can be used by the field procedure to send message text back when
an SQLSTATE error is signaled by the field procedure. Message text is ignored by DB2 unless the
SQLSTATE parameter is set by the field procedure. The message text is assumed to be in the job
CCSID.
Parameter 9
A 128-byte structure containing additional information for the field procedure. This parameter is
input only.
This structure is set by DB2 before calling the field procedure. For field procedures that mask
data, it indicates that the caller is a system function that requires that the data be decoded
SQL programming

31

without masking. For example, in some cases, RGZPFM and ALTER TABLE may need to copy
data. If the field procedure ignores this parameter and masks data when these operations are
performed, the column data will be lost. Hence, it is critical that a field procedure that masks
data properly handle this parameter.

|
|
|
|

| Include SQLFP in QSYSINC/H describes these parameters.


| The field procedure parameter value list (FPPVL):
| The field procedure parameter value list communicates the literal list, supplied in the CREATE TABLE or
| ALTER TABLE statement, to the field procedure during field-definition.
| At that time, the field procedure can reformat the FPPVL; it is the reformatted FPPVL that is stored in
| QSYS2.SYSFIELDS and communicated to the field procedure during field-encoding and field-decoding as
| the modified FPPVL.
| The following tables describe the FPPVL:
|
|
|
|
|
|
|
|
|
|
|
|

Table 5. sqlfpFieldProcedureParameterList_T

|
|
|
|
|
|
|
|
|
|
|
|
|
|
|

Table 6. sqlfpOptionalParameterValueDescriptor_T

Name

Offset

Data Type

Description

sqlfpOptParmValueListLength

4-byte integer

Length in bytes of this structure

sqlfpNumberOfOptionalParms

4-byte integer

Number of value descriptors that follow.


Equal to the number of parameters in the
FIELDPROC clause. Zero if no
parameters were listed.

sqlfpParmList

structure
A list containing
sqlfpOptionalParameterValueDescriptor_T sqlfpNumberOfOptionalParms count of
sqlfpOptionalParameterValueDescriptor_T
items.

Name

Offset

Data Type

Description

sqlfpOptDescLength

4-byte integer

Length in bytes of this structure

sqlfpParmDesc

structure sqlfpParameterDescription_T

Parameter description

reserved2

38

character(12)

Not used

sqlfpParmData

40

The optional parameter data value.


v If the value is a varying-length string,
the first 2 bytes contains its length.
v If the value is a LOB or XML string,
the first 4 bytes contains its length.
v If this value is numeric, the internal
numeric representation of the data.
v If a datetime value, the value is in *ISO
format.

| Parameter value descriptors for field procedures:


| A parameter value descriptor describes the data type and other attributes of a value.
| Parameter value descriptors are used with field procedures in these ways:
| v During field-definition, they describe each constant in the field procedure parameter value list
|
(FPPVL). The set of these optional parameter value descriptors are part of the FPPVL control block.
| v During field-encoding and field-decoding, the decoded (column) value and the encoded (field)
|
attributes are described by the column value descriptor (CVD) and the field value descriptor (FVD).

32

IBM i: Database SQL programming

|
|
|
|

The column value descriptor (CVD) contains a description of a column value. During field-encoding, the
CVD describes the value to be encoded. During field-decoding, it describes the decoded value to be
supplied by the field procedure. During field-definition, it describes the column as defined in the
CREATE TABLE or ALTER TABLE statement.

|
|
|
|

The field value descriptor (FVD) contains a description of a field value. During field-encoding, the FVD
describes the encoded value to be returned from the field procedure. During field-decoding, it describes
the value to be decoded. During field-definition a description of the encoded value must put into the
FVD.

The following table describes a parameter value descriptor:

|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|

Table 7. sqlfpParameterDescription_T

Field-definition (function code 8):

The input provided to the field-definition operation, and the output required, are as follows:

|
|

v Parameter 1
Input - A small (2 byte) integer that describes the function to be performed (8 - field-definition).

|
|
|
|

v Parameter 2
Input/Output - A structure that defines the field procedure parameter value list (FPPVL). This is
an auto-extendable space. The minimum length of this structure is 8 bytes. The maximum
returned length of this structure is 32K.

|
|

v Parameter 3
Input - The structure sqlfpParameterDescription_T containing the decoded data attributes.

|
|

v Parameter 4
Not used.

Name

Offset

Data Type

Description

sqlfpSqlType

2-byte integer

SQL data type of this parameter. See


Appendix D of the SQL Reference for
supported values.

sqlfpByteLength

unsigned 4-byte integer

Length in bytes of this parameter. For


datetime parameters, the length of the
string representation of the parameter.

sqlfpLength

unsigned 4-byte integer

Length in characters of this parameter. If


this is a not a character or graphic type,
sqlfpLength and sqlfpByteLength are the
same value.

sqlfpPrecision

10

2-byte integer

Precision if this is a numeric parameter


that has precision (decimal, zoned, binary
with precision and scale).

sqlfpScale

12

2-byte integer

Scale if this is a numeric parameter that


has scale (decimal, zoned, binary with
precision and scale). Scale of 0 if this is a
date or time parameter. Scale of 6 if this
is a timestamp parameter.

sqlfpCcsid

14

unsigned 2-byte integer

CCSID of this parameter if character or


graphic or XML.

sqlfpAllocatedLength

16

unsigned 2-byte integer

The allocated length specified for the


column on the CREATE TABLE or ALTER
TABLE statement.

reserved1

18

character(14)

Reserved.

SQL programming

33

| v Parameter 5
|
Output - The structure sqlfpParameterDescription_T containing the encoded data attributes. The
|
output sqlfpParameterDescription_T must be valid with the appropriate CCSID, length, precision,
|
and scale fields set.
| v Parameter 6
|
Not used.
| v Parameter 7
|
Input/Output - The SQLSTATE (character(5)).
| v Parameter 8
|
Input/Output - The message text area (varchar(1000)).
| v Parameter 9
|
Input - Reserved.
|
|
|
|
|

Errors returned by a field procedure result in SQLCODE -681 (SQLSTATE '23507'), which is set in the SQL
communication area (SQLCA) and the DB2_RETURNED_SQLCODE and RETURNED_SQLSTATE
condition area item of the SQL diagnostics area. The contents of Parameter 7 and 8 are placed into the
tokens, in SQLCA, as field SQLERRMT and in the SQL Diagnostic area condition area item
MESSAGE_TEXT. The meaning of the error message is determined by the field procedure.

| Invalid data in Parameter 5, sqlfpParameterDescription_T, or an invalid length in Parameter 2 results in


| SQLCODE -685 (SQLSTATE '58002'). If the database manager is unable to invoke the field procedure then
| SQLCODE -682 (SQLSTATE '57010') is returned.
| The FPPVL can be redefined to suit the field procedure, and returned as the modified FPPVL, subject to
| the following restriction:
| v sqlfpOptParmValueListLength must contain the actual length of the modified FPPVL. If no parameter
|
list is returned, then sqlfpOptParmValueListLength must be set to 8.
|
|
|
|

The modified FPPVL is recorded in the catalog table QSYS2.SYSFIELDS, and is passed to the field
procedure during field-encoding and field-decoding. The modified FPPVL need not have the format of a
field procedure parameter list, and it need not describe constants by optional parameter value
descriptors.

| The nullability attribute of the column may not be changed.


| If the encoded data attribute is a character, graphic or XML type, the CCSID value must be set to a valid
| CCSID for the data type.
| If the column has a non-null default value, the encoded default value must not exceed the length allowed
| for the column's default value.
| Field-encoding (function code 0):
| The input provided to the field-encoding operation, and the output required, are as follows:
| v Parameter 1
|
Input - A small (2 byte) integer that describes the function to be performed (0 - field-encoding).
| v Parameter 2
|
Input - A structure that defines the modified field procedure parameter value list (FPPVL).
| v Parameter 3
|
Input - A structure described by sqlfpParameterDescription_T containing the decoded data
|
attributes.

34

IBM i: Database SQL programming

|
|
|
|
|

v Parameter 4
Input Data to be encoded.
If the value is a varying-length string, the first 2 bytes contains its length. If the value is a LOB or
XML, then the first 4 bytes contains the length. If the value is numeric, the internal numeric
representation of the data. If a datetime value, the value is in *ISO format.

|
|
|

v Parameter 5
Input - A structure described by sqlfpParameterDescription_T containing the encoded data
attributes.

|
|

v Parameter 6
Output Location to place the encoded data.

|
|
|
|

If the encoded value is a varying-length string, the first 2 bytes must contain the length. If the
encoded value is a LOB or XML, then the first 4 bytes must contain the length. If the value is
numeric, the internal numeric representation of the data. If a datetime value, the value must be in
*ISO format.

|
|

v Parameter 7
Input/Output - The SQLSTATE (character(5)).

|
|

v Parameter 8
Input/Output - The message text area (varchar(1000)).

|
|

v Parameter 9
Input - Reserved.

|
|
|
|
|
|

Errors returned by a field procedure result in SQLCODE -681 (SQLSTATE '23507'), which is set in the SQL
communication area (SQLCA) and the DB2_RETURNED_SQLCODE and RETURNED_SQLSTATE
condition area item of the SQL diagnostics area. The contents of Parameter 7 and 8 are placed into the
tokens, in SQLCA, as field SQLERRMT and in the SQL Diagnostic area condition area item
MESSAGE_TEXT. If the database manager is unable to invoke the field procedure then SQLCODE -682
(SQLSTATE '57010') is returned.

Field-decoding (function code 4):

The input provided to the field-decoding operation, and the output required, are as follows:

|
|

v Parameter 1
Input - A small (2 byte) integer that describes the function to be performed (4 - field-decoding).

|
|

v Parameter 2
Input - A structure that defines the modified field procedure parameter value list (FPPVL).

|
|
|

v Parameter 3
Input - A structure described by sqlfpParameterDescription_T containing the decoded data
attributes.

|
|

v Parameter 4
Output Location to place the decoded data.

|
|
|
|

If the decoded value is a varying-length string, the first 2 bytes must contain the length. If the
decoded value is a LOB or XML, then the first 4 bytes must contain the length. If the value is
numeric, the internal numeric representation of the data. If a datetime value, the value must be in
*ISO format.

|
|
|

v Parameter 5
Input - A structure described by sqlfpParameterDescription_T containing the encoded data
attributes.

|
|

v Parameter 6
Input - encoded data
SQL programming

35

If the value is a varying-length string, the first 2 bytes contains its length. If the value is a LOB or
XML, then the first 4 bytes contains the length. If the value is numeric, the internal numeric
representation of the data. If a datetime value, the value is in *ISO format.

|
|
|

| v Parameter 7
|
Input/Output - The SQLSTATE (character(5)).
| v Parameter 8
|
Input/Output - The message text area (varchar(1000)).
| v Parameter 9
|
Input - Indicates that the caller is a system function that requires that the data be decoded
|
without masking.
|
|
|
|
|
|

Errors returned by a field procedure result in SQLCODE -681 (SQLSTATE '23507'), which is set in the SQL
communication area (SQLCA) and the DB2_RETURNED_SQLCODE and RETURNED_SQLSTATE
condition area item of the SQL diagnostics area. The contents of Parameter 7 and 8 are placed into the
tokens, in SQLCA, as field SQLERRMT and in the SQL Diagnostic area condition area item
MESSAGE_TEXT. If the database manager is unable to invoke the field procedure then SQLCODE -682
(SQLSTATE '57010') is returned.

| Example field procedure program:


|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|

Add field procedure FP1 to column C1. The Field Procedure FP1 takes one parameter which indicates the
number of bytes of the column the field procedure should operate on.
ALTER TABLE TESTTAB ALTER C1 SET FIELDPROC FP1(10)
#include "string.h"
#include <QSYSINC/H/SQLFP>
void reverse(char *in, char *out, long length);
main(int argc, void *argv[])
{
short *funccode = argv[1];
sqlfpFieldProcedureParameterList_T *optionalParms = argv[2];
char *sqlstate = argv[7];
sqlfpMessageText_T *msgtext = argv[8];
int bytesToProcess;
sqlfpOptionalParameterValueDescriptor_T *optionalParmPtr;
if (optionalParms->sqlfpNumberOfOptionalParms != 1)
{
memcpy(sqlstate,"38001",5);
return;
}
optionalParmPtr = (void *)&(optionalParms->sqlfpParmList);
bytesToProcess = *((int *)&optionalParmPtr->sqlfpParmData);
if (*funccode == 8)
/* create time */
{
sqlfpParameterDescription_T *inDataType = argv[3];
sqlfpParameterDescription_T *outDataType = argv[5];
if (inDataType->sqlfpSqlType !=452 &&
inDataType->sqlfpSqlType !=453 )
/* only support fixed length char */
{
memcpy(sqlstate,"38002",5);
return;
}
/* do something here to determine the result data type */
/* ..... */
/* in this example input and output types are exactly the same */
/* so just copy */
memcpy(outDataType, inDataType, sizeof(sqlfpParameterDescription_T));
}
else if (*funccode == 0)
/* encode */
{

36

IBM i: Database SQL programming

|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|

char *decodedData = argv[4];


char *encodedData = argv[6];
reverse(decodedData, encodedData, bytesToProcess);
}
else if (*funccode == 4)
/* decode */
{
char *decodedData = argv[4];
char *encodedData = argv[6];
reverse(encodedData, decodedData, bytesToProcess);
}
else /* unsupported option -- error */
memcpy(sqlstate, "38003",5);
}
void reverse(char *in, char *out, long length)
{
int i;
for (i=0;i<length; ++i) {
out[length - (i+1)] = in[i];
}
}

|
|
|
|
|
|
|
|
|
|
|
|

General guidelines for writing field procedures

|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|

v The field procedure will not be called if the data to be encoded or decoded is the null value.
v On an encode operation, packed decimal and zoned decimal values will be converted to the preferred
sign prior to calling the user field procedure program.
v The field procedure must be deterministic. For SQE, caching of results will occur based on the
QAQQINI FIELDPROC_ENCODED_COMPARISON.
v The field procedure must be parallel capable and capable of running in a multi-threaded environment.
For RPG, this means the THREAD(*SERIALIZE) control specification must be specified.
v Must be capable of running in both a fenced and non-fenced environment.
v The program cannot be created with ACTGRP(*NEW). If the program is created with
ACTGRP(*CALLER), the program will run in the default activation group.
v Field procedure programs are expected to be short running. It is recommended that the field procedure
program avoid commitment control and native database operations.
v Create the program in the physical file's library.
v If an error occurs or is detected in the field procedure program, the field procedure program should set
the SQLSTATE and message text parameters. If the SQLSTATE parameter is not set to indicate an error,
database assumes that the field procedure ran successfully. This might cause the user data to end up in
an inconsistent state.

|
|

Warning: Field procedures are a productive way both to provide application functions and to manage
information. However, field procedure programs could provide the ability for someone with devious

The following considerations must be considered when writing field procedures:


v Must be an ILE *PGM object. *SRVPGMs, OPM *PGMs, and JAVA objects are not supported.
v Authority to the field procedure *PGM object is checked when the field procedure is added to the
table. Authority checking is not done when the field procedure is invoked.
Create the field procedure program so that it runs under the user profile of the user who created it.
In this way, users who do not have the same level of authority to the program will not encounter
errors.
Create the program with USRPRF(*OWNER) and *EXCLUDE public authority. Do not grant
authorities to the field procedure program to USER(*PUBLIC). Avoid having the field procedure
program altered or replaced by other users.
v No SQL is allowed in a field procedure.

SQL programming

37

| intentions to create a "Trojan horse"1 on your system. This is why it is important to restrict who has the
| authority to alter a table. If you are managing object authority carefully, the typical user will not have
| sufficient authority to add a field procedure program.
| Index considerations:
| Indexes may be recovered at IPL time based on the RECOVER parameter of CRTPF, CRTLF, CHGPF, or
| CHGLF commands. Indexes that are based on a column that has a field procedure have special
| considerations.
|
|
|
|
|

Use of PASE(QSH) and JAVA within field procedures must be avoided if the index keys are built over
expressions that contain columns with field procedures or the sparse index criteria references a column
with an associated field procedure. If use of PASE or JAVA is required, consider changing indexes to
RECOVER(*NO) so that they are not recovered during the IPL process and recovered during an open
operation instead.

| Thread considerations:
| A field procedure runs in the same job as the operation that initiated the field procedure. However, the
| field procedure may or may not run in a different system thread (fenced or not fenced) which are
| separate from the thread from the initiating request.
|
|
|
|
|
|
|
|

Because the field procedure runs in the same job, it shares much of the same environment as the
initiating request. However, because it may run under a separate thread, the following threads
considerations apply:
v Field procedures do not inherit any program adopted authority that may have been active at the time
the request was initiated. Field procedure authority comes from the authority associated with the field
procedure program or from the authority of the user running.
v The field procedure cannot perform any operation that is blocked from being run in a secondary
thread.

| v The field procedure program must be created such that it either runs under a named activation group
|
or in the activation group of its caller (ACTGRP parameter). Programs that specify *CALLER will run
|
in the default activation group.
| Debug considerations:
| There are some things to keep in mind when debugging field procedures.
| Since field procedures can run in a secondary thread, it is recommended that debugging should be done
| using STRSRVJOB or the graphical debugger.
|
|
|
|

For natively run field procedures, the database manager uses the job default wait time. If the field
procedure does not return within that specified time, an error is returned. This default wait time may
need to be increased when debugging field procedures. For example, to change the default wait time to 5
minutes: CHGJOB DFTWAIT(300)

| Guidelines for writing field procedures that mask data


| Field procedures can be used to mask data for certain users or environments when data is decoded. Field
| procedures that mask data must be coded to handle special situations to ensure data is not corrupted.
| The following special situations must be handled by a field procedure that masks data:

1. In history, the Trojan horse was a large hollow wooden horse that was filled with Greek soldiers. After the horse was introduced
within the walls of Troy, the soldiers climbed out of the horse and fought the Trojans. In the computer world, a program that
hides destructive functions is often called a Trojan horse.

38

IBM i: Database SQL programming

|
|
|
|
|
|
|
|
|
|
|
|
|
|

v Field-decoding
Masking must only be performed for field-decoding. It must not be performed for field-encoding. If
masking was performed for field-encoding, the masked data would be stored in the table and the
actual value would be lost.
In some cases, system code needs to copy data internally (the data is not being returned to the user
in these cases). For example, in some cases, RGZPFM, ALTER TABLE, and CRTDUPOBJ must copy
data internally. Likewise, data passed internally to triggers must not be masked. During these
operations, when the data is read, field-decoding will occur and when the data is written,
field-encoding will occur. If masking is performed in these cases during field-decoding, the mask
data will then be written and the actual data will be lost.
To prevent corruption, the ninth parameter to the field procedure indicates whether this is a system
operation where masking must not be performed. It is critical that the field procedure be written to
check this parameter during field-decoding and if the parameter indicates that masking must not be
performed, the field procedure must not mask regardless of the user or environment.

|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|

v Field-encoding
For native update and insert operations, the field procedure must be able to identify when masked
data is being passed to the field procedure and take special actions. For example, a field procedure
might be written to mask a credit card number column. That same user may be authorized to read
and update the table through an RPG application that performs READ and UPDATE operations.
When the READ is performed, the credit card number is masked to prevent the user from seeing it,
but when the user performs the UPDATE, the masked data will be passed back to database on the
UPDATE operation and the field procedure will be called to encode the data. If the field procedure
does not recognize that the value being passed is masked, the masked data would be encoded and
stored in the table and the original value in the row would be corrupted with an encoded masked
data.
To prevent corruption, the field procedure must recognize on field-encoding that the data is masked.
Instead of encoding the data, the field procedure must return a warning SQLSTATE value of 09501
in the seventh parameter.
- For an UPDATE operation, 09501 indicates to DB2 that the current value for the column should
be used.
- For an INSERT operation, 09501 indicates to DB2 that the default value should be used for the
associated column value.

|
|
|
|
|
|
|
|
|
|
|

Query Considerations: There are several considerations that apply to queries that reference a column of a
table that has a field procedure that masks data:
v Depending on how the optimizer implements a query, the same query may return different rows and
values for different users or environments. This will occur in cases where optimizer must decode the
data in order to perform comparisons or evaluate expressions in a query. If masking is performed for
one user but not for another user, the result of the decode operation will be very different, so the
resulting rows and values can also be quite different for the two users.
For example, assume that a field procedure returns (decodes) data for user profile MAIN without
masking and returns (decodes) data for user profile QUSER with masking. An application contains the
following query:
SELECT * FROM orders WHERE cardnum = 112233

|
|

By default, the optimizer will try to implement the search condition (logically) as follows:

|
|
|
|
|

This is the best performing implementation since it allows DB2 to compare the encoded version of the
constant 112233 with the encoded version of the CARDNUM values that are stored in the orders
table. Since the optimizer did not decode the data to perform the comparison, the query will return the
same rows for the MAIN and QUSER user profiles. The only difference will be that QUSER will see
masked values in the result rows for the CARDNUM column.

WHERE cardnum = FieldProc ENCODE(112233)

SQL programming

39

|
|
|
|
|
|
|
|

The implementation of queries that reference a field procedure column can be controlled by the
QAQQINI FIELDPROC_ENCODED_COMPARISON option. The default value for this option is
*ALLOW_EQUAL. This option enables the optimizer to implement the comparison using the encoded
values.
In the previous example, if the FIELDPROC_ENCODED_COMPARISON option was changed to
*NONE, the query would return different rows for the two users. When the value is *NONE, an equal
comparison will be implemented internally by DB2 as follows:

|
|
|
|
|
|

In this case, DB2 has to decode the CARDNUM values for every row in the table to compare against
the original constant '112233'. This means that the comparison for the MAIN user profile will compare
the decoded and unmasked card number values (112233, 332211, etc) to 112233. The MAIN user
profile will find the orders associated with the specified card number (112233). However, the query will
not return any rows for the QUSER user profile. That is because the comparison for QUSER will be
comparing the masked value of the card numbers (****33, ****11, etc) with the constant 112233.

|
|
|

For more information on how the QAQQINI FIELDPROC_ENCODED_COMPARISON option affects


field procedures see the Database Performance and Query Optimization topic in the Information
Center.

WHERE FieldProc DECODE(cardnum)=112233

| v REFRESH of a materialized query table is affected by the QAQQINI


|
FIELDPROC_ENCODED_COMPARISON option. If the materialized query table references a column
|
with a field procedure that masks, it is imperative that the REFRESH of the MQT be issued by a user
|
that is allowed to see unmasked data. Otherwise, the results in the MQT will be incorrect for all users.
| v CREATE TABLE LIKE, CREATE TABLE AS, DECLARE GLOBAL TEMPORARY TABLE LIKE, or
|
DECLARE GLOBAL TEMPORARY TABLE AS are affected by the QAQQINI
|
FIELDPROC_ENCODED_COMPARISON option. If the statements are issued by a user that is not
|
allowed to see unmasked data, the resulting table will contain masked data.
| v OPNQRYF and Query/400 are not affected by the QAQQINI FIELDPROC_ENCODED_COMPARISON
|
option. The optimizer always processes by decoding values (similar to a
|
FIELDPROC_ENCODED_COMPARISON option of *NONE).
| v Select/omit DDS-created logical files are also not affected by the QAQQINI
|
FIELDPROC_ENCODED_COMPARISON option. The logical file is processed by decoding values
|
(similar to a FIELDPROC_ENCODED_COMPARISON option of *NONE).
| Best Practices: There are two QAQQINI options that are strongly recommended for use if you have field
| procedures that mask data:
| FIELDPROC_ENCODED_COMPARISON
|
The default option for FIELDPROC_ENCODED_COMPARISON is *ALLOW_EQUAL which
|
works very well for field procedures that do not mask data. If field procedures are used that do
|
mask data, however, *NONE is the most secure and recommended option.
| CACHE_RESULTS
|
The default option for CACHE_RESULTS is *SYSTEM. In many cases, this option works well.
|
However, if field procedures that mask data are used, you should specify *JOB for
|
CACHE_RESULTS.
| Since these two options can affect the behavior of field procedures that mask data, it is also important to
| ensure that only authorized users be allowed to specify new or different QAQQINI options:
| v CHGQRYA command
|
|
|
| v

Verify that only authorized users can execute the CHGQRYA command. By default only users with job
control (*JOBCTL) special authority or have the QIBM_DB_SQLADM function usage are authorized to
the CHGQRYA command.
QUSRSYS/QAQQINI file

40

IBM i: Database SQL programming

|
|
|

Verify that only authorized users can create the QUSRSYS/QAQQINI file or update it if it already
exists. By default *PUBLIC has *USE authority to QUSRSYS which is not be enough authority to create
a new QUSRSYS.QAQQINI file.

Example field procedure program that masks data:

|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|

Add field procedure FP1 to column C1. The Field Procedure FP1 takes one additional parameter which
indicates the number of bytes of the column the field procedure should operate on.
ALTER TABLE TESTTAB ALTER C1 SET FIELDPROC FP1(10)
#include "string.h"
#include <QSYSINC/H/SQLFP>
void reverse(char *in, char *out, long length);
main(int argc, void *argv[])
{
short *funccode = argv[1];
sqlfpFieldProcedureParameterList_T *optionalParms = argv[2];
char *sqlstate = argv[7];
sqlfpMessageText_T *msgtext = argv[8];
int bytesToProcess;
sqlfpOptionalParameterValueDescriptor_T *optionalParmPtr;
sqlfpInformation_T *info = argv[9];
int masked;
if (optionalParms->sqlfpNumberOfOptionalParms != 1)
{
memcpy(sqlstate,"38001",5);
return;
}
optionalParmPtr = (void *)&(optionalParms->sqlfpParmList);
bytesToProcess = *((int *)&optionalParmPtr->sqlfpParmData);
/*******************************************************************/
/* CREATE CALL
*/
/*******************************************************************/
if (*funccode == 8)
/* create time */
{
sqlfpParameterDescription_T *inDataType = argv[3];
sqlfpParameterDescription_T *outDataType = argv[5];
if (inDataType->sqlfpSqlType !=452 &&
inDataType->sqlfpSqlType !=453 )
/* only support fixed length char */
{
memcpy(sqlstate,"38002",5);
return;
}
/* do something here to determine the result data type */
/* ..... */
/* in this example input and output types are exactly the same */
/* so just copy */
memcpy(outDataType, inDataType, sizeof(sqlfpParameterDescription_T));
}
/*******************************************************************/
/* ENCODE (WRITE) CALL
*/
/*******************************************************************/
else if (*funccode == 0)
/* encode */
{
char *decodedData = argv[4];
char *encodedData = argv[6];
/*
/*
/*
/*
if

Detect that the value passed on encode is masked.


*/
Return 09501 to tell DB that:
*/
- The field should not be updated for an update operation */
- The default value should be used for an insert operation*/
( memcmp(decodedData, "XXXXXXXXXXXX", 12) == 0 )
SQL programming

41

|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|

{
memcpy(sqlstate,"09501",5);
}
else
{
reverse(decodedData, encodedData, bytesToProcess);
}
}
/*******************************************************************/
/* DECODE (READ) CALL
*/
/*******************************************************************/
else if (*funccode == 4)
/* decode */
{
char *decodedData = argv[4];
char *encodedData = argv[6];
/* The 9th paramter indicates that the column must not be
/* masked. For exmaple, during ALTER TABLE or RGZPFM.
if ( info->sqlfpNoMask == 1 )
{
reverse(encodedData, decodedData, bytesToProcess);
return;
}
else
{
reverse(encodedData, decodedData, bytesToProcess);
/* Mask the data when appropriate
/* Assume mask is set to 0 when it should not be masked
/* and 1 when it shoulbe be masked
if (masked == 1)
{
memcpy(decodedData, "XXXXXXXXXXXX", 12);
}
}

*/
*/

*/
*/
*/

return;
}
/*******************************************************************/
/* ERROR- UNSUPPORTED OPTION
*/
/*******************************************************************/
else /* unsupported option -- error */
memcpy(sqlstate, "38003",5);
}
/*******************************************************************/
/* REVERSE
*/
/*******************************************************************/
void reverse(char *in, char *out, long length)
{
int i;
for (i=0;i<length; ++i) {
out[length - (i+1)] = in[i];
}
}

Creating descriptive labels using the LABEL ON statement


Sometimes a text description is useful for an object (such as a table or an index) or useful as column text
or column headings. You can create a more descriptive label for these names by using the LABEL ON
statement.

42

IBM i: Database SQL programming

These labels can be seen in the SQL catalog in the LABEL column.
The LABEL ON statement looks like this:
LABEL ON
TABLE CORPDATA.DEPARTMENT IS Department Structure Table
LABEL ON
COLUMN CORPDATA.DEPARTMENT.ADMRDEPT IS Reports to Dept.

After these statements are run, the table named DEPARTMENT displays the text description as
Department Structure Table and the column named ADMRDEPT displays the heading Reports to Dept. The
label for an object or a column cannot be more than 50 bytes and the label for a column heading cannot
be more than 60 bytes (blanks included). Here are the examples of LABEL ON statements for column
headings:
This LABEL ON statement provides column heading 1 and column heading 2:
*...+....1....+....2....+....3....+....4....+....5....+....6..*
LABEL ON COLUMN CORPDATA.EMPLOYEE.EMPNO IS
Employee
Number

This LABEL ON statement provides three levels of column headings for the SALARY column:
*...+....1....+....2....+....3....+....4....+....5....+....6..*
LABEL ON COLUMN CORPDATA.EMPLOYEE.SALARY IS
Yearly
Salary
(in dollars)

This LABEL ON statement removes the column heading for SALARY:


*...+....1....+....2....+....3....+....4....+....5....+....6..*
LABEL ON COLUMN CORPDATA.EMPLOYEE.SALARY IS

This LABEL ON statement provides a DBCS column heading with two levels specified:
*...+....1....+....2....+....3....+....4....+....5....+....6..*
LABEL ON COLUMN CORPDATA.EMPLOYEE.SALARY IS
<AABBCCDD>
<EEFFGG>

This LABEL ON statement provides the column text for the EDLEVEL column:
*...+....1....+....2....+....3....+....4....+....5....+....6..*
LABEL ON COLUMN CORPDATA.EMPLOYEE.EDLEVEL TEXT IS
Number of years of formal education

Related reference:
LABEL

Describing an SQL object using COMMENT ON


After you create an SQL object, such as a table or view, you can provide object information for future
reference using the COMMENT ON statement.
The information can be the purpose of the object, who uses it, and anything unusual or special about it.
You can also include similar information about each column of a table or view. A comment is especially
useful if your names do not clearly indicate the contents of the columns or objects. In that case, use a
comment to describe the specific contents of the column or objects. Usually, your comment must not be
more than 2000 characters, but for sequences the maximum length is 500 characters. If the object already
contains a comment, the old comment is replaced by the new one.
An example of using COMMENT ON follows:
COMMENT ON TABLE CORPDATA.EMPLOYEE IS
Employee table. Each row in this table represents
one employee of the company.

SQL programming

43

Getting comments after running a COMMENT ON statement


After you run a COMMENT ON statement for a table, your comments are stored in the
LONG_COMMENT column of SYSTABLES. Comments for the other objects are stored in the
LONG_COMMENT column of the appropriate catalog table. The following example gets the comments
that are added by the COMMENT ON statement in the previous example:
SELECT LONG_COMMENT
FROM CORPDATA.SYSTABLES
WHERE NAME = EMPLOYEE

Related reference:
COMMENT

Changing a table definition


You can change the definition of a table by adding a new column, changing an existing column definition
such as its length or default value, dropping an existing column, adding or removing constraints,
changing partitioning for the table, or altering one of many other options.
To change a table definition, you can use the ALTER TABLE statement. You can add, change, or drop
columns and add or remove constraints all with one ALTER TABLE statement. You can also include other
options to define or change partitioning, add a materialized query table definition, or change your media
preference for the table to use solid state disk storage media.
A second option for changing a table definition is to use the CREATE OR REPLACE TABLE statement.
With CREATE OR REPLACE TABLE, all the changes to the table definition are made with the standard
CREATE TABLE statement. If the table already exists, any changes between the new table definition and
the existing table definition are recognized and applied to the existing table. For example, based on the
content of the CREATE OR REPLACE TABLE statement, several new columns could be added, the data
type of a column could be changed, and a primary key could be defined.
Related reference:
ALTER TABLE

Adding a column
When you add a new column to a table, the column is initialized with its default value for all existing
rows. If NOT NULL is specified, a default value must also be specified.
You can add a column to a table using the ADD COLUMN clause of the SQL ALTER TABLE statement.
The altered table may consist of up to 8000 columns. The sum of the byte counts of the columns must not
be greater than 32766 or, if a VARCHAR or VARGRAPHIC column is specified, 32740. If a LOB column is
specified, the sum of record data byte counts of the columns must not be greater than 15 728 640.
Related reference:
ALTER TABLE

Changing a column
You can change a column definition in a table using the ALTER COLUMN clause of the ALTER TABLE
statement.
When you change the data type of an existing column, the old and new attributes must be compatible.
You can always change a character, graphic, or binary column from fixed length to varying length or
LOB; or from varying length or LOB to fixed length.
When you convert to a data type with a longer length, data is padded with the appropriate pad
character. When you convert to a data type with a shorter length, data might be lost because of
truncation. An inquiry message prompts you to confirm the request.

44

IBM i: Database SQL programming

If you have a column that does not allow the null value and you want to change it to now allow the null
value, use the DROP NOT NULL clause. If you have a column that allows the null value and you want
to prevent the use of null values, use the SET NOT NULL clause. If any of the existing values in that
column are the null value, the ALTER TABLE will not be performed and an SQLCODE of -190 will result.
Related reference:
Allowable conversions of data types
When you change the data type of an existing column, the old and new attributes must be compatible.
Related information:
ALTER TABLE

Allowable conversions of data types


When you change the data type of an existing column, the old and new attributes must be compatible.
Table 8. Allowable conversions
From data type

To data type

Decimal

Numeric

Decimal

Bigint, Integer, Smallint

Decimal

Decfloat

Decimal

Float

Numeric

Decimal

Numeric

Bigint, Integer, Smallint

Numeric

Decfloat

Numeric

Float

Bigint, Integer, Smallint

Decimal

Bigint, Integer, Smallint

Numeric

Bigint, Integer, Smallint

Decfloat

Bigint, Integer, Smallint

Float

Float

Decimal

Float

Numeric

Float

Bigint, Integer, Smallint

Float

Decfloat

Character

DBCS-open

Character

UCS-2 or UTF-16 graphic

DBCS-open

Character

DBCS-open

UCS-2 or UTF-16 graphic

DBCS-either

Character

DBCS-either

DBCS-open

DBCS-either

UCS-2 or UTF-16 graphic

DBCS-only

DBCS-open

DBCS-only

DBCS graphic

DBCS-only

UCS-2 or UTF-16 graphic

DBCS graphic

UCS-2 or UTF-16 graphic

UCS-2 or UTF-16 graphic

Character

UCS-2 or UTF-16 graphic

DBCS-open

UCS-2 or UTF-16 graphic

DBCS graphic
SQL programming

45

Table 8. Allowable conversions (continued)


From data type

To data type

distinct type

source type

source type

distinct type

When you change an existing column, only the attributes that you specify are changed. All other
attributes remain unchanged. For example, you have a table with the following table definition:
CREATE TABLE EX1 (COL1 CHAR(10) DEFAULT COL1,
COL2 VARCHAR(20) ALLOCATE(10) CCSID 937,
COL3 VARGRAPHIC(20) ALLOCATE(10)
NOT NULL WITH DEFAULT)

After you run the following ALTER TABLE statement, COL2 still has an allocated length of 10 and
CCSID 937, and COL3 still has an allocated length of 10.
ALTER TABLE EX1 ALTER COLUMN COL2 SET DATA TYPE VARCHAR(30)
ALTER COLUMN COL3 DROP NOT NULL

Related reference:
Changing a column on page 44
You can change a column definition in a table using the ALTER COLUMN clause of the ALTER TABLE
statement.

Deleting a column
You can delete a column using the DROP COLUMN clause of the ALTER TABLE statement.
Dropping a column deletes that column from the table definition. If CASCADE is specified, any views,
indexes, and constraints dependent on that column will also be dropped. If RESTRICT is specified, and
any views, indexes, or constraints are dependent on the column, the column will not be dropped and
SQLCODE of -196 will be issued.
ALTER TABLE DEPT
DROP COLUMN NUMDEPT

Related reference:
ALTER TABLE

Order of operations for the ALTER TABLE statement


Operations for the ALTER TABLE statement are performed in a defined order.
The ALTER TABLE statement is performed as this set of steps shows:
1. Drop constraints
2. Drop materialized query table
3. Drop partition information
4. Drop columns for which the RESTRICT option was specified
5. Alter column definitions (this includes adding columns and dropping columns for which the
CASCADE option was specified)
6. Add or alter materialized query table
7. Add partitioning to a table
8. Add constraints
Within each of these steps, the order in which you specify the clauses is the order in which they are
performed, with one exception. If any columns are being dropped, that operation is logically done before
any column definitions are added or altered, in case record length is increased as a result of the ALTER
TABLE statement.

46

IBM i: Database SQL programming

|
|
|

Using CREATE OR REPLACE TABLE

|
|
|
|
|

Using CREATE OR REPLACE TABLE lets you consolidate the master definition of a table into one
statement. You do not need to maintain the source for the original CREATE TABLE statement plus a
complex list of ALTER TABLE statements needed to recreate the most current version of a table. This
CREATE TABLE statement can be executed to deploy the current definition of the table either as a new
table or to replace a prior version of the table.

|
|
|
|

There are options to either keep the existing data in the table or to clear the data from the table during
the replace. The default is to keep all data. If you elect to clear all the data, your new table definition
does not need to be compatible with the original version. In all cases, other objects that depend on the
table, such as referential constraints, triggers, and views, must remain satisfied or the replace will fail.

|
|
|
|
|
|

Suppose your original table was this basic INVENTORY table.

|
|
|
|
|
|
|
|
|
|
|
|

Perhaps over time, you have updated the column names to be more descriptive, changed the DESCR
column to be a longer Unicode column, and added a timestamp column for when the row was last
updated. The following statement reflects all of these changes and can be executed against any prior
version of the table, as long as the column names can be matched to the prior column names and the
data types are compatible.

|
|

Partitioned tables can be modified using CREATE OR REPLACE TABLE. The following example
demonstrates splitting a single partition into multiple partitions.

|
|
|
|
|
|
|
|

Suppose your original table was defined to have 3 partitions

|
|
|
|
|
|
|
|
|
|
|

To break the second partition into 3 pieces, modify the original CREATE TABLE statement to redefine the
partitions.

The OR REPLACE option on the CREATE TABLE statement can be used to change an existing table
definition.

CREATE TABLE INVENTORY


(PARTNO SMALLINT NOT NULL,
DESCR
VARCHAR(24),
QONHAND INT,
PRIMARY KEY(PARTNO))

CREATE OR REPLACE TABLE INVENTORY


(PART_NUMBER FOR PARTNO
SMALLINT NOT NULL,
DESCRIPTION FOR DESCR
VARGRAPHIC(500) CCSID 1200,
QUANTITY_ON_HAND FOR QONHAND INT,
LAST_MODIFIED FOR MODIFIED
TIMESTAMP
NOT NULL GENERATED ALWAYS FOR EACH ROW ON UPDATE AS ROW CHANGE TIMESTAMP,
PRIMARY KEY(PARTNO))

CREATE TABLE PARTITIONED


(KEYFLD BIGINT,
DATAFLD VARCHAR(200))
PARTITION BY RANGE (KEYFLD)
(PARTITION FIRST STARTING 0 ENDING 100,
PARTITION SECOND STARTING 100 EXCLUSIVE ENDING 200,
PARTITION THIRD STARTING 200 EXCLUSIVE ENDING 300)

CREATE OR REPLACE TABLE PARTITIONED


(KEYFLD BIGINT,
DATAFLD VARCHAR(200))
PARTITION BY RANGE (KEYFLD)
(PARTITION FIRST STARTING 0 ENDING 100,
PARTITION SECOND STARTING 100 EXCLUSIVE ENDING 150,
PARTITION SPLIT1 STARTING 151 EXCLUSIVE ENDING 175,
PARTITION SPLIT2 STARTING 176 EXCLUSIVE ENDING 200,
PARTITION THIRD STARTING 200 EXCLUSIVE ENDING 300)

SQL programming

47

| Now the table will have 5 partitions with the data spread among them according to the new definition.
|
|
|
|
|
|

This example uses the default of ON REPLACE PRESERVE ALL ROWS. That means that all data for all
rows is guaranteed to be kept. If data from an existing partition doesn't fit in any new partition, the
statement fails. To remove a partition and the data from that partition, omit the partition definition from
the CREATE OR REPLACE TABLE statement and use ON REPLACE PRESERVE ROWS. This will
preserve all the data that can be assigned to the remaining partitions and discard any that no longer has
a defined partition.

Creating and using ALIAS names


When you refer to an existing table or view, or to a physical file that consists of multiple members, you
can avoid using file overrides by creating an alias. To create an alias, use the CREATE ALIAS statement.
You can create an alias for:
v A table or view
v A member of a table
A table alias defines a name for the file, including the specific member name. You can use this alias name
in an SQL statement in the same way that a table name is used. Unlike overrides, alias names are objects
that exist until they are dropped.
For example, if there is a multiple member file MYLIB.MYFILE with members MBR1 and MBR2, an alias
can be created for the second member so that SQL can easily refer to it.
CREATE ALIAS MYLIB.MYMBR2_ALIAS FOR MYLIB.MYFILE (MBR2)

When alias MYLIB.MYMBR2_ALIAS is specified on the following insert statement, the values are inserted
into member MBR2 in MYLIB.MYFILE:
INSERT INTO MYLIB.MYMBR2_ALIAS VALUES(ABC, 6)

Alias names can also be specified on DDL statements. Assume that MYLIB.MYALIAS is an alias for table
MYLIB.MYTABLE. The following DROP statement drops table MYLIB.MYTABLE:
DROP TABLE MYLIB.MYALIAS

If you really want to drop the alias name instead, specify the ALIAS keyword on the drop statement:
DROP ALIAS MYLIB.MYALIAS

Related reference:
CREATE ALIAS

Creating and using views


A view can be used to access data in one or more tables or views. You create a view by using a SELECT
statement.
For example, create a view that selects only the family name and the department of all the managers:
CREATE VIEW CORPDATA.EMP_MANAGERS FOR SYSTEM NAME EMPMANAGER AS
SELECT LASTNAME, WORKDEPT FROM CORPDATA.EMPLOYEE
WHERE JOB = MANAGER

| Since the view name, EMP_MANAGERS, is too long for a system object name, the FOR SYSTEM NAME
| clause can be used to provide the system name. Without adding this clause, a name like EMP_M00001
| will be generated for the system object.

48

IBM i: Database SQL programming

After you create the view, you can use it in SQL statements just like a table. You can also change the data
in the base table through the view. The following SELECT statement displays the contents of
EMP_MANAGERS:
SELECT *
FROM CORPDATA.EMP_MANAGERS

The results follow.


LASTNAME

WORKDEPT

THOMPSON

B01

KWAN

C01

GEYER

E01

STERN

D11

PULASKI

D21

HENDERSON

E11

SPENSER

E21

If the select list contains elements other than columns such as expressions, functions, constants, or special
registers, and the AS clause was not used to name the columns, a column list must be specified for the
view. In the following example, the columns of the view are LASTNAME and YEARSOFSERVICE.
CREATE VIEW CORPDATA.EMP_YEARSOFSERVICE
(LASTNAME, YEARSOFSERVICE) AS
SELECT LASTNAME, YEAR (CURRENT DATE - HIREDATE)
FROM CORPDATA.EMPLOYEE

Because the results of querying this view change as the current year changes, they are not included here.
You can also define the previous view by using the AS clause in the select list to name the columns in the
view. For example:
CREATE VIEW CORPDATA.EMP_YEARSOFSERVICE AS
SELECT LASTNAME,
YEARS (CURRENT_DATE - HIREDATE) AS YEARSOFSERVICE
FROM CORPDATA.EMPLOYEE

Using the UNION keyword, you can combine two or more subselects to form a single view. For example:
CREATE VIEW D11_EMPS_PROJECTS AS
(SELECT EMPNO
FROM CORPDATA.EMPLOYEE
WHERE WORKDEPT = D11
UNION
SELECT EMPNO
FROM CORPDATA.EMPPROJACT
WHERE PROJNO = MA2112 OR
PROJNO = MA2113 OR
PROJNO = AD3111)

This view has the following data.


Table 9. Results of creating a view as UNION
EMPNO
000060
000150
000160

SQL programming

49

Table 9. Results of creating a view as UNION (continued)


EMPNO
000170
000180
000190
000200
000210
000220
000230
000240
200170
200220

Views are created with the sort sequence in effect at the time the CREATE VIEW statement is run. The
sort sequence applies to all character, or UCS-2 or UTF-16 graphic comparisons in the CREATE VIEW
statement subselect.
You can also create views using the WITH CHECK OPTION clause to specify the level of checking when
data is inserted or updated through the view.
Related concepts:
Retrieving data using the SELECT statement on page 56
The SELECT statement tailors your query to gather data. You can use the SELECT statement to retrieve a
specific row or retrieve data in a specific way.
Sort sequences and normalization in SQL on page 128
A sort sequence defines how characters in a character set relate to each other when they are compared or
ordered. Normalization allows you to compare strings that contain combining characters.
Related reference:
Using the UNION keyword to combine subselects on page 95
Using the UNION keyword, you can combine two or more subselects to form a fullselect.
CREATE VIEW

WITH CHECK OPTION on a view


WITH CHECK OPTION is an optional clause on the CREATE VIEW statement. It specifies the level of
checking when data is inserted or updated through a view.
If WITH CHECK OPTION is specified, every row that is inserted or updated through the view must
conform to the definition of the view. The option cannot be specified if the view is read-only. The
definition of the view must not include a subquery.
If the view is created without a WITH CHECK OPTION clause, insert and update operations that are
performed on the view are not checked for conformance to the view definition. Some checking might still
occur if the view is directly or indirectly dependent on another view that includes WITH CHECK
OPTION. Because the definition of the view is not used, rows that do not conform to the definition of the
view might be inserted or updated through the view. This means that the rows cannot be selected again
through the view.
Related reference:
CREATE VIEW

50

IBM i: Database SQL programming

WITH CASCADED CHECK OPTION:


The WITH CASCADED CHECK OPTION clause specifies that every row that is inserted or updated
through a view must conform to the definition of the view.
In addition, the search conditions of all dependent views are checked when a row is inserted or updated.
If a row does not conform to the definition of the view, that row cannot be retrieved through the view.
For example, consider the following updatable view:
CREATE VIEW V1 AS SELECT COL1
FROM T1 WHERE COL1 > 10

Because no WITH CHECK OPTION is specified, the following INSERT statement is successful even
though the value being inserted does not meet the search condition of the view.
INSERT INTO V1 VALUES (5)

Create another view over V1, specifying the WITH CASCADED CHECK OPTION clause:
CREATE VIEW V2 AS SELECT COL1
FROM V1 WITH CASCADED CHECK OPTION

The following INSERT statement fails because it produces a row that does not conform to the definition
of V2:
INSERT INTO V2 VALUES (5)

Consider one more view created over V2:


CREATE VIEW V3 AS SELECT COL1
FROM V2 WHERE COL1 < 100

The following INSERT statement fails only because V3 is dependent on V2, and V2 has a WITH
CASCADED CHECK OPTION.
INSERT INTO V3 VALUES (5)

However, the following INSERT statement is successful because it conforms to the definition of V2.
Because V3 does not have a WITH CASCADED CHECK OPTION, it does not matter that the statement
does not conform to the definition of V3.
INSERT INTO V3 VALUES (200)

WITH LOCAL CHECK OPTION:


The WITH LOCAL CHECK OPTION clause is identical to the WITH CASCADED CHECK OPTION
clause except that you can update a row so that it can no longer be retrieved through the view. This can
happen only when the view is directly or indirectly dependent on a view that was defined with no WITH
CHECK OPTION clause.
For example, consider the same updatable view used in the previous example:
CREATE VIEW V1 AS SELECT COL1
FROM T1 WHERE COL1 > 10

Create second view over V1, this time specifying WITH LOCAL CHECK OPTION:
CREATE VIEW V2 AS SELECT COL1
FROM V1 WITH LOCAL CHECK OPTION

The same INSERT statement that failed in the previous CASCADED CHECK OPTION example succeeds
now because V2 does not have any search conditions, and the search conditions of V1 do not need to be
checked since V1 does not specify a check option.
SQL programming

51

INSERT INTO V2 VALUES (5)

Consider one more view created over V2:


CREATE VIEW V3 AS SELECT COL1
FROM V2 WHERE COL1 < 100

The following INSERT is successful again because the search condition on V1 is not checked due to the
WITH LOCAL CHECK OPTION on V2, versus the WITH CASCADED CHECK OPTION in the previous
example.
INSERT INTO V3 VALUES (5)

The difference between LOCAL and CASCADED CHECK OPTION lies in how many of the dependent
views' search conditions are checked when a row is inserted or updated.
v WITH LOCAL CHECK OPTION specifies that the search conditions of only those dependent views
that have the WITH LOCAL CHECK OPTION or WITH CASCADED CHECK OPTION are checked
when a row is inserted or updated.
v WITH CASCADED CHECK OPTION specifies that the search conditions of all dependent views are
checked when a row is inserted or updated.
Example: Cascaded check option:
This example shows how the check option is enforced on a number of dependent views that are defined
with or without a check option.
Use the following table and views:
CREATE TABLE T1 (COL1 CHAR(10))
CREATE VIEW V1 AS SELECT COL1
FROM T1 WHERE COL1 LIKE A%
CREATE VIEW V2 AS SELECT COL1
FROM V1 WHERE COL1 LIKE %Z
WITH LOCAL CHECK OPTION
CREATE VIEW V3 AS SELECT COL1
FROM V2 WHERE COL1 LIKE AB%
CREATE VIEW V4 AS SELECT COL1
FROM V3 WHERE COL1 LIKE %YZ
WITH CASCADED CHECK OPTION
CREATE VIEW V5 AS SELECT COL1
FROM V4 WHERE COL1 LIKE ABC%

Different search conditions are going to be checked depending on which view is being operated on with
an INSERT or UPDATE statement.
v If V1 is operated on, no conditions are checked because V1 does not have a WITH CHECK OPTION
specified.
v If V2 is operated on,
COL1 must end in the letter Z, but it doesn't need to start with the letter A. This is because the
check option is LOCAL, and view V1 does not have a check option specified.
v If V3 is operated on,
COL1 must end in the letter Z, but it does not need to start with the letter A. V3 does not have a
check option specified, so its own search condition must not be met. However, the search condition
for V2 must be checked because V3 is defined on V2, and V2 has a check option.
v If V4 is operated on,

52

IBM i: Database SQL programming

COL1 must start with 'AB' and must end with 'YZ'. Because V4 has the WITH CASCADED CHECK
OPTION specified, every search condition for every view on which V4 is dependent must be
checked.
v If V5 is operated on,
COL1 must start with 'AB', but not necessarily 'ABC'. This is because V5 does not specify a check
option, so its own search condition does not need to be checked. However, because V5 is defined on
V4, and V4 had a cascaded check option, every search condition for V4, V3, V2, and V1 must be
checked. That is, COL1 must start with 'AB' and end with 'YZ'.
If V5 were created WITH LOCAL CHECK OPTION, operating on V5 means that COL1 must start with
'ABC' and end with 'YZ'. The LOCAL CHECK OPTION adds the additional requirement that the third
character must be a 'C'.

Creating indexes
You can use indexes to sort and select data. In addition, indexes help the system retrieve data faster for
better query performance.
Use the CREATE INDEX statement to create indexes. The following example creates an index over the
column LASTNAME in the CORPDATA.EMPLOYEE table:
CREATE INDEX CORPDATA.INX1 ON CORPDATA.EMPLOYEE (LASTNAME)

You can also create an index that does not exactly match the data for a column in a table. For example,
you can create an index that uses the uppercase version of an employee name:
CREATE INDEX CORPDATA.INX2 ON CORPDATA.EMPLOYEE (UPPER(LASTNAME))

Most expressions allowed by SQL can be used in the definition of the key columns.
You can create any number of indexes. However, because the indexes are maintained by the system, a
large number of indexes can adversely affect performance. One type of index, the encoded vector index
(EVI), allows for faster scans that can be more easily processed in parallel.
If an index is created that has exactly the same attributes as an existing index, the new index shares the
existing indexes' binary tree. Otherwise, another binary tree is created. If the attributes of the new index
are exactly the same as another index, except that the new index has fewer columns, another binary tree
is still created. It is still created because the extra columns prevent the index from being used by cursors
or UPDATE statements that update those extra columns.
Indexes are created with the sort sequence in effect at the time the CREATE INDEX statement is run. The
sort sequence applies to all SBCS character fields, or UCS-2 or UTF-16 graphic fields of the index.
Related concepts:
Sort sequences and normalization in SQL on page 128
A sort sequence defines how characters in a character set relate to each other when they are compared or
ordered. Normalization allows you to compare strings that contain combining characters.
Creating an index strategy
Related reference:
CREATE INDEX
|

Creating and using global variables

You can use global variables to assign specific variable values for a session.

|
|
|

Use the CREATE VARIABLE statement to create a global variable. The following example creates a global
variable that defines a user class.
CREATE VARIABLE USER_CLASS INT DEFAULT (CLASS_FUNC(USER))
SQL programming

53

| This variable will have its initial value set based on the result of invoking a function called
| CLASS_FUNC. This function is assumed to assign a class value such as administrator or clerk based on
| the USER special register value.
| A global variable is instantiated for a session the first time it is referenced. Once it is set, it will maintain
| its value unless explicitly changed within the session.
|
|
|
|
|
|

A global variable can be used in a query to determine what results will be returned. In the following
example, a list of all employees from department A00 are listed. Only a session that has a global variable
with a USER_CLASS value of 1 will see the salaries for these employees.
SELECT EMPNO, LASTNAME, CASE WHEN USER_CLASS = 1 THEN SALARY ELSE NULL END
FROM EMPLOYEE
WHERE WORKDEPT = A00

| Global variables can be used in any context where an expression is allowed. Unlike a host variable, a
| global variable can be used in a CREATE VIEW statement.
|

Replacing existing objects

| You can replace an existing object using a CREATE statement rather than always needing to drop the
| object first.
|
|
|
|
|

For many SQL objects, you can optionally replace an existing object when using the CREATE SQL
statement. The existing object is effectively dropped before the new object is created. The following SQL
statements have that option:
v CREATE ALIAS
v CREATE FUNCTION

| v CREATE PROCEDURE
| v CREATE SEQUENCE
| v CREATE TRIGGER
| v CREATE VARIABLE
| v CREATE VIEW
| When the replace option is used for any of these statements, the privileges for the existing object are
| kept. The object definition is replaced by the new definition.
|

Example: Create or replace sequence

| To create a sequence called MYSEQUENCE, or replace a sequence of that name if it exists, use the
| following SQL statement
| CREATE OR REPLACE SEQUENCE MYSEQUENCE AS BIGINT
| The sequence will be created if it does not already exist. If it does exist, the privileges from the existing
| sequence will be transferred to the new sequence.

Catalogs in database design


A catalog is automatically created when you create a schema. There is also a system-wide catalog that is
always in the QSYS2 library.
When an SQL object is created in a schema, information is added to both the system catalog tables and
the schema's catalog tables. When an SQL object is created in a library, only the QSYS2 catalog is
updated. A table created with DECLARE GLOBAL TEMPORARY TABLE is not added to a catalog.

54

IBM i: Database SQL programming

As the following examples show, you can display catalog information. You cannot insert, delete, or
update catalog information. You must have SELECT privileges on the catalog views to run the following
examples.
Related reference:
DB2 for i5/OS catalog views

Getting catalog information about a table


The SYSTABLES view contains a row for each table and view in the SQL schema. The SYSTABLES view
provides information such as the object type (table or view), the object name, the owner of the object, and
the schema where the object exists.
The following example statement displays information for the CORPDATA.DEPARTMENT table:
SELECT *
FROM CORPDATA.SYSTABLES
WHERE TABLE_NAME = DEPARTMENT

Getting catalog information about a column


The SYSCOLUMNS view contains a row for each column of a table and view in the schema.
The following example statement displays all the column names in the CORPDATA.DEPARTMENT table:
SELECT *
FROM CORPDATA.SYSCOLUMNS
WHERE TABLE_NAME = DEPARTMENT

The result of the previous example statement is a row of information for each column in the table.
For specific information about each column, specify a select-statement like this:
SELECT COLUMN_NAME, TABLE_NAME, DATA_TYPE, LENGTH, HAS_DEFAULT
FROM CORPDATA.SYSCOLUMNS
WHERE TABLE_NAME = DEPARTMENT

In addition to the column name for each column, the select-statement shows:
v The name of the table that contains the column
v The data type of the column
v The length attribute of the column
v If the column allows default values
The result looks like this.
COLUMN_NAME

TABLE_NAME

DATA_TYPE

LENGTH

HAS_DEFAULT

DEPTNO

DEPARTMENT

CHAR

DEPTNAME

DEPARTMENT

VARCHAR

29

MGRNO

DEPARTMENT

CHAR

ADMRDEPT

DEPARTMENT

CHAR

Dropping a database object


The DROP statement deletes an object. Depending on the action requested, any objects that are directly or
indirectly dependent on that object might also be deleted or might prevent the drop from happening.
For example, if you drop a table, any aliases, constraints, triggers, views, or indexes associated with that
table are also dropped. Whenever an object is deleted, its description is deleted from the catalog.
For example, to drop table EMPLOYEE, issue the following statement:
SQL programming

55

DROP TABLE EMPLOYEE RESTRICT

Related reference:
DROP

Data manipulation language


Data manipulation language (DML) describes the portion of SQL that manipulates or controls data.
Related concepts:
Types of SQL statements on page 7
There are several basic types of SQL statements. They are listed here according to their functions.

Retrieving data using the SELECT statement


The SELECT statement tailors your query to gather data. You can use the SELECT statement to retrieve a
specific row or retrieve data in a specific way.
If SQL is unable to find a row that satisfies the search condition, an SQLCODE of +100 is returned.
If SQL finds errors while running your select-statement, a negative SQLCODE is returned. If SQL finds
more host variables than results, +326 is returned.
Related reference:
Creating a table using AS on page 21
You can create a table from the result of a SELECT statement. To create this type of table, use the
CREATE TABLE AS statement.
Creating and using views on page 48
A view can be used to access data in one or more tables or views. You create a view by using a SELECT
statement.

Basic SELECT statement


The basic format and syntax of the SELECT statement consists of several required and optional clauses.
You can write SQL statements on one line or on many lines. For SQL statements in precompiled
programs, the rules for the continuation of lines are the same as those of the host language (the language
the program is written in). A SELECT statement can also be used by a cursor in a program. Finally, a
SELECT statement can be prepared in a dynamic application.
Notes:
1. The SQL statements described in this section can be run on SQL tables and views, and
database physical and logical files.
2. Character strings specified in an SQL statement (such as those used with WHERE or VALUES
clauses) are case-sensitive; that is, uppercase characters must be entered in uppercase and
lowercase characters must be entered in lowercase.
WHERE ADMRDEPT=a00

(does not return a result)

WHERE ADMRDEPT=A00

(returns a valid department number)

Comparisons might not be case sensitive if a shared-weight sort sequence is used where
uppercase and lowercase characters are treated as the same characters.
A SELECT statement can include the following:
1. The name of each column you want to include in the result.
2. The name of the table or view that contains the data.
3. A search condition to identify the rows that contain the information you want.
4. The name of each column used to group your data.

56

IBM i: Database SQL programming

5. A search condition that uniquely identifies a group that contains the information you want.
6. The order of the results so a specific row among duplicates can be returned.
A SELECT statement looks like this:
SELECT column names
FROM table or view name
WHERE search condition
GROUP BY column names
HAVING search condition
ORDER BY column-name

The SELECT and FROM clauses must be specified. The other clauses are optional.
With the SELECT clause, you specify the name of each column you want to retrieve. For example:
SELECT EMPNO, LASTNAME, WORKDEPT

You can specify that only one column be retrieved, or as many as 8000 columns. The value of each
column you name is retrieved in the order specified in the SELECT clause.
If you want to retrieve all columns (in the same order as they appear in the table's definition), use an
asterisk (*) instead of naming the columns:
SELECT *

The FROM clause specifies the table that you want to select data from. You can select columns from more
than one table. When issuing a SELECT, you must specify a FROM clause. Issue the following statement:
SELECT *
FROM EMPLOYEE

The result is all of the columns and rows from the table EMPLOYEE.
The SELECT list can also contain expressions, including constants, special registers, and scalar fullselects.
An AS clause can be used to give the resulting column a name. For example, issue the following
statement:
SELECT LASTNAME, SALARY * .05 AS RAISE
FROM EMPLOYEE
WHERE EMPNO = 200140

The result of this statement follows.


Table 10. Results for query
LASTNAME

RAISE

NATZ

1421

Specifying a search condition using the WHERE clause


The WHERE clause specifies a search condition that identifies the row or rows that you want to retrieve,
update, or delete.
The number of rows you process with an SQL statement then depends on the number of rows that satisfy
the WHERE clause search condition. A search condition consists of one or more predicates. A predicate
specifies a test that you want SQL to apply to a specified row or rows of a table.
In the following example, WORKDEPT = 'C01' is a predicate, WORKDEPT and 'C01' are expressions, and
the equal sign (=) is a comparison operator. Note that character values are enclosed in apostrophes ();

SQL programming

57

numeric values are not. This applies to all constant values wherever they are coded within an SQL
statement. For example, to specify that you are interested in the rows where the department number is
C01, issue the following statement:
... WHERE WORKDEPT = C01

In this case, the search condition consists of one predicate: WORKDEPT = 'C01'.
To further illustrate WHERE, put it into a SELECT statement. Assume that each department listed in the
CORPDATA.DEPARTMENT table has a unique department number. You want to retrieve the department
name and manager number from the CORPDATA.DEPARTMENT table for department C01. Issue the
following statement:
SELECT DEPTNAME, MGRNO
FROM CORPDATA.DEPARTMENT
WHERE DEPTNO = C01

The result of this statement is one row.


Table 11. Result table
DEPTNAME

MGRNO

INFORMATION CENTER

000030

If the search condition contains character, or UCS-2 or UTF-16 graphic column predicates, the sort
sequence that is in effect when the query is run is applied to those predicates. If a sort sequence is not
being used, character constants must be specified in uppercase or lowercase to match the column or
expression they are being compared to.
Related concepts:
Sort sequences and normalization in SQL on page 128
A sort sequence defines how characters in a character set relate to each other when they are compared or
ordered. Normalization allows you to compare strings that contain combining characters.
Related reference:
Defining complex search conditions on page 70
In addition to the basic comparison predicates, such as = and >, a search condition can contain any of
these predicates: BETWEEN, IN, EXISTS, IS NULL, and LIKE.
Multiple search conditions within a WHERE clause on page 72
You can qualify your request further by coding a search condition that includes several predicates.
Expressions in the WHERE clause:
An expression in a WHERE clause names or specifies something that you want to compare to something
else.
The expressions you specify can be:
v A column name names a column. For example:
... WHERE EMPNO = 000200

EMPNO names a column that is defined as a 6-byte character value.


v An expression identifies two values that are added (+), subtracted (-), multiplied (*), divided (/), have
exponentiation (**), or concatenated (CONCAT or ||) to result in a value. The most common operands
of an expression are:
A constant
A column
A host variable
A global variable

58

IBM i: Database SQL programming

A function
A special register
A scalar fullselect
Another expression
For example:
... WHERE INTEGER(PRENDATE - PRSTDATE) > 100

When the order of evaluation is not specified by parentheses, the expression is evaluated in the
following order:
1. Prefix operators
2. Exponentiation
3. Multiplication, division, and concatenation
4. Addition and subtraction
Operators on the same precedence level are applied from left to right.
v A constant specifies a literal value for the expression. For example:
... WHERE 40000 < SALARY

SALARY names a column that is defined as a 9-digit packed decimal value (DECIMAL(9,2)). It is
compared to the numeric constant 40000.
v A host variable identifies a variable in an application program. For example:
... WHERE EMPNO = :EMP

v A special register identifies a special value defined by the database manager. For example:
... WHERE LASTNAME = USER

v The NULL value specifies the condition of having an unknown value.


... WHERE DUE_DATE IS NULL

A search condition can specify many predicates separated by AND and OR. No matter how complex the
search condition, it supplies a TRUE or FALSE value when evaluated against a row. There is also an
unknown truth value, which is effectively false. That is, if the value of a row is null, this null value is not
returned as a result of a search because it is not less than, equal to, or greater than the value specified in
the search condition.
To fully understand the WHERE clause, you need to know the order SQL evaluates search conditions and
predicates, and compares the values of expressions. This topic is discussed in the DB2 for i SQL reference
topic collection.
Related concepts:
Using subqueries on page 120
You can use subqueries in a search condition as another way to select data. Subqueries can be used
anywhere an expression can be used.
Related reference:
Defining complex search conditions on page 70
In addition to the basic comparison predicates, such as = and >, a search condition can contain any of
these predicates: BETWEEN, IN, EXISTS, IS NULL, and LIKE.
Expressions
Comparison operators:
SQL supports several comparison operators.
Comparison operator

Description

<> or = or !=
=

Not equal to
Equal to
SQL programming

59

Comparison operator

Description

<
>
<= or > or !>
> = or < or !<

Less than
Greater than
Less than or equal to (or not greater than)
Greater than or equal to (or not less than)

NOT keyword:
You can precede a predicate with the NOT keyword to specify that you want the opposite of the
predicate's value (that is, TRUE if the predicate is FALSE).
NOT applies only to the predicate it precedes, not to all predicates in the WHERE clause. For example, to
indicate that you are interested in all employees except those working in the department C01, you can
say:
... WHERE NOT WORKDEPT = C01

which is equivalent to:


... WHERE WORKDEPT <> C01

GROUP BY clause
The GROUP BY clause allows you to find the characteristics of groups of rows rather than individual
rows.
When you specify a GROUP BY clause, SQL divides the selected rows into groups such that the rows of
each group have matching values in one or more columns or expressions. Next, SQL processes each
group to produce a single-row result for the group. You can specify one or more columns or expressions
in the GROUP BY clause to group the rows. The items you specify in the SELECT statement are
properties of each group of rows, not properties of individual rows in a table or view.
Without a GROUP BY clause, the application of SQL aggregate functions returns one row. When GROUP
BY is used, the function is applied to each group, thereby returning as many rows as there are groups.
For example, the CORPDATA.EMPLOYEE table has several sets of rows, and each set consists of rows
describing members of a specific department. To find the average salary of people in each department,
you can issue:
SELECT WORKDEPT, DECIMAL (AVG(SALARY),5,0)
FROM CORPDATA.EMPLOYEE
GROUP BY WORKDEPT

The result is several rows, one for each department.


WORKDEPT

AVG-SALARY

A00

40850

B01

41250

C01

29722

D11

25147

D21

25668

E01

40175

E11

21020

E21

24086

60

IBM i: Database SQL programming

Notes:
1. Grouping the rows does not mean ordering them. Grouping puts each selected row in a
group, which SQL then processes to derive characteristics of the group. Ordering the rows
puts all the rows in the results table in ascending or descending collating sequence.
Depending on the implementation selected by the database manager, the resulting groups
might appear to be ordered.
2. If there are null values in the column you specify in the GROUP BY clause, a single-row result
is produced for the data in the rows with null values.
3. If the grouping occurs over character, or UCS-2 or UTF-16 graphic columns, the sort sequence
in effect when the query is run is applied to the grouping.
When you use GROUP BY, you list the columns or expressions you want SQL to use to group the rows.
For example, suppose that you want a list of the number of people working on each major project
described in the CORPDATA.PROJECT table. You can issue:
SELECT SUM(PRSTAFF), MAJPROJ
FROM CORPDATA.PROJECT
GROUP BY MAJPROJ

The result is a list of the company's current major projects and the number of people working on each
project.
SUM(PRSTAFF)

MAJPROJ

AD3100

AD3110

10

MA2100

MA2110

OP1000

OP2000

OP2010

32.5

You can also specify that you want the rows grouped by more than one column or expression. For
example, you can issue a select statement to find the average salary for men and women in each
department, using the CORPDATA.EMPLOYEE table. To do this, you can issue:
SELECT WORKDEPT, SEX, DECIMAL(AVG(SALARY),5,0) AS AVG_WAGES
FROM CORPDATA.EMPLOYEE
GROUP BY WORKDEPT, SEX

The result follows.


WORKDEPT

SEX

AVG_WAGES

A00

49625

A00

35000

B01

41250

C01

29722

D11

25817

D11

24764

D21

26933

D21

24720

SQL programming

61

WORKDEPT

SEX

AVG_WAGES

E01

40175

E11

22810

E11

16545

E21

25370

E21

23830

Because you did not include a WHERE clause in this example, SQL examines and processes all rows in
the CORPDATA.EMPLOYEE table. The rows are grouped first by department number and next (within
each department) by sex before SQL derives the average SALARY value for each group.
Related concepts:
Sort sequences and normalization in SQL on page 128
A sort sequence defines how characters in a character set relate to each other when they are compared or
ordered. Normalization allows you to compare strings that contain combining characters.
Related reference:
ORDER BY clause on page 63
The ORDER BY clause specifies the particular order in which you want selected rows returned. The order
is sorted by ascending or descending collating sequence of a column's or an expression's value.

HAVING clause
The HAVING clause specifies a search condition for the groups selected by the GROUP BY clause.
The HAVING clause says that you want only those groups that satisfy the condition in that clause.
Therefore, the search condition you specify in the HAVING clause must test properties of each group
rather than properties of individual rows in the group.
The HAVING clause follows the GROUP BY clause and can contain the same kind of search condition as
you can specify in a WHERE clause. In addition, you can specify aggregate functions in a HAVING
clause. For example, suppose that you want to retrieve the average salary of women in each department.
To do this, use the AVG aggregate function and group the resulting rows by WORKDEPT and specify a
WHERE clause of SEX = F.
To specify that you want this data only when all the female employees in the selected department have
an education level equal to or greater than 16 (a college graduate), use the HAVING clause. The HAVING
clause tests a property of the group. In this case, the test is on MIN(EDLEVEL), which is a group
property:
SELECT WORKDEPT, DECIMAL(AVG(SALARY),5,0) AS AVG_WAGES, MIN(EDLEVEL) AS MIN_EDUC
FROM CORPDATA.EMPLOYEE
WHERE SEX=F
GROUP BY WORKDEPT
HAVING MIN(EDLEVEL)>=16

The result follows.


WORKDEPT

AVG_WAGES

MIN_EDUC

A00

49625

18

C01

29722

16

D11

25817

17

You can use multiple predicates in a HAVING clause by connecting them with AND and OR, and you
can use NOT for any predicate of a search condition.

62

IBM i: Database SQL programming

Note: If you intend to update a column or delete a row, you cannot include a GROUP BY or HAVING
clause in the SELECT statement within a DECLARE CURSOR statement. These clauses make it a
read-only cursor.
Predicates with arguments that are not aggregate functions can be coded in either WHERE or HAVING
clauses. It is typically more efficient to code the selection criteria in the WHERE clause because it is
handled earlier in the query processing. The HAVING selection is performed in post processing of the
result table.
If the search condition contains predicates involving character, or UCS-2 or UTF-16 graphic columns, the
sort sequence in effect when the query is run is applied to those predicates.
Related concepts:
Sort sequences and normalization in SQL on page 128
A sort sequence defines how characters in a character set relate to each other when they are compared or
ordered. Normalization allows you to compare strings that contain combining characters.
Related reference:
Using a cursor on page 383
When SQL runs a SELECT statement, the resulting rows comprise the result table. A cursor provides a
way to access a result table.

ORDER BY clause
The ORDER BY clause specifies the particular order in which you want selected rows returned. The order
is sorted by ascending or descending collating sequence of a column's or an expression's value.
For example, to retrieve the names and department numbers of female employees listed in the
alphanumeric order of their department numbers, you can use this select-statement:
SELECT LASTNAME,WORKDEPT
FROM CORPDATA.EMPLOYEE
WHERE SEX=F
ORDER BY WORKDEPT

The result follows.


LASTNAME

WORKDEPT

HAAS

A00

HEMMINGER

A00

KWAN

C01

QUINTANA

C01

NICHOLLS

C01

NATZ

C01

PIANKA

D11

SCOUTTEN

D11

LUTZ

D11

JOHN

D11

PULASKI

D21

JOHNSON

D21

PEREZ

D21

HENDERSON

E11

SCHNEIDER

E11

SETRIGHT

D11
SQL programming

63

LASTNAME

WORKDEPT

SCHWARTZ

E11

SPRINGER

E11

WONG

E21

Note: Null values are ordered as the highest value.


The column specified in the ORDER BY clause does not need to be included in the SELECT clause. For
example, the following statement will return all female employees ordered with the largest salary first:
SELECT LASTNAME,FIRSTNME
FROM CORPDATA.EMPLOYEE
WHERE SEX=F
ORDER BY SALARY DESC

If an AS clause is specified to name a result column in the select-list, this name can be specified in the
ORDER BY clause. The name specified in the AS clause must be unique in the select-list. For example, to
retrieve the full names of employees listed in alphabetic order, you can use this select-statement:
SELECT LASTNAME CONCAT FIRSTNME AS FULLNAME
FROM CORPDATA.EMPLOYEE
ORDER BY FULLNAME

This select-statement can optionally be written as:


SELECT LASTNAME CONCAT FIRSTNME
FROM CORPDATA.EMPLOYEE
ORDER BY LASTNAME CONCAT FIRSTNME

Instead of naming the columns to order the results, you can use a number. For example, ORDER BY 3
specifies that you want the results ordered by the third column of the results table, as specified by the
select-list. Use a number to order the rows of the results table when the sequencing value is not a named
column.
You can also specify whether you want SQL to collate the rows in ascending (ASC) or descending (DESC)
sequence. An ascending collating sequence is the default. In the previous select-statement, SQL first
returns the row with the lowest FULLNAME expression (alphabetically and numerically), followed by
rows with higher values. To order the rows in descending collating sequence based on this name, specify:
... ORDER BY FULLNAME DESC

You can specify a secondary ordering sequence (or several levels of ordering sequences) as well as a
primary one. In the previous example, you might want the rows ordered first by department number,
and within each department, ordered by employee name. To do this, specify:
... ORDER BY WORKDEPT, FULLNAME

If character columns, or UCS-2 or UTF-16 graphic columns are used in the ORDER BY clause, ordering
for these columns is based on the sort sequence in effect when the query is run.
Related concepts:
Sort sequences and normalization in SQL on page 128
A sort sequence defines how characters in a character set relate to each other when they are compared or
ordered. Normalization allows you to compare strings that contain combining characters.
Related reference:
GROUP BY clause on page 60
The GROUP BY clause allows you to find the characteristics of groups of rows rather than individual
rows.

64

IBM i: Database SQL programming

Static SELECT statements


For a static SELECT statement (one embedded in an SQL program), an INTO clause must be specified
before the FROM clause.
The INTO clause names the host variables (variables in your program used to contain retrieved column
values). The value of the first result column specified in the SELECT clause is put into the first host
variable named in the INTO clause; the second value is put into the second host variable, and so on.
The result table for a SELECT INTO should contain just one row. For example, each row in the
CORPDATA.EMPLOYEE table has a unique EMPNO (employee number) column. The result of a SELECT
INTO statement for this table, if the WHERE clause contains an equal comparison on the EMPNO
column, will be exactly one row (or no rows). Finding more than one row is an error, but one row is still
returned. You can control which row will be returned in this error condition by specifying the ORDER BY
clause. If you use the ORDER BY clause, the first row in the result table is returned.
If you want more than one row to be the result of a SELECT INTO statement, use a DECLARE CURSOR
statement to select the rows, followed by a FETCH statement to move the column values into host
variables one or many rows at a time.
When using the select-statement in an application program, list the column names to give your program
more data independence. There are two reasons for this:
1. When you look at the source code statement, you can easily see the one-to-one correspondence
between the column names in the SELECT clause and the host variables named in the INTO clause.
2. If a column is added to a table or view you access and you use SELECT * ..., and you create the
program again from source, the INTO clause does not have a matching host variable named for the
new column. The extra column causes you to get a warning (not an error) in the SQLCA (SQLWARN3
will contain a W). When using the GET DIAGNOSTICS statement, the RETURNED_SQLSTATE item
will have a value of '01503'.
Related reference:
Using a cursor on page 383
When SQL runs a SELECT statement, the resulting rows comprise the result table. A cursor provides a
way to access a result table.

Handling null values


A null value indicates the absence of a column value in a row. A null value is an unknown value; it is not
the same as zero or all blanks.
Null values can be used as a condition in the WHERE and HAVING clauses. For example, a WHERE
clause can specify a column that, for some rows, contains a null value. A basic comparison predicate
using a column that contains null values does not select a row that has a null value for the column. This
is because a null value is not less than, equal to, or greater than the value specified in the condition. The
IS NULL predicate is used to check for null values. To select the values for all rows that contain a null
value for the manager number, you can specify:
SELECT DEPTNO, DEPTNAME, ADMRDEPT
FROM CORPDATA.DEPARTMENT
WHERE MGRNO IS NULL

The result follows.


DEPTNO

DEPTNAME

ADMRDEPT

D01

DEVELOPMENT CENTER

A00

F22

BRANCH OFFICE F2

E01

G22

BRANCH OFFICE G2

E01

H22

BRANCH OFFICE H2

E01
SQL programming

65

DEPTNO

DEPTNAME

ADMRDEPT

I22

BRANCH OFFICE I2

E01

J22

BRANCH OFFICE J2

E01

To get the rows that do not have a null value for the manager number, you can change the WHERE
clause like this:
WHERE MGRNO IS NOT NULL

Another predicate that is useful for comparing values that can contain the NULL value is the DISTINCT
predicate. Comparing two columns using a normal equal comparison (COL1 = COL2) will be true if both
columns contain an equal non-null value. If both columns are null, the result will be false because null is
never equal to any other value, not even another null value. Using the DISTINCT predicate, null values
are considered equal. So COL1 is NOT DISTINCT from COL2 will be true if both columns contain an
equal non-null value and also when both columns are the null value.
For example, suppose that you want to select information from two tables that contain null values. The
first table T1 has a column C1 with the following values.
C1
2
1
null

The second table T2 has a column C2 with the following values.


C2
2
null

Run the following SELECT statement:


SELECT *
FROM T1, T2
WHERE C1 IS DISTINCT FROM C2

The result follows.


C1

C2

For more information about the use of null values, see the DB2 for i SQL reference topic collection.

Special registers in SQL statements


You can specify certain special registers in SQL statements. A special register, for example, CURRENT
DATE, contains information that can be referenced in SQL statements.
For locally run SQL statements, the special registers and their contents are shown in the following table.

66

IBM i: Database SQL programming

|
|

Special registers

Contents

CURRENT CLIENT_ACCTNG
CLIENT ACCTNG

The accounting string for the client connection.

CURRENT CLIENT_APPLNAME
CLIENT APPLNAME

The application name for the client connection.

CURRENT CLIENT_PROGRAMID
CLIENT PROGRAMID

The program ID for the client connection.

CURRENT CLIENT_USERID
CLIENT USERID

The client user ID for the client connection.

CURRENT CLIENT_WRKSTNNAME
CLIENT WRKSTNNAME

The workstation name for the client connection.

CURRENT DATE
CURRENT_DATE

The current date.

CURRENT DEBUG MODE

The debug mode to be used when creating or altering


routines.

CURRENT DECFLOAT ROUNDING MODE

The rounding mode to be used when working with


decimal floating point values.

CURRENT DEGREE

The number of tasks the database manager should run in


parallel.

CURRENT IMPLICIT XMLPARSE OPTION

The whitespace handling options to be used for XML


data when implicitly parsed without validation.

CURRENT PATH
CURRENT_PATH
CURRENT FUNCTION PATH

The SQL path used to resolve unqualified data type


names, procedure names, and function names in
dynamically prepared SQL statements.

CURRENT SCHEMA

The schema name used to qualify unqualified database


object references where applicable in dynamically
prepared SQL statements.

CURRENT SERVER
CURRENT_SERVER

The name of the relational database currently being used.

CURRENT TIME
CURRENT_TIME

The current time.

CURRENT TIMESTAMP
CURRENT_TIMESTAMP

The current date and time in timestamp format.

CURRENT TIMEZONE
CURRENT_TIMEZONE

A duration of time that links local time to Universal Time


Coordinated (UTC) using the formula:
local time CURRENT TIMEZONE = UTC
It is taken from the system value QUTCOFFSET.

SESSION_USER
USER

The runtime authorization identifier (user profile) of the


job.

SYSTEM_USER

The authorization identifier (user profile) of the user


connected to the database.

If a single statement contains more than one reference to any of CURRENT DATE, CURRENT TIME, or
CURRENT TIMESTAMP special registers, or the CURDATE, CURTIME, or NOW scalar functions, all
values are based on a single clock reading.
For remotely run SQL statements, the values for special registers are determined at the remote system.

SQL programming

67

When a query over a distributed table references a special register, the contents of the special register on
the system that requests the query are used. For more information about distributed tables, see the DB2
Multisystem topic collection.

Casting data types


Sometimes you need to cast or change the type of an expression to a different data type or to the same
data type with a different length, precision, or scale.
For example, if you want to compare two columns of different types, such as a user-defined type based
on a character and an integer, you can change the character to an integer or the integer to a character to
make the comparison possible. A data type that can be changed to another data type is castable from the
source data type to the target data type.
You can use cast functions or CAST specification to explicitly cast a data type to another data type. For
example, if you have a column of dates (BIRTHDATE) defined as DATE and want to cast the column
data type to CHARACTER with a fixed length of 10, enter the following:
SELECT CHAR (BIRTHDATE,USA)
FROM CORPDATA.EMPLOYEE

You can also use the CAST specification to cast data types directly:
SELECT CAST(BIRTHDATE AS CHAR(10))
FROM CORPDATA.EMPLOYEE

Related reference:
Casting between data types

Date, time, and timestamp data types


Date, time, and timestamp are data types that are represented in an internal form not seen by an SQL
user.
Date, time, and timestamp can be represented by character string values and assigned to character string
variables. The database manager recognizes the following as date, time, and timestamp values:
v A value returned by the DATE, TIME, or TIMESTAMP scalar function.
v A value returned by the CURRENT DATE, CURRENT TIME, or CURRENT TIMESTAMP special
register.
v A value of a character string in the ANSI/ISO standard date, time, or timestamp format, for example,
DATE '1950-01-01'.
v A character string when it is an operand of an arithmetic expression or a comparison and the other
operand is a date, time, or timestamp. For example, in the predicate:
... WHERE HIREDATE < 1950-01-01

if HIREDATE is a date column, the character string '1950-01-01' is interpreted as a date.


v A character string variable or constant used to set a date, time, or timestamp column in either the SET
clause of an UPDATE statement, or the VALUES clause of an INSERT statement.
Related reference:
Data types
Specifying current date and time values:
You can specify a current date, time, or timestamp in an expression by using one of these special
registers: CURRENT DATE, CURRENT TIME, and CURRENT TIMESTAMP.
The value of each is based on a time-of-day clock reading obtained during the running of the statement.
Multiple references to CURRENT DATE, CURRENT TIME, or CURRENT TIMESTAMP within the same
SQL statement use the same value. The following statement returns the age (in years) of each employee
in the EMPLOYEE table when the statement is run:

68

IBM i: Database SQL programming

SELECT YEAR(CURRENT DATE - BIRTHDATE)


FROM CORPDATA.EMPLOYEE

The CURRENT TIMEZONE special register allows a local time to be converted to Universal Time
Coordinated (UTC). For example, if you have a table named DATETIME that contains a time column type
with a name of STARTT, and you want to convert STARTT to UTC, you can use the following statement:
SELECT STARTT - CURRENT TIMEZONE
FROM DATETIME

Date/time arithmetic:
Addition and subtraction are the only arithmetic operators applicable to date, time, and timestamp
values.
You can increment and decrement a date, time, or timestamp by a duration; or subtract a date from a
date, a time from a time, or a timestamp from a timestamp.
Related reference:
Datetime arithmetic in SQL

Row change expressions


The ROW CHANGE TIMESTAMP and ROW CHANGE TOKEN expressions can be used to determine
when a row was last changed.
To use a ROW CHANGE TIMESTAMP expression for a table, the table must be defined to have a row
change timestamp column.
The following query can find all the orders that are at least four weeks old and can list when they were
last modified:
SELECT ORDERNO, ROW CHANGE TIMESTAMP FOR ORDERS
FROM ORDERS
WHERE ORDER_DATE < CURRENT DATE - 4 WEEKS

The ROW CHANGE TOKEN expression can be used for both tables that have a row change timestamp
and tables that do not. It represents a modification point for a row. If a table has a row change
timestamp, it is derived from the timestamp. If a table does not have a row change timestamp, it is based
on an internal modification time that is not row-based, so it is not as accurate as for a table that has a
row change timestamp.

Handling duplicate rows


When SQL evaluates a select-statement, several rows might qualify to be in the result table, depending on
the number of rows that satisfy the search condition of the select-statement. Some of the rows in the
result table might be duplicate.
You can specify that you do not want any duplicates by using the DISTINCT keyword, followed by the
list of expressions:
SELECT DISTINCT JOB, SEX
...

DISTINCT means that you want to select only the unique rows. If a selected row duplicates another row
in the result table, the duplicate row is ignored (it is not put into the result table). For example, suppose
you want a list of employee job codes. You do not need to know which employee has what job code.
Because it is probable that several people in a department have the same job code, you can use
DISTINCT to ensure that the result table has only unique values.
The following example shows how to do this:

SQL programming

69

SELECT DISTINCT JOB


FROM CORPDATA.EMPLOYEE
WHERE WORKDEPT = D11

The result is two rows.


JOB
DESIGNER
MANAGER

If you do not include DISTINCT in a SELECT clause, you might find duplicate rows in your result,
because SQL returns the JOB column's value for each row that satisfies the search condition. Null values
are treated as duplicate rows for DISTINCT.
If you include DISTINCT in a SELECT clause and you also include a shared-weight sort sequence, fewer
values might be returned. The sort sequence causes values that contain the same characters to be
weighted the same. If 'MGR', 'Mgr', and 'mgr' are all in the same table, only one of these values is
returned.
Related concepts:
Sort sequences and normalization in SQL on page 128
A sort sequence defines how characters in a character set relate to each other when they are compared or
ordered. Normalization allows you to compare strings that contain combining characters.

Defining complex search conditions


In addition to the basic comparison predicates, such as = and >, a search condition can contain any of
these predicates: BETWEEN, IN, EXISTS, IS NULL, and LIKE.
A search condition can include a scalar fullselect.
For character, or UCS-2 or UTF-16 graphic column predicates, the sort sequence is applied to the
operands before evaluation of the predicates for BETWEEN, IN, EXISTS, and LIKE clauses.
You can also perform multiple search conditions.
v BETWEEN ... AND ... is used to specify a search condition that is satisfied by any value that falls on or
between two other values. For example, to find all employees who were hired in 1987, you can use
this:
... WHERE HIREDATE BETWEEN 1987-01-01 AND 1987-12-31

The BETWEEN keyword is inclusive. A more complex, but explicit, search condition that produces the
same result is:
... WHERE HIREDATE >= 1987-01-01 AND HIREDATE <= 1987-12-31

v IN says you are interested in rows in which the value of the specified expression is among the values
you listed. For example, to find the names of all employees in departments A00, C01, and E21, you can
specify:
... WHERE WORKDEPT IN (A00, C01, E21)

v EXISTS says you are interested in testing for the existence of certain rows. For example, to find out if
there are any employees that have a salary greater than 60000, you can specify:
EXISTS (SELECT * FROM EMPLOYEE WHERE SALARY > 60000)

v IS NULL says that you are interested in testing for null values. For example, to find out if there are
any employees without a phone listing, you can specify:
... WHERE EMPLOYEE.PHONE IS NULL

70

IBM i: Database SQL programming

v LIKE says you are interested in rows in which an expression is similar to the value you supply. When
you use LIKE, SQL searches for a character string similar to the one you specify. The degree of
similarity is determined by two special characters used in the string that you include in the search
condition:
_

An underline character stands for any single character.

A percent sign stands for an unknown string of 0 or more characters. If the percent sign starts
the search string, then SQL allows 0 or more character(s) to precede the matching value in the
column. Otherwise, the search string must begin in the first position of the column.

Note: If you are operating on MIXED data, the following distinction applies: an SBCS underline
character refers to one SBCS character. No such restriction applies to the percent sign; that is, a
percent sign refers to any number of SBCS or DBCS characters. See the DB2 for i SQL reference
topic collection for more information about the LIKE predicate and MIXED data.
Use the underline character or percent sign either when you do not know or do not care about all the
characters of the column's value. For example, to find out which employees live in Minneapolis, you
can specify:
... WHERE ADDRESS LIKE %MINNEAPOLIS%

SQL returns any row with the string MINNEAPOLIS in the ADDRESS column, no matter where the
string occurs.
In another example, to list the towns whose names begin with 'SAN', you can specify:
... WHERE TOWN LIKE SAN%

If you want to find any addresses where the street name isn't in your master street name list, you can
use an expression in the LIKE expression. In this example, the STREET column in the table is assumed
to be upper case.
... WHERE UCASE (:address_variable) NOT LIKE %||STREET||%

If you want to search for a character string that contains either the underscore or percent character, use
the ESCAPE clause to specify an escape character. For example, to see all businesses that have a
percent in their name, you can specify:
... WHERE BUSINESS_NAME LIKE %@%% ESCAPE @

The first and last percent characters in the LIKE string are interpreted as the normal LIKE percent
characters. The combination '@%' is taken as the actual percent character.
Related concepts:
Using subqueries on page 120
You can use subqueries in a search condition as another way to select data. Subqueries can be used
anywhere an expression can be used.
Sort sequences and normalization in SQL on page 128
A sort sequence defines how characters in a character set relate to each other when they are compared or
ordered. Normalization allows you to compare strings that contain combining characters.
Related reference:
Specifying a search condition using the WHERE clause on page 57
The WHERE clause specifies a search condition that identifies the row or rows that you want to retrieve,
update, or delete.
Expressions in the WHERE clause on page 58
An expression in a WHERE clause names or specifies something that you want to compare to something
else.
Predicates
Special considerations for LIKE:
Here are some considerations for using the LIKE predicate.
SQL programming

71

v When host variables are used in place of string constants in a search pattern, you should consider
using varying length host variables. This allows you to:
Assign previously used string constants to host variables without any changes.
Obtain the same selection criteria and results as if a string constant were used.
v When fixed-length host variables are used in place of string constants in a search pattern, you should
ensure that the value specified in the host variable matches the pattern previously used by the string
constants. All characters in a host variable that are not assigned a value are initialized with a blank.
For example, if you do a search using the string pattern 'ABC%' in a varying length host variable, these
are some of the values that can be returned:
ABCD

ABCDE

ABCxxx

ABC

However, if you do a search using the search pattern 'ABC%' contained in a host variable with a fixed
length of 10, these values can be returned, assuming that the column has a length of 12:
ABCDE

ABCD

ABCxxx

ABC

Note: All returned values start with 'ABC' and end with at least 6 blanks. Blanks are used because the
last 6 characters in the host variable are not assigned a specific value.
If you want to do a search using a fixed-length host variable where the last 7 characters can be
anything, search for 'ABC%%%%%%%'. These are some of the values that can be returned:
ABCDEFGHIJ

ABCXXXXXXX

ABCDE

ABCDD

Multiple search conditions within a WHERE clause:


You can qualify your request further by coding a search condition that includes several predicates.
The search condition you specify can contain any of the comparison operators or the predicates
BETWEEN, DISTINCT, IN, LIKE, EXISTS, IS NULL, and IS NOT NULL.
You can combine any two predicates with AND and OR. In addition, you can use the NOT keyword to
specify that the search condition that you want is the negated value of the specified search condition. A
WHERE clause can have as many predicates as you want.
v AND says that, for a row to qualify, the row must satisfy both predicates of the search condition. For
example, to find out which employees in department D21 were hired after December 31, 1987, specify:
...
WHERE WORKDEPT = D21 AND HIREDATE > 1987-12-31

v OR says that, for a row to qualify, the row can satisfy the condition set by either or both predicates of
the search condition. For example, to find out which employees are in either department C01 or D11,
you can specify :
...
WHERE WORKDEPT = C01 OR WORKDEPT = D11

Note: You can also use IN to specify this request: WHERE WORKDEPT IN (C01, D11).
v NOT says that, to qualify, a row must not meet the criteria set by the search condition or predicate that
follows the NOT. For example, to find all employees in the department E11 except those with a job
code equal to analyst, you can specify:
...
WHERE WORKDEPT = E11 AND NOT JOB = ANALYST

When SQL evaluates search conditions that contain these connectors, it does so in a specific order. SQL
first evaluates the NOT clauses, next evaluates the AND clauses, and then the OR clauses.
You can change the order of evaluation by using parentheses. The search conditions enclosed in
parentheses are evaluated first. For example, to select all employees in departments E11 and E21 who
have education levels greater than 12, you can specify:

72

IBM i: Database SQL programming

...
WHERE EDLEVEL > 12 AND
(WORKDEPT = E11 OR WORKDEPT = E21)

The parentheses determine the meaning of the search condition. In this example, you want all rows that
have a:
v WORKDEPT value of E11 or E21, and
v EDLEVEL value greater than 12
If you did not use parentheses:
...
WHERE EDLEVEL > 12 AND WORKDEPT = E11
OR WORKDEPT = E21

Your result is different. The selected rows are rows that have:
v WORKDEPT = E11 and EDLEVEL > 12, or
v WORKDEPT = E21, regardless of the EDLEVEL value
If you are combining multiple equal comparisons, you can write the predicate with the ANDs as shown
in the following example:
...
WHERE WORKDEPT = E11 AND EDLEVEL = 12 AND JOB = CLERK

You can also compare two lists, for example:


...
WHERE (WORKDEPT, EDLEVEL, JOB) = (E11, 12, CLERK)

When two lists are used, the first item in the first list is compared to the first item in the second list, and
so on through both lists. Thus, each list must contain the same number of entries. Using lists is identical
to writing the query with AND. Lists can only be used with the equal and not equal comparison
operators.
Related reference:
Specifying a search condition using the WHERE clause on page 57
The WHERE clause specifies a search condition that identifies the row or rows that you want to retrieve,
update, or delete.

Using OLAP specifications


Online analytical processing (OLAP) specifications are used to return ranking numbers and row numbers
for the result rows of a query. You can specify RANK, DENSE_RANK, and ROW_NUMBER.

Example: Ranking and row numbering


Suppose that you want a list of the top 10 salaries along with their ranking. The following query
generates the ranking number for you:
SELECT EMPNO, SALARY,
RANK() OVER(ORDER BY SALARY DESC),
DENSE_RANK() OVER(ORDER BY SALARY DESC),
ROW_NUMBER() OVER(ORDER BY SALARY DESC)
FROM EMPLOYEE
FETCH FIRST 10 ROWS ONLY

This query returns the following information.

SQL programming

73

Table 12. Results of the previous query


EMPNO

SALARY

RANK

DENSE_RANK

ROW_NUMBER

000010

52,750.00

000110

46,500.00

200010

46,500.00

000020

41,250.00

000050

40,175.00

000030

38,250.00

000070

36,170.00

000060

32,250.00

000220

29,840.00

200220

29,840.00

10

In this example, the SALARY descending order with the top 10 returned. The RANK column shows the
relative ranking of each salary. Notice that there are two rows with the same salary at position 2. Each of
those rows is assigned the same rank value. The following row is assigned the value of 4. RANK returns
a value for a row that is one more than the total number of rows that precede that row. There are gaps in
the numbering sequence whenever there are duplicates.
In contrast, the DENSE_RANK column shows a value of 3 for the row directly after the duplicate rows.
DENSE_RANK returns a value for a row that is one more than the number of distinct row values that
precede it. There will never be gaps in the numbering sequence.
ROW_NUMBER returns a unique number for each row. For rows that contain duplicate values according
to the specified ordering, the assignment of a row number is arbitrary; the row numbers could be
assigned in a different order for the duplicate rows when the query is run another time.

Example: Ranking groups


Suppose that you want to find out which department has the highest average salary. The following query
groups the data by department, determines the average salary for each department, and ranks the
resulting averages.
SELECT WORKDEPT, INT(AVG(SALARY)) AS AVERAGE,
RANK() OVER(ORDER BY AVG(SALARY) DESC) AS AVG_SALARY
FROM EMPLOYEE
GROUP BY WORKDEPT

This query returns the following information.


Table 13. Results of previous query
WORKDEPT

AVERAGE

AVG_SALARY

B01

41,250

A00

40,850

E01

40,175

C01

29,722

D21

25,668

D11

25,147

E21

24,086

E11

21,020

74

IBM i: Database SQL programming

Example: Ranking within a department


Suppose that you want a list of employees along with how their bonus ranks within their department.
Using the PARTITION BY clause, you can specify groups that are to be numbered separately.
SELECT LASTNAME, WORKDEPT, BONUS,
DENSE_RANK() OVER(PARTITION BY WORKDEPT ORDER BY BONUS DESC)
AS BONUS_RANK_IN_DEPT
FROM EMPLOYEE
WHERE WORKDEPT LIKE E%

This query returns the following information.


Table 14. Results of the previous query
LASTNAME

WORKDEPT

BONUS

BONUS_RANK_in_DEPT

GEYER

E01

800.00

HENDERSON

E11

600.00

SCHNEIDER

E11

500.00

SCHWARTZ

E11

500.00

SMITH

E11

400.00

PARKER

E11

300.00

SETRIGHT

E11

300.00

SPRINGER

E11

300.00

SPENSER

E21

500.00

LEE

E21

500.00

GOUNOT

E21

500.00

WONG

E21

500.00

ALONZO

E21

500.00

MENTA

E21

400.00

Example: Ranking and ordering by table expression results


Suppose that you want to find the top five employees whose salaries are the highest along with their
department names. The department name is in the department table, so a join operation is needed.
Because ordering is already being done in the nested table expression, that ordering can also be used for
determining the ROW_NUMBER value. The ORDER BY ORDER OF table clause is used to do this.
SELECT ROW_NUMBER() OVER(ORDER BY ORDER OF EMP),
EMPNO, SALARY, DEPTNO, DEPTNAME
FROM (SELECT EMPNO, WORKDEPT, SALARY
FROM EMPLOYEE
ORDER BY SALARY DESC
FETCH FIRST 5 ROWS ONLY) EMP,
DEPARTMENT
WHERE DEPTNO = WORKDEPT

This query returns the following information.


Table 15. Results of the previous query
ROW_NUMBER

EMPNO

SALARY

DEPTNO

DEPTNAME

000010

52,750.00

A00

SPIFFY COMPUTER
SERVICE DIV.
SQL programming

75

Table 15. Results of the previous query (continued)


ROW_NUMBER

EMPNO

SALARY

DEPTNO

DEPTNAME

000110

46,500.00

A00

SPIFFY COMPUTER
SERVICE DIV.

200010

46,500.00

A00

SPIFFY COMPUTER
SERVICE DIV.

000020

41,250.00

B01

PLANNING

000050

40,175.00

E01

SUPPORT SERVICES

Joining data from more than one table


Sometimes the information you want to see is not in a single table. To form a row of the result table, you
might want to retrieve some column values from one table and some column values from another table.
You can retrieve and join column values from two or more tables into a single row.
Several different types of joins are supported by DB2 for i: inner join, left outer join, right outer join, left
exception join, right exception join, and cross join.

Usage notes on join operations


When you join two or more tables, consider the following items:
v If there are common column names, you must qualify each common name with the name of the table
(or a correlation name). Column names that are unique do not need to be qualified. However, the
USING clause can be used in a join to allow you to identify columns that exist in both tables without
specifying table names.
v If you do not list the column names you want, but instead use SELECT *, SQL returns rows that
consist of all the columns of the first table, followed by all the columns of the second table, and so on.
v You must be authorized to select rows from each table or view specified in the FROM clause.
v The sort sequence is applied to all character, or UCS-2 or UTF-16 graphic columns being joined.
Inner join:
An inner join returns only the rows from each table that have matching values in the join columns. Any
rows that do not have a match between the tables do not appear in the result table.
With an inner join, column values from one row of a table are combined with column values from
another row of another (or the same) table to form a single row of data. SQL examines both tables
specified for the join to retrieve data from all the rows that meet the search condition for the join. There
are two ways of specifying an inner join: using the JOIN syntax, and using the WHERE clause.
Suppose you want to retrieve the employee numbers, names, and project numbers for all employees that
are responsible for a project. In other words, you want the EMPNO and LASTNAME columns from the
CORPDATA.EMPLOYEE table and the PROJNO column from the CORPDATA.PROJECT table. Only
employees with last names starting with 'S' or later should be considered. To find this information, you
need to join the two tables.
Inner join using the JOIN syntax:
To use the inner join syntax, both of the tables you are joining are listed in the FROM clause, along with
the join condition that applies to the tables.
The join condition is specified after the ON keyword and determines how the two tables are to be
compared to each other to produce the join result. The condition can be any comparison operator; it does
not need to be the equal operator. Multiple join conditions can be specified in the ON clause separated by

76

IBM i: Database SQL programming

the AND keyword. Any additional conditions that do not relate to the actual join are specified in either
the WHERE clause or as part of the actual join in the ON clause.
SELECT EMPNO, LASTNAME, PROJNO
FROM CORPDATA.EMPLOYEE INNER JOIN CORPDATA.PROJECT
ON EMPNO = RESPEMP
WHERE LASTNAME > S

In this example, the join is done on the two tables using the EMPNO and RESPEMP columns from the
tables. Since only employees that have last names starting with at least 'S' are to be returned, this
additional condition is provided in the WHERE clause.
This query returns the following output.
EMPNO

LASTNAME

PROJNO

000250

SMITH

AD3112

000060

STERN

MA2110

000100

SPENSER

OP2010

000020

THOMPSON

PL2100

Inner join using the WHERE clause:


To use the WHERE clause to perform the same join as you perform using the INNER JOIN syntax, enter
both the join condition and the additional selection condition in the WHERE clause.
The tables to be joined are listed in the FROM clause, separated by commas.
SELECT EMPNO, LASTNAME, PROJNO
FROM CORPDATA.EMPLOYEE, CORPDATA.PROJECT
WHERE EMPNO = RESPEMP
AND LASTNAME > S

This query returns the same output as the previous example.


Joining data with the USING clause:
You can use the USING clause for a shorthand way of defining join conditions. The USING clause is
equivalent to a join condition where each column from the left table is compared to a column with the
same name in the right table.
For example, look at the USING clause in this statement:
SELECT EMPNO, ACSTDATE
FROM CORPDATA.PROJACT INNER JOIN CORPDATA.EMPPROJACT
USING (PROJNO, ACTNO)
WHERE ACSDATE > 1982-12-31;

The syntax in this statement is valid and equivalent to the join condition in the following statement:
SELECT EMPNO, ACSTDATE
FROM CORPDATA.PROJACT INNER JOIN CORPDATA.EMPPROJACT
ON CORPDATA.PROJACT.PROJNO = CORPDATA.EMPPROJACT.PROJNO AND
CORPDATA.PROJACT.ACTNO = CORPDATA.EMPPROJACT.ACTNO
WHERE ACSTDATE > 1982-12-31;

Left outer join:


A left outer join returns all the rows that an inner join returns plus one row for each of the other rows in
the first table that do not have a match in the second table.
SQL programming

77

Suppose you want to find all employees and the projects they are currently responsible for. You want to
see those employees that are not currently in charge of a project as well. The following query will return
a list of all employees whose names are greater than 'S', along with their assigned project numbers.
SELECT EMPNO, LASTNAME, PROJNO
FROM CORPDATA.EMPLOYEE LEFT OUTER JOIN CORPDATA.PROJECT
ON EMPNO = RESPEMP
WHERE LASTNAME > S

The result of this query contains some employees that do not have a project number. They are listed in
the query, but have the null value returned for their project number.
EMPNO

LASTNAME

PROJNO

000020

THOMPSON

PL2100

000060

STERN

MA2110

000100

SPENSER

OP2010

000170

YOSHIMURA

000180

SCOUTTEN

000190

WALKER

000250

SMITH

AD3112

000280

SCHNEIDER

000300

SMITH

000310

SETRIGHT

200170

YAMAMOTO

200280

SCHWARTZ

200310

SPRINGER

200330

WONG

Note: Using the RRN scalar function to return the relative record number for a column in the table on
the right in a left outer join or exception join will return a value of 0 for the unmatched rows.
Right outer join:
A right outer join returns all the rows that an inner join returns plus one row for each of the other rows
in the second table that do not have a match in the first table. It is the same as a left outer join with the
tables specified in the opposite order.
The query that was used as the left outer join example can be rewritten as a right outer join as follows:
SELECT EMPNO, LASTNAME, PROJNO
FROM CORPDATA.PROJECT RIGHT OUTER JOIN CORPDATA.EMPLOYEE
ON EMPNO = RESPEMP
WHERE LASTNAME > S

The results of this query are identical to the results from the left outer join query.
Exception join:
A left exception join returns only the rows from the first table that do not have a match in the second
table.
Using the same tables as before, return those employees that are not responsible for any projects.

78

IBM i: Database SQL programming

SELECT EMPNO, LASTNAME, PROJNO


FROM CORPDATA.EMPLOYEE EXCEPTION JOIN CORPDATA.PROJECT
ON EMPNO = RESPEMP
WHERE LASTNAME > S

This join returns the following output.


EMPNO

LASTNAME

PROJNO

000170

YOSHIMURA

000180

SCOUTTEN

000190

WALKER

000280

SCHNEIDER

000300

SMITH

000310

SETRIGHT

200170

YAMAMOTO

200280

SCHWARTZ

200310

SPRINGER

200330

WONG

An exception join can also be written as a subquery using the NOT EXISTS predicate. The previous query
can be rewritten in the following way:
SELECT EMPNO, LASTNAME
FROM CORPDATA.EMPLOYEE
WHERE LASTNAME > S
AND NOT EXISTS
(SELECT * FROM CORPDATA.PROJECT
WHERE EMPNO = RESPEMP)

The only difference in this query is that it cannot return values from the PROJECT table.
There is a right exception join, too, that works just like a left exception join but with the tables reversed.
Cross join:
A cross join, also known as a Cartesian Product join, returns a result table where each row from the first
table is combined with each row from the second table.
The number of rows in the result table is the product of the number of rows in each table. If the tables
involved are large, this join can take a very long time.
A cross join can be specified in two ways: using the JOIN syntax or by listing the tables in the FROM
clause separated by commas without using a WHERE clause to supply join criteria.
Suppose that the following tables exist.
Table 16. Table A
ACOL1

ACOL2

A1

AA1

A2

AA2

A3

AA3

SQL programming

79

Table 17. Table B


BCOL1

BCOL2

B1

BB1

B2

BB2

The following two select statements produce identical results.


SELECT * FROM A CROSS JOIN B
SELECT * FROM A, B

The result table for either of these SELECT statements looks like this.
ACOL1

ACOL2

BCOL1

BCOL2

A1

AA1

B1

BB1

A1

AA1

B2

BB2

A2

AA2

B1

BB1

A2

AA2

B2

BB2

A3

AA3

B1

BB1

A3

AA3

B2

BB2

Full outer join:


Like the left and right outer joins, a full outer join returns matching rows from both tables. However, a
full outer join also returns nonmatching rows from both tables.
Suppose that you want to find all employees and all of their projects. You want to see those employees
that are not currently in charge of a project as well as any projects that do not have an employee
assigned. The following query returns a list of all employees whose names are greater than 'S', along with
their assigned project numbers:
SELECT EMPNO, LASTNAME, PROJNO
FROM CORPDATA.EMPLOYEE FULL OUTER JOIN CORPDATA.PROJECT
ON EMPNO = RESPEMP
WHERE LASTNAME > S

Because there are no projects without an assigned employee, the query returns the same rows as a left
outer join. Here are the results.
EMPNO

LASTNAME

PROJNO

000020

THOMPSON

PL2100

000060

STERN

MA2110

000100

SPENSER

OP2010

000170

YOSHIMURA

000180

SCOUTTEN

000190

WALKER

000250

SMITH

AD3112

000280

SCHNEIDER

000300

SMITH

000310

SETRIGHT

200170

YAMAMOTO

80

IBM i: Database SQL programming

EMPNO

LASTNAME

PROJNO

200280

SCHWARTZ

200310

SPRINGER

200330

WONG

Multiple join types in one statement:


Sometimes you need to join more than two tables to produce the result that you want.
If you want to return all the employees, their department names, and the projects they are responsible
for, if any, you need to join the EMPLOYEE table, the DEPARTMENT table, and the PROJECT table to get
the information. The following example shows the query and the results:
SELECT EMPNO, LASTNAME, DEPTNAME, PROJNO
FROM CORPDATA.EMPLOYEE INNER JOIN CORPDATA.DEPARTMENT
ON WORKDEPT = DEPTNO
LEFT OUTER JOIN CORPDATA.PROJECT
ON EMPNO = RESPEMP
WHERE LASTNAME > S
EMPNO

LASTNAME

DEPTNAME

PROJNO

000020

THOMPSON

PLANNING

PL2100

000060

STERN

MANUFACTURING SYSTEMS

MA2110

000100

SPENSER

SOFTWARE SUPPORT

OP2010

000170

YOSHIMURA

MANUFACTURING SYSTEMS

000180

SCOUTTEN

MANUFACTURING SYSTEMS

000190

WALKER

MANUFACTURING SYSTEMS

000250

SMITH

ADMINISTRATION SYSTEMS

AD3112

000280

SCHNEIDER

OPERATIONS

000300

SMITH

OPERATIONS

000310

SETRIGHT

OPERATIONS

Using table expressions


You can use table expressions to specify an intermediate result table.
Table expressions can be used in place of a view to avoid creating the view when general use of the view
is not required. Table expressions consist of nested table expressions (also called derived tables) and
common table expressions.
Nested table expressions are specified within parentheses in the FROM clause. For example, suppose you
want a result table that shows the manager number, department number, and maximum salary for each
department. The manager number is in the DEPARTMENT table, the department number is in both the
DEPARTMENT and EMPLOYEE tables, and the salaries are in the EMPLOYEE table. You can use a table
expression in the FROM clause to select the maximum salary for each department. You can also add a
correlation name, T2, following the nested table expression to name the derived table. The outer select
then uses T2 to qualify columns that are selected from the derived table, in this case MAXSAL and
WORKDEPT. Note that the MAX(SALARY) column selected in the nested table expression must be
named in order to be referenced in the outer select. The AS clause is used to do that.
SELECT MGRNO, T1.DEPTNO, MAXSAL
FROM CORPDATA.DEPARTMENT T1,
(SELECT MAX(SALARY) AS MAXSAL, WORKDEPT

SQL programming

81

FROM CORPDATA.EMPLOYEE E1
GROUP BY WORKDEPT) T2
WHERE T1.DEPTNO = T2.WORKDEPT
ORDER BY DEPTNO

The result of the query follows.


MGRNO

DEPTNO

MAXSAL

000010

A00

52750.00

000020

B01

41250.00

000030

C01

38250.00

000060

D11

32250.00

000070

D21

36170.00

000050

E01

40175.00

000090

E11

29750.00

000100

E21

26150.00

Common table expressions can be specified before the full-select in a SELECT statement, an INSERT
statement, or a CREATE VIEW statement. They can be used when the same result table needs to be
shared in a full-select. Common table expressions are preceded with the keyword WITH.
For example, suppose you want a table that shows the minimum and maximum of the average salary of
a certain set of departments. The first character of the department number has some meaning and you
want to get the minimum and maximum for those departments that start with the letter 'D' and those
that start with the letter 'E'. You can use a common table expression to select the average salary for each
department. Again, you must name the derived table; in this case, the name is DT. You can then specify a
SELECT statement using a WHERE clause to restrict the selection to only the departments that begin with
a certain letter. Specify the minimum and maximum of column AVGSAL from the derived table DT.
Specify a UNION to get the results for the letter 'E' and the results for the letter 'D'.
WITH DT AS (SELECT E.WORKDEPT AS DEPTNO, AVG(SALARY) AS AVGSAL
FROM CORPDATA.DEPARTMENT D , CORPDATA.EMPLOYEE E
WHERE D.DEPTNO = E.WORKDEPT
GROUP BY E.WORKDEPT)
SELECT E, MAX(AVGSAL), MIN(AVGSAL) FROM DT
WHERE DEPTNO LIKE E%
UNION
SELECT D, MAX(AVGSAL), MIN(AVGSAL) FROM DT
WHERE DEPTNO LIKE D%

The result of the query follows.


MAX(AVGSAL)

MIN(AVGSAL)

40175.00

21020.00

25668.57

25147.27

Suppose that you want to write a query against your ordering database that will return the top 5 items
(in total quantity ordered) within the last 1000 orders from customers who also ordered item 'XXX'.
WITH X AS (SELECT ORDER_ID, CUST_ID
FROM ORDERS
ORDER BY ORD_DATE DESC
FETCH FIRST 1000 ROWS ONLY),
Y AS (SELECT CUST_ID, LINE_ID, ORDER_QTY
FROM X, ORDERLINE
WHERE X.ORDER_ID = ORDERLINE.ORDER_ID)

82

IBM i: Database SQL programming

SELECT LINE_ID
FROM (SELECT LINE_ID
FROM Y
WHERE Y.CUST_ID IN (SELECT DISTINCT CUST_ID
FROM Y
WHERE LINE.ID = XXX )
GROUP BY LINE_ID
ORDER BY SUM(ORDER_QTY) DESC)
FETCH FIRST 5 ROWS ONLY

The first common table expression (X) returns the most recent 1000 order numbers. The result is ordered
by the date in descending order and then only the first 1000 of those ordered rows are returned as the
result table.
The second common table expression (Y) joins the most recent 1000 orders with the line item table and
returns (for each of the 1000 orders) the customer, line item, and quantity of the line item for that order.
The derived table in the main select statement returns the line items for the customers who are in the top
1000 orders who ordered item XXX. The results for all customers who ordered XXX are then grouped by
the line item and the groups are ordered by the total quantity of the line item.
Finally, the outer select selects only the first 5 rows from the ordered list that the derived table returned.

Using recursive queries


|
|

Some applications work with data that is recursive in nature. To query this type of data, you can use a
hierarchical query or a recursive common table expression.
One example of recursive data is a Bill of Materials (BOM) application that works with the expansion of
parts and its component subparts. For example, a chair might be made of a seat unit and a leg assembly.
The seat unit might consist of a seat and two arms. Each of these parts can be further broken down into
its subparts until there is a list of all the parts needed to build a chair.

|
|
|
|

DB2 for i provides two ways of defining a recursive query. The first one is called a hierarchical query
which uses the CONNECT BY clause to define how a parent row is to be associated with its child rows.
The second method is to use a recursive common table expression. This uses a common table expression
to define the first, or seed, rows and then uses a UNION to define how the child rows are determined.

|
|
|
|

Each of these methods of defining a recursive query has advantages and disadvantages. The CONNECT
BY syntax is much simpler to understand, but has fewer ways to derive the data in its query. CONNECT
BY can be specified in any subselect anywhere in a query. A recursive common table expression has more
options for how the union is defined to generate the child rows.

|
|
|
|
|
|
|
|
|

There are a couple of behavioral differences between a connect by recursive query and a recursive
common table expression query. First, they differ in how they handle cyclic data. This difference is
discussed in the examples. Second, connect by allows a sort among siblings. This is also shown in the
examples. Finally, the two implementations differ in how the data is put on a queue that is used to
implement the recursion. By default a recursive common table expression's data tends to come out in
breadth first order, first in first out. With connect by, the order is designed to come out depth first. This
means that rows in a recursive step immediately follow their parent row. The recursive common table
expression syntax gives you a choice of depth or breadth first hierarchical order by adding the SEARCH
clause. The connect by syntax is always depth first.
In the trip planner examples for these recursive methods, airline flights and train connections are used to
find transportation paths between cities. The following table definitions and data are used in the
examples.

SQL programming

83

CREATE TABLE FLIGHTS (DEPARTURE CHAR(20),


ARRIVAL CHAR(20),
CARRIER CHAR(15),
FLIGHT_NUMBER CHAR(5),
PRICE INT);
INSERT
INSERT
INSERT
INSERT
INSERT
INSERT
INSERT
INSERT
INSERT
INSERT
INSERT
INSERT
INSERT
INSERT
INSERT
INSERT
INSERT
INSERT

INTO
INTO
INTO
INTO
INTO
INTO
INTO
INTO
INTO
INTO
INTO
INTO
INTO
INTO
INTO
INTO
INTO
INTO

FLIGHTS
FLIGHTS
FLIGHTS
FLIGHTS
FLIGHTS
FLIGHTS
FLIGHTS
FLIGHTS
FLIGHTS
FLIGHTS
FLIGHTS
FLIGHTS
FLIGHTS
FLIGHTS
FLIGHTS
FLIGHTS
FLIGHTS
FLIGHTS

VALUES(New York, Paris, Atlantic, 234, 400);


VALUES(Chicago, Miami, NA Air, 2334, 300);
VALUES(New York, London, Atlantic, 5473, 350);
VALUES(London, Athens , Mediterranean, 247, 340);
VALUES(Athens, Nicosia , Mediterranean, 2356, 280);
VALUES(Paris, Madrid , Euro Air, 3256, 380);
VALUES(Paris, Cairo , Euro Air, 63, 480);
VALUES(Chicago, Frankfurt, Atlantic, 37, 480);
VALUES(Frankfurt, Moscow, Asia Air, 2337, 580);
VALUES(Frankfurt, Beijing, Asia Air, 77, 480);
VALUES(Moscow, Tokyo, Asia Air, 437, 680);
VALUES(Frankfurt, Vienna, Euro Air, 59, 200);
VALUES(Paris, Rome, Euro Air, 534, 340);
VALUES(Miami, Lima, SA Air, 5234, 530);
VALUES(New York, Los Angeles, NA Air, 84, 330);
VALUES(Los Angeles, Tokyo, Pacific Air, 824, 530);
VALUES(Tokyo, Hawaii, Asia Air, 94, 330);
VALUES(Washington, Toronto, NA Air, 104, 250);

CREATE TABLE TRAINS(DEPARTURE CHAR(20),


ARRIVAL CHAR(20),
RAILLINE CHAR(15),
TRAIN CHAR(5),
PRICE INT);
INSERT INTO TRAINS VALUES(Chicago, Washington, UsTrack, 323, 90;
INSERT INTO TRAINS VALUES(Madrid, Barcelona, EuroTrack, 5234, 60);
INSERT INTO TRAINS VALUES(Washington , Boston , UsTrack, 232, 50);
CREATE TABLE FLIGHTSTATS(FLIGHT# CHAR(5),
ON_TIME_PERCENT DECIMAL(5,2),
CANCEL_PERCENT DECIMAL(5,2));
INSERT
INSERT
INSERT
INSERT
INSERT
INSERT
INSERT
INSERT
INSERT
INSERT
INSERT
INSERT
INSERT
INSERT
INSERT
INSERT
INSERT
INSERT

INTO
INTO
INTO
INTO
INTO
INTO
INTO
INTO
INTO
INTO
INTO
INTO
INTO
INTO
INTO
INTO
INTO
INTO

FLIGHTSTATS
FLIGHTSTATS
FLIGHTSTATS
FLIGHTSTATS
FLIGHTSTATS
FLIGHTSTATS
FLIGHTSTATS
FLIGHTSTATS
FLIGHTSTATS
FLIGHTSTATS
FLIGHTSTATS
FLIGHTSTATS
FLIGHTSTATS
FLIGHTSTATS
FLIGHTSTATS
FLIGHTSTATS
FLIGHTSTATS
FLIGHTSTATS

VALUES(234, 85.0, 0.20);


VALUES(2334, 92.0, 0.10);
VALUES(5473, 86.2, 0.10);
VALUES(247, 91.0, 0.10);
VALUES(2356, 91.0, 0.10);
VALUES(3256, 92.0 , 0.10);
VALUES(63, 90.5 , 0.10);
VALUES(37, 87.0 , 0.20);
VALUES(2337, 80.0, 0.20);
VALUES(77, 86.0, 0.10);
VALUES(437, 81.0, 0.10);
VALUES(59, 85.0, 01.0);
VALUES(534, 87.0 , 01.0);
VALUES(5234, 88.0, 0.20);
VALUES(84, 88.0, 0.1);
VALUES(824, 93.0, 0.10);
VALUES(94, 92.0, 0.10);
VALUES(104, 93.0, 0.10);

| Using connect by hierarchical queries


| Suppose you want to find out what cities you can fly to if you start in Chicago, and how many separate
| flights it will take to get there. The following query shows you that information.

84

IBM i: Database SQL programming

|
|
|
|

SELECT CONNECT_BY_ROOT departure AS origin, departure, arrival, LEVEL AS flight_count


FROM flights
START WITH departure = Chicago
CONNECT BY PRIOR arrival = departure

This query returns the following information.

Table 18. Results of the previous query

ORIGIN

DEPARTURE

ARRIVAL

FLIGHT_COUNT

Chicago

Chicago

Miami

Chicago

Miami

Lima

Chicago

Chicago

Frankfurt

Chicago

Frankfurt

Vienna

Chicago

Frankfurt

Beijing

Chicago

Frankfurt

Moscow

Chicago

Moscow

Tokyo

|
|

Chicago

Tokyo

Hawaii

|
|
|
|
|
|
|
|
|

There are several parts to this hierarchical query. There is an initial selection which defines the initial seed
for the recursion. In this case, it is the rows from the flights table that START WITH a departure from
Chicago'. The CONNECT BY clause is used to define how the rows that have already been generated are
to be 'connected' to generate more rows for subsequent iterations of the query. The PRIOR unary operator
tells DB2 how to select a new row based on the results of the previous row. The recursive join column
(typically one column but could have several) selected by the result of the START WITH clause is
referenced by the PRIOR keyword. This means that the previous row's ARRIVAL city becomes the new
row's PRIOR value for the DEPARTURE city. This is encapsulated in the clause CONNECT BY PRIOR
arrival = departure.

|
|
|
|
|
|

There are two other connect by features illustrated in this example query. The unary operator
CONNECT_BY_ROOT is used to define a fixed expression value that is determined in the initialization
step and is the same for all the generated recursive result rows. Typically, it is your starting value for that
particular iteration as you might have multiple START WITH values. In this query, it defines in the result
set the ORIGIN of the different destination options from Chicago. If the START WITH clause selected
multiple cities, ORIGIN would indicate which city a row used as its start value.

|
|
|

LEVEL is one of three pseudo columns available when using connect by recursion. The value of LEVEL
reflects the recursion level of the current row. In this example, LEVEL also reflects the number of flights it
would take to get from the city of ORIGIN (Chicago) to the different ARRIVAL cities.

|
|
|
|
|
|

A hierarchical query is run just like its equivalent recursive common table expression query and generates
the same result set. See Using recursive common table expressions and recursive views on page 90. The
only difference is in the order of the returned rows. The connect by query returns the rows in depth first
order; every row of the result set immediately follows its parent row. The recursive common table
expression query returns the rows in breadth first order; all the rows for one level are returned, then all
the rows that were generated from the previous level are returned.

Example: Two tables used for recursion using CONNECT BY

|
|

Now, suppose you start in Chicago but want to add in transportation options by rail in addition to flights
and you want to know which cities you can get to and how many connections it would take.

|
|

The following connect by query returns that information. Note that in the corresponding recursive
common table expression example, Example: Two tables used for recursion using recursive common
SQL programming

85

|
|
|
|
|
|
|
|
|
|
|

table expressions on page 92, we can also distinguish between the number of rail vs number of airline
connections and sum the ongoing ticket cost of the connections to that destination. Those calculations are
examples of derivations allowed using the more complex but more flexible recursive common table
expression syntax. That capability is not available when using the connect by syntax.
SELECT CONNECT_BY_ROOT departure AS departure, arrival, LEVEL - 1 connections
FROM
( SELECT departure, arrival FROM flights
UNION
SELECT departure, arrival FROM trains) t
START WITH departure = Chicago
CONNECT BY PRIOR arrival = departure;

| This query returns the following information.


|

Table 19. Results of the previous query

DEPARTURE

ARRIVAL

CONNECTIONS

Chicago

Miami

Chicago

Lima

Chicago

Frankfurt

Chicago

Vienna

Chicago

Beijing

Chicago

Moscow

Chicago

Tokyo

Chicago

Hawaii

Chicago

Washington

Chicago

Boston

|
|

Chicago

Toronto

| In this example, there are two data sources feeding the recursion, a list of flights and a list of trains. In
| the final results, you see how many connections are needed to travel between the cities.
| Example: Sibling ordering using CONNECT BY
|
|
|
|
|
|
|
|
|
|
|

One of the drawbacks of recursive common table expressions is that you cannot order the results among
siblings based on a particular column value. You can do this with connect by. For example, if you want to
output destinations from New York but you also want to order your hierarchical data among siblings by
a certain value, such as the cost of a ticket to that destination, you can do that by specifying the ORDER
SIBLINGS BY clause.
SELECT CONNECT_BY_ROOT departure AS origin, departure, arrival,
LEVEL level, price ticket_price
FROM flights
START WITH departure = New York
CONNECT BY PRIOR arrival = departure
ORDER SIBLINGS BY price ASC

| This query returns the following information.


|

Table 20. Results of the previous query

ORIGIN

DEPARTURE

ARRIVAL

LEVEL

TICKET_PRICE

New York

New York

LA

330

New York

LA

Tokyo

530

New York

Tokyo

Hawaii

330

86

IBM i: Database SQL programming

Table 20. Results of the previous query (continued)

ORIGIN

DEPARTURE

ARRIVAL

LEVEL

TICKET_PRICE

New York

New York

London

350

New York

London

Athens

340

New York

Athens

Nicosia

280

New York

New York

Paris

400

New York

Paris

Rome

340

New York

Paris

Madrid

380

|
|

New York

Paris

Cairo

480

|
|
|
|
|
|

The result table shows all the destinations possible from the origin city of New York. All sibling
destinations (those destinations that originate from the same departure city) are output sorted by ticket
price. For example, the destinations from Paris are Rome, Madrid and Cairo; they are output ordered by
ascending ticket price. Note that the output shows New York to LA as the first destination directly from
New York because it has a less expensive ticket price (330) than did the direct connects to London or
Paris which are 350 and 400 respectively.

Example: Cyclic data checks using CONNECT BY

|
|
|
|

The key to any recursive process, whether is it a recursive program or a recursive query, is that the
recursion must be finite. If not, you will get into a never ending loop. CONNECT BY is unlike recursive
common table expressions in that it always checks for infinite recursion and terminates that cycle
automatically so you never have to worry about a runaway query.

|
|

By default, if connect by encounters cyclic data, it will issue an SQL error, SQ20451: Cycle detected in
hierarchical query. This error causes termination of the query so no results are returned.

|
|

If you want results back and just want the infinite cycle to stop, you can specify the NOCYCLE keyword
on the CONNECT BY clause. This means no error will be issued for cyclic data.

|
|

Using the NOCYCLE option along with the CONNECT_BY_ISCYCLE pseudo column is a way you can
find cyclic data and correct the data if desired.

|
|
|

Inserting the following row into the FLIGHTS table results in potentially infinite recursion since Paris
goes to Cairo and Cairo goes to Paris.

|
|
|
|
|
|
|
|
|
|

The following query illustrates the tolerance of the cyclic data by specifying NOCYCLE. In addition, the
CONNECT_BY_ISCYCLE pseudo column is used to identify cyclic rows and the function
SYS_CONNECT_BY_PATH is used to build an Itinerary string of all the connection cities leading up to
the destination. SYS_CONNECT_BY_PATH is implemented as a CLOB data type so you have a large
result column to reflect deep recursions.

This query returns the following information.

INSERT INTO flights VALUES (Cairo, Paris, Atlantic, 1134, 440);

SELECT CONNECT_BY_ROOT departure AS origin, arrival,


SYS_CONNECT_BY_PATH(TRIM(arrival), : ) itinerary, CONNECT_BY_ISCYCLE cyclic
FROM flights
START WITH departure = New York
CONNECT BY NOCYCLE PRIOR arrival = departure;

SQL programming

87

Table 21. Results of the previous query

ORIGIN

ARRIVAL

ITINERARY

CYCLIC

New York

Paris

: Paris

New York

Rome

: Paris : Rome

New York

Cairo

: Paris : Cairo

New York

Paris

: Paris : Cairo : Paris

New York

Madrid

: Paris : Madrid

New York

London

: London

New York

Athens

: London : Athens

New York

Nicosia

: London : Athens : Nicosia

New York

LA

: LA

New York

Tokyo

: LA : Tokyo

|
|

New York

Hawaii

: LA : Tokyo : Hawaii

| Note that the result set row that reflects cyclic data often depends on where you start in the cycle with
| the START WITH clause.
| Example: Pseudo column CONNECT_BY_ISLEAF in CONNECT BY
| There may be times when processing recursive data that you may want to know which rows result in no
| further recursion. In other words, which rows are leaf rows or have no children in the hierarchy.
|
|
|
|
|
|
|
|
|

In the following query, you can find out which destinations are final destinations; in other words, which
destinations have no outbound flights. The CONNECT_BY_ISLEAF pseudo column will be 0 if it is not a
leaf and 1 if it is. You can also specify CONNECT_BY_ISLEAF in a WHERE predicate to see only leaf
rows.
SELECT CONNECT_BY_ROOT departure AS origin, arrival,
SYS_CONNECT_BY_PATH(TRIM(arrival), : ) itinerary, CONNECT_BY_ISLEAF leaf
FROM flights
START WITH departure = New York
CONNECT BY PRIOR arrival = departure;

| This query returns the following information.


|

Table 22. Results of the previous query

ORIGIN

ARRIVAL

ITINERARY

LEAF

New York

Paris

: Paris

New York

Rome

: Paris : Rome

New York

Cairo

: Paris : Cairo

New York

Madrid

: Paris : Madrid

New York

London

: London

New York

Athens

: London : Athens

New York

Nicosia

: London : Athens : Nicosia

New York

LA

: LA

New York

Tokyo

: LA : Tokyo

|
|

New York

Hawaii

: LA : Tokyo : Hawaii

88

IBM i: Database SQL programming

Example: Join predicates and where clause selection with CONNECT BY

|
|

Often times the hierarchical nature of your data is reflected in one table but you need to join those results
to additional tables to fully determine the output of the row.

|
|
|
|
|

In a connect by query you can use any type of join supported by DB2 for i including INNER JOIN, LEFT
OUTER JOIN, and LEFT EXCEPTION JOIN. When you explicitly use a JOIN clause, the predicate
specified in the ON clause is applied first, before the connect by operation, and any WHERE clause in the
connect by query is applied after the recursion. The WHERE selection is applied after the connect by so
that the recursive process results don't end too soon.

|
|
|
|
|
|
|
|

In the following query, you are looking for all the flight connections starting in New York that have an
ON_TIME_PERCENT greater than 90%.

This query returns the following information.

Table 23. Results of the previous query

ORIGIN

DEPARTURE

ARRIVAL

FLIGHT#

ONTIME

New York

Paris

Cairo

63

90.50

New York

Paris

Madrid

3256

92.00

New York

London

Athens

247

91.00

New York

Athens

Nicosia

2356

91.00

New York

LA

Tokyo

824

93.00

|
|

New York

Tokyo

Hawaii

94

92.00

|
|
|
|
|
|
|
|

This query can also be expressed without using the JOIN syntax. The query optimizer will pull out of the
WHERE clause those predicates that are join predicates to be processed first and leave any remaining
WHERE predicates to be evaluated after the recursion.

|
|
|

In this second example, if the WHERE predicates are more complex, you may need to aid the optimizer
by explicitly pulling out the JOIN predicates between the flights and flightstats tables and using both an
ON clause and a WHERE clause.

|
|
|
|
|
|
|
|
|

If you want additional search conditions to be applied as part of the recursion process, for example you
never want to take a flight with an on time percentage of less than 90%, you can also control the join
results by putting the join in a derived table with a join predicate and a WHERE clause.

Another option is to put the selection predicates in the START WITH and CONNECT BY clauses.

SELECT CONNECT_BY_ROOT departure AS origin, departure, arrival,


flight_number, on_time_Percent AS onTime
FROM flights INNER JOIN flightstats ON flight_number = flight#
WHERE on_time_percent > 90
START WITH departure = New York
CONNECT BY PRIOR arrival = departure;

SELECT CONNECT_BY_ROOT departure AS origin, departure, arrival, flight_number, on_time_percent AS onTime


FROM flights, flightstats
WHERE flight_number = flight# AND on_time_percent > 90
START WITH departure = New York
CONNECT BY PRIOR arrival = departure;

SELECT CONNECT_BY_ROOT departure AS origin, departure, arrival, flight_number, on_time_percent AS onTime


FROM (SELECT departure, arrival, flight_number, on_time_percent
FROM flights, flightstats
WHERE flight_number = flight# AND on_time_percent > 90) t1
START WITH departure=New York
CONNECT BY PRIOR arrival = departure;

SQL programming

89

| SELECT CONNECT_BY_ROOT departure AS origin, departure, arrival, flight_number, on_time_percent AS onTime


|
FROM flights, flightstats
|
WHERE flight_number = flight#
|
START WITH departure = New York AND on_time_percent > 90
|
CONNECT BY PRIOR arrival = departure AND on_time_percent > 90
| In this case, you would be out of luck as there are no direct flights out of New York with a greater than
| 90% on time statistic. Since there is nothing to seed the recursion, no rows are returned from the query.

Using recursive common table expressions and recursive views


Suppose you want to find out what cities you can fly to if you start in Chicago, and how many separate
flights it will take to get there. The following query shows you that information.
WITH destinations (origin, departure, arrival, flight_count) AS
(SELECT a.departure, a.departure, a.arrival, 1
FROM flights a
WHERE a.departure = Chicago
UNION ALL
SELECT r.origin, b.departure, b.arrival, r.flight_count + 1
FROM destinations r, flights b
WHERE r.arrival = b.departure)
SELECT origin, departure, arrival, flight_count
FROM destinations

This query returns the following information.


Table 24. Results of the previous query
ORIGIN

DEPARTURE

ARRIVAL

FLIGHT_COUNT

Chicago

Chicago

Miami

Chicago

Chicago

Frankfurt

Chicago

Miami

Lima

Chicago

Frankfurt

Moscow

Chicago

Frankfurt

Beijing

Chicago

Frankfurt

Vienna

Chicago

Moscow

Tokyo

Chicago

Tokyo

Hawaii

This recursive query is written in two parts. The first part of the common table expression is called the
intialization fullselect. It selects the first rows for the result set of the common table expression. In this
example, it selects the two rows in the flights table that get you directly to another location from Chicago.
It also initializes the number of flight legs to one for each row it selects.
The second part of the recursive query joins the rows from the current result set of the common table
expression with other rows from the original table. It is called the iterative fullselect. This is where the
recursion is introduced. Notice that the rows that have already been selected for the result set are
referenced by using the name of the common table expression as the table name and the common table
expression result column names as the column names.
In this recursive part of the query, any rows from the original table that you can get to from each of the
previously selected arrival cities are selected. A previously selected row's arrival city becomes the new
departure city. Each row from this recursive select increments the flight count to the destination by one
more flight. As these new rows are added to the common table expression result set, they are also fed
into the iterative fullselect to generate more result set rows. In the data for the final result, you can see
that the total number of flights is actually the total number of recursive joins (plus 1) it took to get to that
arrival city.

90

IBM i: Database SQL programming

A recursive view looks very similar to a recursive common table expression. You can write the previous
recursive common table expression as a recursive view like this:
CREATE VIEW destinations (origin, departure, arrival, flight_count) AS
SELECT departure, departure, arrival, 1
FROM flights
WHERE departure = Chicago
UNION ALL
SELECT r.origin, b.departure, b.arrival, r.flight_count + 1
FROM destinations r, flights b
WHERE r.arrival = b.departure)

|
|
|

The iterative fullselect part of this view definition refers to the view itself. Selection from this view
returns the same rows as you get from the previous recursive common table expression. For comparison,
note that connect by recursion is allowed anywhere a SELECT is allowed, so it can easily be included in a
view definition.

Example: Two starting cities using recursive common table expressions


Suppose you are willing to fly from either Chicago or New York, and you want to know where you
could go and how much it would cost.
WITH destinations (departure, arrival, connections, cost) AS
(SELECT a.departure, a.arrival, 0, price
FROM flights a
WHERE a.departure = Chicago OR
a.departure = New York
UNION ALL
SELECT r.departure, b.arrival, r.connections + 1,
r.cost + b.price
FROM destinations r, flights b
WHERE r.arrival = b.departure)
SELECT departure, arrival, connections, cost
FROM destinations

This query returns the following information.


Table 25. Results of the previous query
DEPARTURE

ARRIVAL

CONNECTIONS

COST

Chicago

Miami

300

Chicago

Frankfurt

480

New York

Paris

400

New York

London

350

New York

Los Angeles

330

Chicago

Lima

830

Chicago

Moscow

1,060

Chicago

Beijing

960

Chicago

Vienna

680

New York

Madrid

780

New York

Cairo

880

New York

Rome

740

New York

Athens

690

New York

Tokyo

860

Chicago

Tokyo

1,740

New York

Nicosia

970
SQL programming

91

Table 25. Results of the previous query (continued)


DEPARTURE

ARRIVAL

CONNECTIONS

COST

New York

Hawaii

1,190

Chicago

Hawaii

2,070

For each returned row, the results show the starting departure city and the final destination city. It counts
the number of connections needed rather than the total number of flight and adds up the total cost for all
the flights.

Example: Two tables used for recursion using recursive common table expressions
Now, suppose you start in Chicago but add in transportation by railway in addition to the airline flights,
and you want to know which cities you can go to.
The following query returns that information:
WITH destinations (departure, arrival, connections, flights, trains, cost) AS
(SELECT f.departure, f.arrival, 0, 1, 0, price
FROM flights f
WHERE f.departure = Chicago
UNION ALL
SELECT t.departure, t.arrival, 0, 0, 1, price
FROM trains t
WHERE t.departure = Chicago
UNION ALL
SELECT r.departure, b.arrival, r.connections + 1 , r.flights + 1, r.trains,
r.cost + b.price
FROM destinations r, flights b
WHERE r.arrival = b.departure
UNION ALL
SELECT r.departure, c.arrival, r.connections + 1 ,
r.flights, r.trains + 1, r.cost + c.price
FROM destinations r, trains c
WHERE r.arrival = c.departure)
SELECT departure, arrival, connections, flights, trains, cost
FROM destinations

This query returns the following information.


Table 26. Results of the previous query
DEPARTURE

ARRIVAL

CONNECTIONS

FLIGHTS

TRAINS

COST

Chicago

Miami

300

Chicago

Frankfurt

480

Chicago

Washington

90

Chicago

Lima

830

Chicago

Moscow

1,060

Chicago

Beijing

960

Chicago

Vienna

680

Chicago

Toronto

340

Chicago

Boston

140

Chicago

Tokyo

1,740

Chicago

Hawaii

2,070

92

IBM i: Database SQL programming

In this example, there are two parts of the common table expression that provide initialization values to
the query: one for flights and one for trains. For each of the result rows, there are two recursive
references to get from the previous arrival location to the next possible destination: one for continuing by
air, the other for continuing by train. In the final results, you would see how many connections are
needed and how many airline or train trips can be taken.

Example: DEPTH FIRST and BREADTH FIRST options for recursive common table
expressions
The two examples here show the difference in the result set row order based on whether the recursion is
processed depth first or breadth first.
Note: The search clause is not supported directly for recursive views. You can define a view that contains
a recursive common table expression to get this function.
The option to determine the result using breadth first or depth first is a recursive relationship sort based
on the recursive join column specified for the SEARCH BY clause. When the recursion is handled breadth
first, all children are processed first, then all grandchildren, then all great grandchildren. When the
recursion is handled depth first, the full recursive ancestry chain of one child is processed before going to
the next child.
In both of these cases, you specify an extra column name that is used by the recursive process to keep
track of the depth first or breadth first ordering. This column must be used in the ORDER BY clause of
the outer query to get the rows back in the specified order. If this column is not used in the ORDER BY,
the DEPTH FIRST or BREADTH FIRST processing option is ignored.
The selection of which column to use for the SEARCH BY column is important. To have any meaning in
the result, it must be the column that is used in the iterative fullselect to join from the initialization
fullselect. In this example, ARRIVAL is the column to use.
The following query returns that information:
WITH destinations (departure, arrival, connections, cost) AS
(SELECT f.departure, f.arrival, 0, price
FROM flights f
WHERE f.departure = Chicago
UNION ALL
SELECT r.departure, b.arrival, r.connections + 1,
r.cost + b.price
FROM destinations r, flights b
WHERE r.arrival = b.departure)
SEARCH DEPTH FIRST BY arrival SET ordcol
SELECT *
FROM destinations
ORDER BY ordcol

This query returns the following information.


Table 27. Results of the previous query
DEPARTURE

ARRIVAL

CONNECTIONS

COST

Chicago

Miami

300

Chicago

Lima

830

Chicago

Frankfurt

480

Chicago

Moscow

1,060

Chicago

Tokyo

1,740

Chicago

Hawaii

2,070

SQL programming

93

Table 27. Results of the previous query (continued)


DEPARTURE

ARRIVAL

CONNECTIONS

COST

Chicago

Beijing

960

Chicago

Vienna

680

In this result data, you can see that all destinations that are generated from the Chicago-to-Miami row are
listed before the destinations from the Chicago-to-Frankfort row.
Next, you can run the same query but request the result to be ordered breadth first.
WITH destinations (departure, arrival, connections, cost) AS
(SELECT f.departure, f.arrival, 0, price
FROM flights f
WHERE f.departure = Chicago
UNION ALL
SELECT r.departure, b.arrival, r.connections + 1,
r.cost + b.price
FROM destinations r, flights b
WHERE r.arrival = b.departure)
SEARCH BREADTH FIRST BY arrival SET ordcol
SELECT *
FROM destinations
ORDER BY ordcol

This query returns the following information.


Table 28. Results of the previous query

|
|
|
|
|

DEPARTURE

ARRIVAL

CONNECTIONS

COST

Chicago

Miami

300

Chicago

Frankfurt

480

Chicago

Lima

830

Chicago

Moscow

1,060

Chicago

Beijing

960

Chicago

Vienna

680

Chicago

Tokyo

1,740

Chicago

Hawaii

2,070

In this result data, you can see that all the direct connections from Chicago are listed before the
connecting flights. The data is identical to the results from the previous query, but in a breadth first order.
As you can see, there is no ordering done based on any values of the column used for depth or breadth
first processing. To get ordering, the ORDER SIBLINGS BY construct available with the CONNECT BY
form of recursion can be used.

Example: Cyclic data using recursive common table expressions


The key to any recursive process, whether it is a recursive programming algorithm or querying recursive
data, is that the recursion must be finite. If not, you will get into a never ending loop. The CYCLE option
allows you to safeguard against cyclic data. Not only will it terminate repeating cycles but it also allows
you to optionally output a cycle mark indicator that may lead you to find cyclic data.
Note: The cycle clause is not supported directly for recursive views. You can define a view that contains
a recursive common table expression to get this function.

94

IBM i: Database SQL programming

For a final example, suppose we have a cycle in the data. By adding one more row to the table, there is
now a flight from Cairo to Paris and one from Paris to Cairo. Without accounting for possible cyclic data
like this, it is quite easy to generate a query that will go into an infinite loop processing the data.
The following query returns that information:
INSERT INTO FLIGHTS VALUES(Cairo, Paris, Euro Air, 1134, 440)
WITH destinations (departure, arrival, connections, cost, itinerary) AS
(SELECT f.departure, f.arrival, 1, price,
CAST(f.departure CONCAT f.arrival AS VARCHAR(2000))
FROM flights f
WHERE f.departure = New York
UNION ALL
SELECT r.departure, b.arrival, r.connections + 1 ,
r.cost + b.price, CAST(r.itinerary CONCAT b.arrival AS VARCHAR(2000))
FROM destinations r, flights b
WHERE r.arrival = b.departure)
CYCLE arrival SET cyclic_data TO 1 DEFAULT 0
SELECT departure, arrival, itinerary, cyclic_data
FROM destinations

This query returns the following information.


Table 29. Results of the previous query
DEPARTURE

ARRIVAL

ITINERARY

CYCLIC_DATA

New York

Paris

New York

Paris

New York

London

New York

London

New York

Los Angeles

New York

Los Angeles

New York

Madrid

New York

Paris

Madrid

New York

Cairo

New York

Paris

Cairo

New York

Rome

New York

Paris

Rome

New York

Athens

New York

London

New York

Tokyo

New York

Los Angeles

New York

Paris

New York

Paris

New York

Nicosia

New York

London

New York

Hawaii

New York

Los Angeles

Athens

Tokyo

Cairo

Paris

Athens
Tokyo

Nicosia

Hawaii

In this example, the ARRIVAL column is defined in the CYCLE clause as the column to use for detecting
a cycle in the data. When a cycle is found, a special column, CYCLIC_DATA in this case, is set to the
character value of '1' for the cycling row in the result set. All other rows will contain the default value of
'0'. When a cycle on the ARRIVAL column is found, processing will not proceed any further in the data so
the infinite loop will not happen. To see if your data actually has a cyclic reference, the CYCLIC_DATA
| column can be referenced in the outer query. You can choose to exclude cyclic rows by adding a
| predicate: WHERE CYCLIC_DATA = 0.

Using the UNION keyword to combine subselects


Using the UNION keyword, you can combine two or more subselects to form a fullselect.
When SQL encounters the UNION keyword, it processes each subselect to form an interim result table,
then it combines the interim result table of each subselect and deletes duplicate rows to form a combined
result table. You can use different clauses and techniques when coding select-statements.
You can use UNION to eliminate duplicates when merging lists of values obtained from several tables.
For example, you can obtain a combined list of employee numbers that includes:

SQL programming

95

v People in department D11


v People whose assignments include projects MA2112, MA2113, and AD3111
The combined list is derived from two tables and contains no duplicates. To do this, specify:
SELECT EMPNO
FROM CORPDATA.EMPLOYEE
WHERE WORKDEPT = D11
UNION
SELECT EMPNO
FROM CORPDATA.EMPPROJACT
WHERE PROJNO = MA2112 OR
PROJNO = MA2113 OR
PROJNO = AD3111
ORDER BY EMPNO

To better understand the results from these SQL statements, imagine that SQL goes through the following
process:
Step 1. SQL processes the first SELECT statement:
SELECT EMPNO
FROM CORPDATA.EMPLOYEE
WHERE WORKDEPT = D11

The query returns the following interim result table.


EMPNO from CORPDATA.EMPLOYEE
000060
000150
000160
000170
000180
000190
000200
000210
000220
200170
200220

Step 2. SQL processes the second SELECT statement:


SELECT EMPNO
FROM CORPDATA.EMPPROJACT
WHERE PROJNO=MA2112 OR
PROJNO= MA2113 OR
PROJNO= AD3111

The query returns another interim result table.


EMPNO from CORPDATA.EMPPROJACT
000230
000230
000240
000230

96

IBM i: Database SQL programming

EMPNO from CORPDATA.EMPPROJACT


000230
000240
000230
000150
000170
000190
000170
000190
000150
000160
000180
000170
000210
000210

Step 3. SQL combines the two interim result tables, removes duplicate rows, and orders the result:
SELECT EMPNO
FROM CORPDATA.EMPLOYEE
WHERE WORKDEPT = D11
UNION
SELECT EMPNO
FROM CORPDATA.EMPPROJACT
WHERE PROJNO=MA2112 OR
PROJNO= MA2113 OR
PROJNO= AD3111
ORDER BY EMPNO

The query returns a combined result table with values in ascending sequence.
EMPNO
000060
000150
000160
000170
000180
000190
000200
000210
000220
000230
000240
200170
200220

When you use UNION:


SQL programming

97

v Any ORDER BY clause must appear after the last subselect that is part of the union. In this example,
the results are sequenced on the basis of the first selected column, EMPNO. The ORDER BY clause
specifies that the combined result table is to be in collated sequence. ORDER BY is not allowed in a
view.
v A name may be specified on the ORDER BY clause if the result columns are named. A result column is
named if the corresponding columns in each of the unioned select-statements have the same name. An
AS clause can be used to assign a name to columns in the select list.
SELECT A + B AS X ...
UNION
SELECT X ... ORDER BY X

If the result columns are unnamed, use a positive integer to order the result. The number refers to the
position of the expression in the list of expressions you include in your subselects.
SELECT A + B ...
UNION
SELECT X ... ORDER BY 1

To identify which subselect each row is from, you can include a constant at the end of the select list of
each subselect in the union. When SQL returns your results, the last column contains the constant for the
subselect that is the source of that row. For example, you can specify:
SELECT A, B, A1 ...
UNION
SELECT X, Y, B2...

When a row is returned, it includes a value (either A1 or B2) to indicate the table that is the source of the
row's values. If the column names in the union are different, SQL uses the set of column names specified
in the first subselect when interactive SQL displays or prints the results, or in the SQLDA resulting from
processing an SQL DESCRIBE statement.
Note: Sort sequence is applied after the fields across the UNION pieces are made compatible. The sort
sequence is used for the distinct processing that implicitly occurs during UNION processing.
Related concepts:
Sort sequences and normalization in SQL on page 128
A sort sequence defines how characters in a character set relate to each other when they are compared or
ordered. Normalization allows you to compare strings that contain combining characters.
Related reference:
Creating and using views on page 48
A view can be used to access data in one or more tables or views. You create a view by using a SELECT
statement.
Specifying the UNION ALL keyword:
If you want to keep duplicates in the result of a UNION operation, specify the UNION ALL keyword
instead of just UNION.
This topic uses the same steps and example as Using the UNION keyword to combine subselects on
page 95.
Step 3. SQL combines two interim result tables:
SELECT EMPNO
FROM CORPDATA.EMPLOYEE
WHERE WORKDEPT = D11
UNION ALL
SELECT EMPNO
FROM CORPDATA.EMPPROJACT

98

IBM i: Database SQL programming

WHERE PROJNO=MA2112 OR
PROJNO= MA2113 OR
PROJNO= AD3111
ORDER BY EMPNO

The query returns an ordered result table that includes duplicates.


EMPNO
000060
000150
000150
000150
000160
000160
000170
000170
000170
000170
000180
000180
000190
000190
000190
000200
000210
000210
000210
000220
000230
000230
000230
000230
000230
000240
000240
200170
200220

The UNION ALL operation is associative, for example:


(SELECT PROJNO FROM CORPDATA.PROJECT
UNION ALL
SELECT PROJNO FROM CORPDATA.PROJECT)
UNION ALL
SELECT PROJNO FROM CORPDATA.EMPPROJACT

This statement can also be written as:


SQL programming

99

SELECT PROJNO FROM CORPDATA.PROJECT


UNION ALL
(SELECT PROJNO FROM CORPDATA.PROJECT
UNION ALL
SELECT PROJNO FROM CORPDATA.EMPPROJACT)

When you include the UNION ALL in the same SQL statement as a UNION operator, however, the result
of the operation depends on the order of evaluation. Where there are no parentheses, evaluation is from
left to right. Where parentheses are included, the parenthesized subselect is evaluated first, followed,
from left to right, by the other parts of the statement.

Using the EXCEPT keyword


The EXCEPT keyword returns the result set of the first subselect minus any matching rows from the
second subselect.
Suppose that you want to find a list of employee numbers that includes people in department D11 minus
those people whose assignments include projects MA2112, MA2113, and AD3111.
This query returns all of the people in department D11 who are not working on projects MA2112,
MA2113, and AD3111:
SELECT EMPNO
FROM CORPDATA.EMPLOYEE
WHERE WORKDEPT = D11
EXCEPT
SELECT EMPNO
FROM CORPDATA.EMPPROJACT
WHERE PROJNO = MA2112 OR
PROJNO = MA2113 OR
PROJNO = AD3111
ORDER BY EMPNO

To better understand the results from these SQL statements, imagine that SQL goes through the following
process:
Step 1. SQL processes the first SELECT statement:
SELECT EMPNO
FROM CORPDATA.EMPLOYEE
WHERE WORKDEPT = D11

This query returns an interim result table.


EMPNO from CORPDATA.EMPLOYEE
000060
000150
000160
000170
000180
000190
000200
000210
000220
200170
200220

100

IBM i: Database SQL programming

Step 2. SQL processes the second SELECT statement:


SELECT EMPNO
FROM CORPDATA.EMPPROJACT
WHERE PROJNO=MA2112 OR
PROJNO= MA2113 OR
PROJNO= AD3111

This query returns another interim result table.


EMPNO from CORPDATA.EMPPROJACT
000230
000230
000240
000230
000230
000240
000230
000150
000170
000190
000170
000190
000150
000160
000180
000170
000210
000210

Step 3. SQL takes the first interim result table, removes all of the rows that also appear in the second
interim result table, removes duplicate rows, and orders the result:
SELECT EMPNO
FROM CORPDATA.EMPLOYEE
WHERE WORKDEPT = D11
EXCEPT
SELECT EMPNO
FROM CORPDATA.EMPPROJACT
WHERE PROJNO=MA2112 OR
PROJNO= MA2113 OR
PROJNO= AD3111
ORDER BY EMPNO

This query returns a combined result table with values in ascending sequence.
EMPNO
000060
000200
000220
200170

SQL programming

101

EMPNO
200220

Using the INTERSECT keyword


The INTERSECT keyword returns a combined result set that consists of all of the rows existing in both
result sets.
Suppose that you want to find a list of employee numbers that includes people in department D11 and
people whose assignments include projects MA2112, MA2113, and AD3111.
The INTERSECT operation returns all of the employee numbers that exist in both result sets. In other
words, this query returns all of the people in department D11 who are also working on projects MA2112,
MA2113, and AD3111:
SELECT EMPNO
FROM CORPDATA.EMPLOYEE
WHERE WORKDEPT = D11
INTERSECT
SELECT EMPNO
FROM CORPDATA.EMPPROJACT
WHERE PROJNO = MA2112 OR
PROJNO = MA2113 OR
PROJNO = AD3111
ORDER BY EMPNO

To better understand the results from these SQL statements, imagine that SQL goes through the following
process:
Step 1. SQL processes the first SELECT statement:
SELECT EMPNO
FROM CORPDATA.EMPLOYEE
WHERE WORKDEPT = D11

This query returns an interim result table.


EMPNO from CORPDATA.EMPLOYEE
000060
000150
000160
000170
000180
000190
000200
000210
000220
200170
200220

Step 2. SQL processes the second SELECT statement:

102

IBM i: Database SQL programming

SELECT EMPNO
FROM CORPDATA.EMPPROJACT
WHERE PROJNO=MA2112 OR
PROJNO= MA2113 OR
PROJNO= AD3111

This query returns another interim result table.


EMPNO from CORPDATA.EMPPROJACT
000230
000230
000240
000230
000230
000240
000230
000150
000170
000190
000170
000190
000150
000160
000180
000170
000210
000210

Step 3. SQL takes the first interim result table, compares it to the second interim result table, and returns
the rows that exist in both tables minus any duplicate rows, and orders the results.
SELECT EMPNO
FROM CORPDATA.EMPLOYEE
WHERE WORKDEPT = D11
INTERSECT
SELECT EMPNO
FROM CORPDATA.EMPPROJACT
WHERE PROJNO=MA2112 OR
PROJNO= MA2113 OR
PROJNO= AD3111
ORDER BY EMPNO

This query returns a combined result table with values in ascending sequence.
EMPNO
000150
000160
000170
000180
000190

SQL programming

103

EMPNO
000210

Data retrieval errors


Use this information to understand how SQL handles errors that occur when retrieving data.
If SQL finds that a retrieved character or graphic column is too long to be placed in a host variable, SQL
does the following:
v Truncates the data while assigning the value to the host variable.
v Sets SQLWARN0 and SQLWARN1 in the SQLCA to the value 'W' or sets RETURNED_SQLSTATE to
'01004' in the SQL diagnostics area.
v Sets the indicator variable, if provided, to the length of the value before truncation.
If SQL finds a data mapping error while running a statement, one of two things occurs:
v If the error occurs on an expression in the SELECT list and an indicator variable is provided for the
expression in error:
SQL returns a -2 for the indicator variable corresponding to the expression in error.
SQL returns all valid data for that row.
SQL returns a positive SQLCODE.
v If an indicator variable is not provided, SQL returns the corresponding negative SQLCODE.
Data mapping errors include:
v +138 - Argument of the substringing function is not valid.
v +180 - Syntax for a string representation of a date, time, or timestamp is not valid.
v +181 - String representation of a date, time, or timestamp is not a valid value.
v +183 - Invalid result from a date/time expression. The resulting date or timestamp is not within the
valid range of dates or timestamps.
v +191 - MIXED data is not properly formed.
v +304 - Numeric conversion error (for example, overflow, underflow, or division by zero).
v +331 - Characters cannot be converted.
v +364 - DECFLOAT arithmetic error.
v +420 - Character in the CAST argument is not valid.
v +802 - Data conversion or data mapping error.
For data mapping errors, the SQLCA reports only the last error detected. The indicator variable
corresponding to each result column having an error is set to -2.
For data mapping errors on a multi-row FETCH, each mapping error reported as a warning SQLSTATE
will have a separate condition area in the SQL diagnostics area. Note that SQL stops on the first error, so
only one mapping error that is reported as an error SQLSTATE will be returned in the SQL diagnostics
area.
For all other SQL statements, only the last warning SQLSTATE will be reported in the SQL diagnostics
area.
If the full-select contains DISTINCT in the select list and a column in the select list contains numeric data
that is not valid, the data is considered equal to a null value if the query is completed as a sort. If an
existing index is used, the data is not considered equal to a null.
The impact of data mapping errors on the ORDER BY clause depends on the situation:

104

IBM i: Database SQL programming

v If the data mapping error occurs while data is being assigned to a host variable in a SELECT INTO or
FETCH statement, and that same expression is used in the ORDER BY clause, the result record is
ordered based on the value of the expression. It is not ordered as if it were a null (higher than all other
values). This is because the expression was evaluated before the assignment to the host variable is
attempted.
v If the data mapping error occurs while an expression in the select-list is being evaluated and the same
expression is used in the ORDER BY clause, the result column is normally ordered as if it were a null
value (higher than all other values). If the ORDER BY clause is implemented by using a sort, the result
column is ordered as if it were a null value. If the ORDER BY clause is implemented by using an
existing index, in the following cases, the result column is ordered based on the actual value of the
expression in the index:
The expression is a date column with a date format of *MDY, *DMY, *YMD, or *JUL, and a date
conversion error occurs because the date is not within the valid range for dates.
The expression is a character column and a character cannot be converted.
The expression is a decimal column and a numeric value that is not valid is detected.

Inserting rows using the INSERT statement


To add a single row or multiple rows to a table or view, use a form of the INSERT statement.
You can use the INSERT statement to add new rows to a table or view in one of the following ways:
v Specifying values in the INSERT statement for columns to be added.
v Including a select-statement in the INSERT statement to tell SQL what data for the new row is
contained in another table or view.
v Specifying the blocked form of the INSERT statement to add multiple rows.
For every row you insert, you must supply a value for each column defined with the NOT NULL
attribute if that column does not have a default value. The INSERT statement for adding a row to a table
or view may look like this:
INSERT INTO table-name
(column1, column2, ... )
VALUES (value-for-column1, value-for-column2, ... )

The INTO clause names the columns for which you specify values. The VALUES clause specifies a value
for each column named in the INTO clause. The value you specify can be:
v A constant. Inserts the value provided in the VALUES clause.
v A null value. Inserts the null value, using the keyword NULL. The column must be defined as capable
of containing a null value or an error occurs.
v A host variable. Inserts the contents of a host variable.
|

v A global variable. Inserts the contents of the global variable.


v A special register. Inserts a special register value; for example, USER.
v An expression. Inserts the value that results from an expression.
v A scalar fullselect. Inserts the value that is the result of running the select statement.
v The DEFAULT keyword. Inserts the default value of the column. The column must have a default
value defined for it or allow the NULL value, or an error occurs.
You must provide a value in the VALUES clause for each column named in an INSERT statement's
column list. The column name list can be omitted if all columns in the table have a value provided in the
VALUES clause. If a column has a default value, the keyword DEFAULT may be used as a value in the
VALUES clause. This causes the default value for the column to be placed in the column.
It is a good idea to name all columns into which you are inserting values because:
v Your INSERT statement is more descriptive.
SQL programming

105

v You can verify that you are providing the values in the proper order based on the column names.
v You have better data independence. The order in which the columns are defined in the table does not
affect your INSERT statement.
If the column is defined to allow null values or to have a default, you do not need to name it in the
column name list or specify a value for it. The default value is used. If the column is defined to have a
default value, the default value is placed in the column. If DEFAULT was specified for the column
definition without an explicit default value, SQL places the default value for that data type in the
column. If the column does not have a default value defined for it, but is defined to allow the null value
(NOT NULL was not specified in the column definition), SQL places the null value in the column.
v For numeric columns, the default value is 0.
v For fixed length character or graphic columns, the default is blanks.
v For fixed length binary columns, the default is hexadecimal zeros.
v For varying length character, graphic, or binary columns and for LOB columns, the default is a zero
length string.
v For date, time, and timestamp columns, the default value is the current date, time, or timestamp. When
inserting a block of records, the default date/time value is extracted from the system when the block is
written. This means that the column will be assigned the same default value for each row in the block.
v For DataLink columns, the default value corresponds to DLVALUE('','URL','').
v For distinct-type columns, the default value is the default value of the corresponding source type.

v For ROWID columns or columns that are defined AS IDENTITY, the database manager generates a
default value.
v For XML columns, there is no default allowed except the null value.
When your program attempts to insert a row that duplicates another row already in the table, an error
might occur. Multiple null values may or may not be considered duplicate values, depending on the
option used when the index was created.
v If the table has a primary key, unique key, or unique index, the row is not inserted. Instead, SQL
returns an SQLCODE of -803.
v If the table does not have a primary key, unique key, or unique index, the row can be inserted without
error.
If SQL finds an error while running the INSERT statement, it stops inserting data. If you specify
COMMIT(*ALL), COMMIT(*CS), COMMIT(*CHG), or COMMIT(*RR), no rows are inserted. Rows already
inserted by this statement, in the case of INSERT with a select-statement or blocked insert, are deleted. If
you specify COMMIT(*NONE), any rows already inserted are not deleted.
A table created by SQL is created with the Reuse Deleted Records parameter of *YES. This allows the
database manager to reuse any rows in the table that were marked as deleted. The CHGPF command can
be used to change the attribute to *NO. This causes INSERT to always add rows to the end of the table.
The order in which rows are inserted does not guarantee the order in which they will be retrieved.
If the row is inserted without error, the SQLERRD(3) field of the SQLCA has a value of 1.
Note: For blocked INSERT or for INSERT with select-statement, more than one row can be inserted. The
number of rows inserted is reflected in SQLERRD(3) in the SQLCA. It is also available from the
ROW_COUNT diagnostics item in the GET DIAGNOSTICS statement.
Related reference:
INSERT

Inserting rows using the VALUES clause


You use the VALUES clause in the INSERT statement to insert a single row or multiple rows into a table.

106

IBM i: Database SQL programming

An example of this is to insert a new row into the DEPARTMENT table. The columns for the new row
are as follows:
v
v
v
v

Department number (DEPTNO) is 'E31'


Department name (DEPTNAME) is 'ARCHITECTURE'
Manager number (MGRNO) is '00390'
Reports to (ADMRDEPT) department 'E01'

The INSERT statement for this new row is as follows:


INSERT INTO DEPARTMENT (DEPTNO, DEPTNAME, MGRNO, ADMRDEPT)
VALUES(E31, ARCHITECTURE, 00390, E01)

You can also insert multiple rows into a table using the VALUES clause. The following example inserts
two rows into the PROJECT table. Values for the Project number (PROJNO) , Project name (PROJNAME),
Department number (DEPTNO), and Responsible employee (RESPEMP) are given in the values list. The
value for the Project start date (PRSTDATE) uses the current date. The rest of the columns in the table
that are not listed in the column list are assigned their default value.
INSERT INTO PROJECT (PROJNO, PROJNAME, DEPTNO, RESPEMP, PRSTDATE)
VALUES(HG0023, NEW NETWORK, E11, 200280, CURRENT DATE),
(HG0024, NETWORK PGM, E11", 200310, CURRENT DATE)

Inserting rows using a select-statement


You can use a select-statement within an INSERT statement to insert zero, one, or more rows into a table
from the result table of the select-statement.
One use for this kind of INSERT statement is to move data into a table you created for summary data.
For example, suppose you want a table that shows each employee's time commitments to projects. Create
a table called EMPTIME with the columns EMPNUMBER, PROJNUMBER, STARTDATE, and ENDDATE
and then use the following INSERT statement to fill the table:
INSERT INTO CORPDATA.EMPTIME
(EMPNUMBER, PROJNUMBER, STARTDATE, ENDDATE)
SELECT EMPNO, PROJNO, EMSTDATE, EMENDATE
FROM CORPDATA.EMPPROJACT

The select-statement embedded in the INSERT statement is no different from the select-statement you use
to retrieve data. With the exception of FOR READ ONLY, FOR UPDATE, or the OPTIMIZE clause, you
can use all the keywords, functions, and techniques used to retrieve data. SQL inserts all the rows that
meet the search conditions into the table you specify. Inserting rows from one table into another table
does not affect any existing rows in either the source table or the target table.
You should consider the following when inserting multiple rows into a table:
Notes:
1. The number of columns implicitly or explicitly listed in the INSERT statement must equal the
number of columns listed in the select-statement.
2. The data in the columns you are selecting must be compatible with the columns you are
inserting into when using the INSERT with select-statement.
3. In the event the select-statement embedded in the INSERT returns no rows, an SQLCODE of
100 is returned to alert you that no rows were inserted. If you successfully insert rows, the
SQLERRD(3) field of the SQLCA has an integer representing the number of rows SQL actually
inserted. This value is also available from the ROW_COUNT diagnostics item in the GET
DIAGNOSTICS statement.
4. If SQL finds an error while running the INSERT statement, SQL stops the operation. If you
specify COMMIT (*CHG), COMMIT(*CS), COMMIT (*ALL), or COMMIT(*RR), nothing is
inserted into the table and a negative SQLCODE is returned. If you specify COMMIT(*NONE),
any rows inserted before the error remain in the table.
SQL programming

107

Inserting multiple rows using the blocked INSERT statement


Using a blocked INSERT statement, you can insert multiple rows into a table with a single INSERT
statement.
The blocked INSERT statement is supported in all of the languages except REXX. The data inserted into
the table must be in a host structure array. If indicator variables are used with a blocked INSERT, they
must also be in a host structure array.
For example, to add ten employees to the CORPDATA.EMPLOYEE table:
INSERT INTO CORPDATA.EMPLOYEE
(EMPNO,FIRSTNME,MIDINIT,LASTNAME,WORKDEPT)
10 ROWS VALUES(:DSTRUCT:ISTRUCT)

DSTRUCT is a host structure array with five elements that is declared in the program. The five elements
correspond to EMPNO, FIRSTNME, MIDINIT, LASTNAME, and WORKDEPT. DSTRUCT has a
dimension of at least ten to accommodate inserting ten rows. ISTRUCT is a host structure array that is
declared in the program. ISTRUCT has a dimension of at least ten small integer fields for the indicators.
Blocked INSERT statements are supported for non-distributed SQL applications and for distributed
applications where both the application server and the application requester are System i products.
Related concepts:
Embedded SQL programming

Inserting data into tables with referential constraints


When you insert data into tables with referential constraints, you need to consider these rules.
If you are inserting data into a parent table with a parent key, SQL does not allow:
v Duplicate values for the parent key
v If the parent key is a primary key, a null value for any column of the primary key
If you are inserting data into a dependent table with foreign keys:
v Each non-null value you insert into a foreign key column must be equal to some value in the
corresponding parent key of the parent table.
v If any column in the foreign key is null, the entire foreign key is considered null. If all foreign keys
that contain the column are null, the INSERT succeeds (as long as there are no unique index
violations).
Alter the sample application project table (PROJECT) to define two foreign keys:
v A foreign key on the department number (DEPTNO) which references the departm