Rbafy-Database SQL Programming 7.1
Rbafy-Database SQL Programming 7.1
IBM i
Database
SQL programming
7.1
IBM
IBM i
Database
SQL programming
7.1
Note
Before using this information and the product it supports, read the information in Notices, on
page 493.
This edition applies to IBM i 7.1 (product number 5770-SS1) and to all subsequent releases and modifications until
otherwise indicated in new editions. This version does not run on all reduced instruction set computer (RISC)
models nor does it run on CISC models.
Copyright IBM Corporation 1998, 2010.
US Government Users Restricted Rights Use, duplication or disclosure restricted by GSA ADP Schedule Contract
with IBM Corp.
Contents
SQL programming . . . . . . . . ..
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
28
29
29
29
30
32
32
33
34
35
36
37
38
38
38
38
41
42
43
44
44
44
45
46
46
47
48
48
50
51
51
52
53
53
54
54
55
55
55
56
56
56
57
58
59
60
60
iii
HAVING clause . . . . . . . . . ..
ORDER BY clause . . . . . . . . ..
Static SELECT statements. . . . . . ..
Handling null values . . . . . . . ..
Special registers in SQL statements . . ..
Casting data types . . . . . . . . ..
Date, time, and timestamp data types . ..
Specifying current date and time values ..
Date/time arithmetic . . . . . . ..
Row change expressions . . . . . . ..
Handling duplicate rows . . . . . . ..
Defining complex search conditions . . ..
Special considerations for LIKE . . . ..
Multiple search conditions within a
WHERE clause . . . . . . . . ..
Using OLAP specifications . . . . . ..
Joining data from more than one table . ..
Inner join . . . . . . . . . . ..
Left outer join . . . . . . . . ..
Right outer join . . . . . . . . ..
Exception join . . . . . . . . ..
Cross join . . . . . . . . . . ..
Full outer join . . . . . . . . ..
Multiple join types in one statement . ..
Using table expressions . . . . . . ..
Using recursive queries . . . . . . ..
Using the UNION keyword to combine
subselects . . . . . . . . . . . ..
Specifying the UNION ALL keyword. ..
Using the EXCEPT keyword . . . . ..
Using the INTERSECT keyword . . . ..
Data retrieval errors . . . . . . . ..
Inserting rows using the INSERT statement ..
Inserting rows using the VALUES clause ..
Inserting rows using a select-statement . ..
Inserting multiple rows using the blocked
INSERT statement . . . . . . . . ..
Inserting data into tables with referential
constraints . . . . . . . . . . ..
Inserting values into an identity column ..
Selecting inserted values. . . . . . ..
Inserting data from a remote database . ..
Changing data in a table using the UPDATE
statement . . . . . . . . . . . . ..
Updating a table using a scalar-subselect ..
Updating a table with rows from another
table . . . . . . . . . . . . ..
Updating tables with referential constraints
Examples: UPDATE rules . . . . ..
Updating an identity column . . . . ..
Updating data as it is retrieved from a table
Removing rows from a table using the DELETE
statement . . . . . . . . . . . . ..
Removing rows from tables with referential
constraints . . . . . . . . . . ..
Example: DELETE rules . . . . . ..
Merging data . . . . . . . . . . ..
Using subqueries . . . . . . . . . ..
Subqueries in SELECT statements . . ..
Subqueries and search conditions. . ..
Usage notes on subqueries . . . . ..
iv
62
63
65
65
66
68
68
68
69
69
69
70
71
72
73
76
76
77
78
78
79
80
81
81
83
95
98
100
102
104
105
106
107
108
108
109
110
110
111
112
112
113
114
114
114
116
117
118
119
120
121
122
122
|
|
123
124
124
125
126
127
127
128
128
129
129
131
131
132
132
132
133
134
134
135
135
135
136
136
138
138
142
144
145
146
146
147
148
148
149
149
149
149
150
151
156
158
158
159
160
161
161
167
|
|
|
|
|
|
|
|
|
|
|
|
168
169
174
180
185
187
187
187
189
189
189
190
190
190
192
193
193
194
194
197
203
203
204
204
204
205
205
205
207
208
214
214
214
215
216
217
218
219
220
221
222
224
225
225
226
227
228
230
231
|
|
|
|
|
|
Contents
231
233
234
234
235
235
236
236
238
239
239
240
241
242
243
243
243
244
244
245
245
245
245
246
246
246
246
247
248
248
249
249
249
250
251
251
252
252
253
253
254
254
254
255
255
257
258
259
261
261
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
vi
261
262
263
265
267
267
268
269
270
271
272
273
273
275
277
278
281
281
282
284
284
284
286
286
287
289
291
292
293
293
294
294
295
296
297
298
298
298
298
304
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Routines . . . . . . . . . . . ..
XML support in SQL procedures . . ..
XML data type support in external
routines . . . . . . . . . . ..
XML data encoding . . . . . . . . ..
Encoding considerations when storing or
passing XML data . . . . . . . . ..
Encoding considerations for input of XML
data to a database . . . . . . . ..
Encoding considerations for retrieval of
XML data from a database . . . . ..
Encoding considerations for passing XML
data in routine parameters . . . . ..
Encoding considerations for XML data in
JDBC and SQLJ applications . . . ..
Effects of XML encoding and serialization on
data conversion . . . . . . . . ..
Encoding scenarios for input of internally
encoded XML data to a database . . ..
Encoding scenarios for input of externally
encoded XML data to a database . . ..
Encoding scenarios for retrieval of XML
data with implicit serialization . . ..
Encoding scenarios for retrieval of XML
data with explicit XMLSERIALIZE . ..
Mappings of encoding names to effective
CCSIDs for stored XML data . . . . ..
Mappings of CCSIDs to encoding names for
serialized XML output data. . . . . ..
Annotated XML schema decomposition . ..
Decomposing XML documents with
annotated XML schemas. . . . . . ..
Registering and enabling XML schemas for
decomposition . . . . . . . . . ..
Sources for annotated XML schema
decomposition . . . . . . . . . ..
XML decomposition annotations . . . ..
Specification and scope of XML
decomposition annotations . . . . ..
Annotations as attributes . . . . ..
Annotations as structured child elements
Global annotations . . . . . . ..
XML decomposition annotations Summary. . . . . . . . . . ..
db2-xdb:defaultSQLSchema decomposition
annotation . . . . . . . . . ..
db2-xdb:rowSet decomposition annotation
db2-xdb:table decomposition annotation
db2-xdb:column decomposition annotation
db2-xdb:locationPath decomposition
annotation . . . . . . . . . ..
db2-xdb:expression decomposition
annotation . . . . . . . . . ..
db2-xdb:condition decomposition
annotation . . . . . . . . . ..
db2-xdb:contentHandling decomposition
annotation . . . . . . . . . ..
db2-xdb:normalization decomposition
annotation . . . . . . . . . ..
db2-xdb:order decomposition annotation
306
307
307
311
312
312
312
312
312
313
313
315
316
319
321
321
321
321
322
322
322
323
323
323
324
324
326
327
331
333
335
338
341
344
348
350
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
db2-xdb:truncate decomposition
annotation . . . . . . . . . ..
db2-xdb:rowSetMapping decomposition
annotation . . . . . . . . . ..
db2-xdb:rowSetOperationOrder
decomposition annotation . . . . ..
Keywords for annotated XML schema
decomposition . . . . . . . . ..
Treatment of CDATA sections in annotated
XML schema decomposition . . . . ..
NULL values and empty strings in annotated
XML schema decomposition . . . . ..
Checklist for annotated XML schema
decomposition . . . . . . . . . ..
Examples of mappings in annotated XML
schema decomposition . . . . . . . ..
Annotations of derived complex types . ..
Decomposition annotation example: Mapping
to an XML column . . . . . . . ..
Decomposition annotation example: A value
mapped to a single table that yields a single
row . . . . . . . . . . . . ..
Decomposition annotation example: A value
mapped to a single table that yields multiple
rows . . . . . . . . . . . . ..
Decomposition annotation example: A value
mapped to multiple tables . . . . . ..
Decomposition annotation example:
Grouping multiple values mapped to a single
table . . . . . . . . . . . . ..
Decomposition annotation example: Multiple
values from different contexts mapped to a
single table . . . . . . . . . . ..
XML schema to SQL types compatibility for
annotated schema decomposition. . . ..
Limits and restrictions for annotated XML
schema decomposition . . . . . . ..
Schema for XML decomposition annotations
Using SQL in different environments . . . ..
Using a cursor . . . . . . . . . . ..
Types of cursors . . . . . . . . ..
Examples: Using a cursor . . . . . ..
Step 1: Defining the cursor . . . . ..
Step 2: Opening the cursor . . . . ..
Step 3: Specifying what to do when the
end of data is reached . . . . . ..
Step 4: Retrieving a row using a cursor
Step 5a: Updating the current row . ..
Step 5b: Deleting the current row. . ..
Step 6: Closing the cursor . . . . ..
Using the multiple-row FETCH statement
Multiple-row FETCH using a host
structure array . . . . . . . . ..
Multiple-row FETCH using a row storage
area . . . . . . . . . . . ..
Unit of work and open cursors . . . ..
Dynamic SQL applications . . . . . . ..
Designing and running a dynamic SQL
application . . . . . . . . . . ..
CCSID of dynamic SQL statements . . ..
Processing non-SELECT statements . . ..
352
354
356
357
358
358
359
360
360
366
367
368
370
371
373
375
381
382
383
383
383
384
386
387
387
388
389
389
389
390
390
392
394
395
395
396
396
396
397
397
398
399
399
402
406
408
410
411
412
412
414
414
414
414
414
415
415
417
418
419
419
419
421
423
423
423
425
427
427
428
429
430
430
430
430
431
431
431
432
432
432
432
432
432
432
433
433
433
434
vii
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
viii
434
435
437
438
439
439
440
442
442
444
444
445
445
447
447
448
448
449
449
450
450
451
452
453
458
459
460
461
462
463
463
463
464
465
465
466
467
468
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Appendix. Notices . . . . . . . ..
Programming interface information .
Trademarks . . . . . . . . .
Terms and conditions. . . . . .
.
.
.
.
.
.
.
.
.
468
469
469
470
471
472
473
474
476
476
477
477
477
478
478
479
479
480
481
481
482
482
483
484
485
487
488
489
489
489
490
490
490
490
491
491
493
..
..
..
494
495
495
SQL programming
The DB2 for IBM i database provides a wide range of support for Structured Query Language (SQL).
The examples of SQL statements shown in this topic collection are based on the sample tables and
assume that the following statements are true:
v They are shown in the interactive SQL environment or they are written in ILE C or in COBOL. EXEC
SQL and END-EXEC are used to delimit an SQL statement in a COBOL program.
v Each SQL example is shown on several lines, with each clause of the statement on a separate line.
v SQL keywords are highlighted.
v Table names provided in the sample tables use the schema CORPDATA. Table names that are not
found in the Sample Tables should use schemas you create.
v Calculated columns are enclosed in parentheses, (), and brackets, [].
v The SQL naming convention is used.
v The APOST and APOSTSQL precompiler options are assumed although they are not the default
options in COBOL. Character string literals within SQL and host language statements are delimited by
single-quotation marks (').
v A sort sequence of *HEX is used, unless otherwise noted.
Whenever the examples vary from these assumptions, it is stated.
Because this topic collection is for the application programmer, most of the examples are shown as if they
were written in an application program. However, many examples can be slightly changed and run
interactively by using interactive SQL. The syntax of an SQL statement, when using interactive SQL,
differs slightly from the format of the same statement when it is embedded in a program.
Note: By using the code examples, you agree to the terms of the Code license and disclaimer
information on page 491.
Related concepts:
Embedded SQL programming
Related reference:
DB2 for i sample tables on page 463
These sample tables are referred to and used in the SQL programming and the SQL reference topic
collections.
DB2 for i5/OS SQL reference
|
Read about new or significantly changed information for the SQL programming topic collection.
|
|
|
You can return a result set from a stored procedure to a program and have the program consume the
result set. For more information, see Writing a program or SQL procedure to receive the result sets from
a stored procedure on page 174.
XML
| The XML data type has been added to SQL. This includes internal handling of XML values and XML
| schema decomposition. For more information, see SQL statements and SQL/XML functions on page
| 257.
|
Field procedures
|
|
|
|
|
A field procedure is a user-written exit routine to transform values in a single column. When values in
the column are changed, or new values inserted, the field procedure is invoked for each value, and can
transform that value (encode it) in any way. The encoded value is then stored. When values are retrieved
from the column, the field procedure is invoked for each value, which is encoded, and must decode it
back to the original value. For more information, see Defining field procedures on page 28.
MERGE statement
| The MERGE statement can be used to either insert new rows or update existing rows in a table or view,
| depending on whether they already exists in the table or view. For more information, see Merging data
| on page 119.
|
Global variables
| Global variables can be created and used in SQL. For more information, see Creating and using global
| variables on page 53.
|
| An array data type has been added for use in SQL procedures. For more information, see Array support
| in SQL procedures on page 226 and Debugging an SQL routine on page 227.
|
|
|
|
|
DB2 provides an application programming interface to the WebSphere MQ message handling system
through a set of external user-defined functions, which are called DB2 MQ functions. You can use these
functions in SQL statements to combine DB2 database access with WebSphere MQ message handling. For
more information, see WebSphere MQ with DB2 on page 448.
| The OR REPLACE option has been added to the CREATE TABLE statement. For more information, see
| Using CREATE OR REPLACE TABLE on page 47.
|
| A pipelined SQL table function is a more flexible version of a table function. For more information, see
| Example: SQL table UDFs on page 193.
|
|
Debugging of SQL routines allows you to display values of SQL variables. For more information, see
Debugging an SQL routine on page 227.
|
|
A trigger can be defined for more than one event. For more information, see Multiple event SQL
triggers on page 221.
|
|
|
When creating a table, view, or index, the system name for the object can be specified on the create
statement. For more information, see Creating a table on page 17 and Creating and using views on
page 48.
|
|
|
|
You can define parameters for SQL and external procedures to have default values. Parameters with
default values can be omitted when calling the procedure. The CALL statement can specify parameter
names for any arguments. For more information, see Defining a procedure with default parameters on
page 156.
|
|
You can create a local table with the definition and data pulled from a non-local table. For more
information, see Creating a table with remote server data on page 23.
|
|
Procedures and functions are tied to system objects that can be administered with CL commands. For
more information, see Managing SQL and external routine objects on page 230.
|
|
The RUNSQLSTM command does not need to generate a listing. For more information, see Using the
SQL statement processor on page 421.
Obfuscation
|
|
The content of an SQL procedure or SQL function can be obfuscated. For more information, see
Obfuscating an SQL routine on page 228.
XMLTABLE
|
|
|
|
The XMLTABLE table function provides a way to work with XML content as a relational table. For more
information, see Using XMLTABLE to reference XML content as a relational table on page 284. In
addition, SQL XML Programming contains all the new and existing XML information that is related to
SQL.
SQL programming
| You can insert into a local table with data pulled from a non-local table. For more information, see
| Inserting data from a remote database on page 110.
|
RUNSQL
| This new CL command runs a single SQL statement. For more information, see Using the RUNSQL CL
| command on page 425.
|
Connect by
| Hierarchical queries can be defined using the CONNECT BY syntax. For more information, see Using
| recursive queries on page 83.
|
| To help you see where technical changes have been made, the information center uses:
| v The
| v The
| In PDF files, you might see revision bars (|) in the left margin of new and changed information.
| To find other information about what's new or changed this release, see the Memo to users.
SQL manages information based on the relational model of data. SQL statements can be embedded in
high-level languages, dynamically prepared and run, or run interactively. For information about
embedded SQL, see Embedded SQL programming.
SQL consists of statements and clauses that describe what you want to do with the data in a database
and under what conditions you want to do it.
SQL can access data in a remote relational database, using the IBM Distributed Relational Database
Architecture (DRDA).
Related concepts:
Distributed database programming
Related reference:
Distributed relational database function and SQL on page 427
A distributed relational database consists of a set of SQL objects that are spread across interconnected
computer systems.
SQL concepts
DB2 for i SQL consists of several main parts, such as SQL runtime support, precompilers, and interactive
SQL.
v SQL runtime support
SQL run time parses SQL statements and runs any SQL statements. This support is part of the IBM i
licensed program, which allows applications that contain SQL statements to be run on systems where
the IBM DB2 Query Manager and SQL Development Kit for i licensed program is not installed.
v SQL precompilers
SQL precompilers support precompiling embedded SQL statements in host languages. The following
languages are supported:
ILE C
ILE C++
ILE COBOL
COBOL
PL/I
RPG III (part of RPG)
ILE RPG
The SQL host language precompilers prepare an application program that contains SQL statements.
The host language compilers then compile the precompiled host source programs. For more
information about precompiling, see Preparing and running a program with SQL statements in the
Embedded SQL programming information. The precompiler support is part of the IBM DB2 Query
Manager and SQL Development Kit for i licensed program.
SQL interactive interface
The SQL interactive interface allows you to create and run SQL statements. For more information about
interactive SQL, see Using interactive SQL on page 410. Interactive SQL is part of the IBM DB2
Query Manager and SQL Development Kit for i licensed program.
Run SQL Scripts
The Run SQL Scripts window in System i Navigator allows you to create, edit, run, and troubleshoot
scripts of SQL statements.
Run SQL Statements (RUNSQLSTM) CL command
The RUNSQLSTM command can be used to run a series of SQL statements that are stored in a source
file or a source stream file. For more information about the RUNSQLSTM command, see Using the
SQL statement processor on page 421.
DB2 Query Manager
SQL programming
DB2 Query Manager provides a prompt-driven interactive interface that allows you to create data, add
data, maintain data, and run reports on the databases. Query Manager is part of the IBM DB2 Query
Manager and SQL Development Kit for i licensed program. For more information, see Query Manager
Use
.
v SQL REXX interface
The SQL REXX interface allows you to run SQL statements in a REXX procedure. For more information
about using SQL statements in REXX procedures, see Coding SQL statements in REXX applications in
the Embedded SQL programming information.
v SQL call level interface
The DB2 for i database supports the SQL call level interface. This allows users of any of the ILE
languages to access SQL functions directly through bound calls to a service program that is provided
by the system. Using the SQL call level interface, you can perform all the SQL functions without the
need to precompile. This is a standard set of procedure calls to prepare SQL statements, run SQL
statements, fetch rows of data, and even perform advanced functions, such as accessing the catalogs
and binding program variables to output columns.
For a complete description of all the available functions and their syntax, see SQL call level interface in
the Database section of the IBM i Information Center.
v Process Extended Dynamic SQL (QSQPRCED) API
This application programming interface (API) provides an extended dynamic SQL capability. You can
prepare SQL statements into an SQL package and run them by using this API. Statements that are
prepared into a package by this API persist until the package or statement is explicitly dropped. For
more information about the QSQPRCED API, see Process Extended Dynamic SQL (QSQPRCED) API.
For general information about APIs, see Application programming interfaces.
v Syntax Check SQL Statement (QSQCHKS) API
This API syntax checks SQL statements. For more information about the QSQCHKS API, see Syntax
Check SQL Statement (QSQCHKS) API. For general information about APIs, see Application
programming interfaces.
v DB2 Multisystem
This feature of the operating system allows your data to be distributed across multiple systems. For
more information, see DB2 Multisystem.
v DB2 Symmetric Multiprocessing
This feature of the operating system provides the query optimizer with additional methods for
retrieving data that include parallel processing. Symmetric multiprocessing (SMP) is a form of
parallelism achieved on a single system where multiple processors (CPU and I/O processors) that
share memory and disk resource work simultaneously toward achieving a single end result. This
parallel processing means that the database manager can have more than one (or all) of the system
processors working on a single query simultaneously. For more information, see Controlling parallel
processing for queries in the Database performance and query optimization topic collection.
SQL terms
SQL terms
Related concepts:
Distributed database programming
Related reference:
Qualification of unqualified object names
SQL programming
| ALTER FUNCTION
| ALTER PROCEDURE
|
|
ALTER SEQUENCE
ALTER TABLE
COMMENT ON
CREATE ALIAS
CREATE FUNCTION
CREATE INDEX
CREATE PROCEDURE
CREATE SCHEMA
CREATE SEQUENCE
CREATE TABLE
CREATE TRIGGER
CREATE TYPE
CREATE VARIABLE
CREATE VIEW
DROP
GRANT
LABEL ON
RENAME
REVOKE
SQL statements can operate on objects that are created by SQL as well as externally described physical
files and single-format logical files. They do not refer to the interactive data definition utility (IDDU)
dictionary definition for program-described files. Program-described files appear as a table with only a
single column.
Related concepts:
Data definition language on page 16
Data definition language (DDL) describes the portion of SQL that creates, alters, and deletes database
objects. These database objects include schemas, tables, views, sequences, catalogs, indexes, and aliases.
Data manipulation language on page 56
Data manipulation language (DML) describes the portion of SQL that manipulates or controls data.
Related reference:
DB2 for i5/OS SQL reference
SQL objects
|
|
|
SQL objects are schemas, journals, catalogs, tables, aliases, views, indexes, constraints, triggers, sequences,
stored procedures, user-defined functions, user-defined types, global variables, and SQL packages. SQL
creates and maintains these objects as system objects.
Schemas
A schema provides a logical grouping of SQL objects. A schema consists of a library, a journal, a journal
receiver, a catalog, and, optionally, a data dictionary.
Tables, views, and system objects (such as programs) can be created, moved, or restored into any system
library. All system files can be created or moved into an SQL schema if the SQL schema does not contain
a data dictionary. If the SQL schema contains a data dictionary then:
v Source physical files or nonsource physical files with one member can be created, moved, or restored
into an SQL schema.
v Logical files cannot be placed in an SQL schema because they cannot be described in the data
dictionary.
You can create and own many schemas.
Journals and journal receivers are used in processing the SQL COMMIT, ROLLBACK, SAVEPOINT, and
RELEASE SAVEPOINT statements. Journals and journal receivers can also be used as audit trails or for
forward or backward recovery.
Related concepts:
Journal management
Commitment control
Catalogs
| An SQL catalog is a collection of tables and views that describe tables, views, indexes, procedures,
| functions, sequences, triggers, variables, constraints, programs, packages, and XSR objects.
This information is contained in a set of cross-reference tables in libraries QSYS and QSYS2. In each SQL
schema there is a set of views built over the catalog tables that contains information about the objects in
the schema.
A catalog is automatically created when you create a schema. You cannot drop or explicitly change the
catalog.
Related reference:
Catalog
Aliases
An alias is an alternate name for a table or view.
You can use an alias to refer to a table or view in those cases where an existing table or view can be
referred to. Additionally, aliases can be used to join table members.
Related reference:
Aliases
Views
A view appears like a table to an application program. However, a view contains no data and only
logically represents one or more tables over which it is created.
10
A view can contain all the columns and rows of the given tables or a subset of them. The columns can be
arranged differently in a view than they are in the tables from which they are taken. A view in SQL is a
special form of a nonkeyed logical file.
Related reference:
Views
Indexes
An SQL index is a subset of the data in the columns of a table that are logically arranged in either
ascending or descending order.
Each index contains a separate arrangement. These arrangements are used for ordering (ORDER BY
clause), grouping (GROUP BY clause), and joining. An SQL index is a keyed logical file.
The index is used by the system for faster data retrieval. Creating an index is optional. You can create
any number of indexes. You can create or drop an index at any time. The index is automatically
maintained by the system. However, because the indexes are maintained by the system, a large number
of indexes can adversely affect the performance of the applications that change the table.
Related concepts:
Creating an index strategy
Constraints
A constraint is a rule enforced by the database manager to limit the values that can be inserted, deleted,
or updated in a table.
DB2 for i supports the following constraints:
v Unique constraints
A unique constraint is the rule that the values of the key are valid only if they are unique. You can
create a unique constraint using the CREATE TABLE or ALTER TABLE statement. Although the
CREATE INDEX statement can create a unique index that also guarantees uniqueness, such an index is
not a constraint.
Unique constraints are enforced during the execution of INSERT and UPDATE statements. A PRIMARY
KEY constraint is a form of the UNIQUE constraint. The difference is that a PRIMARY KEY cannot
contain any nullable columns.
v Referential constraints
A referential constraint is the rule that the values of the foreign key are valid only if one of the following
conditions is met:
They appear as values of a parent key.
Some component of the foreign key is null.
Referential constraints are enforced during the execution of INSERT, UPDATE, and DELETE
statements.
v Check constraints
A check constraint is the rule that limits the values allowed in a column or group of columns. You can
create a check constraint using the CREATE TABLE or ALTER TABLE statement. Check constraints are
enforced during the execution of INSERT and UPDATE statements. To satisfy the constraint, each row
of data inserted or updated in the table must make the specified condition either TRUE or unknown
(because of a null value).
Related reference:
Constraints on page 145
The DB2 for i database supports unique, referential, and check constraints.
SQL programming
11
Triggers
A trigger is a set of actions that runs automatically whenever a specified event occurs to a specified table
or view.
An event can be an insert, an update, a delete, or a read operation. A trigger can run either before or
after the event. DB2 for i supports SQL insert, update, and delete triggers and external triggers.
Related tasks:
Triggering automatic events in your database
Stored procedures
A stored procedure is a program that can be called with the SQL CALL statement.
DB2 for i supports external procedures and SQL procedures. An external procedure can be any system
program, service program, or REXX procedure. It cannot be a System/36 program or procedure. An SQL
procedure is defined entirely in SQL and can contain SQL statements, including SQL control statements.
Related concepts:
Stored procedures on page 149
A procedure (often called a stored procedure) is a program that can be called to perform operations. A
procedure can include both host language statements and SQL statements. Procedures in SQL provide the
same benefits as procedures in a host language.
Sequences
A sequence is a data area object that provides a quick and easy way of generating unique numbers.
You can use a sequence to replace an identity column or a user-generated numeric column. A sequence
has uses similar to these alternatives.
Related reference:
Creating and using sequences on page 26
Sequences are similar to identity columns in that they both generate unique values. However, sequences
are objects that are independent of any tables. You can use sequences to generate values quickly and
easily.
| Global variables
| A global variable is a named variable that can be created, accessed, and modified using SQL.
| A global variable can provide a unique value for a session. The variable can be used as part of any
| expression in places such as a query, a create view, or an insert statement.
User-defined functions
A user-defined function is a program that can be called like any built-in functions.
DB2 for i supports external functions, SQL functions, and sourced functions. An external function can be
any system ILE program or service program. An SQL function is defined entirely in SQL and can contain
SQL statements, including SQL control statements. A sourced function is built over any built-in or any
existing user-defined function. You can create a scalar function or a table function as either an SQL
function or an external function.
Related concepts:
Using user-defined functions on page 190
In writing SQL applications, you can implement some actions or operations as a user-defined function
(UDF) or as a subroutine in your application. Although it might appear easier to implement new
operations as subroutines, you might want to consider the advantages of using a UDF instead.
12
User-defined types
A user-defined type is a data type that you can define independently of the data types that are provided by
the database management system.
|
|
Distinct data types map to built-in types. Array data types are defined using a built-in type as the
element type and a maximum cardinality value.
Related concepts:
User-defined distinct types on page 244
A user-defined distinct type (UDT) is a mechanism to extend DB2 capabilities beyond the built-in data
types that are available.
|
|
|
XSR objects
|
|
You can use an XSR object during validation of an XML document or during annotated XML schema
decomposition.
An XSR object is one or more XML schema documents that have been registered in the XML schema
repository with the same name.
SQL packages
An SQL package is an object that contains the control structure produced when the SQL statements in an
application program are bound to a remote relational database management system (DBMS).
The DBMS uses the control structure to process SQL statements encountered while running the
application program.
SQL packages are created when a relational database name (RDB parameter) is specified on a Create SQL
(CRTSQLxxx) command and a program object is created. Packages can also be created with the Create
SQL Package (CRTSQLPKG) command.
Note: The xxx in this command refers to the host language indicators: CI for ILE C, CPPI for ILE C++,
CBL for COBOL, CBLI for ILE COBOL, PLI for PL/I, RPG for RPG/400, and RPGI for ILE RPG.
SQL packages can also be created with the Process Extended Dynamic SQL (QSQPRCED) API. The SQL
packages mentioned within this topic collection refer exclusively to distributed program SQL packages.
The QSQPRCED API uses SQL packages to provide extended dynamic SQL support.
Related reference:
Distributed relational database function and SQL on page 427
A distributed relational database consists of a set of SQL objects that are spread across interconnected
computer systems.
Process Extended Dynamic SQL (QSQPRCED) API
SQL programming
13
With a nondistributed non-ILE DB2 for i program, you must manage only the original source and the
resulting program. The following figure shows the objects involved and the steps that happen during the
precompile and compile processes for a nondistributed non-ILE DB2 for i program. The user source file
precompiles the source to a temporary source file member. This member is then compiled into a program.
With a nondistributed ILE DB2 for i program, you might need to manage the original source, the
modules, and the resulting program or service program. The following figure shows the objects involved
and the steps that happen during the precompile and compile processes for a nondistributed ILE DB2 for
i program when OBJTYPE(*PGM) is specified on the precompile command. The user source file
precompiles the source to a temporary source file member. This member is then compiled into a module
that binds to a program.
With a distributed non-ILE DB2 for i program, you must manage the original source, the resulting
program, and the resulting package. The following figure shows the objects and the steps that occur
during the precompile and compile processes for a distributed non-ILE DB2 for i program. The user
source file precompiles the source to a temporary source file member. This member is then compiled into
a program. After the program is created, an SQL package is created to hold the program.
With a distributed ILE DB2 for i program, you must manage the original source, module objects, the
resulting program or service program, and the resulting packages. An SQL package can be created for
each distributed module in a distributed ILE program or service program. The following figure shows the
objects and the steps that occur during the precompile and compile processes for a distributed ILE DB2
for i program. The user source file precompiles the source to a temporary source file member. This
member is then compiled into a module that binds to a program. After the program is created, an SQL
package is created to hold the program.
14
Note: The access plans associated with the DB2 for i distributed program object are not created until the
program is run locally.
Related tasks:
Preparing and running a program with SQL statements
Program
A program is an object that is created as a result of the compilation process for non-ILE compilations or as
a result of the bind process for ILE compilations.
An access plan is a set of internal structures and information that tells SQL how to run an embedded SQL
statement most effectively. It is created only when the program has been successfully created. Access
plans are not created during program creation for SQL statements if the statements refer to an object,
such as a table or view, that cannot be found or to which you are not authorized.
The access plans for such statements are created when the program is run. If, at that time, the table or
view still cannot be found or you are still not authorized, a negative SQLCODE is returned. Access plans
are stored and maintained in the program object for non-distributed SQL programs and in the SQL
package for distributed SQL programs.
SQL package
An SQL package contains the access plans for a distributed SQL program.
An SQL package is an object that is created when:
v You successfully create a distributed SQL program by specifying the relational database (RDB)
parameter on the CREATE SQL (CRTSQLxxx) commands.
v You run the Create SQL Package (CRTSQLPKG) command.
SQL programming
15
When a distributed SQL program is created, the name of the SQL package and an internal consistency
token are saved in the program. They are used at run time to find the SQL package and to verify that the
SQL package is correct for this program. Because the name of the SQL package is critical for running
distributed SQL programs, an SQL package cannot be:
v Moved
v Renamed
v Duplicated
v Restored to a different library
Module
A module is an Integrated Language Environment (ILE) object that you create by compiling source code
using the Create Module (CRTxxxMOD) command (or any of the Create Bound Program (CRTBNDxxx)
commands, where xxx is C, CBL, CPP, or RPG).
You can run a module only if you use the Create Program (CRTPGM) command to bind it into a
program. You typically bind several modules together, but you can bind a module by itself. Modules
contain information about the SQL statements; however, the SQL access plans are not created until the
modules are bound into either a program or service program.
Related reference:
Create Program (CRTPGM) command
Service program
A service program is an Integrated Language Environment (ILE) object that provides a means of packaging
externally supported callable routines (functions or procedures) into a separate object.
Bound programs and other service programs can access these routines by resolving their imports to the
exports provided by a service program. The connections to these services are made when the calling
programs are created. This improves call performance to these routines without including the code in the
calling program.
Creating a schema
A schema provides a logical grouping of SQL objects. To create a schema, use the CREATE SCHEMA
statement.
A schema consists of a library, a journal, a journal receiver, a catalog, and optionally, a data dictionary.
Tables, views, and system objects (such as programs) can be created, moved, or restored into any system
libraries. All system files can be created or moved into an SQL schema if the SQL schema does not
contain a data dictionary. If the SQL schema contains a data dictionary then:
v Source physical files or nonsource physical files with one member can be created, moved, or restored
into an SQL schema.
v Logical files cannot be placed in an SQL schema because they cannot be described in the data
dictionary.
16
Related reference:
CREATE SCHEMA
Creating a table
A table can be visualized as a two-dimensional arrangement of data that consists of rows and columns.
To create a table, use the CREATE TABLE statement.
The row is the horizontal part containing one or more columns. The column is the vertical part
containing one or more rows of data of one data type. All data for a column must be of the same type. A
table in SQL is a keyed or non-keyed physical file.
|
|
|
You can create a table using the CREATE TABLE statement. You provide a name for the table. If the table
name is not a valid system object name, you can use the optional FOR SYSTEM NAME clause to specify
a system name.
The definition includes the names and attributes of its columns. The definition can include other
attributes of the table, such as the primary key.
Example: Given that you have administrative authority, create a table named 'INVENTORY' with the
following columns:
v Part number: Integer between 1 and 9999, and must not be null
v Description: Character of length 0 to 24
v Quantity on hand: Integer between 0 and 100000
The primary key is PARTNO.
CREATE TABLE INVENTORY
(PARTNO
SMALLINT
NOT NULL,
DESCR
VARCHAR(24 ),
QONHAND
INT,
PRIMARY KEY(PARTNO))
Related concepts:
Data types
To make this key a unique key, replace the keyword PRIMARY with UNIQUE.
You can remove a constraint using the same ALTER TABLE statement:
ALTER TABLE CORPDATA.DEPARTMENT
DROP PRIMARY KEY (DEPTNO)
SQL programming
17
18
v Delete and update rules that specify the action taken with respect to dependent rows when the parent
row is deleted or updated.
Optionally, you can specify a name for the constraint. If a name is not specified, one is automatically
generated.
After a referential constraint is defined, the system enforces the constraint on every INSERT, DELETE,
and UPDATE operation performed through SQL or any other interface, including System i Navigator, CL
commands, utilities, or high-level language statements.
Related reference:
CREATE TABLE
ALTER TABLE
Example: Adding referential constraints:
You define a referential constraint that every department number in the sample employee table must
appear in the department table. The referential constraint ensures that every employee belongs to an
existing department.
The following SQL statements create the CORPDATA.DEPARTMENT and CORPDATA.EMPLOYEE tables
with those constraint relationships defined.
CREATE TABLE CORPDATA.DEPARTMENT
(DEPTNO
CHAR(3)
NOT NULL PRIMARY KEY,
DEPTNAME VARCHAR(29) NOT NULL,
MGRNO
CHAR(6),
ADMRDEPT CHAR(3)
NOT NULL
CONSTRAINT REPORTS_TO_EXISTS
REFERENCES CORPDATA.DEPARTMENT (DEPTNO)
ON DELETE CASCADE)
CREATE TABLE CORPDATA.EMPLOYEE
(EMPNO
CHAR(6)
NOT NULL PRIMARY KEY,
FIRSTNME VARCHAR(12) NOT NULL,
MIDINIT CHAR(1)
NOT NULL,
LASTNAME VARCHAR(15) NOT NULL,
WORKDEPT CHAR(3)
CONSTRAINT WORKDEPT_EXISTS
REFERENCES CORPDATA.DEPARTMENT (DEPTNO)
ON DELETE SET NULL ON UPDATE RESTRICT,
PHONENO CHAR(4),
HIREDATE DATE,
JOB
CHAR(8),
EDLEVEL SMALLINT
NOT NULL,
SEX
CHAR(1),
BIRTHDATE DATE,
SALARY
DECIMAL(9,2),
BONUS
DECIMAL(9,2),
COMM
DECIMAL(9,2),
CONSTRAINT UNIQUE_LNAME_IN_DEPT UNIQUE (WORKDEPT, LASTNAME))
In this case, the DEPARTMENT table has a column of unique department numbers (DEPTNO) which
functions as a primary key, and is a parent table in two constraint relationships:
REPORTS_TO_EXISTS
is a self-referencing constraint in which the DEPARTMENT table is both the parent and the
dependent in the same relationship. Every non-null value of ADMRDEPT must match a value of
DEPTNO. A department must report to an existing department in the database. The DELETE
CASCADE rule indicates that if a row with a DEPTNO value n is deleted, every row in the table
for which the ADMRDEPT is n is also deleted.
SQL programming
19
WORKDEPT_EXISTS
establishes the EMPLOYEE table as a dependent table, and the column of employee department
assignments (WORKDEPT) as a foreign key. Thus, every value of WORKDEPT must match a
value of DEPTNO. The DELETE SET NULL rule says that if a row is deleted from
DEPARTMENT in which the value of DEPTNO is n, then the value of WORKDEPT in
EMPLOYEE is set to null in every row in which the value was n. The UPDATE RESTRICT rule
says that a value of DEPTNO in DEPARTMENT cannot be updated if there are values of
WORKDEPT in EMPLOYEE that match the current DEPTNO value.
Constraint UNIQUE_LNAME_IN_DEPT in the EMPLOYEE table causes LASTNAME to be unique within
a department. While this constraint is unlikely, it illustrates how a constraint made up of several columns
can be defined at the table level.
Check pending
Referential constraints and check constraints can be in a check pending state, where potential violations
of the constraints exist.
For referential constraints, a violation occurs when potential mismatches exist between parent and foreign
keys. For check constraints, a violation occurs when potential values exist in columns that are limited by
the check constraint. When the system determines that a constraint might have been violated (such as
after a restore operation), the constraint is marked as check pending. When this happens, restrictions are
placed on the use of tables involved in the constraint. For referential constraints, the following restrictions
apply:
v No input or output operations are allowed on the dependent file.
v Only read and insert operations are allowed on the parent file.
When a check constraint is in check pending, the following restrictions apply:
v Read operations are not allowed on the file.
v Insert and update operations are allowed and the constraint is enforced.
To get a constraint out of check pending, follow these steps:
1. Disable the relationship with the Change Physical File Constraint (CHGPFCST) CL command.
2. Correct the key (foreign, parent, or both) data for referential constraints or column data for check
constraints.
3. Enable the constraint again with the CHGPFCST CL command.
You can identify the rows that are in violation of the constraint with the Display Check Pending
Constraint (DSPCPCST) CL command.
Related concepts:
Check pending status in referential constraints
20
Related tasks:
Working with constraints that are in check pending status
Related reference:
CREATE TABLE
If the specified table or view contains an identity column, you must specify the option INCLUDING
IDENTITY on the CREATE TABLE statement if you want the identity column to exist in the new table.
The default behavior for CREATE TABLE is EXCLUDING IDENTITY. There are similar options to include
the default value, the hidden attribute, and the row change timestamp attribute. The WITH NO DATA
clause indicates that the column definitions are to be copied without the data. If you want to include the
SQL programming
21
data in the new table EMPLOYEE3, include the WITH DATA clause. If the specified query includes a
non-SQL-created physical file or logical file, any non-SQL result attributes are removed.
Related concepts:
Retrieving data using the SELECT statement on page 56
The SELECT statement tailors your query to gather data. You can use the SELECT statement to retrieve a
specific row or retrieve data in a specific way.
Related reference:
CREATE TABLE
This materialized query table specifies that the table is not populated at the time that it is created by
using the DATA INITIALLY DEFERRED clause. REFRESH DEFERRED indicates that changes made to
TRANS are not reflected in STRANS. Additionally, this table is maintained by the user, enabling the user
to use ALTER, INSERT, DELETE, and UPDATE statements.
To populate the materialized query table or refresh the table after it has been populated, use the
REFRESH TABLE statement. This causes the query associated with the materialized query table to be run
and causes the table to be filled with the results of the query. To populate the STRANS table, run the
following statement:
REFRESH TABLE STRANS
You can create a materialized query table from an existing base table as long as the result of the
select-statement provides a set of columns that match the columns in the existing table (same number of
columns and compatible column definitions). For example, create a table TRANSCOUNT. Then, change
the base table TRANSCOUNT into a materialized query table:
To create the table:
CREATE TABLE TRANSCOUNT
(ACCTID SMALLINT NOT NULL,
LOCID SMALLINT,
YEAR DATE
CNT INTEGER)
22
Finally, you can change a materialized query table back to a base table. For example:
ALTER TABLE TRANSCOUNT
DROP MATERIALIZED QUERY
In this example, the table TRANSCOUNT is not dropped, but it is no longer a materialized query table.
Related concepts:
Tables, rows, and columns on page 10
A table is a two-dimensional arrangement of data that consists of rows and columns.
This table is created in QTEMP. To reference the table using a schema name, use either SESSION or
QTEMP. You can issue SELECT, INSERT, UPDATE, and DELETE statements against this table, the same
as any other table. You can drop this table by issuing the DROP TABLE statement:
DROP TABLE ORDERS
Related reference:
DECLARE GLOBAL TEMPORARY TABLE
|
You can create a table on the local server that references one or more tables on a remote server.
|
|
|
Along with the select-statement, you can specify copy options to get attributes such as the default values
or identity column information copied for the new table. The WITH DATA or WITH NO DATA clause
must be specified to indicate whether to populate the table from the remote system.
|
|
|
|
|
|
For example, create a table named EMPLOYEE4 that includes column definitions from the EMPLOYEE
table on remote server REMOTESYS. Include the data from the remote system as well.
CREATE TABLE EMPLOYEE4 AS
(SELECT PROJNO, PROJNAME, DEPTNO
FROM REMOTESYS.TESTSCHEMA.EMPLOYEE
WHERE DEPTNO = D11) WITH DATA
SQL programming
23
|
|
|
|
|
|
|
You can also create this table as a global temporary table, which will create it in QTEMP. In this example,
different column names are provided for the new table. The table definition will pick up the default
values for its columns from the remote server.
|
|
|
|
|
The following restrictions apply to using a remote server as the source for the new table:
v The materialized query table clauses are not allowed.
v A column with a FIELDPROC cannot be listed in the select list.
v The copy options cannot be specified if the remote server is DB2 for LUW or DB2 for z/OS.
Related reference:
When a row is inserted into the ORDERS table, the CHANGE_TS column for the row is set to the
timestamp of the insert operation. Any time a row in ORDERS is updated, the CHANGE_TS column for
the row is modified to reflect the timestamp of the update operation.
You can drop the row change timestamp attribute from a column:
ALTER TABLE ORDER
ALTER COLUMN CHANGE_TS
DROP ROW CHANGE TIMESTAMP
The column CHANGE_TS remains as a TIMESTAMP column in the table, but the system no longer
automatically updates timestamp values for this column.
24
When you create a table, you can define a column in the table to be an identity column. For example,
create a table ORDERS with three columns called ORDERNO, SHIPPED_TO, and ORDER_DATE. Define
ORDERNO as an identity column.
CREATE TABLE ORDERS
(ORDERNO SMALLINT NOT NULL
GENERATED ALWAYS AS IDENTITY
(START WITH 500
INCREMENT BY 1
CYCLE),
SHIPPED_TO VARCHAR (36) ,
ORDER_DATE DATE)
This column is defined with a starting value of 500, incremented by 1 for every new row inserted, and
will recycle when the maximum value is reached. In this example, the maximum value for the identity
column is the maximum value for the data type. Because the data type is defined as SMALLINT, the
range of values that can be assigned to ORDERNO is from 500 to 32 767. When this column value
reaches 32 767, it will restart at 500 again. If 500 is still assigned to a column, and a unique key is
specified on the identity column, a duplicate key error is returned. The next insert operation will attempt
to use 501. If you do not have a unique key specified for the identity column, 500 is used again,
regardless of how many times it appears in the table.
For a larger range of values, specify the column to be data type INTEGER or even BIGINT. If you want
the value of the identity column to decrease, specify a negative value for the INCREMENT option. It is
also possible to specify the exact range of numbers by using MINVALUE and MAXVALUE.
You can modify the attributes of an existing identity column using the ALTER TABLE statement. For
example, you want to restart the identity column with a new value:
ALTER TABLE ORDER
ALTER COLUMN ORDERNO
RESTART WITH 1
The column ORDERNO remains as a SMALLINT column, but the identity attribute is dropped. The
system will no longer generate values for this column.
Related reference:
Comparison of identity columns and sequences on page 27
While identity columns and sequences are similar in many ways, there are also differences.
Inserting values into an identity column on page 109
You can insert a value into an identity column or allow the system to insert a value for you.
Updating an identity column on page 114
You can update the value in an identity column to a specified value or have the system generate a new
value.
Using ROWID
Using ROWID is another way to have the system assign a unique value to a column. ROWID is similar
to identity columns. But rather than being an attribute of a numeric column, it is a separate data type.
To create a table similar to the identity column example:
SQL programming
25
This sequence is defined with a starting value of 500, incremented by 1 for every use, and recycles when
the maximum value is reached. In this example, the maximum value for the sequence is 1000. When this
value reaches 1000, it will restart at 500 again.
After this sequence is created, you can insert values into a column using the sequence. For example,
insert the next value of the sequence ORDER_SEQ into a table ORDERS with columns ORDERNO and
CUSTNO.
First, create the table ORDERS:
CREATE TABLE ORDERS
(ORDERNO SMALLINT NOT NULL,
CUSTNO SMALLINT);
CUSTNO
500
12
In this example, the next value for sequence ORDER is inserted into the ORDERNO column. Issue the
INSERT statement again. Then run the SELECT statement.
Table 3. Results for SELECT from table ORDERS
ORDERNO
CUSTNO
500
12
501
12
26
You can also insert the previous value for the sequence ORDER by using the PREVIOUS VALUE
expression. You can use NEXT VALUE and PREVIOUS VALUE in the following expressions:
v Within the select-clause of a SELECT statement or SELECT INTO statement as long as the statement
does not contain a DISTINCT keyword, a GROUP BY clause, an ORDER BY clause, a UNION
keyword, an INTERSECT keyword, or an EXCEPT keyword
v Within a VALUES clause of an INSERT statement
v Within the select-clause of the fullselect of an INSERT statement
v Within the SET clause of a searched or positioned UPDATE statement, though NEXT VALUE cannot be
specified in the select-clause of the subselect of an expression in the SET clause
You can alter a sequence by issuing the ALTER SEQUENCE statement. Sequences can be altered in the
following ways:
v Restarting the sequence
v Changing the increment between future sequence values
v Setting or eliminating the minimum or maximum values
v Changing the number of cached sequence numbers
v Changing the attribute that determines whether the sequence can cycle or not
v Changing whether sequence numbers must be generated in order of request
For example, change the increment of values of sequence ORDER from 1 to 5:
ALTER SEQUENCE ORDER_SEQ
INCREMENT BY 5
After this change is complete, run the INSERT statement again and then the SELECT statement. Now the
table contains the following columns.
Table 4. Results for SELECT from table ORDERS
ORDERNO
CUSTNO
500
12
501
12
528
12
Notice that the next value that the sequence uses is a 528. At first glance, this number appears to be
incorrect. However, look at the events that lead up to this assignment. First, when the sequence was
originally create, a cache value of 24 was assigned. The system assigns the first 24 values for this cache.
Next, the sequence was altered. When the ALTER SEQUENCE statement is issued, the system drops the
assigned values and starts up again with the next available value; in this case the original 24 that was
cached, plus the next increment, 5. If the original CREATE SEQUENCE statement did not have the
CACHE clause, the system automatically assigns a default cache value of 20. If that sequence was altered,
then the next available value is 25.
Related concepts:
Sequences on page 12
A sequence is a data area object that provides a quick and easy way of generating unique numbers.
SQL programming
27
|
|
| Field procedures are assigned to a table by the FIELDPROC clause of the CREATE TABLE and ALTER
| TABLE statements. A field procedure is a user-written exit routine that transforms values in a single
| column.
|
|
|
|
|
When values in the column are changed, or new values inserted, the field procedure is invoked for each
value, and can transform that value (encode it) in any way. The encoded value is then stored. When
values are retrieved from the column, the field procedure is invoked for each value, which is encoded,
and must decode it back to the original value. Any indexes defined on a non-derived column that uses a
field procedure are built with encoded values.
|
|
|
|
|
|
|
|
The transformation your field procedure performs on a value is called field-encoding. The same routine is
used to undo the transformation when values are retrieved; that operation is called field-decoding. Values
in columns with a field procedure are described to DB2 in two ways:
1. The description of the column as defined in CREATE TABLE or ALTER TABLE appears in the catalog
table QSYS2.SYSCOLUMNS. That is the description of the field-decoded value, and is called the
column description.
2. The description of the encoded value, as it is stored in the database, appears in the catalog table
QSYS2.SYSFIELDS. That is the description of the field-encoded value, and is called the field description.
| Important: The field-decoding function must be the exact inverse of the field-encoding function. For
| example, if a routine encodes 'ALABAMA' to '01', it must decode '01' to 'ALABAMA'. A violation of this
| rule can lead to unpredictable results. See General guidelines for writing field procedures on page 37.
28
|
|
|
|
Field procedures can also perform masking of data when decoded (retrieved). In this case, the field
procedure would decode '01' to 'ALABAMA for certain users or environments and for other users or
environments may return a masked value such as XXXXXXXX instead. See Guidelines for writing field
procedures that mask data on page 38.
|
|
|
|
|
|
The data type of the encoded value can be any valid SQL data type except ROWID or DATALINK. Also a
field procedure cannot be associated with any column having values generated by IDENTITY or ROW
CHANGE TIMESTAMP.
|
|
|
If a DDS-created physical file is altered to add a field procedure, the encoded attribute data type cannot
be a LOB type or DataLink. If an SQL table is altered to add a field procedure, the encoded attribute
precision field must be 0 if the encoded attribute data type is any of the integer types.
|
|
A field procedure may not be added to a column that has a default value of CURRENT DATE,
CURRENT TIME, CURRENT TIMESTAMP, or USER.
|
|
|
A column defined with a user-defined data type can have a field procedure if the source type of the
user-defined data type is any of the allowed SQL data types. DB2 casts the value of the column to the
source type before it passes it to the field procedure.
|
|
|
|
|
|
You cannot use a field procedure on a ROWID or DATALINK column of a table or on any column having
values generated by IDENTITY or ROW CHANGE TIMESTAMP. However, you can specify it for other
columns in the same table.
|
|
|
|
|
|
|
The optional parameter list that follows the procedure name is a list of constants, enclosed in parentheses,
called the literal list. The literal list is converted by DB2 into a data structure called the field procedure
parameter value list (FPPVL). The FPPVL is passed to the field procedure during the field-definition
operation. At that time, the procedure can modify it or return it unchanged. The output form of the
FPPVL is called the modified FPPVL. It is stored in the DB2 QSYS2.SYSFIELDS catalog table as part of the
column description. The modified FPPVL is passed again to the field procedure whenever that procedure
is invoked for field-encoding or field-decoding.
|
|
|
|
|
|
|
|
|
|
v For field-encoding, when a column value is to be encoded. Encoding occurs for any value that:
Is inserted in the column by an SQL INSERT statement, SQL MERGE statement, or native write
operation.
Is changed by an SQL UPDATE statement, SQL MERGE statement, or native update operation.
The field procedure is also invoked when the table is created or altered, to define the data type and
attributes of an encoded value to DB2. That operation is called field-definition.
To name a field procedure for a column, use the FIELDPROC clause of the CREATE TABLE or ALTER
TABLE statement, followed by the name of the procedure and, optionally, a list of parameters.
A field procedure that is specified for a column is invoked in three general situations.
v For field-definition, when the CREATE TABLE or ALTER TABLE statement that names the procedure is
executed. During this invocation, the procedure is expected to:
Determine whether the data type and attributes of the column are valid.
Verify the literal list, and change it if desired.
Provide the field description of the column.
SQL programming
29
|
If the data needs to be copied and the target column has a field procedure, it is possible that the
|
field procedure may be invoked to encode the copied data. Examples include the SQL statements
|
ALTER TABLE or CREATE TABLE (with a LIKE or as-result-table clause) and the CL commands
|
CPYF or RGZPFM.
|
Is compared to a column with a field procedure. The QAQQINI option
|
FIELDPROC_ENCODED_COMPARISON is used by the optimizer to decide if the column value is
|
decoded or if the variable, constant, or join column is encoded.
|
At CREATE or ALTER TABLE time for the DEFAULT value, if the column has a field procedure.
|
If there are any after or read triggers, the field procedure is invoked before any of these triggers. For
|
before triggers, there may be multiple invocations of the field procedure with encode and decode
|
operations. The number of calls to the field procedure depends on many factors including the type of
|
trigger and if the trigger changes the data in the trigger buffer. The database manager will ensure that
|
the field procedure is called to encode the data that will be inserted into the table.
| v For field-decoding, when a stored value is to be field-decoded back into its original value. This occurs
|
for any value that is:
|
Retrieved by an SQL SELECT or FETCH statement, or by a native read operation.
|
If the data needs to be copied and the source column has a field procedure, it is possible that the
|
field procedure may be invoked to decode the data prior to making the copy. Examples include the
|
SQL statements ALTER TABLE or CREATE TABLE (with a LIKE or as-result-table clause) and the CL
|
commands CPYF or RGZPFM.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Recommendation: Avoid encoding blanks in a field procedure. When DB2 compares the values of two
strings with different lengths, it temporarily pads the shorter string with the appropriate blank characters
(for example, EBCDIC or double-byte blanks) up to the length of the longer string. If the shorter string is
the value of a column with a field procedure, padding is done to the encoded value, but the pad
character is not encoded. Therefore, if the procedure changes blanks to some other character, encoded
blanks at the end of the longer string are not equal to padded blanks at the end of the shorter string.
That situation can lead to errors; for example, some strings that ought to be equal might not be
recognized as such. Therefore, encoding blanks in a field procedure is not recommended.
When defining and using the parameters in the field procedure, care should be taken to ensure that no
more storage is referenced for a given parameter than is defined for that parameter. The parameters are
all stored in the same space and exceeding a given parameter's storage space can overwrite another
parameter's value. This, in turn, can cause the field procedure to see invalid input data or cause the value
returned to the database to be invalid.
| Parameter 1
|
A small (2 byte) integer that describes the function to be performed. This parameter is input only.
|
Supported values are:
v 0 field-encoding
v 4 field-decoding
|
|
30
v 8 field-definition
|
|
|
|
|
Parameter 2
A structure that defines the field procedure parameter value list (FPPVL).
v For function code 8, this parameter is input/output.
v For function code 0 and 4, this parameter contains the output of the function code 8 call. This
parameter is input only
|
|
|
|
Parameter 3
The decoded data attribute that is defined by the Column Value Descriptor (CVD). This is the
column attributes that were specified at CREATE TABLE or ALTER TABLE time. This parameter
is input only.
|
|
|
|
Parameter 4
The decoded data. The exact structure is dependent on function code.
v If function code 8, then the NULL value. This parameter is input only.
v If function code 0, then the data to be encoded. This parameter is input only.
v If function code 4, then the location to place the decoded data. This parameter is output only.
|
|
|
|
|
|
Parameter 5
The encoded data attribute that is defined by the Field Value Descriptor (FVD).
v If function code 8, then the structure containing the encoded data attributes. This parameter is
output only.
v If function code 0 or 4 , then a structure containing the encoded data attributes that was
returned by the function 8 call. This parameter is input only.
|
|
|
|
|
Parameter 6
The encoded data that is defined by the Field Value Descriptor (FVD). The exact structure is
dependent on function code.
v If function code 8, then the NULL value. This parameter is input only.
v If function code 0, then the location to place the encoded data. This parameter is output only.
|
|
|
v If function code 4, then the encoded form of the data. This parameter is input only.
Parameter 7
The SQLSTATE (character(5)). This parameter is input/output.
|
|
|
|
|
This parameter is set by DB2 to '00000' before calling the field procedure. It can be set by the field
procedure. While normally the SQLSTATE is not set by a field procedure, it can be used to signal
an error to the database as follows:
v If the field procedure detects an error, it should set the SQLSTATE to '38xxx', where xxx may
be one of several possible strings. For more information, see DB2 Messages and Codes.
|
|
|
|
|
|
|
|
|
|
|
|
Parameter 8
The message text area (varchar(1000)). This parameter is input/output.
This argument is set by DB2 to the empty string before calling the field procedure. It is a
VARCHAR(1000) value that can be used by the field procedure to send message text back when
an SQLSTATE error is signaled by the field procedure. Message text is ignored by DB2 unless the
SQLSTATE parameter is set by the field procedure. The message text is assumed to be in the job
CCSID.
Parameter 9
A 128-byte structure containing additional information for the field procedure. This parameter is
input only.
This structure is set by DB2 before calling the field procedure. For field procedures that mask
data, it indicates that the caller is a system function that requires that the data be decoded
SQL programming
31
without masking. For example, in some cases, RGZPFM and ALTER TABLE may need to copy
data. If the field procedure ignores this parameter and masks data when these operations are
performed, the column data will be lost. Hence, it is critical that a field procedure that masks
data properly handle this parameter.
|
|
|
|
Table 5. sqlfpFieldProcedureParameterList_T
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Table 6. sqlfpOptionalParameterValueDescriptor_T
Name
Offset
Data Type
Description
sqlfpOptParmValueListLength
4-byte integer
sqlfpNumberOfOptionalParms
4-byte integer
sqlfpParmList
structure
A list containing
sqlfpOptionalParameterValueDescriptor_T sqlfpNumberOfOptionalParms count of
sqlfpOptionalParameterValueDescriptor_T
items.
Name
Offset
Data Type
Description
sqlfpOptDescLength
4-byte integer
sqlfpParmDesc
structure sqlfpParameterDescription_T
Parameter description
reserved2
38
character(12)
Not used
sqlfpParmData
40
32
|
|
|
|
The column value descriptor (CVD) contains a description of a column value. During field-encoding, the
CVD describes the value to be encoded. During field-decoding, it describes the decoded value to be
supplied by the field procedure. During field-definition, it describes the column as defined in the
CREATE TABLE or ALTER TABLE statement.
|
|
|
|
The field value descriptor (FVD) contains a description of a field value. During field-encoding, the FVD
describes the encoded value to be returned from the field procedure. During field-decoding, it describes
the value to be decoded. During field-definition a description of the encoded value must put into the
FVD.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Table 7. sqlfpParameterDescription_T
The input provided to the field-definition operation, and the output required, are as follows:
|
|
v Parameter 1
Input - A small (2 byte) integer that describes the function to be performed (8 - field-definition).
|
|
|
|
v Parameter 2
Input/Output - A structure that defines the field procedure parameter value list (FPPVL). This is
an auto-extendable space. The minimum length of this structure is 8 bytes. The maximum
returned length of this structure is 32K.
|
|
v Parameter 3
Input - The structure sqlfpParameterDescription_T containing the decoded data attributes.
|
|
v Parameter 4
Not used.
Name
Offset
Data Type
Description
sqlfpSqlType
2-byte integer
sqlfpByteLength
sqlfpLength
sqlfpPrecision
10
2-byte integer
sqlfpScale
12
2-byte integer
sqlfpCcsid
14
sqlfpAllocatedLength
16
reserved1
18
character(14)
Reserved.
SQL programming
33
| v Parameter 5
|
Output - The structure sqlfpParameterDescription_T containing the encoded data attributes. The
|
output sqlfpParameterDescription_T must be valid with the appropriate CCSID, length, precision,
|
and scale fields set.
| v Parameter 6
|
Not used.
| v Parameter 7
|
Input/Output - The SQLSTATE (character(5)).
| v Parameter 8
|
Input/Output - The message text area (varchar(1000)).
| v Parameter 9
|
Input - Reserved.
|
|
|
|
|
Errors returned by a field procedure result in SQLCODE -681 (SQLSTATE '23507'), which is set in the SQL
communication area (SQLCA) and the DB2_RETURNED_SQLCODE and RETURNED_SQLSTATE
condition area item of the SQL diagnostics area. The contents of Parameter 7 and 8 are placed into the
tokens, in SQLCA, as field SQLERRMT and in the SQL Diagnostic area condition area item
MESSAGE_TEXT. The meaning of the error message is determined by the field procedure.
The modified FPPVL is recorded in the catalog table QSYS2.SYSFIELDS, and is passed to the field
procedure during field-encoding and field-decoding. The modified FPPVL need not have the format of a
field procedure parameter list, and it need not describe constants by optional parameter value
descriptors.
34
|
|
|
|
|
v Parameter 4
Input Data to be encoded.
If the value is a varying-length string, the first 2 bytes contains its length. If the value is a LOB or
XML, then the first 4 bytes contains the length. If the value is numeric, the internal numeric
representation of the data. If a datetime value, the value is in *ISO format.
|
|
|
v Parameter 5
Input - A structure described by sqlfpParameterDescription_T containing the encoded data
attributes.
|
|
v Parameter 6
Output Location to place the encoded data.
|
|
|
|
If the encoded value is a varying-length string, the first 2 bytes must contain the length. If the
encoded value is a LOB or XML, then the first 4 bytes must contain the length. If the value is
numeric, the internal numeric representation of the data. If a datetime value, the value must be in
*ISO format.
|
|
v Parameter 7
Input/Output - The SQLSTATE (character(5)).
|
|
v Parameter 8
Input/Output - The message text area (varchar(1000)).
|
|
v Parameter 9
Input - Reserved.
|
|
|
|
|
|
Errors returned by a field procedure result in SQLCODE -681 (SQLSTATE '23507'), which is set in the SQL
communication area (SQLCA) and the DB2_RETURNED_SQLCODE and RETURNED_SQLSTATE
condition area item of the SQL diagnostics area. The contents of Parameter 7 and 8 are placed into the
tokens, in SQLCA, as field SQLERRMT and in the SQL Diagnostic area condition area item
MESSAGE_TEXT. If the database manager is unable to invoke the field procedure then SQLCODE -682
(SQLSTATE '57010') is returned.
The input provided to the field-decoding operation, and the output required, are as follows:
|
|
v Parameter 1
Input - A small (2 byte) integer that describes the function to be performed (4 - field-decoding).
|
|
v Parameter 2
Input - A structure that defines the modified field procedure parameter value list (FPPVL).
|
|
|
v Parameter 3
Input - A structure described by sqlfpParameterDescription_T containing the decoded data
attributes.
|
|
v Parameter 4
Output Location to place the decoded data.
|
|
|
|
If the decoded value is a varying-length string, the first 2 bytes must contain the length. If the
decoded value is a LOB or XML, then the first 4 bytes must contain the length. If the value is
numeric, the internal numeric representation of the data. If a datetime value, the value must be in
*ISO format.
|
|
|
v Parameter 5
Input - A structure described by sqlfpParameterDescription_T containing the encoded data
attributes.
|
|
v Parameter 6
Input - encoded data
SQL programming
35
If the value is a varying-length string, the first 2 bytes contains its length. If the value is a LOB or
XML, then the first 4 bytes contains the length. If the value is numeric, the internal numeric
representation of the data. If a datetime value, the value is in *ISO format.
|
|
|
| v Parameter 7
|
Input/Output - The SQLSTATE (character(5)).
| v Parameter 8
|
Input/Output - The message text area (varchar(1000)).
| v Parameter 9
|
Input - Indicates that the caller is a system function that requires that the data be decoded
|
without masking.
|
|
|
|
|
|
Errors returned by a field procedure result in SQLCODE -681 (SQLSTATE '23507'), which is set in the SQL
communication area (SQLCA) and the DB2_RETURNED_SQLCODE and RETURNED_SQLSTATE
condition area item of the SQL diagnostics area. The contents of Parameter 7 and 8 are placed into the
tokens, in SQLCA, as field SQLERRMT and in the SQL Diagnostic area condition area item
MESSAGE_TEXT. If the database manager is unable to invoke the field procedure then SQLCODE -682
(SQLSTATE '57010') is returned.
Add field procedure FP1 to column C1. The Field Procedure FP1 takes one parameter which indicates the
number of bytes of the column the field procedure should operate on.
ALTER TABLE TESTTAB ALTER C1 SET FIELDPROC FP1(10)
#include "string.h"
#include <QSYSINC/H/SQLFP>
void reverse(char *in, char *out, long length);
main(int argc, void *argv[])
{
short *funccode = argv[1];
sqlfpFieldProcedureParameterList_T *optionalParms = argv[2];
char *sqlstate = argv[7];
sqlfpMessageText_T *msgtext = argv[8];
int bytesToProcess;
sqlfpOptionalParameterValueDescriptor_T *optionalParmPtr;
if (optionalParms->sqlfpNumberOfOptionalParms != 1)
{
memcpy(sqlstate,"38001",5);
return;
}
optionalParmPtr = (void *)&(optionalParms->sqlfpParmList);
bytesToProcess = *((int *)&optionalParmPtr->sqlfpParmData);
if (*funccode == 8)
/* create time */
{
sqlfpParameterDescription_T *inDataType = argv[3];
sqlfpParameterDescription_T *outDataType = argv[5];
if (inDataType->sqlfpSqlType !=452 &&
inDataType->sqlfpSqlType !=453 )
/* only support fixed length char */
{
memcpy(sqlstate,"38002",5);
return;
}
/* do something here to determine the result data type */
/* ..... */
/* in this example input and output types are exactly the same */
/* so just copy */
memcpy(outDataType, inDataType, sizeof(sqlfpParameterDescription_T));
}
else if (*funccode == 0)
/* encode */
{
36
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
v The field procedure will not be called if the data to be encoded or decoded is the null value.
v On an encode operation, packed decimal and zoned decimal values will be converted to the preferred
sign prior to calling the user field procedure program.
v The field procedure must be deterministic. For SQE, caching of results will occur based on the
QAQQINI FIELDPROC_ENCODED_COMPARISON.
v The field procedure must be parallel capable and capable of running in a multi-threaded environment.
For RPG, this means the THREAD(*SERIALIZE) control specification must be specified.
v Must be capable of running in both a fenced and non-fenced environment.
v The program cannot be created with ACTGRP(*NEW). If the program is created with
ACTGRP(*CALLER), the program will run in the default activation group.
v Field procedure programs are expected to be short running. It is recommended that the field procedure
program avoid commitment control and native database operations.
v Create the program in the physical file's library.
v If an error occurs or is detected in the field procedure program, the field procedure program should set
the SQLSTATE and message text parameters. If the SQLSTATE parameter is not set to indicate an error,
database assumes that the field procedure ran successfully. This might cause the user data to end up in
an inconsistent state.
|
|
Warning: Field procedures are a productive way both to provide application functions and to manage
information. However, field procedure programs could provide the ability for someone with devious
SQL programming
37
| intentions to create a "Trojan horse"1 on your system. This is why it is important to restrict who has the
| authority to alter a table. If you are managing object authority carefully, the typical user will not have
| sufficient authority to add a field procedure program.
| Index considerations:
| Indexes may be recovered at IPL time based on the RECOVER parameter of CRTPF, CRTLF, CHGPF, or
| CHGLF commands. Indexes that are based on a column that has a field procedure have special
| considerations.
|
|
|
|
|
Use of PASE(QSH) and JAVA within field procedures must be avoided if the index keys are built over
expressions that contain columns with field procedures or the sparse index criteria references a column
with an associated field procedure. If use of PASE or JAVA is required, consider changing indexes to
RECOVER(*NO) so that they are not recovered during the IPL process and recovered during an open
operation instead.
| Thread considerations:
| A field procedure runs in the same job as the operation that initiated the field procedure. However, the
| field procedure may or may not run in a different system thread (fenced or not fenced) which are
| separate from the thread from the initiating request.
|
|
|
|
|
|
|
|
Because the field procedure runs in the same job, it shares much of the same environment as the
initiating request. However, because it may run under a separate thread, the following threads
considerations apply:
v Field procedures do not inherit any program adopted authority that may have been active at the time
the request was initiated. Field procedure authority comes from the authority associated with the field
procedure program or from the authority of the user running.
v The field procedure cannot perform any operation that is blocked from being run in a secondary
thread.
| v The field procedure program must be created such that it either runs under a named activation group
|
or in the activation group of its caller (ACTGRP parameter). Programs that specify *CALLER will run
|
in the default activation group.
| Debug considerations:
| There are some things to keep in mind when debugging field procedures.
| Since field procedures can run in a secondary thread, it is recommended that debugging should be done
| using STRSRVJOB or the graphical debugger.
|
|
|
|
For natively run field procedures, the database manager uses the job default wait time. If the field
procedure does not return within that specified time, an error is returned. This default wait time may
need to be increased when debugging field procedures. For example, to change the default wait time to 5
minutes: CHGJOB DFTWAIT(300)
1. In history, the Trojan horse was a large hollow wooden horse that was filled with Greek soldiers. After the horse was introduced
within the walls of Troy, the soldiers climbed out of the horse and fought the Trojans. In the computer world, a program that
hides destructive functions is often called a Trojan horse.
38
|
|
|
|
|
|
|
|
|
|
|
|
|
|
v Field-decoding
Masking must only be performed for field-decoding. It must not be performed for field-encoding. If
masking was performed for field-encoding, the masked data would be stored in the table and the
actual value would be lost.
In some cases, system code needs to copy data internally (the data is not being returned to the user
in these cases). For example, in some cases, RGZPFM, ALTER TABLE, and CRTDUPOBJ must copy
data internally. Likewise, data passed internally to triggers must not be masked. During these
operations, when the data is read, field-decoding will occur and when the data is written,
field-encoding will occur. If masking is performed in these cases during field-decoding, the mask
data will then be written and the actual data will be lost.
To prevent corruption, the ninth parameter to the field procedure indicates whether this is a system
operation where masking must not be performed. It is critical that the field procedure be written to
check this parameter during field-decoding and if the parameter indicates that masking must not be
performed, the field procedure must not mask regardless of the user or environment.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
v Field-encoding
For native update and insert operations, the field procedure must be able to identify when masked
data is being passed to the field procedure and take special actions. For example, a field procedure
might be written to mask a credit card number column. That same user may be authorized to read
and update the table through an RPG application that performs READ and UPDATE operations.
When the READ is performed, the credit card number is masked to prevent the user from seeing it,
but when the user performs the UPDATE, the masked data will be passed back to database on the
UPDATE operation and the field procedure will be called to encode the data. If the field procedure
does not recognize that the value being passed is masked, the masked data would be encoded and
stored in the table and the original value in the row would be corrupted with an encoded masked
data.
To prevent corruption, the field procedure must recognize on field-encoding that the data is masked.
Instead of encoding the data, the field procedure must return a warning SQLSTATE value of 09501
in the seventh parameter.
- For an UPDATE operation, 09501 indicates to DB2 that the current value for the column should
be used.
- For an INSERT operation, 09501 indicates to DB2 that the default value should be used for the
associated column value.
|
|
|
|
|
|
|
|
|
|
|
Query Considerations: There are several considerations that apply to queries that reference a column of a
table that has a field procedure that masks data:
v Depending on how the optimizer implements a query, the same query may return different rows and
values for different users or environments. This will occur in cases where optimizer must decode the
data in order to perform comparisons or evaluate expressions in a query. If masking is performed for
one user but not for another user, the result of the decode operation will be very different, so the
resulting rows and values can also be quite different for the two users.
For example, assume that a field procedure returns (decodes) data for user profile MAIN without
masking and returns (decodes) data for user profile QUSER with masking. An application contains the
following query:
SELECT * FROM orders WHERE cardnum = 112233
|
|
By default, the optimizer will try to implement the search condition (logically) as follows:
|
|
|
|
|
This is the best performing implementation since it allows DB2 to compare the encoded version of the
constant 112233 with the encoded version of the CARDNUM values that are stored in the orders
table. Since the optimizer did not decode the data to perform the comparison, the query will return the
same rows for the MAIN and QUSER user profiles. The only difference will be that QUSER will see
masked values in the result rows for the CARDNUM column.
SQL programming
39
|
|
|
|
|
|
|
|
The implementation of queries that reference a field procedure column can be controlled by the
QAQQINI FIELDPROC_ENCODED_COMPARISON option. The default value for this option is
*ALLOW_EQUAL. This option enables the optimizer to implement the comparison using the encoded
values.
In the previous example, if the FIELDPROC_ENCODED_COMPARISON option was changed to
*NONE, the query would return different rows for the two users. When the value is *NONE, an equal
comparison will be implemented internally by DB2 as follows:
|
|
|
|
|
|
In this case, DB2 has to decode the CARDNUM values for every row in the table to compare against
the original constant '112233'. This means that the comparison for the MAIN user profile will compare
the decoded and unmasked card number values (112233, 332211, etc) to 112233. The MAIN user
profile will find the orders associated with the specified card number (112233). However, the query will
not return any rows for the QUSER user profile. That is because the comparison for QUSER will be
comparing the masked value of the card numbers (****33, ****11, etc) with the constant 112233.
|
|
|
Verify that only authorized users can execute the CHGQRYA command. By default only users with job
control (*JOBCTL) special authority or have the QIBM_DB_SQLADM function usage are authorized to
the CHGQRYA command.
QUSRSYS/QAQQINI file
40
|
|
|
Verify that only authorized users can create the QUSRSYS/QAQQINI file or update it if it already
exists. By default *PUBLIC has *USE authority to QUSRSYS which is not be enough authority to create
a new QUSRSYS.QAQQINI file.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Add field procedure FP1 to column C1. The Field Procedure FP1 takes one additional parameter which
indicates the number of bytes of the column the field procedure should operate on.
ALTER TABLE TESTTAB ALTER C1 SET FIELDPROC FP1(10)
#include "string.h"
#include <QSYSINC/H/SQLFP>
void reverse(char *in, char *out, long length);
main(int argc, void *argv[])
{
short *funccode = argv[1];
sqlfpFieldProcedureParameterList_T *optionalParms = argv[2];
char *sqlstate = argv[7];
sqlfpMessageText_T *msgtext = argv[8];
int bytesToProcess;
sqlfpOptionalParameterValueDescriptor_T *optionalParmPtr;
sqlfpInformation_T *info = argv[9];
int masked;
if (optionalParms->sqlfpNumberOfOptionalParms != 1)
{
memcpy(sqlstate,"38001",5);
return;
}
optionalParmPtr = (void *)&(optionalParms->sqlfpParmList);
bytesToProcess = *((int *)&optionalParmPtr->sqlfpParmData);
/*******************************************************************/
/* CREATE CALL
*/
/*******************************************************************/
if (*funccode == 8)
/* create time */
{
sqlfpParameterDescription_T *inDataType = argv[3];
sqlfpParameterDescription_T *outDataType = argv[5];
if (inDataType->sqlfpSqlType !=452 &&
inDataType->sqlfpSqlType !=453 )
/* only support fixed length char */
{
memcpy(sqlstate,"38002",5);
return;
}
/* do something here to determine the result data type */
/* ..... */
/* in this example input and output types are exactly the same */
/* so just copy */
memcpy(outDataType, inDataType, sizeof(sqlfpParameterDescription_T));
}
/*******************************************************************/
/* ENCODE (WRITE) CALL
*/
/*******************************************************************/
else if (*funccode == 0)
/* encode */
{
char *decodedData = argv[4];
char *encodedData = argv[6];
/*
/*
/*
/*
if
41
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
{
memcpy(sqlstate,"09501",5);
}
else
{
reverse(decodedData, encodedData, bytesToProcess);
}
}
/*******************************************************************/
/* DECODE (READ) CALL
*/
/*******************************************************************/
else if (*funccode == 4)
/* decode */
{
char *decodedData = argv[4];
char *encodedData = argv[6];
/* The 9th paramter indicates that the column must not be
/* masked. For exmaple, during ALTER TABLE or RGZPFM.
if ( info->sqlfpNoMask == 1 )
{
reverse(encodedData, decodedData, bytesToProcess);
return;
}
else
{
reverse(encodedData, decodedData, bytesToProcess);
/* Mask the data when appropriate
/* Assume mask is set to 0 when it should not be masked
/* and 1 when it shoulbe be masked
if (masked == 1)
{
memcpy(decodedData, "XXXXXXXXXXXX", 12);
}
}
*/
*/
*/
*/
*/
return;
}
/*******************************************************************/
/* ERROR- UNSUPPORTED OPTION
*/
/*******************************************************************/
else /* unsupported option -- error */
memcpy(sqlstate, "38003",5);
}
/*******************************************************************/
/* REVERSE
*/
/*******************************************************************/
void reverse(char *in, char *out, long length)
{
int i;
for (i=0;i<length; ++i) {
out[length - (i+1)] = in[i];
}
}
42
These labels can be seen in the SQL catalog in the LABEL column.
The LABEL ON statement looks like this:
LABEL ON
TABLE CORPDATA.DEPARTMENT IS Department Structure Table
LABEL ON
COLUMN CORPDATA.DEPARTMENT.ADMRDEPT IS Reports to Dept.
After these statements are run, the table named DEPARTMENT displays the text description as
Department Structure Table and the column named ADMRDEPT displays the heading Reports to Dept. The
label for an object or a column cannot be more than 50 bytes and the label for a column heading cannot
be more than 60 bytes (blanks included). Here are the examples of LABEL ON statements for column
headings:
This LABEL ON statement provides column heading 1 and column heading 2:
*...+....1....+....2....+....3....+....4....+....5....+....6..*
LABEL ON COLUMN CORPDATA.EMPLOYEE.EMPNO IS
Employee
Number
This LABEL ON statement provides three levels of column headings for the SALARY column:
*...+....1....+....2....+....3....+....4....+....5....+....6..*
LABEL ON COLUMN CORPDATA.EMPLOYEE.SALARY IS
Yearly
Salary
(in dollars)
This LABEL ON statement provides a DBCS column heading with two levels specified:
*...+....1....+....2....+....3....+....4....+....5....+....6..*
LABEL ON COLUMN CORPDATA.EMPLOYEE.SALARY IS
<AABBCCDD>
<EEFFGG>
This LABEL ON statement provides the column text for the EDLEVEL column:
*...+....1....+....2....+....3....+....4....+....5....+....6..*
LABEL ON COLUMN CORPDATA.EMPLOYEE.EDLEVEL TEXT IS
Number of years of formal education
Related reference:
LABEL
SQL programming
43
Related reference:
COMMENT
Adding a column
When you add a new column to a table, the column is initialized with its default value for all existing
rows. If NOT NULL is specified, a default value must also be specified.
You can add a column to a table using the ADD COLUMN clause of the SQL ALTER TABLE statement.
The altered table may consist of up to 8000 columns. The sum of the byte counts of the columns must not
be greater than 32766 or, if a VARCHAR or VARGRAPHIC column is specified, 32740. If a LOB column is
specified, the sum of record data byte counts of the columns must not be greater than 15 728 640.
Related reference:
ALTER TABLE
Changing a column
You can change a column definition in a table using the ALTER COLUMN clause of the ALTER TABLE
statement.
When you change the data type of an existing column, the old and new attributes must be compatible.
You can always change a character, graphic, or binary column from fixed length to varying length or
LOB; or from varying length or LOB to fixed length.
When you convert to a data type with a longer length, data is padded with the appropriate pad
character. When you convert to a data type with a shorter length, data might be lost because of
truncation. An inquiry message prompts you to confirm the request.
44
If you have a column that does not allow the null value and you want to change it to now allow the null
value, use the DROP NOT NULL clause. If you have a column that allows the null value and you want
to prevent the use of null values, use the SET NOT NULL clause. If any of the existing values in that
column are the null value, the ALTER TABLE will not be performed and an SQLCODE of -190 will result.
Related reference:
Allowable conversions of data types
When you change the data type of an existing column, the old and new attributes must be compatible.
Related information:
ALTER TABLE
To data type
Decimal
Numeric
Decimal
Decimal
Decfloat
Decimal
Float
Numeric
Decimal
Numeric
Numeric
Decfloat
Numeric
Float
Decimal
Numeric
Decfloat
Float
Float
Decimal
Float
Numeric
Float
Float
Decfloat
Character
DBCS-open
Character
DBCS-open
Character
DBCS-open
DBCS-either
Character
DBCS-either
DBCS-open
DBCS-either
DBCS-only
DBCS-open
DBCS-only
DBCS graphic
DBCS-only
DBCS graphic
Character
DBCS-open
DBCS graphic
SQL programming
45
To data type
distinct type
source type
source type
distinct type
When you change an existing column, only the attributes that you specify are changed. All other
attributes remain unchanged. For example, you have a table with the following table definition:
CREATE TABLE EX1 (COL1 CHAR(10) DEFAULT COL1,
COL2 VARCHAR(20) ALLOCATE(10) CCSID 937,
COL3 VARGRAPHIC(20) ALLOCATE(10)
NOT NULL WITH DEFAULT)
After you run the following ALTER TABLE statement, COL2 still has an allocated length of 10 and
CCSID 937, and COL3 still has an allocated length of 10.
ALTER TABLE EX1 ALTER COLUMN COL2 SET DATA TYPE VARCHAR(30)
ALTER COLUMN COL3 DROP NOT NULL
Related reference:
Changing a column on page 44
You can change a column definition in a table using the ALTER COLUMN clause of the ALTER TABLE
statement.
Deleting a column
You can delete a column using the DROP COLUMN clause of the ALTER TABLE statement.
Dropping a column deletes that column from the table definition. If CASCADE is specified, any views,
indexes, and constraints dependent on that column will also be dropped. If RESTRICT is specified, and
any views, indexes, or constraints are dependent on the column, the column will not be dropped and
SQLCODE of -196 will be issued.
ALTER TABLE DEPT
DROP COLUMN NUMDEPT
Related reference:
ALTER TABLE
46
|
|
|
|
|
|
|
|
Using CREATE OR REPLACE TABLE lets you consolidate the master definition of a table into one
statement. You do not need to maintain the source for the original CREATE TABLE statement plus a
complex list of ALTER TABLE statements needed to recreate the most current version of a table. This
CREATE TABLE statement can be executed to deploy the current definition of the table either as a new
table or to replace a prior version of the table.
|
|
|
|
There are options to either keep the existing data in the table or to clear the data from the table during
the replace. The default is to keep all data. If you elect to clear all the data, your new table definition
does not need to be compatible with the original version. In all cases, other objects that depend on the
table, such as referential constraints, triggers, and views, must remain satisfied or the replace will fail.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Perhaps over time, you have updated the column names to be more descriptive, changed the DESCR
column to be a longer Unicode column, and added a timestamp column for when the row was last
updated. The following statement reflects all of these changes and can be executed against any prior
version of the table, as long as the column names can be matched to the prior column names and the
data types are compatible.
|
|
Partitioned tables can be modified using CREATE OR REPLACE TABLE. The following example
demonstrates splitting a single partition into multiple partitions.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
To break the second partition into 3 pieces, modify the original CREATE TABLE statement to redefine the
partitions.
The OR REPLACE option on the CREATE TABLE statement can be used to change an existing table
definition.
SQL programming
47
| Now the table will have 5 partitions with the data spread among them according to the new definition.
|
|
|
|
|
|
This example uses the default of ON REPLACE PRESERVE ALL ROWS. That means that all data for all
rows is guaranteed to be kept. If data from an existing partition doesn't fit in any new partition, the
statement fails. To remove a partition and the data from that partition, omit the partition definition from
the CREATE OR REPLACE TABLE statement and use ON REPLACE PRESERVE ROWS. This will
preserve all the data that can be assigned to the remaining partitions and discard any that no longer has
a defined partition.
When alias MYLIB.MYMBR2_ALIAS is specified on the following insert statement, the values are inserted
into member MBR2 in MYLIB.MYFILE:
INSERT INTO MYLIB.MYMBR2_ALIAS VALUES(ABC, 6)
Alias names can also be specified on DDL statements. Assume that MYLIB.MYALIAS is an alias for table
MYLIB.MYTABLE. The following DROP statement drops table MYLIB.MYTABLE:
DROP TABLE MYLIB.MYALIAS
If you really want to drop the alias name instead, specify the ALIAS keyword on the drop statement:
DROP ALIAS MYLIB.MYALIAS
Related reference:
CREATE ALIAS
| Since the view name, EMP_MANAGERS, is too long for a system object name, the FOR SYSTEM NAME
| clause can be used to provide the system name. Without adding this clause, a name like EMP_M00001
| will be generated for the system object.
48
After you create the view, you can use it in SQL statements just like a table. You can also change the data
in the base table through the view. The following SELECT statement displays the contents of
EMP_MANAGERS:
SELECT *
FROM CORPDATA.EMP_MANAGERS
WORKDEPT
THOMPSON
B01
KWAN
C01
GEYER
E01
STERN
D11
PULASKI
D21
HENDERSON
E11
SPENSER
E21
If the select list contains elements other than columns such as expressions, functions, constants, or special
registers, and the AS clause was not used to name the columns, a column list must be specified for the
view. In the following example, the columns of the view are LASTNAME and YEARSOFSERVICE.
CREATE VIEW CORPDATA.EMP_YEARSOFSERVICE
(LASTNAME, YEARSOFSERVICE) AS
SELECT LASTNAME, YEAR (CURRENT DATE - HIREDATE)
FROM CORPDATA.EMPLOYEE
Because the results of querying this view change as the current year changes, they are not included here.
You can also define the previous view by using the AS clause in the select list to name the columns in the
view. For example:
CREATE VIEW CORPDATA.EMP_YEARSOFSERVICE AS
SELECT LASTNAME,
YEARS (CURRENT_DATE - HIREDATE) AS YEARSOFSERVICE
FROM CORPDATA.EMPLOYEE
Using the UNION keyword, you can combine two or more subselects to form a single view. For example:
CREATE VIEW D11_EMPS_PROJECTS AS
(SELECT EMPNO
FROM CORPDATA.EMPLOYEE
WHERE WORKDEPT = D11
UNION
SELECT EMPNO
FROM CORPDATA.EMPPROJACT
WHERE PROJNO = MA2112 OR
PROJNO = MA2113 OR
PROJNO = AD3111)
SQL programming
49
Views are created with the sort sequence in effect at the time the CREATE VIEW statement is run. The
sort sequence applies to all character, or UCS-2 or UTF-16 graphic comparisons in the CREATE VIEW
statement subselect.
You can also create views using the WITH CHECK OPTION clause to specify the level of checking when
data is inserted or updated through the view.
Related concepts:
Retrieving data using the SELECT statement on page 56
The SELECT statement tailors your query to gather data. You can use the SELECT statement to retrieve a
specific row or retrieve data in a specific way.
Sort sequences and normalization in SQL on page 128
A sort sequence defines how characters in a character set relate to each other when they are compared or
ordered. Normalization allows you to compare strings that contain combining characters.
Related reference:
Using the UNION keyword to combine subselects on page 95
Using the UNION keyword, you can combine two or more subselects to form a fullselect.
CREATE VIEW
50
Because no WITH CHECK OPTION is specified, the following INSERT statement is successful even
though the value being inserted does not meet the search condition of the view.
INSERT INTO V1 VALUES (5)
Create another view over V1, specifying the WITH CASCADED CHECK OPTION clause:
CREATE VIEW V2 AS SELECT COL1
FROM V1 WITH CASCADED CHECK OPTION
The following INSERT statement fails because it produces a row that does not conform to the definition
of V2:
INSERT INTO V2 VALUES (5)
The following INSERT statement fails only because V3 is dependent on V2, and V2 has a WITH
CASCADED CHECK OPTION.
INSERT INTO V3 VALUES (5)
However, the following INSERT statement is successful because it conforms to the definition of V2.
Because V3 does not have a WITH CASCADED CHECK OPTION, it does not matter that the statement
does not conform to the definition of V3.
INSERT INTO V3 VALUES (200)
Create second view over V1, this time specifying WITH LOCAL CHECK OPTION:
CREATE VIEW V2 AS SELECT COL1
FROM V1 WITH LOCAL CHECK OPTION
The same INSERT statement that failed in the previous CASCADED CHECK OPTION example succeeds
now because V2 does not have any search conditions, and the search conditions of V1 do not need to be
checked since V1 does not specify a check option.
SQL programming
51
The following INSERT is successful again because the search condition on V1 is not checked due to the
WITH LOCAL CHECK OPTION on V2, versus the WITH CASCADED CHECK OPTION in the previous
example.
INSERT INTO V3 VALUES (5)
The difference between LOCAL and CASCADED CHECK OPTION lies in how many of the dependent
views' search conditions are checked when a row is inserted or updated.
v WITH LOCAL CHECK OPTION specifies that the search conditions of only those dependent views
that have the WITH LOCAL CHECK OPTION or WITH CASCADED CHECK OPTION are checked
when a row is inserted or updated.
v WITH CASCADED CHECK OPTION specifies that the search conditions of all dependent views are
checked when a row is inserted or updated.
Example: Cascaded check option:
This example shows how the check option is enforced on a number of dependent views that are defined
with or without a check option.
Use the following table and views:
CREATE TABLE T1 (COL1 CHAR(10))
CREATE VIEW V1 AS SELECT COL1
FROM T1 WHERE COL1 LIKE A%
CREATE VIEW V2 AS SELECT COL1
FROM V1 WHERE COL1 LIKE %Z
WITH LOCAL CHECK OPTION
CREATE VIEW V3 AS SELECT COL1
FROM V2 WHERE COL1 LIKE AB%
CREATE VIEW V4 AS SELECT COL1
FROM V3 WHERE COL1 LIKE %YZ
WITH CASCADED CHECK OPTION
CREATE VIEW V5 AS SELECT COL1
FROM V4 WHERE COL1 LIKE ABC%
Different search conditions are going to be checked depending on which view is being operated on with
an INSERT or UPDATE statement.
v If V1 is operated on, no conditions are checked because V1 does not have a WITH CHECK OPTION
specified.
v If V2 is operated on,
COL1 must end in the letter Z, but it doesn't need to start with the letter A. This is because the
check option is LOCAL, and view V1 does not have a check option specified.
v If V3 is operated on,
COL1 must end in the letter Z, but it does not need to start with the letter A. V3 does not have a
check option specified, so its own search condition must not be met. However, the search condition
for V2 must be checked because V3 is defined on V2, and V2 has a check option.
v If V4 is operated on,
52
COL1 must start with 'AB' and must end with 'YZ'. Because V4 has the WITH CASCADED CHECK
OPTION specified, every search condition for every view on which V4 is dependent must be
checked.
v If V5 is operated on,
COL1 must start with 'AB', but not necessarily 'ABC'. This is because V5 does not specify a check
option, so its own search condition does not need to be checked. However, because V5 is defined on
V4, and V4 had a cascaded check option, every search condition for V4, V3, V2, and V1 must be
checked. That is, COL1 must start with 'AB' and end with 'YZ'.
If V5 were created WITH LOCAL CHECK OPTION, operating on V5 means that COL1 must start with
'ABC' and end with 'YZ'. The LOCAL CHECK OPTION adds the additional requirement that the third
character must be a 'C'.
Creating indexes
You can use indexes to sort and select data. In addition, indexes help the system retrieve data faster for
better query performance.
Use the CREATE INDEX statement to create indexes. The following example creates an index over the
column LASTNAME in the CORPDATA.EMPLOYEE table:
CREATE INDEX CORPDATA.INX1 ON CORPDATA.EMPLOYEE (LASTNAME)
You can also create an index that does not exactly match the data for a column in a table. For example,
you can create an index that uses the uppercase version of an employee name:
CREATE INDEX CORPDATA.INX2 ON CORPDATA.EMPLOYEE (UPPER(LASTNAME))
Most expressions allowed by SQL can be used in the definition of the key columns.
You can create any number of indexes. However, because the indexes are maintained by the system, a
large number of indexes can adversely affect performance. One type of index, the encoded vector index
(EVI), allows for faster scans that can be more easily processed in parallel.
If an index is created that has exactly the same attributes as an existing index, the new index shares the
existing indexes' binary tree. Otherwise, another binary tree is created. If the attributes of the new index
are exactly the same as another index, except that the new index has fewer columns, another binary tree
is still created. It is still created because the extra columns prevent the index from being used by cursors
or UPDATE statements that update those extra columns.
Indexes are created with the sort sequence in effect at the time the CREATE INDEX statement is run. The
sort sequence applies to all SBCS character fields, or UCS-2 or UTF-16 graphic fields of the index.
Related concepts:
Sort sequences and normalization in SQL on page 128
A sort sequence defines how characters in a character set relate to each other when they are compared or
ordered. Normalization allows you to compare strings that contain combining characters.
Creating an index strategy
Related reference:
CREATE INDEX
|
You can use global variables to assign specific variable values for a session.
|
|
|
Use the CREATE VARIABLE statement to create a global variable. The following example creates a global
variable that defines a user class.
CREATE VARIABLE USER_CLASS INT DEFAULT (CLASS_FUNC(USER))
SQL programming
53
| This variable will have its initial value set based on the result of invoking a function called
| CLASS_FUNC. This function is assumed to assign a class value such as administrator or clerk based on
| the USER special register value.
| A global variable is instantiated for a session the first time it is referenced. Once it is set, it will maintain
| its value unless explicitly changed within the session.
|
|
|
|
|
|
A global variable can be used in a query to determine what results will be returned. In the following
example, a list of all employees from department A00 are listed. Only a session that has a global variable
with a USER_CLASS value of 1 will see the salaries for these employees.
SELECT EMPNO, LASTNAME, CASE WHEN USER_CLASS = 1 THEN SALARY ELSE NULL END
FROM EMPLOYEE
WHERE WORKDEPT = A00
| Global variables can be used in any context where an expression is allowed. Unlike a host variable, a
| global variable can be used in a CREATE VIEW statement.
|
| You can replace an existing object using a CREATE statement rather than always needing to drop the
| object first.
|
|
|
|
|
For many SQL objects, you can optionally replace an existing object when using the CREATE SQL
statement. The existing object is effectively dropped before the new object is created. The following SQL
statements have that option:
v CREATE ALIAS
v CREATE FUNCTION
| v CREATE PROCEDURE
| v CREATE SEQUENCE
| v CREATE TRIGGER
| v CREATE VARIABLE
| v CREATE VIEW
| When the replace option is used for any of these statements, the privileges for the existing object are
| kept. The object definition is replaced by the new definition.
|
| To create a sequence called MYSEQUENCE, or replace a sequence of that name if it exists, use the
| following SQL statement
| CREATE OR REPLACE SEQUENCE MYSEQUENCE AS BIGINT
| The sequence will be created if it does not already exist. If it does exist, the privileges from the existing
| sequence will be transferred to the new sequence.
54
As the following examples show, you can display catalog information. You cannot insert, delete, or
update catalog information. You must have SELECT privileges on the catalog views to run the following
examples.
Related reference:
DB2 for i5/OS catalog views
The result of the previous example statement is a row of information for each column in the table.
For specific information about each column, specify a select-statement like this:
SELECT COLUMN_NAME, TABLE_NAME, DATA_TYPE, LENGTH, HAS_DEFAULT
FROM CORPDATA.SYSCOLUMNS
WHERE TABLE_NAME = DEPARTMENT
In addition to the column name for each column, the select-statement shows:
v The name of the table that contains the column
v The data type of the column
v The length attribute of the column
v If the column allows default values
The result looks like this.
COLUMN_NAME
TABLE_NAME
DATA_TYPE
LENGTH
HAS_DEFAULT
DEPTNO
DEPARTMENT
CHAR
DEPTNAME
DEPARTMENT
VARCHAR
29
MGRNO
DEPARTMENT
CHAR
ADMRDEPT
DEPARTMENT
CHAR
55
Related reference:
DROP
WHERE ADMRDEPT=A00
Comparisons might not be case sensitive if a shared-weight sort sequence is used where
uppercase and lowercase characters are treated as the same characters.
A SELECT statement can include the following:
1. The name of each column you want to include in the result.
2. The name of the table or view that contains the data.
3. A search condition to identify the rows that contain the information you want.
4. The name of each column used to group your data.
56
5. A search condition that uniquely identifies a group that contains the information you want.
6. The order of the results so a specific row among duplicates can be returned.
A SELECT statement looks like this:
SELECT column names
FROM table or view name
WHERE search condition
GROUP BY column names
HAVING search condition
ORDER BY column-name
The SELECT and FROM clauses must be specified. The other clauses are optional.
With the SELECT clause, you specify the name of each column you want to retrieve. For example:
SELECT EMPNO, LASTNAME, WORKDEPT
You can specify that only one column be retrieved, or as many as 8000 columns. The value of each
column you name is retrieved in the order specified in the SELECT clause.
If you want to retrieve all columns (in the same order as they appear in the table's definition), use an
asterisk (*) instead of naming the columns:
SELECT *
The FROM clause specifies the table that you want to select data from. You can select columns from more
than one table. When issuing a SELECT, you must specify a FROM clause. Issue the following statement:
SELECT *
FROM EMPLOYEE
The result is all of the columns and rows from the table EMPLOYEE.
The SELECT list can also contain expressions, including constants, special registers, and scalar fullselects.
An AS clause can be used to give the resulting column a name. For example, issue the following
statement:
SELECT LASTNAME, SALARY * .05 AS RAISE
FROM EMPLOYEE
WHERE EMPNO = 200140
RAISE
NATZ
1421
SQL programming
57
numeric values are not. This applies to all constant values wherever they are coded within an SQL
statement. For example, to specify that you are interested in the rows where the department number is
C01, issue the following statement:
... WHERE WORKDEPT = C01
In this case, the search condition consists of one predicate: WORKDEPT = 'C01'.
To further illustrate WHERE, put it into a SELECT statement. Assume that each department listed in the
CORPDATA.DEPARTMENT table has a unique department number. You want to retrieve the department
name and manager number from the CORPDATA.DEPARTMENT table for department C01. Issue the
following statement:
SELECT DEPTNAME, MGRNO
FROM CORPDATA.DEPARTMENT
WHERE DEPTNO = C01
MGRNO
INFORMATION CENTER
000030
If the search condition contains character, or UCS-2 or UTF-16 graphic column predicates, the sort
sequence that is in effect when the query is run is applied to those predicates. If a sort sequence is not
being used, character constants must be specified in uppercase or lowercase to match the column or
expression they are being compared to.
Related concepts:
Sort sequences and normalization in SQL on page 128
A sort sequence defines how characters in a character set relate to each other when they are compared or
ordered. Normalization allows you to compare strings that contain combining characters.
Related reference:
Defining complex search conditions on page 70
In addition to the basic comparison predicates, such as = and >, a search condition can contain any of
these predicates: BETWEEN, IN, EXISTS, IS NULL, and LIKE.
Multiple search conditions within a WHERE clause on page 72
You can qualify your request further by coding a search condition that includes several predicates.
Expressions in the WHERE clause:
An expression in a WHERE clause names or specifies something that you want to compare to something
else.
The expressions you specify can be:
v A column name names a column. For example:
... WHERE EMPNO = 000200
58
A function
A special register
A scalar fullselect
Another expression
For example:
... WHERE INTEGER(PRENDATE - PRSTDATE) > 100
When the order of evaluation is not specified by parentheses, the expression is evaluated in the
following order:
1. Prefix operators
2. Exponentiation
3. Multiplication, division, and concatenation
4. Addition and subtraction
Operators on the same precedence level are applied from left to right.
v A constant specifies a literal value for the expression. For example:
... WHERE 40000 < SALARY
SALARY names a column that is defined as a 9-digit packed decimal value (DECIMAL(9,2)). It is
compared to the numeric constant 40000.
v A host variable identifies a variable in an application program. For example:
... WHERE EMPNO = :EMP
v A special register identifies a special value defined by the database manager. For example:
... WHERE LASTNAME = USER
A search condition can specify many predicates separated by AND and OR. No matter how complex the
search condition, it supplies a TRUE or FALSE value when evaluated against a row. There is also an
unknown truth value, which is effectively false. That is, if the value of a row is null, this null value is not
returned as a result of a search because it is not less than, equal to, or greater than the value specified in
the search condition.
To fully understand the WHERE clause, you need to know the order SQL evaluates search conditions and
predicates, and compares the values of expressions. This topic is discussed in the DB2 for i SQL reference
topic collection.
Related concepts:
Using subqueries on page 120
You can use subqueries in a search condition as another way to select data. Subqueries can be used
anywhere an expression can be used.
Related reference:
Defining complex search conditions on page 70
In addition to the basic comparison predicates, such as = and >, a search condition can contain any of
these predicates: BETWEEN, IN, EXISTS, IS NULL, and LIKE.
Expressions
Comparison operators:
SQL supports several comparison operators.
Comparison operator
Description
<> or = or !=
=
Not equal to
Equal to
SQL programming
59
Comparison operator
Description
<
>
<= or > or !>
> = or < or !<
Less than
Greater than
Less than or equal to (or not greater than)
Greater than or equal to (or not less than)
NOT keyword:
You can precede a predicate with the NOT keyword to specify that you want the opposite of the
predicate's value (that is, TRUE if the predicate is FALSE).
NOT applies only to the predicate it precedes, not to all predicates in the WHERE clause. For example, to
indicate that you are interested in all employees except those working in the department C01, you can
say:
... WHERE NOT WORKDEPT = C01
GROUP BY clause
The GROUP BY clause allows you to find the characteristics of groups of rows rather than individual
rows.
When you specify a GROUP BY clause, SQL divides the selected rows into groups such that the rows of
each group have matching values in one or more columns or expressions. Next, SQL processes each
group to produce a single-row result for the group. You can specify one or more columns or expressions
in the GROUP BY clause to group the rows. The items you specify in the SELECT statement are
properties of each group of rows, not properties of individual rows in a table or view.
Without a GROUP BY clause, the application of SQL aggregate functions returns one row. When GROUP
BY is used, the function is applied to each group, thereby returning as many rows as there are groups.
For example, the CORPDATA.EMPLOYEE table has several sets of rows, and each set consists of rows
describing members of a specific department. To find the average salary of people in each department,
you can issue:
SELECT WORKDEPT, DECIMAL (AVG(SALARY),5,0)
FROM CORPDATA.EMPLOYEE
GROUP BY WORKDEPT
AVG-SALARY
A00
40850
B01
41250
C01
29722
D11
25147
D21
25668
E01
40175
E11
21020
E21
24086
60
Notes:
1. Grouping the rows does not mean ordering them. Grouping puts each selected row in a
group, which SQL then processes to derive characteristics of the group. Ordering the rows
puts all the rows in the results table in ascending or descending collating sequence.
Depending on the implementation selected by the database manager, the resulting groups
might appear to be ordered.
2. If there are null values in the column you specify in the GROUP BY clause, a single-row result
is produced for the data in the rows with null values.
3. If the grouping occurs over character, or UCS-2 or UTF-16 graphic columns, the sort sequence
in effect when the query is run is applied to the grouping.
When you use GROUP BY, you list the columns or expressions you want SQL to use to group the rows.
For example, suppose that you want a list of the number of people working on each major project
described in the CORPDATA.PROJECT table. You can issue:
SELECT SUM(PRSTAFF), MAJPROJ
FROM CORPDATA.PROJECT
GROUP BY MAJPROJ
The result is a list of the company's current major projects and the number of people working on each
project.
SUM(PRSTAFF)
MAJPROJ
AD3100
AD3110
10
MA2100
MA2110
OP1000
OP2000
OP2010
32.5
You can also specify that you want the rows grouped by more than one column or expression. For
example, you can issue a select statement to find the average salary for men and women in each
department, using the CORPDATA.EMPLOYEE table. To do this, you can issue:
SELECT WORKDEPT, SEX, DECIMAL(AVG(SALARY),5,0) AS AVG_WAGES
FROM CORPDATA.EMPLOYEE
GROUP BY WORKDEPT, SEX
SEX
AVG_WAGES
A00
49625
A00
35000
B01
41250
C01
29722
D11
25817
D11
24764
D21
26933
D21
24720
SQL programming
61
WORKDEPT
SEX
AVG_WAGES
E01
40175
E11
22810
E11
16545
E21
25370
E21
23830
Because you did not include a WHERE clause in this example, SQL examines and processes all rows in
the CORPDATA.EMPLOYEE table. The rows are grouped first by department number and next (within
each department) by sex before SQL derives the average SALARY value for each group.
Related concepts:
Sort sequences and normalization in SQL on page 128
A sort sequence defines how characters in a character set relate to each other when they are compared or
ordered. Normalization allows you to compare strings that contain combining characters.
Related reference:
ORDER BY clause on page 63
The ORDER BY clause specifies the particular order in which you want selected rows returned. The order
is sorted by ascending or descending collating sequence of a column's or an expression's value.
HAVING clause
The HAVING clause specifies a search condition for the groups selected by the GROUP BY clause.
The HAVING clause says that you want only those groups that satisfy the condition in that clause.
Therefore, the search condition you specify in the HAVING clause must test properties of each group
rather than properties of individual rows in the group.
The HAVING clause follows the GROUP BY clause and can contain the same kind of search condition as
you can specify in a WHERE clause. In addition, you can specify aggregate functions in a HAVING
clause. For example, suppose that you want to retrieve the average salary of women in each department.
To do this, use the AVG aggregate function and group the resulting rows by WORKDEPT and specify a
WHERE clause of SEX = F.
To specify that you want this data only when all the female employees in the selected department have
an education level equal to or greater than 16 (a college graduate), use the HAVING clause. The HAVING
clause tests a property of the group. In this case, the test is on MIN(EDLEVEL), which is a group
property:
SELECT WORKDEPT, DECIMAL(AVG(SALARY),5,0) AS AVG_WAGES, MIN(EDLEVEL) AS MIN_EDUC
FROM CORPDATA.EMPLOYEE
WHERE SEX=F
GROUP BY WORKDEPT
HAVING MIN(EDLEVEL)>=16
AVG_WAGES
MIN_EDUC
A00
49625
18
C01
29722
16
D11
25817
17
You can use multiple predicates in a HAVING clause by connecting them with AND and OR, and you
can use NOT for any predicate of a search condition.
62
Note: If you intend to update a column or delete a row, you cannot include a GROUP BY or HAVING
clause in the SELECT statement within a DECLARE CURSOR statement. These clauses make it a
read-only cursor.
Predicates with arguments that are not aggregate functions can be coded in either WHERE or HAVING
clauses. It is typically more efficient to code the selection criteria in the WHERE clause because it is
handled earlier in the query processing. The HAVING selection is performed in post processing of the
result table.
If the search condition contains predicates involving character, or UCS-2 or UTF-16 graphic columns, the
sort sequence in effect when the query is run is applied to those predicates.
Related concepts:
Sort sequences and normalization in SQL on page 128
A sort sequence defines how characters in a character set relate to each other when they are compared or
ordered. Normalization allows you to compare strings that contain combining characters.
Related reference:
Using a cursor on page 383
When SQL runs a SELECT statement, the resulting rows comprise the result table. A cursor provides a
way to access a result table.
ORDER BY clause
The ORDER BY clause specifies the particular order in which you want selected rows returned. The order
is sorted by ascending or descending collating sequence of a column's or an expression's value.
For example, to retrieve the names and department numbers of female employees listed in the
alphanumeric order of their department numbers, you can use this select-statement:
SELECT LASTNAME,WORKDEPT
FROM CORPDATA.EMPLOYEE
WHERE SEX=F
ORDER BY WORKDEPT
WORKDEPT
HAAS
A00
HEMMINGER
A00
KWAN
C01
QUINTANA
C01
NICHOLLS
C01
NATZ
C01
PIANKA
D11
SCOUTTEN
D11
LUTZ
D11
JOHN
D11
PULASKI
D21
JOHNSON
D21
PEREZ
D21
HENDERSON
E11
SCHNEIDER
E11
SETRIGHT
D11
SQL programming
63
LASTNAME
WORKDEPT
SCHWARTZ
E11
SPRINGER
E11
WONG
E21
If an AS clause is specified to name a result column in the select-list, this name can be specified in the
ORDER BY clause. The name specified in the AS clause must be unique in the select-list. For example, to
retrieve the full names of employees listed in alphabetic order, you can use this select-statement:
SELECT LASTNAME CONCAT FIRSTNME AS FULLNAME
FROM CORPDATA.EMPLOYEE
ORDER BY FULLNAME
Instead of naming the columns to order the results, you can use a number. For example, ORDER BY 3
specifies that you want the results ordered by the third column of the results table, as specified by the
select-list. Use a number to order the rows of the results table when the sequencing value is not a named
column.
You can also specify whether you want SQL to collate the rows in ascending (ASC) or descending (DESC)
sequence. An ascending collating sequence is the default. In the previous select-statement, SQL first
returns the row with the lowest FULLNAME expression (alphabetically and numerically), followed by
rows with higher values. To order the rows in descending collating sequence based on this name, specify:
... ORDER BY FULLNAME DESC
You can specify a secondary ordering sequence (or several levels of ordering sequences) as well as a
primary one. In the previous example, you might want the rows ordered first by department number,
and within each department, ordered by employee name. To do this, specify:
... ORDER BY WORKDEPT, FULLNAME
If character columns, or UCS-2 or UTF-16 graphic columns are used in the ORDER BY clause, ordering
for these columns is based on the sort sequence in effect when the query is run.
Related concepts:
Sort sequences and normalization in SQL on page 128
A sort sequence defines how characters in a character set relate to each other when they are compared or
ordered. Normalization allows you to compare strings that contain combining characters.
Related reference:
GROUP BY clause on page 60
The GROUP BY clause allows you to find the characteristics of groups of rows rather than individual
rows.
64
DEPTNAME
ADMRDEPT
D01
DEVELOPMENT CENTER
A00
F22
BRANCH OFFICE F2
E01
G22
BRANCH OFFICE G2
E01
H22
BRANCH OFFICE H2
E01
SQL programming
65
DEPTNO
DEPTNAME
ADMRDEPT
I22
BRANCH OFFICE I2
E01
J22
BRANCH OFFICE J2
E01
To get the rows that do not have a null value for the manager number, you can change the WHERE
clause like this:
WHERE MGRNO IS NOT NULL
Another predicate that is useful for comparing values that can contain the NULL value is the DISTINCT
predicate. Comparing two columns using a normal equal comparison (COL1 = COL2) will be true if both
columns contain an equal non-null value. If both columns are null, the result will be false because null is
never equal to any other value, not even another null value. Using the DISTINCT predicate, null values
are considered equal. So COL1 is NOT DISTINCT from COL2 will be true if both columns contain an
equal non-null value and also when both columns are the null value.
For example, suppose that you want to select information from two tables that contain null values. The
first table T1 has a column C1 with the following values.
C1
2
1
null
C2
For more information about the use of null values, see the DB2 for i SQL reference topic collection.
66
|
|
Special registers
Contents
CURRENT CLIENT_ACCTNG
CLIENT ACCTNG
CURRENT CLIENT_APPLNAME
CLIENT APPLNAME
CURRENT CLIENT_PROGRAMID
CLIENT PROGRAMID
CURRENT CLIENT_USERID
CLIENT USERID
CURRENT CLIENT_WRKSTNNAME
CLIENT WRKSTNNAME
CURRENT DATE
CURRENT_DATE
CURRENT DEGREE
CURRENT PATH
CURRENT_PATH
CURRENT FUNCTION PATH
CURRENT SCHEMA
CURRENT SERVER
CURRENT_SERVER
CURRENT TIME
CURRENT_TIME
CURRENT TIMESTAMP
CURRENT_TIMESTAMP
CURRENT TIMEZONE
CURRENT_TIMEZONE
SESSION_USER
USER
SYSTEM_USER
If a single statement contains more than one reference to any of CURRENT DATE, CURRENT TIME, or
CURRENT TIMESTAMP special registers, or the CURDATE, CURTIME, or NOW scalar functions, all
values are based on a single clock reading.
For remotely run SQL statements, the values for special registers are determined at the remote system.
SQL programming
67
When a query over a distributed table references a special register, the contents of the special register on
the system that requests the query are used. For more information about distributed tables, see the DB2
Multisystem topic collection.
You can also use the CAST specification to cast data types directly:
SELECT CAST(BIRTHDATE AS CHAR(10))
FROM CORPDATA.EMPLOYEE
Related reference:
Casting between data types
68
The CURRENT TIMEZONE special register allows a local time to be converted to Universal Time
Coordinated (UTC). For example, if you have a table named DATETIME that contains a time column type
with a name of STARTT, and you want to convert STARTT to UTC, you can use the following statement:
SELECT STARTT - CURRENT TIMEZONE
FROM DATETIME
Date/time arithmetic:
Addition and subtraction are the only arithmetic operators applicable to date, time, and timestamp
values.
You can increment and decrement a date, time, or timestamp by a duration; or subtract a date from a
date, a time from a time, or a timestamp from a timestamp.
Related reference:
Datetime arithmetic in SQL
The ROW CHANGE TOKEN expression can be used for both tables that have a row change timestamp
and tables that do not. It represents a modification point for a row. If a table has a row change
timestamp, it is derived from the timestamp. If a table does not have a row change timestamp, it is based
on an internal modification time that is not row-based, so it is not as accurate as for a table that has a
row change timestamp.
DISTINCT means that you want to select only the unique rows. If a selected row duplicates another row
in the result table, the duplicate row is ignored (it is not put into the result table). For example, suppose
you want a list of employee job codes. You do not need to know which employee has what job code.
Because it is probable that several people in a department have the same job code, you can use
DISTINCT to ensure that the result table has only unique values.
The following example shows how to do this:
SQL programming
69
If you do not include DISTINCT in a SELECT clause, you might find duplicate rows in your result,
because SQL returns the JOB column's value for each row that satisfies the search condition. Null values
are treated as duplicate rows for DISTINCT.
If you include DISTINCT in a SELECT clause and you also include a shared-weight sort sequence, fewer
values might be returned. The sort sequence causes values that contain the same characters to be
weighted the same. If 'MGR', 'Mgr', and 'mgr' are all in the same table, only one of these values is
returned.
Related concepts:
Sort sequences and normalization in SQL on page 128
A sort sequence defines how characters in a character set relate to each other when they are compared or
ordered. Normalization allows you to compare strings that contain combining characters.
The BETWEEN keyword is inclusive. A more complex, but explicit, search condition that produces the
same result is:
... WHERE HIREDATE >= 1987-01-01 AND HIREDATE <= 1987-12-31
v IN says you are interested in rows in which the value of the specified expression is among the values
you listed. For example, to find the names of all employees in departments A00, C01, and E21, you can
specify:
... WHERE WORKDEPT IN (A00, C01, E21)
v EXISTS says you are interested in testing for the existence of certain rows. For example, to find out if
there are any employees that have a salary greater than 60000, you can specify:
EXISTS (SELECT * FROM EMPLOYEE WHERE SALARY > 60000)
v IS NULL says that you are interested in testing for null values. For example, to find out if there are
any employees without a phone listing, you can specify:
... WHERE EMPLOYEE.PHONE IS NULL
70
v LIKE says you are interested in rows in which an expression is similar to the value you supply. When
you use LIKE, SQL searches for a character string similar to the one you specify. The degree of
similarity is determined by two special characters used in the string that you include in the search
condition:
_
A percent sign stands for an unknown string of 0 or more characters. If the percent sign starts
the search string, then SQL allows 0 or more character(s) to precede the matching value in the
column. Otherwise, the search string must begin in the first position of the column.
Note: If you are operating on MIXED data, the following distinction applies: an SBCS underline
character refers to one SBCS character. No such restriction applies to the percent sign; that is, a
percent sign refers to any number of SBCS or DBCS characters. See the DB2 for i SQL reference
topic collection for more information about the LIKE predicate and MIXED data.
Use the underline character or percent sign either when you do not know or do not care about all the
characters of the column's value. For example, to find out which employees live in Minneapolis, you
can specify:
... WHERE ADDRESS LIKE %MINNEAPOLIS%
SQL returns any row with the string MINNEAPOLIS in the ADDRESS column, no matter where the
string occurs.
In another example, to list the towns whose names begin with 'SAN', you can specify:
... WHERE TOWN LIKE SAN%
If you want to find any addresses where the street name isn't in your master street name list, you can
use an expression in the LIKE expression. In this example, the STREET column in the table is assumed
to be upper case.
... WHERE UCASE (:address_variable) NOT LIKE %||STREET||%
If you want to search for a character string that contains either the underscore or percent character, use
the ESCAPE clause to specify an escape character. For example, to see all businesses that have a
percent in their name, you can specify:
... WHERE BUSINESS_NAME LIKE %@%% ESCAPE @
The first and last percent characters in the LIKE string are interpreted as the normal LIKE percent
characters. The combination '@%' is taken as the actual percent character.
Related concepts:
Using subqueries on page 120
You can use subqueries in a search condition as another way to select data. Subqueries can be used
anywhere an expression can be used.
Sort sequences and normalization in SQL on page 128
A sort sequence defines how characters in a character set relate to each other when they are compared or
ordered. Normalization allows you to compare strings that contain combining characters.
Related reference:
Specifying a search condition using the WHERE clause on page 57
The WHERE clause specifies a search condition that identifies the row or rows that you want to retrieve,
update, or delete.
Expressions in the WHERE clause on page 58
An expression in a WHERE clause names or specifies something that you want to compare to something
else.
Predicates
Special considerations for LIKE:
Here are some considerations for using the LIKE predicate.
SQL programming
71
v When host variables are used in place of string constants in a search pattern, you should consider
using varying length host variables. This allows you to:
Assign previously used string constants to host variables without any changes.
Obtain the same selection criteria and results as if a string constant were used.
v When fixed-length host variables are used in place of string constants in a search pattern, you should
ensure that the value specified in the host variable matches the pattern previously used by the string
constants. All characters in a host variable that are not assigned a value are initialized with a blank.
For example, if you do a search using the string pattern 'ABC%' in a varying length host variable, these
are some of the values that can be returned:
ABCD
ABCDE
ABCxxx
ABC
However, if you do a search using the search pattern 'ABC%' contained in a host variable with a fixed
length of 10, these values can be returned, assuming that the column has a length of 12:
ABCDE
ABCD
ABCxxx
ABC
Note: All returned values start with 'ABC' and end with at least 6 blanks. Blanks are used because the
last 6 characters in the host variable are not assigned a specific value.
If you want to do a search using a fixed-length host variable where the last 7 characters can be
anything, search for 'ABC%%%%%%%'. These are some of the values that can be returned:
ABCDEFGHIJ
ABCXXXXXXX
ABCDE
ABCDD
v OR says that, for a row to qualify, the row can satisfy the condition set by either or both predicates of
the search condition. For example, to find out which employees are in either department C01 or D11,
you can specify :
...
WHERE WORKDEPT = C01 OR WORKDEPT = D11
Note: You can also use IN to specify this request: WHERE WORKDEPT IN (C01, D11).
v NOT says that, to qualify, a row must not meet the criteria set by the search condition or predicate that
follows the NOT. For example, to find all employees in the department E11 except those with a job
code equal to analyst, you can specify:
...
WHERE WORKDEPT = E11 AND NOT JOB = ANALYST
When SQL evaluates search conditions that contain these connectors, it does so in a specific order. SQL
first evaluates the NOT clauses, next evaluates the AND clauses, and then the OR clauses.
You can change the order of evaluation by using parentheses. The search conditions enclosed in
parentheses are evaluated first. For example, to select all employees in departments E11 and E21 who
have education levels greater than 12, you can specify:
72
...
WHERE EDLEVEL > 12 AND
(WORKDEPT = E11 OR WORKDEPT = E21)
The parentheses determine the meaning of the search condition. In this example, you want all rows that
have a:
v WORKDEPT value of E11 or E21, and
v EDLEVEL value greater than 12
If you did not use parentheses:
...
WHERE EDLEVEL > 12 AND WORKDEPT = E11
OR WORKDEPT = E21
Your result is different. The selected rows are rows that have:
v WORKDEPT = E11 and EDLEVEL > 12, or
v WORKDEPT = E21, regardless of the EDLEVEL value
If you are combining multiple equal comparisons, you can write the predicate with the ANDs as shown
in the following example:
...
WHERE WORKDEPT = E11 AND EDLEVEL = 12 AND JOB = CLERK
When two lists are used, the first item in the first list is compared to the first item in the second list, and
so on through both lists. Thus, each list must contain the same number of entries. Using lists is identical
to writing the query with AND. Lists can only be used with the equal and not equal comparison
operators.
Related reference:
Specifying a search condition using the WHERE clause on page 57
The WHERE clause specifies a search condition that identifies the row or rows that you want to retrieve,
update, or delete.
SQL programming
73
SALARY
RANK
DENSE_RANK
ROW_NUMBER
000010
52,750.00
000110
46,500.00
200010
46,500.00
000020
41,250.00
000050
40,175.00
000030
38,250.00
000070
36,170.00
000060
32,250.00
000220
29,840.00
200220
29,840.00
10
In this example, the SALARY descending order with the top 10 returned. The RANK column shows the
relative ranking of each salary. Notice that there are two rows with the same salary at position 2. Each of
those rows is assigned the same rank value. The following row is assigned the value of 4. RANK returns
a value for a row that is one more than the total number of rows that precede that row. There are gaps in
the numbering sequence whenever there are duplicates.
In contrast, the DENSE_RANK column shows a value of 3 for the row directly after the duplicate rows.
DENSE_RANK returns a value for a row that is one more than the number of distinct row values that
precede it. There will never be gaps in the numbering sequence.
ROW_NUMBER returns a unique number for each row. For rows that contain duplicate values according
to the specified ordering, the assignment of a row number is arbitrary; the row numbers could be
assigned in a different order for the duplicate rows when the query is run another time.
AVERAGE
AVG_SALARY
B01
41,250
A00
40,850
E01
40,175
C01
29,722
D21
25,668
D11
25,147
E21
24,086
E11
21,020
74
WORKDEPT
BONUS
BONUS_RANK_in_DEPT
GEYER
E01
800.00
HENDERSON
E11
600.00
SCHNEIDER
E11
500.00
SCHWARTZ
E11
500.00
SMITH
E11
400.00
PARKER
E11
300.00
SETRIGHT
E11
300.00
SPRINGER
E11
300.00
SPENSER
E21
500.00
LEE
E21
500.00
GOUNOT
E21
500.00
WONG
E21
500.00
ALONZO
E21
500.00
MENTA
E21
400.00
EMPNO
SALARY
DEPTNO
DEPTNAME
000010
52,750.00
A00
SPIFFY COMPUTER
SERVICE DIV.
SQL programming
75
EMPNO
SALARY
DEPTNO
DEPTNAME
000110
46,500.00
A00
SPIFFY COMPUTER
SERVICE DIV.
200010
46,500.00
A00
SPIFFY COMPUTER
SERVICE DIV.
000020
41,250.00
B01
PLANNING
000050
40,175.00
E01
SUPPORT SERVICES
76
the AND keyword. Any additional conditions that do not relate to the actual join are specified in either
the WHERE clause or as part of the actual join in the ON clause.
SELECT EMPNO, LASTNAME, PROJNO
FROM CORPDATA.EMPLOYEE INNER JOIN CORPDATA.PROJECT
ON EMPNO = RESPEMP
WHERE LASTNAME > S
In this example, the join is done on the two tables using the EMPNO and RESPEMP columns from the
tables. Since only employees that have last names starting with at least 'S' are to be returned, this
additional condition is provided in the WHERE clause.
This query returns the following output.
EMPNO
LASTNAME
PROJNO
000250
SMITH
AD3112
000060
STERN
MA2110
000100
SPENSER
OP2010
000020
THOMPSON
PL2100
The syntax in this statement is valid and equivalent to the join condition in the following statement:
SELECT EMPNO, ACSTDATE
FROM CORPDATA.PROJACT INNER JOIN CORPDATA.EMPPROJACT
ON CORPDATA.PROJACT.PROJNO = CORPDATA.EMPPROJACT.PROJNO AND
CORPDATA.PROJACT.ACTNO = CORPDATA.EMPPROJACT.ACTNO
WHERE ACSTDATE > 1982-12-31;
77
Suppose you want to find all employees and the projects they are currently responsible for. You want to
see those employees that are not currently in charge of a project as well. The following query will return
a list of all employees whose names are greater than 'S', along with their assigned project numbers.
SELECT EMPNO, LASTNAME, PROJNO
FROM CORPDATA.EMPLOYEE LEFT OUTER JOIN CORPDATA.PROJECT
ON EMPNO = RESPEMP
WHERE LASTNAME > S
The result of this query contains some employees that do not have a project number. They are listed in
the query, but have the null value returned for their project number.
EMPNO
LASTNAME
PROJNO
000020
THOMPSON
PL2100
000060
STERN
MA2110
000100
SPENSER
OP2010
000170
YOSHIMURA
000180
SCOUTTEN
000190
WALKER
000250
SMITH
AD3112
000280
SCHNEIDER
000300
SMITH
000310
SETRIGHT
200170
YAMAMOTO
200280
SCHWARTZ
200310
SPRINGER
200330
WONG
Note: Using the RRN scalar function to return the relative record number for a column in the table on
the right in a left outer join or exception join will return a value of 0 for the unmatched rows.
Right outer join:
A right outer join returns all the rows that an inner join returns plus one row for each of the other rows
in the second table that do not have a match in the first table. It is the same as a left outer join with the
tables specified in the opposite order.
The query that was used as the left outer join example can be rewritten as a right outer join as follows:
SELECT EMPNO, LASTNAME, PROJNO
FROM CORPDATA.PROJECT RIGHT OUTER JOIN CORPDATA.EMPLOYEE
ON EMPNO = RESPEMP
WHERE LASTNAME > S
The results of this query are identical to the results from the left outer join query.
Exception join:
A left exception join returns only the rows from the first table that do not have a match in the second
table.
Using the same tables as before, return those employees that are not responsible for any projects.
78
LASTNAME
PROJNO
000170
YOSHIMURA
000180
SCOUTTEN
000190
WALKER
000280
SCHNEIDER
000300
SMITH
000310
SETRIGHT
200170
YAMAMOTO
200280
SCHWARTZ
200310
SPRINGER
200330
WONG
An exception join can also be written as a subquery using the NOT EXISTS predicate. The previous query
can be rewritten in the following way:
SELECT EMPNO, LASTNAME
FROM CORPDATA.EMPLOYEE
WHERE LASTNAME > S
AND NOT EXISTS
(SELECT * FROM CORPDATA.PROJECT
WHERE EMPNO = RESPEMP)
The only difference in this query is that it cannot return values from the PROJECT table.
There is a right exception join, too, that works just like a left exception join but with the tables reversed.
Cross join:
A cross join, also known as a Cartesian Product join, returns a result table where each row from the first
table is combined with each row from the second table.
The number of rows in the result table is the product of the number of rows in each table. If the tables
involved are large, this join can take a very long time.
A cross join can be specified in two ways: using the JOIN syntax or by listing the tables in the FROM
clause separated by commas without using a WHERE clause to supply join criteria.
Suppose that the following tables exist.
Table 16. Table A
ACOL1
ACOL2
A1
AA1
A2
AA2
A3
AA3
SQL programming
79
BCOL2
B1
BB1
B2
BB2
The result table for either of these SELECT statements looks like this.
ACOL1
ACOL2
BCOL1
BCOL2
A1
AA1
B1
BB1
A1
AA1
B2
BB2
A2
AA2
B1
BB1
A2
AA2
B2
BB2
A3
AA3
B1
BB1
A3
AA3
B2
BB2
Because there are no projects without an assigned employee, the query returns the same rows as a left
outer join. Here are the results.
EMPNO
LASTNAME
PROJNO
000020
THOMPSON
PL2100
000060
STERN
MA2110
000100
SPENSER
OP2010
000170
YOSHIMURA
000180
SCOUTTEN
000190
WALKER
000250
SMITH
AD3112
000280
SCHNEIDER
000300
SMITH
000310
SETRIGHT
200170
YAMAMOTO
80
EMPNO
LASTNAME
PROJNO
200280
SCHWARTZ
200310
SPRINGER
200330
WONG
LASTNAME
DEPTNAME
PROJNO
000020
THOMPSON
PLANNING
PL2100
000060
STERN
MANUFACTURING SYSTEMS
MA2110
000100
SPENSER
SOFTWARE SUPPORT
OP2010
000170
YOSHIMURA
MANUFACTURING SYSTEMS
000180
SCOUTTEN
MANUFACTURING SYSTEMS
000190
WALKER
MANUFACTURING SYSTEMS
000250
SMITH
ADMINISTRATION SYSTEMS
AD3112
000280
SCHNEIDER
OPERATIONS
000300
SMITH
OPERATIONS
000310
SETRIGHT
OPERATIONS
SQL programming
81
FROM CORPDATA.EMPLOYEE E1
GROUP BY WORKDEPT) T2
WHERE T1.DEPTNO = T2.WORKDEPT
ORDER BY DEPTNO
DEPTNO
MAXSAL
000010
A00
52750.00
000020
B01
41250.00
000030
C01
38250.00
000060
D11
32250.00
000070
D21
36170.00
000050
E01
40175.00
000090
E11
29750.00
000100
E21
26150.00
Common table expressions can be specified before the full-select in a SELECT statement, an INSERT
statement, or a CREATE VIEW statement. They can be used when the same result table needs to be
shared in a full-select. Common table expressions are preceded with the keyword WITH.
For example, suppose you want a table that shows the minimum and maximum of the average salary of
a certain set of departments. The first character of the department number has some meaning and you
want to get the minimum and maximum for those departments that start with the letter 'D' and those
that start with the letter 'E'. You can use a common table expression to select the average salary for each
department. Again, you must name the derived table; in this case, the name is DT. You can then specify a
SELECT statement using a WHERE clause to restrict the selection to only the departments that begin with
a certain letter. Specify the minimum and maximum of column AVGSAL from the derived table DT.
Specify a UNION to get the results for the letter 'E' and the results for the letter 'D'.
WITH DT AS (SELECT E.WORKDEPT AS DEPTNO, AVG(SALARY) AS AVGSAL
FROM CORPDATA.DEPARTMENT D , CORPDATA.EMPLOYEE E
WHERE D.DEPTNO = E.WORKDEPT
GROUP BY E.WORKDEPT)
SELECT E, MAX(AVGSAL), MIN(AVGSAL) FROM DT
WHERE DEPTNO LIKE E%
UNION
SELECT D, MAX(AVGSAL), MIN(AVGSAL) FROM DT
WHERE DEPTNO LIKE D%
MIN(AVGSAL)
40175.00
21020.00
25668.57
25147.27
Suppose that you want to write a query against your ordering database that will return the top 5 items
(in total quantity ordered) within the last 1000 orders from customers who also ordered item 'XXX'.
WITH X AS (SELECT ORDER_ID, CUST_ID
FROM ORDERS
ORDER BY ORD_DATE DESC
FETCH FIRST 1000 ROWS ONLY),
Y AS (SELECT CUST_ID, LINE_ID, ORDER_QTY
FROM X, ORDERLINE
WHERE X.ORDER_ID = ORDERLINE.ORDER_ID)
82
SELECT LINE_ID
FROM (SELECT LINE_ID
FROM Y
WHERE Y.CUST_ID IN (SELECT DISTINCT CUST_ID
FROM Y
WHERE LINE.ID = XXX )
GROUP BY LINE_ID
ORDER BY SUM(ORDER_QTY) DESC)
FETCH FIRST 5 ROWS ONLY
The first common table expression (X) returns the most recent 1000 order numbers. The result is ordered
by the date in descending order and then only the first 1000 of those ordered rows are returned as the
result table.
The second common table expression (Y) joins the most recent 1000 orders with the line item table and
returns (for each of the 1000 orders) the customer, line item, and quantity of the line item for that order.
The derived table in the main select statement returns the line items for the customers who are in the top
1000 orders who ordered item XXX. The results for all customers who ordered XXX are then grouped by
the line item and the groups are ordered by the total quantity of the line item.
Finally, the outer select selects only the first 5 rows from the ordered list that the derived table returned.
Some applications work with data that is recursive in nature. To query this type of data, you can use a
hierarchical query or a recursive common table expression.
One example of recursive data is a Bill of Materials (BOM) application that works with the expansion of
parts and its component subparts. For example, a chair might be made of a seat unit and a leg assembly.
The seat unit might consist of a seat and two arms. Each of these parts can be further broken down into
its subparts until there is a list of all the parts needed to build a chair.
|
|
|
|
DB2 for i provides two ways of defining a recursive query. The first one is called a hierarchical query
which uses the CONNECT BY clause to define how a parent row is to be associated with its child rows.
The second method is to use a recursive common table expression. This uses a common table expression
to define the first, or seed, rows and then uses a UNION to define how the child rows are determined.
|
|
|
|
Each of these methods of defining a recursive query has advantages and disadvantages. The CONNECT
BY syntax is much simpler to understand, but has fewer ways to derive the data in its query. CONNECT
BY can be specified in any subselect anywhere in a query. A recursive common table expression has more
options for how the union is defined to generate the child rows.
|
|
|
|
|
|
|
|
|
There are a couple of behavioral differences between a connect by recursive query and a recursive
common table expression query. First, they differ in how they handle cyclic data. This difference is
discussed in the examples. Second, connect by allows a sort among siblings. This is also shown in the
examples. Finally, the two implementations differ in how the data is put on a queue that is used to
implement the recursion. By default a recursive common table expression's data tends to come out in
breadth first order, first in first out. With connect by, the order is designed to come out depth first. This
means that rows in a recursive step immediately follow their parent row. The recursive common table
expression syntax gives you a choice of depth or breadth first hierarchical order by adding the SEARCH
clause. The connect by syntax is always depth first.
In the trip planner examples for these recursive methods, airline flights and train connections are used to
find transportation paths between cities. The following table definitions and data are used in the
examples.
SQL programming
83
INTO
INTO
INTO
INTO
INTO
INTO
INTO
INTO
INTO
INTO
INTO
INTO
INTO
INTO
INTO
INTO
INTO
INTO
FLIGHTS
FLIGHTS
FLIGHTS
FLIGHTS
FLIGHTS
FLIGHTS
FLIGHTS
FLIGHTS
FLIGHTS
FLIGHTS
FLIGHTS
FLIGHTS
FLIGHTS
FLIGHTS
FLIGHTS
FLIGHTS
FLIGHTS
FLIGHTS
INTO
INTO
INTO
INTO
INTO
INTO
INTO
INTO
INTO
INTO
INTO
INTO
INTO
INTO
INTO
INTO
INTO
INTO
FLIGHTSTATS
FLIGHTSTATS
FLIGHTSTATS
FLIGHTSTATS
FLIGHTSTATS
FLIGHTSTATS
FLIGHTSTATS
FLIGHTSTATS
FLIGHTSTATS
FLIGHTSTATS
FLIGHTSTATS
FLIGHTSTATS
FLIGHTSTATS
FLIGHTSTATS
FLIGHTSTATS
FLIGHTSTATS
FLIGHTSTATS
FLIGHTSTATS
84
|
|
|
|
ORIGIN
DEPARTURE
ARRIVAL
FLIGHT_COUNT
Chicago
Chicago
Miami
Chicago
Miami
Lima
Chicago
Chicago
Frankfurt
Chicago
Frankfurt
Vienna
Chicago
Frankfurt
Beijing
Chicago
Frankfurt
Moscow
Chicago
Moscow
Tokyo
|
|
Chicago
Tokyo
Hawaii
|
|
|
|
|
|
|
|
|
There are several parts to this hierarchical query. There is an initial selection which defines the initial seed
for the recursion. In this case, it is the rows from the flights table that START WITH a departure from
Chicago'. The CONNECT BY clause is used to define how the rows that have already been generated are
to be 'connected' to generate more rows for subsequent iterations of the query. The PRIOR unary operator
tells DB2 how to select a new row based on the results of the previous row. The recursive join column
(typically one column but could have several) selected by the result of the START WITH clause is
referenced by the PRIOR keyword. This means that the previous row's ARRIVAL city becomes the new
row's PRIOR value for the DEPARTURE city. This is encapsulated in the clause CONNECT BY PRIOR
arrival = departure.
|
|
|
|
|
|
There are two other connect by features illustrated in this example query. The unary operator
CONNECT_BY_ROOT is used to define a fixed expression value that is determined in the initialization
step and is the same for all the generated recursive result rows. Typically, it is your starting value for that
particular iteration as you might have multiple START WITH values. In this query, it defines in the result
set the ORIGIN of the different destination options from Chicago. If the START WITH clause selected
multiple cities, ORIGIN would indicate which city a row used as its start value.
|
|
|
LEVEL is one of three pseudo columns available when using connect by recursion. The value of LEVEL
reflects the recursion level of the current row. In this example, LEVEL also reflects the number of flights it
would take to get from the city of ORIGIN (Chicago) to the different ARRIVAL cities.
|
|
|
|
|
|
A hierarchical query is run just like its equivalent recursive common table expression query and generates
the same result set. See Using recursive common table expressions and recursive views on page 90. The
only difference is in the order of the returned rows. The connect by query returns the rows in depth first
order; every row of the result set immediately follows its parent row. The recursive common table
expression query returns the rows in breadth first order; all the rows for one level are returned, then all
the rows that were generated from the previous level are returned.
|
|
Now, suppose you start in Chicago but want to add in transportation options by rail in addition to flights
and you want to know which cities you can get to and how many connections it would take.
|
|
The following connect by query returns that information. Note that in the corresponding recursive
common table expression example, Example: Two tables used for recursion using recursive common
SQL programming
85
|
|
|
|
|
|
|
|
|
|
|
table expressions on page 92, we can also distinguish between the number of rail vs number of airline
connections and sum the ongoing ticket cost of the connections to that destination. Those calculations are
examples of derivations allowed using the more complex but more flexible recursive common table
expression syntax. That capability is not available when using the connect by syntax.
SELECT CONNECT_BY_ROOT departure AS departure, arrival, LEVEL - 1 connections
FROM
( SELECT departure, arrival FROM flights
UNION
SELECT departure, arrival FROM trains) t
START WITH departure = Chicago
CONNECT BY PRIOR arrival = departure;
DEPARTURE
ARRIVAL
CONNECTIONS
Chicago
Miami
Chicago
Lima
Chicago
Frankfurt
Chicago
Vienna
Chicago
Beijing
Chicago
Moscow
Chicago
Tokyo
Chicago
Hawaii
Chicago
Washington
Chicago
Boston
|
|
Chicago
Toronto
| In this example, there are two data sources feeding the recursion, a list of flights and a list of trains. In
| the final results, you see how many connections are needed to travel between the cities.
| Example: Sibling ordering using CONNECT BY
|
|
|
|
|
|
|
|
|
|
|
One of the drawbacks of recursive common table expressions is that you cannot order the results among
siblings based on a particular column value. You can do this with connect by. For example, if you want to
output destinations from New York but you also want to order your hierarchical data among siblings by
a certain value, such as the cost of a ticket to that destination, you can do that by specifying the ORDER
SIBLINGS BY clause.
SELECT CONNECT_BY_ROOT departure AS origin, departure, arrival,
LEVEL level, price ticket_price
FROM flights
START WITH departure = New York
CONNECT BY PRIOR arrival = departure
ORDER SIBLINGS BY price ASC
ORIGIN
DEPARTURE
ARRIVAL
LEVEL
TICKET_PRICE
New York
New York
LA
330
New York
LA
Tokyo
530
New York
Tokyo
Hawaii
330
86
ORIGIN
DEPARTURE
ARRIVAL
LEVEL
TICKET_PRICE
New York
New York
London
350
New York
London
Athens
340
New York
Athens
Nicosia
280
New York
New York
Paris
400
New York
Paris
Rome
340
New York
Paris
Madrid
380
|
|
New York
Paris
Cairo
480
|
|
|
|
|
|
The result table shows all the destinations possible from the origin city of New York. All sibling
destinations (those destinations that originate from the same departure city) are output sorted by ticket
price. For example, the destinations from Paris are Rome, Madrid and Cairo; they are output ordered by
ascending ticket price. Note that the output shows New York to LA as the first destination directly from
New York because it has a less expensive ticket price (330) than did the direct connects to London or
Paris which are 350 and 400 respectively.
|
|
|
|
The key to any recursive process, whether is it a recursive program or a recursive query, is that the
recursion must be finite. If not, you will get into a never ending loop. CONNECT BY is unlike recursive
common table expressions in that it always checks for infinite recursion and terminates that cycle
automatically so you never have to worry about a runaway query.
|
|
By default, if connect by encounters cyclic data, it will issue an SQL error, SQ20451: Cycle detected in
hierarchical query. This error causes termination of the query so no results are returned.
|
|
If you want results back and just want the infinite cycle to stop, you can specify the NOCYCLE keyword
on the CONNECT BY clause. This means no error will be issued for cyclic data.
|
|
Using the NOCYCLE option along with the CONNECT_BY_ISCYCLE pseudo column is a way you can
find cyclic data and correct the data if desired.
|
|
|
Inserting the following row into the FLIGHTS table results in potentially infinite recursion since Paris
goes to Cairo and Cairo goes to Paris.
|
|
|
|
|
|
|
|
|
|
The following query illustrates the tolerance of the cyclic data by specifying NOCYCLE. In addition, the
CONNECT_BY_ISCYCLE pseudo column is used to identify cyclic rows and the function
SYS_CONNECT_BY_PATH is used to build an Itinerary string of all the connection cities leading up to
the destination. SYS_CONNECT_BY_PATH is implemented as a CLOB data type so you have a large
result column to reflect deep recursions.
SQL programming
87
ORIGIN
ARRIVAL
ITINERARY
CYCLIC
New York
Paris
: Paris
New York
Rome
: Paris : Rome
New York
Cairo
: Paris : Cairo
New York
Paris
New York
Madrid
: Paris : Madrid
New York
London
: London
New York
Athens
: London : Athens
New York
Nicosia
New York
LA
: LA
New York
Tokyo
: LA : Tokyo
|
|
New York
Hawaii
: LA : Tokyo : Hawaii
| Note that the result set row that reflects cyclic data often depends on where you start in the cycle with
| the START WITH clause.
| Example: Pseudo column CONNECT_BY_ISLEAF in CONNECT BY
| There may be times when processing recursive data that you may want to know which rows result in no
| further recursion. In other words, which rows are leaf rows or have no children in the hierarchy.
|
|
|
|
|
|
|
|
|
In the following query, you can find out which destinations are final destinations; in other words, which
destinations have no outbound flights. The CONNECT_BY_ISLEAF pseudo column will be 0 if it is not a
leaf and 1 if it is. You can also specify CONNECT_BY_ISLEAF in a WHERE predicate to see only leaf
rows.
SELECT CONNECT_BY_ROOT departure AS origin, arrival,
SYS_CONNECT_BY_PATH(TRIM(arrival), : ) itinerary, CONNECT_BY_ISLEAF leaf
FROM flights
START WITH departure = New York
CONNECT BY PRIOR arrival = departure;
ORIGIN
ARRIVAL
ITINERARY
LEAF
New York
Paris
: Paris
New York
Rome
: Paris : Rome
New York
Cairo
: Paris : Cairo
New York
Madrid
: Paris : Madrid
New York
London
: London
New York
Athens
: London : Athens
New York
Nicosia
New York
LA
: LA
New York
Tokyo
: LA : Tokyo
|
|
New York
Hawaii
: LA : Tokyo : Hawaii
88
|
|
Often times the hierarchical nature of your data is reflected in one table but you need to join those results
to additional tables to fully determine the output of the row.
|
|
|
|
|
In a connect by query you can use any type of join supported by DB2 for i including INNER JOIN, LEFT
OUTER JOIN, and LEFT EXCEPTION JOIN. When you explicitly use a JOIN clause, the predicate
specified in the ON clause is applied first, before the connect by operation, and any WHERE clause in the
connect by query is applied after the recursion. The WHERE selection is applied after the connect by so
that the recursive process results don't end too soon.
|
|
|
|
|
|
|
|
In the following query, you are looking for all the flight connections starting in New York that have an
ON_TIME_PERCENT greater than 90%.
ORIGIN
DEPARTURE
ARRIVAL
FLIGHT#
ONTIME
New York
Paris
Cairo
63
90.50
New York
Paris
Madrid
3256
92.00
New York
London
Athens
247
91.00
New York
Athens
Nicosia
2356
91.00
New York
LA
Tokyo
824
93.00
|
|
New York
Tokyo
Hawaii
94
92.00
|
|
|
|
|
|
|
|
This query can also be expressed without using the JOIN syntax. The query optimizer will pull out of the
WHERE clause those predicates that are join predicates to be processed first and leave any remaining
WHERE predicates to be evaluated after the recursion.
|
|
|
In this second example, if the WHERE predicates are more complex, you may need to aid the optimizer
by explicitly pulling out the JOIN predicates between the flights and flightstats tables and using both an
ON clause and a WHERE clause.
|
|
|
|
|
|
|
|
|
If you want additional search conditions to be applied as part of the recursion process, for example you
never want to take a flight with an on time percentage of less than 90%, you can also control the join
results by putting the join in a derived table with a join predicate and a WHERE clause.
Another option is to put the selection predicates in the START WITH and CONNECT BY clauses.
SQL programming
89
DEPARTURE
ARRIVAL
FLIGHT_COUNT
Chicago
Chicago
Miami
Chicago
Chicago
Frankfurt
Chicago
Miami
Lima
Chicago
Frankfurt
Moscow
Chicago
Frankfurt
Beijing
Chicago
Frankfurt
Vienna
Chicago
Moscow
Tokyo
Chicago
Tokyo
Hawaii
This recursive query is written in two parts. The first part of the common table expression is called the
intialization fullselect. It selects the first rows for the result set of the common table expression. In this
example, it selects the two rows in the flights table that get you directly to another location from Chicago.
It also initializes the number of flight legs to one for each row it selects.
The second part of the recursive query joins the rows from the current result set of the common table
expression with other rows from the original table. It is called the iterative fullselect. This is where the
recursion is introduced. Notice that the rows that have already been selected for the result set are
referenced by using the name of the common table expression as the table name and the common table
expression result column names as the column names.
In this recursive part of the query, any rows from the original table that you can get to from each of the
previously selected arrival cities are selected. A previously selected row's arrival city becomes the new
departure city. Each row from this recursive select increments the flight count to the destination by one
more flight. As these new rows are added to the common table expression result set, they are also fed
into the iterative fullselect to generate more result set rows. In the data for the final result, you can see
that the total number of flights is actually the total number of recursive joins (plus 1) it took to get to that
arrival city.
90
A recursive view looks very similar to a recursive common table expression. You can write the previous
recursive common table expression as a recursive view like this:
CREATE VIEW destinations (origin, departure, arrival, flight_count) AS
SELECT departure, departure, arrival, 1
FROM flights
WHERE departure = Chicago
UNION ALL
SELECT r.origin, b.departure, b.arrival, r.flight_count + 1
FROM destinations r, flights b
WHERE r.arrival = b.departure)
|
|
|
The iterative fullselect part of this view definition refers to the view itself. Selection from this view
returns the same rows as you get from the previous recursive common table expression. For comparison,
note that connect by recursion is allowed anywhere a SELECT is allowed, so it can easily be included in a
view definition.
ARRIVAL
CONNECTIONS
COST
Chicago
Miami
300
Chicago
Frankfurt
480
New York
Paris
400
New York
London
350
New York
Los Angeles
330
Chicago
Lima
830
Chicago
Moscow
1,060
Chicago
Beijing
960
Chicago
Vienna
680
New York
Madrid
780
New York
Cairo
880
New York
Rome
740
New York
Athens
690
New York
Tokyo
860
Chicago
Tokyo
1,740
New York
Nicosia
970
SQL programming
91
ARRIVAL
CONNECTIONS
COST
New York
Hawaii
1,190
Chicago
Hawaii
2,070
For each returned row, the results show the starting departure city and the final destination city. It counts
the number of connections needed rather than the total number of flight and adds up the total cost for all
the flights.
Example: Two tables used for recursion using recursive common table expressions
Now, suppose you start in Chicago but add in transportation by railway in addition to the airline flights,
and you want to know which cities you can go to.
The following query returns that information:
WITH destinations (departure, arrival, connections, flights, trains, cost) AS
(SELECT f.departure, f.arrival, 0, 1, 0, price
FROM flights f
WHERE f.departure = Chicago
UNION ALL
SELECT t.departure, t.arrival, 0, 0, 1, price
FROM trains t
WHERE t.departure = Chicago
UNION ALL
SELECT r.departure, b.arrival, r.connections + 1 , r.flights + 1, r.trains,
r.cost + b.price
FROM destinations r, flights b
WHERE r.arrival = b.departure
UNION ALL
SELECT r.departure, c.arrival, r.connections + 1 ,
r.flights, r.trains + 1, r.cost + c.price
FROM destinations r, trains c
WHERE r.arrival = c.departure)
SELECT departure, arrival, connections, flights, trains, cost
FROM destinations
ARRIVAL
CONNECTIONS
FLIGHTS
TRAINS
COST
Chicago
Miami
300
Chicago
Frankfurt
480
Chicago
Washington
90
Chicago
Lima
830
Chicago
Moscow
1,060
Chicago
Beijing
960
Chicago
Vienna
680
Chicago
Toronto
340
Chicago
Boston
140
Chicago
Tokyo
1,740
Chicago
Hawaii
2,070
92
In this example, there are two parts of the common table expression that provide initialization values to
the query: one for flights and one for trains. For each of the result rows, there are two recursive
references to get from the previous arrival location to the next possible destination: one for continuing by
air, the other for continuing by train. In the final results, you would see how many connections are
needed and how many airline or train trips can be taken.
Example: DEPTH FIRST and BREADTH FIRST options for recursive common table
expressions
The two examples here show the difference in the result set row order based on whether the recursion is
processed depth first or breadth first.
Note: The search clause is not supported directly for recursive views. You can define a view that contains
a recursive common table expression to get this function.
The option to determine the result using breadth first or depth first is a recursive relationship sort based
on the recursive join column specified for the SEARCH BY clause. When the recursion is handled breadth
first, all children are processed first, then all grandchildren, then all great grandchildren. When the
recursion is handled depth first, the full recursive ancestry chain of one child is processed before going to
the next child.
In both of these cases, you specify an extra column name that is used by the recursive process to keep
track of the depth first or breadth first ordering. This column must be used in the ORDER BY clause of
the outer query to get the rows back in the specified order. If this column is not used in the ORDER BY,
the DEPTH FIRST or BREADTH FIRST processing option is ignored.
The selection of which column to use for the SEARCH BY column is important. To have any meaning in
the result, it must be the column that is used in the iterative fullselect to join from the initialization
fullselect. In this example, ARRIVAL is the column to use.
The following query returns that information:
WITH destinations (departure, arrival, connections, cost) AS
(SELECT f.departure, f.arrival, 0, price
FROM flights f
WHERE f.departure = Chicago
UNION ALL
SELECT r.departure, b.arrival, r.connections + 1,
r.cost + b.price
FROM destinations r, flights b
WHERE r.arrival = b.departure)
SEARCH DEPTH FIRST BY arrival SET ordcol
SELECT *
FROM destinations
ORDER BY ordcol
ARRIVAL
CONNECTIONS
COST
Chicago
Miami
300
Chicago
Lima
830
Chicago
Frankfurt
480
Chicago
Moscow
1,060
Chicago
Tokyo
1,740
Chicago
Hawaii
2,070
SQL programming
93
ARRIVAL
CONNECTIONS
COST
Chicago
Beijing
960
Chicago
Vienna
680
In this result data, you can see that all destinations that are generated from the Chicago-to-Miami row are
listed before the destinations from the Chicago-to-Frankfort row.
Next, you can run the same query but request the result to be ordered breadth first.
WITH destinations (departure, arrival, connections, cost) AS
(SELECT f.departure, f.arrival, 0, price
FROM flights f
WHERE f.departure = Chicago
UNION ALL
SELECT r.departure, b.arrival, r.connections + 1,
r.cost + b.price
FROM destinations r, flights b
WHERE r.arrival = b.departure)
SEARCH BREADTH FIRST BY arrival SET ordcol
SELECT *
FROM destinations
ORDER BY ordcol
|
|
|
|
|
DEPARTURE
ARRIVAL
CONNECTIONS
COST
Chicago
Miami
300
Chicago
Frankfurt
480
Chicago
Lima
830
Chicago
Moscow
1,060
Chicago
Beijing
960
Chicago
Vienna
680
Chicago
Tokyo
1,740
Chicago
Hawaii
2,070
In this result data, you can see that all the direct connections from Chicago are listed before the
connecting flights. The data is identical to the results from the previous query, but in a breadth first order.
As you can see, there is no ordering done based on any values of the column used for depth or breadth
first processing. To get ordering, the ORDER SIBLINGS BY construct available with the CONNECT BY
form of recursion can be used.
94
For a final example, suppose we have a cycle in the data. By adding one more row to the table, there is
now a flight from Cairo to Paris and one from Paris to Cairo. Without accounting for possible cyclic data
like this, it is quite easy to generate a query that will go into an infinite loop processing the data.
The following query returns that information:
INSERT INTO FLIGHTS VALUES(Cairo, Paris, Euro Air, 1134, 440)
WITH destinations (departure, arrival, connections, cost, itinerary) AS
(SELECT f.departure, f.arrival, 1, price,
CAST(f.departure CONCAT f.arrival AS VARCHAR(2000))
FROM flights f
WHERE f.departure = New York
UNION ALL
SELECT r.departure, b.arrival, r.connections + 1 ,
r.cost + b.price, CAST(r.itinerary CONCAT b.arrival AS VARCHAR(2000))
FROM destinations r, flights b
WHERE r.arrival = b.departure)
CYCLE arrival SET cyclic_data TO 1 DEFAULT 0
SELECT departure, arrival, itinerary, cyclic_data
FROM destinations
ARRIVAL
ITINERARY
CYCLIC_DATA
New York
Paris
New York
Paris
New York
London
New York
London
New York
Los Angeles
New York
Los Angeles
New York
Madrid
New York
Paris
Madrid
New York
Cairo
New York
Paris
Cairo
New York
Rome
New York
Paris
Rome
New York
Athens
New York
London
New York
Tokyo
New York
Los Angeles
New York
Paris
New York
Paris
New York
Nicosia
New York
London
New York
Hawaii
New York
Los Angeles
Athens
Tokyo
Cairo
Paris
Athens
Tokyo
Nicosia
Hawaii
In this example, the ARRIVAL column is defined in the CYCLE clause as the column to use for detecting
a cycle in the data. When a cycle is found, a special column, CYCLIC_DATA in this case, is set to the
character value of '1' for the cycling row in the result set. All other rows will contain the default value of
'0'. When a cycle on the ARRIVAL column is found, processing will not proceed any further in the data so
the infinite loop will not happen. To see if your data actually has a cyclic reference, the CYCLIC_DATA
| column can be referenced in the outer query. You can choose to exclude cyclic rows by adding a
| predicate: WHERE CYCLIC_DATA = 0.
SQL programming
95
To better understand the results from these SQL statements, imagine that SQL goes through the following
process:
Step 1. SQL processes the first SELECT statement:
SELECT EMPNO
FROM CORPDATA.EMPLOYEE
WHERE WORKDEPT = D11
96
Step 3. SQL combines the two interim result tables, removes duplicate rows, and orders the result:
SELECT EMPNO
FROM CORPDATA.EMPLOYEE
WHERE WORKDEPT = D11
UNION
SELECT EMPNO
FROM CORPDATA.EMPPROJACT
WHERE PROJNO=MA2112 OR
PROJNO= MA2113 OR
PROJNO= AD3111
ORDER BY EMPNO
The query returns a combined result table with values in ascending sequence.
EMPNO
000060
000150
000160
000170
000180
000190
000200
000210
000220
000230
000240
200170
200220
97
v Any ORDER BY clause must appear after the last subselect that is part of the union. In this example,
the results are sequenced on the basis of the first selected column, EMPNO. The ORDER BY clause
specifies that the combined result table is to be in collated sequence. ORDER BY is not allowed in a
view.
v A name may be specified on the ORDER BY clause if the result columns are named. A result column is
named if the corresponding columns in each of the unioned select-statements have the same name. An
AS clause can be used to assign a name to columns in the select list.
SELECT A + B AS X ...
UNION
SELECT X ... ORDER BY X
If the result columns are unnamed, use a positive integer to order the result. The number refers to the
position of the expression in the list of expressions you include in your subselects.
SELECT A + B ...
UNION
SELECT X ... ORDER BY 1
To identify which subselect each row is from, you can include a constant at the end of the select list of
each subselect in the union. When SQL returns your results, the last column contains the constant for the
subselect that is the source of that row. For example, you can specify:
SELECT A, B, A1 ...
UNION
SELECT X, Y, B2...
When a row is returned, it includes a value (either A1 or B2) to indicate the table that is the source of the
row's values. If the column names in the union are different, SQL uses the set of column names specified
in the first subselect when interactive SQL displays or prints the results, or in the SQLDA resulting from
processing an SQL DESCRIBE statement.
Note: Sort sequence is applied after the fields across the UNION pieces are made compatible. The sort
sequence is used for the distinct processing that implicitly occurs during UNION processing.
Related concepts:
Sort sequences and normalization in SQL on page 128
A sort sequence defines how characters in a character set relate to each other when they are compared or
ordered. Normalization allows you to compare strings that contain combining characters.
Related reference:
Creating and using views on page 48
A view can be used to access data in one or more tables or views. You create a view by using a SELECT
statement.
Specifying the UNION ALL keyword:
If you want to keep duplicates in the result of a UNION operation, specify the UNION ALL keyword
instead of just UNION.
This topic uses the same steps and example as Using the UNION keyword to combine subselects on
page 95.
Step 3. SQL combines two interim result tables:
SELECT EMPNO
FROM CORPDATA.EMPLOYEE
WHERE WORKDEPT = D11
UNION ALL
SELECT EMPNO
FROM CORPDATA.EMPPROJACT
98
WHERE PROJNO=MA2112 OR
PROJNO= MA2113 OR
PROJNO= AD3111
ORDER BY EMPNO
99
When you include the UNION ALL in the same SQL statement as a UNION operator, however, the result
of the operation depends on the order of evaluation. Where there are no parentheses, evaluation is from
left to right. Where parentheses are included, the parenthesized subselect is evaluated first, followed,
from left to right, by the other parts of the statement.
To better understand the results from these SQL statements, imagine that SQL goes through the following
process:
Step 1. SQL processes the first SELECT statement:
SELECT EMPNO
FROM CORPDATA.EMPLOYEE
WHERE WORKDEPT = D11
100
Step 3. SQL takes the first interim result table, removes all of the rows that also appear in the second
interim result table, removes duplicate rows, and orders the result:
SELECT EMPNO
FROM CORPDATA.EMPLOYEE
WHERE WORKDEPT = D11
EXCEPT
SELECT EMPNO
FROM CORPDATA.EMPPROJACT
WHERE PROJNO=MA2112 OR
PROJNO= MA2113 OR
PROJNO= AD3111
ORDER BY EMPNO
This query returns a combined result table with values in ascending sequence.
EMPNO
000060
000200
000220
200170
SQL programming
101
EMPNO
200220
To better understand the results from these SQL statements, imagine that SQL goes through the following
process:
Step 1. SQL processes the first SELECT statement:
SELECT EMPNO
FROM CORPDATA.EMPLOYEE
WHERE WORKDEPT = D11
102
SELECT EMPNO
FROM CORPDATA.EMPPROJACT
WHERE PROJNO=MA2112 OR
PROJNO= MA2113 OR
PROJNO= AD3111
Step 3. SQL takes the first interim result table, compares it to the second interim result table, and returns
the rows that exist in both tables minus any duplicate rows, and orders the results.
SELECT EMPNO
FROM CORPDATA.EMPLOYEE
WHERE WORKDEPT = D11
INTERSECT
SELECT EMPNO
FROM CORPDATA.EMPPROJACT
WHERE PROJNO=MA2112 OR
PROJNO= MA2113 OR
PROJNO= AD3111
ORDER BY EMPNO
This query returns a combined result table with values in ascending sequence.
EMPNO
000150
000160
000170
000180
000190
SQL programming
103
EMPNO
000210
104
v If the data mapping error occurs while data is being assigned to a host variable in a SELECT INTO or
FETCH statement, and that same expression is used in the ORDER BY clause, the result record is
ordered based on the value of the expression. It is not ordered as if it were a null (higher than all other
values). This is because the expression was evaluated before the assignment to the host variable is
attempted.
v If the data mapping error occurs while an expression in the select-list is being evaluated and the same
expression is used in the ORDER BY clause, the result column is normally ordered as if it were a null
value (higher than all other values). If the ORDER BY clause is implemented by using a sort, the result
column is ordered as if it were a null value. If the ORDER BY clause is implemented by using an
existing index, in the following cases, the result column is ordered based on the actual value of the
expression in the index:
The expression is a date column with a date format of *MDY, *DMY, *YMD, or *JUL, and a date
conversion error occurs because the date is not within the valid range for dates.
The expression is a character column and a character cannot be converted.
The expression is a decimal column and a numeric value that is not valid is detected.
The INTO clause names the columns for which you specify values. The VALUES clause specifies a value
for each column named in the INTO clause. The value you specify can be:
v A constant. Inserts the value provided in the VALUES clause.
v A null value. Inserts the null value, using the keyword NULL. The column must be defined as capable
of containing a null value or an error occurs.
v A host variable. Inserts the contents of a host variable.
|
105
v You can verify that you are providing the values in the proper order based on the column names.
v You have better data independence. The order in which the columns are defined in the table does not
affect your INSERT statement.
If the column is defined to allow null values or to have a default, you do not need to name it in the
column name list or specify a value for it. The default value is used. If the column is defined to have a
default value, the default value is placed in the column. If DEFAULT was specified for the column
definition without an explicit default value, SQL places the default value for that data type in the
column. If the column does not have a default value defined for it, but is defined to allow the null value
(NOT NULL was not specified in the column definition), SQL places the null value in the column.
v For numeric columns, the default value is 0.
v For fixed length character or graphic columns, the default is blanks.
v For fixed length binary columns, the default is hexadecimal zeros.
v For varying length character, graphic, or binary columns and for LOB columns, the default is a zero
length string.
v For date, time, and timestamp columns, the default value is the current date, time, or timestamp. When
inserting a block of records, the default date/time value is extracted from the system when the block is
written. This means that the column will be assigned the same default value for each row in the block.
v For DataLink columns, the default value corresponds to DLVALUE('','URL','').
v For distinct-type columns, the default value is the default value of the corresponding source type.
v For ROWID columns or columns that are defined AS IDENTITY, the database manager generates a
default value.
v For XML columns, there is no default allowed except the null value.
When your program attempts to insert a row that duplicates another row already in the table, an error
might occur. Multiple null values may or may not be considered duplicate values, depending on the
option used when the index was created.
v If the table has a primary key, unique key, or unique index, the row is not inserted. Instead, SQL
returns an SQLCODE of -803.
v If the table does not have a primary key, unique key, or unique index, the row can be inserted without
error.
If SQL finds an error while running the INSERT statement, it stops inserting data. If you specify
COMMIT(*ALL), COMMIT(*CS), COMMIT(*CHG), or COMMIT(*RR), no rows are inserted. Rows already
inserted by this statement, in the case of INSERT with a select-statement or blocked insert, are deleted. If
you specify COMMIT(*NONE), any rows already inserted are not deleted.
A table created by SQL is created with the Reuse Deleted Records parameter of *YES. This allows the
database manager to reuse any rows in the table that were marked as deleted. The CHGPF command can
be used to change the attribute to *NO. This causes INSERT to always add rows to the end of the table.
The order in which rows are inserted does not guarantee the order in which they will be retrieved.
If the row is inserted without error, the SQLERRD(3) field of the SQLCA has a value of 1.
Note: For blocked INSERT or for INSERT with select-statement, more than one row can be inserted. The
number of rows inserted is reflected in SQLERRD(3) in the SQLCA. It is also available from the
ROW_COUNT diagnostics item in the GET DIAGNOSTICS statement.
Related reference:
INSERT
106
An example of this is to insert a new row into the DEPARTMENT table. The columns for the new row
are as follows:
v
v
v
v
You can also insert multiple rows into a table using the VALUES clause. The following example inserts
two rows into the PROJECT table. Values for the Project number (PROJNO) , Project name (PROJNAME),
Department number (DEPTNO), and Responsible employee (RESPEMP) are given in the values list. The
value for the Project start date (PRSTDATE) uses the current date. The rest of the columns in the table
that are not listed in the column list are assigned their default value.
INSERT INTO PROJECT (PROJNO, PROJNAME, DEPTNO, RESPEMP, PRSTDATE)
VALUES(HG0023, NEW NETWORK, E11, 200280, CURRENT DATE),
(HG0024, NETWORK PGM, E11", 200310, CURRENT DATE)
The select-statement embedded in the INSERT statement is no different from the select-statement you use
to retrieve data. With the exception of FOR READ ONLY, FOR UPDATE, or the OPTIMIZE clause, you
can use all the keywords, functions, and techniques used to retrieve data. SQL inserts all the rows that
meet the search conditions into the table you specify. Inserting rows from one table into another table
does not affect any existing rows in either the source table or the target table.
You should consider the following when inserting multiple rows into a table:
Notes:
1. The number of columns implicitly or explicitly listed in the INSERT statement must equal the
number of columns listed in the select-statement.
2. The data in the columns you are selecting must be compatible with the columns you are
inserting into when using the INSERT with select-statement.
3. In the event the select-statement embedded in the INSERT returns no rows, an SQLCODE of
100 is returned to alert you that no rows were inserted. If you successfully insert rows, the
SQLERRD(3) field of the SQLCA has an integer representing the number of rows SQL actually
inserted. This value is also available from the ROW_COUNT diagnostics item in the GET
DIAGNOSTICS statement.
4. If SQL finds an error while running the INSERT statement, SQL stops the operation. If you
specify COMMIT (*CHG), COMMIT(*CS), COMMIT (*ALL), or COMMIT(*RR), nothing is
inserted into the table and a negative SQLCODE is returned. If you specify COMMIT(*NONE),
any rows inserted before the error remain in the table.
SQL programming
107
DSTRUCT is a host structure array with five elements that is declared in the program. The five elements
correspond to EMPNO, FIRSTNME, MIDINIT, LASTNAME, and WORKDEPT. DSTRUCT has a
dimension of at least ten to accommodate inserting ten rows. ISTRUCT is a host structure array that is
declared in the program. ISTRUCT has a dimension of at least ten small integer fields for the indicators.
Blocked INSERT statements are supported for non-distributed SQL applications and for distributed
applications where both the application server and the application requester are System i products.
Related concepts:
Embedded SQL programming