Postgresql 8.0 US PDF
Postgresql 8.0 US PDF
0 Documentation
Legal Notice
PostgreSQL is Copyright © 1996-2005 by the PostgreSQL Global Development Group and is distributed under the terms of the license of the
University of California below.
Postgres95 is Copyright © 1994-5 by the Regents of the University of California.
Permission to use, copy, modify, and distribute this software and its documentation for any purpose, without fee, and without a written agreement
is hereby granted, provided that the above copyright notice and this paragraph and the following two paragraphs appear in all copies.
IN NO EVENT SHALL THE UNIVERSITY OF CALIFORNIA BE LIABLE TO ANY PARTY FOR DIRECT, INDIRECT, SPECIAL, INCI-
DENTAL, OR CONSEQUENTIAL DAMAGES, INCLUDING LOST PROFITS, ARISING OUT OF THE USE OF THIS SOFTWARE AND ITS
DOCUMENTATION, EVEN IF THE UNIVERSITY OF CALIFORNIA HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
THE UNIVERSITY OF CALIFORNIA SPECIFICALLY DISCLAIMS ANY WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IM-
PLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE SOFTWARE PROVIDED HERE-
UNDER IS ON AN “AS-IS” BASIS, AND THE UNIVERSITY OF CALIFORNIA HAS NO OBLIGATIONS TO PROVIDE MAINTENANCE,
SUPPORT, UPDATES, ENHANCEMENTS, OR MODIFICATIONS.
Table of Contents
Preface .........................................................................................................................................................i
1. What is PostgreSQL? ......................................................................................................................i
2. A Brief History of PostgreSQL..................................................................................................... ii
2.1. The Berkeley POSTGRES Project ................................................................................... ii
2.2. Postgres95........................................................................................................................ iii
2.3. PostgreSQL...................................................................................................................... iii
3. Conventions.................................................................................................................................. iii
4. Further Information.......................................................................................................................iv
5. Bug Reporting Guidelines.............................................................................................................iv
5.1. Identifying Bugs ................................................................................................................v
5.2. What to report....................................................................................................................v
5.3. Where to report bugs ...................................................................................................... vii
I. Tutorial....................................................................................................................................................1
1. Getting Started ...............................................................................................................................1
1.1. Installation .........................................................................................................................1
1.2. Architectural Fundamentals...............................................................................................1
1.3. Creating a Database...........................................................................................................2
1.4. Accessing a Database ........................................................................................................3
2. The SQL Language ........................................................................................................................6
2.1. Introduction .......................................................................................................................6
2.2. Concepts ............................................................................................................................6
2.3. Creating a New Table ........................................................................................................6
2.4. Populating a Table With Rows ..........................................................................................7
2.5. Querying a Table ...............................................................................................................8
2.6. Joins Between Tables.......................................................................................................10
2.7. Aggregate Functions........................................................................................................12
2.8. Updates ............................................................................................................................14
2.9. Deletions..........................................................................................................................14
3. Advanced Features .......................................................................................................................16
3.1. Introduction .....................................................................................................................16
3.2. Views ...............................................................................................................................16
3.3. Foreign Keys....................................................................................................................16
3.4. Transactions.....................................................................................................................17
3.5. Inheritance .......................................................................................................................19
3.6. Conclusion.......................................................................................................................21
II. The SQL Language.............................................................................................................................22
4. SQL Syntax ..................................................................................................................................24
4.1. Lexical Structure..............................................................................................................24
4.1.1. Identifiers and Key Words...................................................................................24
4.1.2. Constants.............................................................................................................25
4.1.2.1. String Constants .....................................................................................25
4.1.2.2. Dollar-Quoted String Constants .............................................................26
4.1.2.3. Bit-String Constants ...............................................................................27
4.1.2.4. Numeric Constants .................................................................................27
4.1.2.5. Constants of Other Types .......................................................................28
iii
4.1.3. Operators.............................................................................................................28
4.1.4. Special Characters...............................................................................................29
4.1.5. Comments ...........................................................................................................30
4.1.6. Lexical Precedence .............................................................................................30
4.2. Value Expressions............................................................................................................31
4.2.1. Column References.............................................................................................32
4.2.2. Positional Parameters..........................................................................................32
4.2.3. Subscripts............................................................................................................33
4.2.4. Field Selection ....................................................................................................33
4.2.5. Operator Invocations...........................................................................................33
4.2.6. Function Calls .....................................................................................................34
4.2.7. Aggregate Expressions........................................................................................34
4.2.8. Type Casts ...........................................................................................................35
4.2.9. Scalar Subqueries................................................................................................36
4.2.10. Array Constructors............................................................................................36
4.2.11. Row Constructors..............................................................................................37
4.2.12. Expression Evaluation Rules ............................................................................38
5. Data Definition .............................................................................................................................40
5.1. Table Basics.....................................................................................................................40
5.2. Default Values .................................................................................................................41
5.3. Constraints.......................................................................................................................42
5.3.1. Check Constraints ...............................................................................................42
5.3.2. Not-Null Constraints...........................................................................................44
5.3.3. Unique Constraints..............................................................................................45
5.3.4. Primary Keys.......................................................................................................46
5.3.5. Foreign Keys .......................................................................................................47
5.4. System Columns..............................................................................................................49
5.5. Inheritance .......................................................................................................................50
5.6. Modifying Tables.............................................................................................................53
5.6.1. Adding a Column................................................................................................53
5.6.2. Removing a Column ...........................................................................................54
5.6.3. Adding a Constraint ............................................................................................54
5.6.4. Removing a Constraint .......................................................................................54
5.6.5. Changing a Column’s Default Value...................................................................55
5.6.6. Changing a Column’s Data Type ........................................................................55
5.6.7. Renaming a Column ...........................................................................................55
5.6.8. Renaming a Table ...............................................................................................55
5.7. Privileges .........................................................................................................................56
5.8. Schemas...........................................................................................................................56
5.8.1. Creating a Schema ..............................................................................................57
5.8.2. The Public Schema .............................................................................................58
5.8.3. The Schema Search Path.....................................................................................58
5.8.4. Schemas and Privileges.......................................................................................59
5.8.5. The System Catalog Schema ..............................................................................60
5.8.6. Usage Patterns.....................................................................................................60
5.8.7. Portability............................................................................................................61
5.9. Other Database Objects ...................................................................................................61
5.10. Dependency Tracking....................................................................................................61
iv
6. Data Manipulation........................................................................................................................63
6.1. Inserting Data ..................................................................................................................63
6.2. Updating Data..................................................................................................................64
6.3. Deleting Data...................................................................................................................65
7. Queries .........................................................................................................................................66
7.1. Overview .........................................................................................................................66
7.2. Table Expressions ............................................................................................................66
7.2.1. The FROM Clause.................................................................................................67
7.2.1.1. Joined Tables ..........................................................................................67
7.2.1.2. Table and Column Aliases......................................................................70
7.2.1.3. Subqueries ..............................................................................................71
7.2.1.4. Table Functions ......................................................................................72
7.2.2. The WHERE Clause...............................................................................................73
7.2.3. The GROUP BY and HAVING Clauses..................................................................74
7.3. Select Lists.......................................................................................................................76
7.3.1. Select-List Items .................................................................................................76
7.3.2. Column Labels ....................................................................................................77
7.3.3. DISTINCT ...........................................................................................................77
7.4. Combining Queries..........................................................................................................77
7.5. Sorting Rows ...................................................................................................................78
7.6. LIMIT and OFFSET..........................................................................................................79
8. Data Types....................................................................................................................................81
8.1. Numeric Types.................................................................................................................82
8.1.1. Integer Types.......................................................................................................83
8.1.2. Arbitrary Precision Numbers ..............................................................................83
8.1.3. Floating-Point Types ...........................................................................................84
8.1.4. Serial Types.........................................................................................................85
8.2. Monetary Types ...............................................................................................................86
8.3. Character Types ...............................................................................................................86
8.4. Binary Data Types ...........................................................................................................88
8.5. Date/Time Types..............................................................................................................90
8.5.1. Date/Time Input ..................................................................................................91
8.5.1.1. Dates.......................................................................................................92
8.5.1.2. Times ......................................................................................................92
8.5.1.3. Time Stamps...........................................................................................93
8.5.1.4. Intervals ..................................................................................................94
8.5.1.5. Special Values ........................................................................................94
8.5.2. Date/Time Output ...............................................................................................95
8.5.3. Time Zones .........................................................................................................96
8.5.4. Internals...............................................................................................................97
8.6. Boolean Type...................................................................................................................97
8.7. Geometric Types..............................................................................................................98
8.7.1. Points ..................................................................................................................98
8.7.2. Line Segments.....................................................................................................99
8.7.3. Boxes...................................................................................................................99
8.7.4. Paths....................................................................................................................99
8.7.5. Polygons..............................................................................................................99
8.7.6. Circles ...............................................................................................................100
v
8.8. Network Address Types.................................................................................................100
8.8.1. inet ..................................................................................................................100
8.8.2. cidr ..................................................................................................................101
8.8.3. inet vs. cidr ...................................................................................................101
8.8.4. macaddr ...........................................................................................................102
8.9. Bit String Types .............................................................................................................102
8.10. Arrays ..........................................................................................................................103
8.10.1. Declaration of Array Types.............................................................................103
8.10.2. Array Value Input............................................................................................104
8.10.3. Accessing Arrays ............................................................................................105
8.10.4. Modifying Arrays............................................................................................107
8.10.5. Searching in Arrays.........................................................................................109
8.10.6. Array Input and Output Syntax.......................................................................110
8.11. Composite Types .........................................................................................................111
8.11.1. Declaration of Composite Types.....................................................................111
8.11.2. Composite Value Input....................................................................................112
8.11.3. Accessing Composite Types ...........................................................................113
8.11.4. Modifying Composite Types...........................................................................114
8.11.5. Composite Type Input and Output Syntax......................................................114
8.12. Object Identifier Types ................................................................................................115
8.13. Pseudo-Types...............................................................................................................117
9. Functions and Operators ............................................................................................................119
9.1. Logical Operators ..........................................................................................................119
9.2. Comparison Operators...................................................................................................119
9.3. Mathematical Functions and Operators.........................................................................121
9.4. String Functions and Operators .....................................................................................124
9.5. Binary String Functions and Operators .........................................................................132
9.6. Bit String Functions and Operators ...............................................................................134
9.7. Pattern Matching ...........................................................................................................135
9.7.1. LIKE ..................................................................................................................135
9.7.2. SIMILAR TO Regular Expressions ...................................................................136
9.7.3. POSIX Regular Expressions .............................................................................137
9.7.3.1. Regular Expression Details ..................................................................138
9.7.3.2. Bracket Expressions .............................................................................140
9.7.3.3. Regular Expression Escapes.................................................................141
9.7.3.4. Regular Expression Metasyntax...........................................................144
9.7.3.5. Regular Expression Matching Rules ....................................................145
9.7.3.6. Limits and Compatibility .....................................................................146
9.7.3.7. Basic Regular Expressions ...................................................................147
9.8. Data Type Formatting Functions ...................................................................................147
9.9. Date/Time Functions and Operators..............................................................................153
9.9.1. EXTRACT, date_part ......................................................................................155
9.9.2. date_trunc .....................................................................................................159
9.9.3. AT TIME ZONE.................................................................................................159
9.9.4. Current Date/Time ............................................................................................160
9.10. Geometric Functions and Operators............................................................................162
9.11. Network Address Functions and Operators.................................................................165
9.12. Sequence Manipulation Functions ..............................................................................167
vi
9.13. Conditional Expressions..............................................................................................169
9.13.1. CASE ................................................................................................................169
9.13.2. COALESCE .......................................................................................................170
9.13.3. NULLIF............................................................................................................171
9.14. Array Functions and Operators ...................................................................................171
9.15. Aggregate Functions....................................................................................................172
9.16. Subquery Expressions .................................................................................................175
9.16.1. EXISTS............................................................................................................175
9.16.2. IN ....................................................................................................................175
9.16.3. NOT IN............................................................................................................176
9.16.4. ANY/SOME ........................................................................................................176
9.16.5. ALL ..................................................................................................................177
9.16.6. Row-wise Comparison....................................................................................178
9.17. Row and Array Comparisons ......................................................................................178
9.17.1. IN ....................................................................................................................178
9.17.2. NOT IN............................................................................................................179
9.17.3. ANY/SOME (array) ............................................................................................179
9.17.4. ALL (array) ......................................................................................................179
9.17.5. Row-wise Comparison....................................................................................180
9.18. Set Returning Functions ..............................................................................................180
9.19. System Information Functions ....................................................................................181
9.20. System Administration Functions ...............................................................................186
10. Type Conversion.......................................................................................................................189
10.1. Overview .....................................................................................................................189
10.2. Operators .....................................................................................................................190
10.3. Functions .....................................................................................................................193
10.4. Value Storage...............................................................................................................196
10.5. UNION, CASE, and ARRAY Constructs ..........................................................................196
11. Indexes .....................................................................................................................................199
11.1. Introduction .................................................................................................................199
11.2. Index Types..................................................................................................................200
11.3. Multicolumn Indexes...................................................................................................201
11.4. Unique Indexes ............................................................................................................202
11.5. Indexes on Expressions ...............................................................................................202
11.6. Operator Classes..........................................................................................................203
11.7. Partial Indexes .............................................................................................................204
11.8. Examining Index Usage...............................................................................................206
12. Concurrency Control................................................................................................................208
12.1. Introduction .................................................................................................................208
12.2. Transaction Isolation ...................................................................................................208
12.2.1. Read Committed Isolation Level ....................................................................209
12.2.2. Serializable Isolation Level.............................................................................210
12.2.2.1. Serializable Isolation versus True Serializability ...............................211
12.3. Explicit Locking ..........................................................................................................212
12.3.1. Table-Level Locks...........................................................................................212
12.3.2. Row-Level Locks ............................................................................................213
12.3.3. Deadlocks........................................................................................................214
12.4. Data Consistency Checks at the Application Level.....................................................215
vii
12.5. Locking and Indexes....................................................................................................215
13. Performance Tips .....................................................................................................................217
13.1. Using EXPLAIN ...........................................................................................................217
13.2. Statistics Used by the Planner .....................................................................................220
13.3. Controlling the Planner with Explicit JOIN Clauses...................................................222
13.4. Populating a Database .................................................................................................223
13.4.1. Disable Autocommit .......................................................................................224
13.4.2. Use COPY .........................................................................................................224
13.4.3. Remove Indexes ..............................................................................................224
13.4.4. Increase maintenance_work_mem ...............................................................224
13.4.5. Increase checkpoint_segments .................................................................224
13.4.6. Run ANALYZE Afterwards...............................................................................225
III. Server Administration ....................................................................................................................226
14. Installation Instructions............................................................................................................228
14.1. Short Version ...............................................................................................................228
14.2. Requirements...............................................................................................................228
14.3. Getting The Source......................................................................................................230
14.4. If You Are Upgrading..................................................................................................230
14.5. Installation Procedure..................................................................................................232
14.6. Post-Installation Setup.................................................................................................237
14.6.1. Shared Libraries ..............................................................................................237
14.6.2. Environment Variables....................................................................................238
14.7. Supported Platforms ....................................................................................................239
15. Client-Only Installation on Windows.......................................................................................245
16. Server Run-time Environment .................................................................................................246
16.1. The PostgreSQL User Account ...................................................................................246
16.2. Creating a Database Cluster ........................................................................................246
16.3. Starting the Database Server........................................................................................247
16.3.1. Server Start-up Failures ..................................................................................248
16.3.2. Client Connection Problems ...........................................................................249
16.4. Run-time Configuration...............................................................................................250
16.4.1. File Locations..................................................................................................251
16.4.2. Connections and Authentication .....................................................................252
16.4.2.1. Connection Settings............................................................................252
16.4.2.2. Security and Authentication ...............................................................253
16.4.3. Resource Consumption ...................................................................................254
16.4.3.1. Memory ..............................................................................................254
16.4.3.2. Free Space Map ..................................................................................255
16.4.3.3. Kernel Resource Usage ......................................................................255
16.4.3.4. Cost-Based Vacuum Delay.................................................................256
16.4.3.5. Background Writer .............................................................................257
16.4.4. Write Ahead Log.............................................................................................258
16.4.4.1. Settings ...............................................................................................258
16.4.4.2. Checkpoints........................................................................................259
16.4.4.3. Archiving............................................................................................259
16.4.5. Query Planning ...............................................................................................260
16.4.5.1. Planner Method Configuration ...........................................................260
viii
16.4.5.2. Planner Cost Constants.......................................................................261
16.4.5.3. Genetic Query Optimizer ...................................................................261
16.4.5.4. Other Planner Options ........................................................................262
16.4.6. Error Reporting and Logging..........................................................................263
16.4.6.1. Where to log .......................................................................................263
16.4.6.2. When To Log......................................................................................264
16.4.6.3. What To Log.......................................................................................266
16.4.7. Runtime Statistics ...........................................................................................268
16.4.7.1. Statistics Monitoring ..........................................................................268
16.4.7.2. Query and Index Statistics Collector..................................................268
16.4.8. Client Connection Defaults.............................................................................269
16.4.8.1. Statement Behavior ............................................................................269
16.4.8.2. Locale and Formatting........................................................................270
16.4.8.3. Other Defaults ....................................................................................271
16.4.9. Lock Management ..........................................................................................272
16.4.10. Version and Platform Compatibility .............................................................273
16.4.10.1. Previous PostgreSQL Versions.........................................................273
16.4.10.2. Platform and Client Compatibility ...................................................273
16.4.11. Preset Options ...............................................................................................274
16.4.12. Customized Options......................................................................................275
16.4.13. Developer Options ........................................................................................276
16.4.14. Short Options ................................................................................................277
16.5. Managing Kernel Resources........................................................................................277
16.5.1. Shared Memory and Semaphores ...................................................................277
16.5.2. Resource Limits ..............................................................................................282
16.5.3. Linux Memory Overcommit ...........................................................................283
16.6. Shutting Down the Server............................................................................................284
16.7. Secure TCP/IP Connections with SSL ........................................................................284
16.8. Secure TCP/IP Connections with SSH Tunnels ..........................................................285
17. Database Users and Privileges .................................................................................................287
17.1. Database Users ............................................................................................................287
17.2. User Attributes.............................................................................................................288
17.3. Groups .........................................................................................................................288
17.4. Privileges .....................................................................................................................289
17.5. Functions and Triggers ................................................................................................289
18. Managing Databases ................................................................................................................291
18.1. Overview .....................................................................................................................291
18.2. Creating a Database.....................................................................................................291
18.3. Template Databases .....................................................................................................292
18.4. Database Configuration ...............................................................................................293
18.5. Destroying a Database .................................................................................................294
18.6. Tablespaces..................................................................................................................294
19. Client Authentication ...............................................................................................................297
19.1. The pg_hba.conf file ................................................................................................297
19.2. Authentication methods...............................................................................................302
19.2.1. Trust authentication.........................................................................................302
19.2.2. Password authentication..................................................................................302
19.2.3. Kerberos authentication ..................................................................................302
ix
19.2.4. Ident-based authentication ..............................................................................303
19.2.4.1. Ident Authentication over TCP/IP ......................................................303
19.2.4.2. Ident Authentication over Local Sockets ...........................................304
19.2.4.3. Ident Maps..........................................................................................304
19.2.5. PAM authentication.........................................................................................305
19.3. Authentication problems .............................................................................................305
20. Localization..............................................................................................................................307
20.1. Locale Support.............................................................................................................307
20.1.1. Overview.........................................................................................................307
20.1.2. Behavior ..........................................................................................................308
20.1.3. Problems .........................................................................................................309
20.2. Character Set Support..................................................................................................309
20.2.1. Supported Character Sets................................................................................309
20.2.2. Setting the Character Set.................................................................................310
20.2.3. Automatic Character Set Conversion Between Server and Client..................311
20.2.4. Further Reading ..............................................................................................314
21. Routine Database Maintenance Tasks......................................................................................315
21.1. Routine Vacuuming .....................................................................................................315
21.1.1. Recovering disk space.....................................................................................315
21.1.2. Updating planner statistics..............................................................................316
21.1.3. Preventing transaction ID wraparound failures ..............................................317
21.2. Routine Reindexing .....................................................................................................319
21.3. Log File Maintenance..................................................................................................319
22. Backup and Restore .................................................................................................................321
22.1. SQL Dump...................................................................................................................321
22.1.1. Restoring the dump .........................................................................................321
22.1.2. Using pg_dumpall...........................................................................................322
22.1.3. Handling large databases ................................................................................322
22.1.4. Caveats ............................................................................................................323
22.2. File system level backup..............................................................................................323
22.3. On-line backup and point-in-time recovery (PITR) ....................................................324
22.3.1. Setting up WAL archiving...............................................................................325
22.3.2. Making a Base Backup ...................................................................................327
22.3.3. Recovering with an On-line Backup...............................................................328
22.3.3.1. Recovery Settings...............................................................................330
22.3.4. Timelines.........................................................................................................331
22.3.5. Caveats ............................................................................................................332
22.4. Migration Between Releases .......................................................................................332
23. Monitoring Database Activity..................................................................................................334
23.1. Standard Unix Tools ....................................................................................................334
23.2. The Statistics Collector................................................................................................334
23.2.1. Statistics Collection Configuration .................................................................335
23.2.2. Viewing Collected Statistics ...........................................................................335
23.3. Viewing Locks.............................................................................................................339
24. Monitoring Disk Usage ............................................................................................................341
24.1. Determining Disk Usage .............................................................................................341
24.2. Disk Full Failure..........................................................................................................342
25. Write-Ahead Logging (WAL) ..................................................................................................343
x
25.1. Benefits of WAL ..........................................................................................................343
25.2. WAL Configuration .....................................................................................................343
25.3. Internals .......................................................................................................................345
26. Regression Tests.......................................................................................................................347
26.1. Running the Tests ........................................................................................................347
26.2. Test Evaluation ............................................................................................................348
26.2.1. Error message differences...............................................................................348
26.2.2. Locale differences ...........................................................................................349
26.2.3. Date and time differences ...............................................................................349
26.2.4. Floating-point differences ...............................................................................349
26.2.5. Row ordering differences................................................................................350
26.2.6. The “random” test ...........................................................................................350
26.3. Platform-specific comparison files ..............................................................................350
IV. Client Interfaces ..............................................................................................................................352
27. libpq - C Library ......................................................................................................................354
27.1. Database Connection Control Functions .....................................................................354
27.2. Connection Status Functions .......................................................................................360
27.3. Command Execution Functions ..................................................................................364
27.3.1. Main Functions ...............................................................................................364
27.3.2. Retrieving Query Result Information .............................................................370
27.3.3. Retrieving Result Information for Other Commands .....................................373
27.3.4. Escaping Strings for Inclusion in SQL Commands ........................................374
27.3.5. Escaping Binary Strings for Inclusion in SQL Commands ............................375
27.4. Asynchronous Command Processing ..........................................................................376
27.5. Cancelling Queries in Progress ...................................................................................380
27.6. The Fast-Path Interface................................................................................................381
27.7. Asynchronous Notification..........................................................................................382
27.8. Functions Associated with the COPY Command .........................................................383
27.8.1. Functions for Sending COPY Data...................................................................384
27.8.2. Functions for Receiving COPY Data................................................................385
27.8.3. Obsolete Functions for COPY ..........................................................................386
27.9. Control Functions ........................................................................................................388
27.10. Notice Processing ......................................................................................................388
27.11. Environment Variables ..............................................................................................390
27.12. The Password File .....................................................................................................391
27.13. SSL Support...............................................................................................................392
27.14. Behavior in Threaded Programs ................................................................................392
27.15. Building libpq Programs............................................................................................392
27.16. Example Programs.....................................................................................................394
28. Large Objects ...........................................................................................................................403
28.1. History .........................................................................................................................403
28.2. Implementation Features .............................................................................................403
28.3. Client Interfaces...........................................................................................................403
28.3.1. Creating a Large Object ..................................................................................403
28.3.2. Importing a Large Object................................................................................404
28.3.3. Exporting a Large Object................................................................................404
28.3.4. Opening an Existing Large Object..................................................................404
xi
28.3.5. Writing Data to a Large Object.......................................................................405
28.3.6. Reading Data from a Large Object .................................................................405
28.3.7. Seeking in a Large Object...............................................................................405
28.3.8. Obtaining the Seek Position of a Large Object...............................................405
28.3.9. Closing a Large Object Descriptor .................................................................405
28.3.10. Removing a Large Object .............................................................................406
28.4. Server-Side Functions..................................................................................................406
28.5. Example Program ........................................................................................................406
29. ECPG - Embedded SQL in C...................................................................................................412
29.1. The Concept.................................................................................................................412
29.2. Connecting to the Database Server..............................................................................412
29.3. Closing a Connection ..................................................................................................413
29.4. Running SQL Commands............................................................................................414
29.5. Choosing a Connection................................................................................................415
29.6. Using Host Variables ...................................................................................................415
29.6.1. Overview.........................................................................................................415
29.6.2. Declare Sections..............................................................................................416
29.6.3. SELECT INTO and FETCH INTO ...................................................................416
29.6.4. Indicators.........................................................................................................417
29.7. Dynamic SQL..............................................................................................................418
29.8. Using SQL Descriptor Areas.......................................................................................419
29.9. Error Handling.............................................................................................................421
29.9.1. Setting Callbacks ............................................................................................421
29.9.2. sqlca ................................................................................................................423
29.9.3. SQLSTATE vs SQLCODE...................................................................................424
29.10. Including Files ...........................................................................................................426
29.11. Processing Embedded SQL Programs.......................................................................427
29.12. Library Functions ......................................................................................................428
29.13. Internals .....................................................................................................................428
30. The Information Schema..........................................................................................................431
30.1. The Schema .................................................................................................................431
30.2. Data Types ...................................................................................................................431
30.3. information_schema_catalog_name ..................................................................432
30.4. applicable_roles...................................................................................................432
30.5. check_constraints ................................................................................................432
30.6. column_domain_usage ............................................................................................433
30.7. column_privileges ................................................................................................433
30.8. column_udt_usage...................................................................................................434
30.9. columns ......................................................................................................................435
30.10. constraint_column_usage .................................................................................439
30.11. constraint_table_usage....................................................................................439
30.12. data_type_privileges ........................................................................................440
30.13. domain_constraints ............................................................................................441
30.14. domain_udt_usage.................................................................................................441
30.15. domains ....................................................................................................................442
30.16. element_types .......................................................................................................445
30.17. enabled_roles .......................................................................................................447
30.18. key_column_usage.................................................................................................448
xii
30.19. parameters..............................................................................................................448
30.20. referential_constraints .................................................................................451
30.21. role_column_grants ............................................................................................452
30.22. role_routine_grants ..........................................................................................452
30.23. role_table_grants ..............................................................................................453
30.24. role_usage_grants ..............................................................................................454
30.25. routine_privileges ............................................................................................454
30.26. routines ..................................................................................................................455
30.27. schemata ..................................................................................................................459
30.28. sql_features .........................................................................................................460
30.29. sql_implementation_info .................................................................................460
30.30. sql_languages .......................................................................................................461
30.31. sql_packages .........................................................................................................462
30.32. sql_sizing..............................................................................................................462
30.33. sql_sizing_profiles ..........................................................................................463
30.34. table_constraints ..............................................................................................463
30.35. table_privileges.................................................................................................464
30.36. tables ......................................................................................................................465
30.37. triggers ..................................................................................................................465
30.38. usage_privileges.................................................................................................467
30.39. view_column_usage ..............................................................................................467
30.40. view_table_usage.................................................................................................468
30.41. views ........................................................................................................................469
V. Server Programming ........................................................................................................................470
31. Extending SQL.........................................................................................................................472
31.1. How Extensibility Works.............................................................................................472
31.2. The PostgreSQL Type System.....................................................................................472
31.2.1. Base Types ......................................................................................................472
31.2.2. Composite Types.............................................................................................472
31.2.3. Domains ..........................................................................................................473
31.2.4. Pseudo-Types ..................................................................................................473
31.2.5. Polymorphic Types .........................................................................................473
31.3. User-Defined Functions...............................................................................................474
31.4. Query Language (SQL) Functions ..............................................................................474
31.4.1. SQL Functions on Base Types ........................................................................475
31.4.2. SQL Functions on Composite Types ..............................................................476
31.4.3. SQL Functions as Table Sources ....................................................................479
31.4.4. SQL Functions Returning Sets .......................................................................480
31.4.5. Polymorphic SQL Functions ..........................................................................481
31.5. Function Overloading ..................................................................................................482
31.6. Function Volatility Categories .....................................................................................483
31.7. Procedural Language Functions ..................................................................................484
31.8. Internal Functions........................................................................................................484
31.9. C-Language Functions.................................................................................................485
31.9.1. Dynamic Loading............................................................................................485
31.9.2. Base Types in C-Language Functions.............................................................486
31.9.3. Calling Conventions Version 0 for C-Language Functions ............................489
xiii
31.9.4. Calling Conventions Version 1 for C-Language Functions ............................491
31.9.5. Writing Code...................................................................................................494
31.9.6. Compiling and Linking Dynamically-Loaded Functions ...............................494
31.9.7. Extension Building Infrastructure...................................................................497
31.9.8. Composite-Type Arguments in C-Language Functions..................................499
31.9.9. Returning Rows (Composite Types) from C-Language Functions.................500
31.9.10. Returning Sets from C-Language Functions.................................................501
31.9.11. Polymorphic Arguments and Return Types ..................................................506
31.10. User-Defined Aggregates ..........................................................................................507
31.11. User-Defined Types ...................................................................................................509
31.12. User-Defined Operators.............................................................................................512
31.13. Operator Optimization Information...........................................................................513
31.13.1. COMMUTATOR .................................................................................................514
31.13.2. NEGATOR .......................................................................................................514
31.13.3. RESTRICT .....................................................................................................515
31.13.4. JOIN ..............................................................................................................516
31.13.5. HASHES..........................................................................................................516
31.13.6. MERGES (SORT1, SORT2, LTCMP, GTCMP).....................................................517
31.14. Interfacing Extensions To Indexes.............................................................................518
31.14.1. Index Methods and Operator Classes ...........................................................518
31.14.2. Index Method Strategies ...............................................................................519
31.14.3. Index Method Support Routines ...................................................................520
31.14.4. An Example ..................................................................................................521
31.14.5. Cross-Data-Type Operator Classes ...............................................................524
31.14.6. System Dependencies on Operator Classes ..................................................525
31.14.7. Special Features of Operator Classes............................................................525
32. Triggers ....................................................................................................................................527
32.1. Overview of Trigger Behavior.....................................................................................527
32.2. Visibility of Data Changes...........................................................................................528
32.3. Writing Trigger Functions in C ...................................................................................529
32.4. A Complete Example ..................................................................................................531
33. The Rule System ......................................................................................................................535
33.1. The Query Tree............................................................................................................535
33.2. Views and the Rule System .........................................................................................537
33.2.1. How SELECT Rules Work ...............................................................................537
33.2.2. View Rules in Non-SELECT Statements .........................................................542
33.2.3. The Power of Views in PostgreSQL ...............................................................543
33.2.4. Updating a View..............................................................................................544
33.3. Rules on INSERT, UPDATE, and DELETE ....................................................................544
33.3.1. How Update Rules Work ................................................................................544
33.3.1.1. A First Rule Step by Step...................................................................545
33.3.2. Cooperation with Views..................................................................................549
33.4. Rules and Privileges ....................................................................................................554
33.5. Rules and Command Status.........................................................................................555
33.6. Rules versus Triggers ..................................................................................................556
34. Procedural Languages ..............................................................................................................559
34.1. Installing Procedural Languages .................................................................................559
35. PL/pgSQL - SQL Procedural Language ..................................................................................561
xiv
35.1. Overview .....................................................................................................................561
35.1.1. Advantages of Using PL/pgSQL ....................................................................562
35.1.2. Supported Argument and Result Data Types ..................................................562
35.2. Tips for Developing in PL/pgSQL...............................................................................563
35.2.1. Handling of Quotation Marks .........................................................................563
35.3. Structure of PL/pgSQL................................................................................................565
35.4. Declarations.................................................................................................................566
35.4.1. Aliases for Function Parameters .....................................................................567
35.4.2. Copying Types ................................................................................................568
35.4.3. Row Types.......................................................................................................569
35.4.4. Record Types ..................................................................................................569
35.4.5. RENAME............................................................................................................570
35.5. Expressions..................................................................................................................570
35.6. Basic Statements..........................................................................................................571
35.6.1. Assignment .....................................................................................................571
35.6.2. SELECT INTO .................................................................................................572
35.6.3. Executing an Expression or Query With No Result........................................573
35.6.4. Doing Nothing At All .....................................................................................573
35.6.5. Executing Dynamic Commands .....................................................................574
35.6.6. Obtaining the Result Status.............................................................................575
35.7. Control Structures........................................................................................................576
35.7.1. Returning From a Function.............................................................................576
35.7.1.1. RETURN ...............................................................................................576
35.7.1.2. RETURN NEXT ....................................................................................576
35.7.2. Conditionals ....................................................................................................577
35.7.2.1. IF-THEN .............................................................................................577
35.7.2.2. IF-THEN-ELSE ..................................................................................578
35.7.2.3. IF-THEN-ELSE IF............................................................................578
35.7.2.4. IF-THEN-ELSIF-ELSE .....................................................................579
35.7.2.5. IF-THEN-ELSEIF-ELSE ...................................................................579
35.7.3. Simple Loops ..................................................................................................579
35.7.3.1. LOOP ...................................................................................................579
35.7.3.2. EXIT ...................................................................................................580
35.7.3.3. WHILE .................................................................................................580
35.7.3.4. FOR (integer variant)...........................................................................581
35.7.4. Looping Through Query Results ....................................................................581
35.7.5. Trapping Errors ...............................................................................................582
35.8. Cursors.........................................................................................................................584
35.8.1. Declaring Cursor Variables .............................................................................584
35.8.2. Opening Cursors .............................................................................................584
35.8.2.1. OPEN FOR SELECT............................................................................585
35.8.2.2. OPEN FOR EXECUTE .........................................................................585
35.8.2.3. Opening a Bound Cursor....................................................................585
35.8.3. Using Cursors..................................................................................................586
35.8.3.1. FETCH .................................................................................................586
35.8.3.2. CLOSE .................................................................................................586
35.8.3.3. Returning Cursors ..............................................................................586
35.9. Errors and Messages....................................................................................................588
xv
35.10. Trigger Procedures ....................................................................................................589
35.11. Porting from Oracle PL/SQL.....................................................................................594
35.11.1. Porting Examples ..........................................................................................594
35.11.2. Other Things to Watch For............................................................................600
35.11.2.1. Implicit Rollback after Exceptions...................................................601
35.11.2.2. EXECUTE ...........................................................................................601
35.11.2.3. Optimizing PL/pgSQL Functions.....................................................601
35.11.3. Appendix.......................................................................................................601
36. PL/Tcl - Tcl Procedural Language...........................................................................................605
36.1. Overview .....................................................................................................................605
36.2. PL/Tcl Functions and Arguments................................................................................605
36.3. Data Values in PL/Tcl..................................................................................................606
36.4. Global Data in PL/Tcl .................................................................................................607
36.5. Database Access from PL/Tcl .....................................................................................607
36.6. Trigger Procedures in PL/Tcl ......................................................................................609
36.7. Modules and the unknown command..........................................................................611
36.8. Tcl Procedure Names ..................................................................................................611
37. PL/Perl - Perl Procedural Language.........................................................................................612
37.1. PL/Perl Functions and Arguments...............................................................................612
37.2. Database Access from PL/Perl ....................................................................................614
37.3. Data Values in PL/Perl.................................................................................................615
37.4. Global Values in PL/Perl .............................................................................................616
37.5. Trusted and Untrusted PL/Perl ....................................................................................617
37.6. PL/Perl Triggers ..........................................................................................................617
37.7. Limitations and Missing Features ...............................................................................619
38. PL/Python - Python Procedural Language...............................................................................620
38.1. PL/Python Functions ...................................................................................................620
38.2. Trigger Functions ........................................................................................................621
38.3. Database Access ..........................................................................................................621
39. Server Programming Interface .................................................................................................623
39.1. Interface Functions ......................................................................................................623
SPI_connect ................................................................................................................623
SPI_finish....................................................................................................................625
SPI_push .....................................................................................................................626
SPI_pop.......................................................................................................................627
SPI_execute.................................................................................................................628
SPI_exec......................................................................................................................631
SPI_prepare.................................................................................................................632
SPI_getargcount ..........................................................................................................634
SPI_getargtypeid.........................................................................................................635
SPI_is_cursor_plan .....................................................................................................636
SPI_execute_plan........................................................................................................637
SPI_execp....................................................................................................................639
SPI_cursor_open .........................................................................................................640
SPI_cursor_find...........................................................................................................642
SPI_cursor_fetch.........................................................................................................643
SPI_cursor_move ........................................................................................................644
SPI_cursor_close.........................................................................................................645
xvi
SPI_saveplan...............................................................................................................646
39.2. Interface Support Functions ........................................................................................647
SPI_fname...................................................................................................................647
SPI_fnumber ...............................................................................................................648
SPI_getvalue ...............................................................................................................649
SPI_getbinval ..............................................................................................................650
SPI_gettype .................................................................................................................651
SPI_gettypeid..............................................................................................................652
SPI_getrelname ...........................................................................................................653
39.3. Memory Management .................................................................................................654
SPI_palloc ...................................................................................................................654
SPI_repalloc................................................................................................................656
SPI_pfree.....................................................................................................................657
SPI_copytuple .............................................................................................................658
SPI_returntuple ...........................................................................................................659
SPI_modifytuple .........................................................................................................660
SPI_freetuple...............................................................................................................662
SPI_freetuptable..........................................................................................................663
SPI_freeplan................................................................................................................664
39.4. Visibility of Data Changes...........................................................................................665
39.5. Examples .....................................................................................................................665
VI. Reference..........................................................................................................................................669
I. SQL Commands..........................................................................................................................671
ABORT.................................................................................................................................672
ALTER AGGREGATE.........................................................................................................674
ALTER CONVERSION.......................................................................................................676
ALTER DATABASE ............................................................................................................678
ALTER DOMAIN ................................................................................................................680
ALTER FUNCTION ............................................................................................................683
ALTER GROUP ...................................................................................................................685
ALTER INDEX ....................................................................................................................687
ALTER LANGUAGE...........................................................................................................689
ALTER OPERATOR ............................................................................................................690
ALTER OPERATOR CLASS...............................................................................................692
ALTER SCHEMA ................................................................................................................693
ALTER SEQUENCE............................................................................................................694
ALTER TABLE ....................................................................................................................696
ALTER TABLESPACE ........................................................................................................703
ALTER TRIGGER ...............................................................................................................705
ALTER TYPE.......................................................................................................................706
ALTER USER ......................................................................................................................707
ANALYZE............................................................................................................................710
BEGIN..................................................................................................................................712
CHECKPOINT.....................................................................................................................714
CLOSE .................................................................................................................................715
CLUSTER ............................................................................................................................717
COMMENT..........................................................................................................................720
xvii
COMMIT..............................................................................................................................723
COPY ...................................................................................................................................725
CREATE AGGREGATE ......................................................................................................733
CREATE CAST....................................................................................................................736
CREATE CONSTRAINT TRIGGER ..................................................................................740
CREATE CONVERSION ....................................................................................................741
CREATE DATABASE..........................................................................................................743
CREATE DOMAIN..............................................................................................................746
CREATE FUNCTION..........................................................................................................749
CREATE GROUP.................................................................................................................754
CREATE INDEX..................................................................................................................756
CREATE LANGUAGE ........................................................................................................759
CREATE OPERATOR .........................................................................................................762
CREATE OPERATOR CLASS ............................................................................................766
CREATE RULE....................................................................................................................769
CREATE SCHEMA .............................................................................................................772
CREATE SEQUENCE .........................................................................................................775
CREATE TABLE .................................................................................................................779
CREATE TABLE AS ...........................................................................................................789
CREATE TABLESPACE......................................................................................................791
CREATE TRIGGER.............................................................................................................793
CREATE TYPE ....................................................................................................................796
CREATE USER....................................................................................................................802
CREATE VIEW....................................................................................................................805
DEALLOCATE ....................................................................................................................808
DECLARE............................................................................................................................809
DELETE ...............................................................................................................................812
DROP AGGREGATE...........................................................................................................814
DROP CAST ........................................................................................................................816
DROP CONVERSION.........................................................................................................818
DROP DATABASE ..............................................................................................................819
DROP DOMAIN ..................................................................................................................820
DROP FUNCTION ..............................................................................................................821
DROP GROUP .....................................................................................................................823
DROP INDEX ......................................................................................................................824
DROP LANGUAGE.............................................................................................................825
DROP OPERATOR ..............................................................................................................826
DROP OPERATOR CLASS.................................................................................................828
DROP RULE ........................................................................................................................830
DROP SCHEMA ..................................................................................................................832
DROP SEQUENCE..............................................................................................................834
DROP TABLE ......................................................................................................................835
DROP TABLESPACE ..........................................................................................................837
DROP TRIGGER .................................................................................................................838
DROP TYPE.........................................................................................................................840
DROP USER ........................................................................................................................841
DROP VIEW ........................................................................................................................843
END......................................................................................................................................844
xviii
EXECUTE............................................................................................................................846
EXPLAIN .............................................................................................................................848
FETCH .................................................................................................................................851
GRANT ................................................................................................................................855
INSERT ................................................................................................................................860
LISTEN ................................................................................................................................863
LOAD ...................................................................................................................................865
LOCK ...................................................................................................................................866
MOVE...................................................................................................................................869
NOTIFY................................................................................................................................871
PREPARE .............................................................................................................................873
REINDEX.............................................................................................................................875
RELEASE SAVEPOINT......................................................................................................878
RESET..................................................................................................................................880
REVOKE ..............................................................................................................................881
ROLLBACK .........................................................................................................................884
ROLLBACK TO SAVEPOINT ............................................................................................886
SAVEPOINT ........................................................................................................................888
SELECT ...............................................................................................................................890
SELECT INTO .....................................................................................................................902
SET .......................................................................................................................................904
SET CONSTRAINTS ..........................................................................................................907
SET SESSION AUTHORIZATION.....................................................................................908
SET TRANSACTION ..........................................................................................................910
SHOW ..................................................................................................................................912
START TRANSACTION .....................................................................................................915
TRUNCATE .........................................................................................................................916
UNLISTEN...........................................................................................................................917
UPDATE ...............................................................................................................................919
VACUUM .............................................................................................................................922
II. PostgreSQL Client Applications ...............................................................................................925
clusterdb ...............................................................................................................................926
createdb.................................................................................................................................929
createlang..............................................................................................................................932
createuser..............................................................................................................................935
dropdb...................................................................................................................................938
droplang................................................................................................................................941
dropuser ................................................................................................................................944
ecpg.......................................................................................................................................947
pg_config ..............................................................................................................................949
pg_dump ...............................................................................................................................951
pg_dumpall ...........................................................................................................................958
pg_restore .............................................................................................................................962
psql .......................................................................................................................................969
vacuumdb..............................................................................................................................993
III. PostgreSQL Server Applications .............................................................................................996
initdb.....................................................................................................................................997
ipcclean...............................................................................................................................1000
xix
pg_controldata ....................................................................................................................1001
pg_ctl ..................................................................................................................................1002
pg_resetxlog .......................................................................................................................1006
postgres...............................................................................................................................1008
postmaster...........................................................................................................................1012
VII. Internals........................................................................................................................................1018
40. Overview of PostgreSQL Internals ........................................................................................1020
40.1. The Path of a Query...................................................................................................1020
40.2. How Connections are Established .............................................................................1020
40.3. The Parser Stage ........................................................................................................1021
40.3.1. Parser.............................................................................................................1021
40.3.2. Transformation Process.................................................................................1022
40.4. The PostgreSQL Rule System ...................................................................................1022
40.5. Planner/Optimizer......................................................................................................1023
40.5.1. Generating Possible Plans.............................................................................1023
40.6. Executor.....................................................................................................................1024
41. System Catalogs .....................................................................................................................1026
41.1. Overview ...................................................................................................................1026
41.2. pg_aggregate .........................................................................................................1027
41.3. pg_am ........................................................................................................................1028
41.4. pg_amop ....................................................................................................................1029
41.5. pg_amproc ................................................................................................................1029
41.6. pg_attrdef..............................................................................................................1030
41.7. pg_attribute .........................................................................................................1030
41.8. pg_cast ....................................................................................................................1033
41.9. pg_class ..................................................................................................................1034
41.10. pg_constraint .....................................................................................................1037
41.11. pg_conversion .....................................................................................................1038
41.12. pg_database .........................................................................................................1039
41.13. pg_depend ..............................................................................................................1040
41.14. pg_description ...................................................................................................1042
41.15. pg_group ................................................................................................................1042
41.16. pg_index ................................................................................................................1043
41.17. pg_inherits .........................................................................................................1045
41.18. pg_language .........................................................................................................1045
41.19. pg_largeobject ...................................................................................................1046
41.20. pg_listener .........................................................................................................1047
41.21. pg_namespace .......................................................................................................1047
41.22. pg_opclass............................................................................................................1048
41.23. pg_operator .........................................................................................................1049
41.24. pg_proc ..................................................................................................................1050
41.25. pg_rewrite............................................................................................................1052
41.26. pg_shadow ..............................................................................................................1053
41.27. pg_statistic .......................................................................................................1053
41.28. pg_tablespace .....................................................................................................1055
41.29. pg_trigger............................................................................................................1056
41.30. pg_type ..................................................................................................................1057
xx
41.31. System Views ..........................................................................................................1063
41.32. pg_indexes............................................................................................................1064
41.33. pg_locks ................................................................................................................1065
41.34. pg_rules ................................................................................................................1066
41.35. pg_settings .........................................................................................................1066
41.36. pg_stats ................................................................................................................1067
41.37. pg_tables ..............................................................................................................1069
41.38. pg_user ..................................................................................................................1070
41.39. pg_views ................................................................................................................1071
42. Frontend/Backend Protocol....................................................................................................1072
42.1. Overview ...................................................................................................................1072
42.1.1. Messaging Overview.....................................................................................1072
42.1.2. Extended Query Overview............................................................................1073
42.1.3. Formats and Format Codes ...........................................................................1073
42.2. Message Flow ............................................................................................................1074
42.2.1. Start-Up.........................................................................................................1074
42.2.2. Simple Query ................................................................................................1076
42.2.3. Extended Query ............................................................................................1077
42.2.4. Function Call.................................................................................................1080
42.2.5. COPY Operations .........................................................................................1081
42.2.6. Asynchronous Operations.............................................................................1082
42.2.7. Cancelling Requests in Progress...................................................................1082
42.2.8. Termination ...................................................................................................1083
42.2.9. SSL Session Encryption................................................................................1083
42.3. Message Data Types ..................................................................................................1084
42.4. Message Formats .......................................................................................................1085
42.5. Error and Notice Message Fields ..............................................................................1101
42.6. Summary of Changes since Protocol 2.0...................................................................1103
43. PostgreSQL Coding Conventions ..........................................................................................1105
43.1. Formatting .................................................................................................................1105
43.2. Reporting Errors Within the Server...........................................................................1105
43.3. Error Message Style Guide........................................................................................1107
43.3.1. What goes where...........................................................................................1108
43.3.2. Formatting.....................................................................................................1108
43.3.3. Quotation marks............................................................................................1108
43.3.4. Use of quotes.................................................................................................1109
43.3.5. Grammar and punctuation.............................................................................1109
43.3.6. Upper case vs. lower case .............................................................................1109
43.3.7. Avoid passive voice.......................................................................................1109
43.3.8. Present vs past tense......................................................................................1109
43.3.9. Type of the object..........................................................................................1110
43.3.10. Brackets.......................................................................................................1110
43.3.11. Assembling error messages.........................................................................1110
43.3.12. Reasons for errors .......................................................................................1110
43.3.13. Function names ...........................................................................................1111
43.3.14. Tricky words to avoid .................................................................................1111
43.3.15. Proper spelling ............................................................................................1111
43.3.16. Localization.................................................................................................1112
xxi
44. Native Language Support.......................................................................................................1113
44.1. For the Translator ......................................................................................................1113
44.1.1. Requirements ................................................................................................1113
44.1.2. Concepts........................................................................................................1113
44.1.3. Creating and maintaining message catalogs .................................................1114
44.1.4. Editing the PO files .......................................................................................1115
44.2. For the Programmer...................................................................................................1116
44.2.1. Mechanics .....................................................................................................1116
44.2.2. Message-writing guidelines ..........................................................................1117
45. Writing A Procedural Language Handler ..............................................................................1119
46. Genetic Query Optimizer .......................................................................................................1121
46.1. Query Handling as a Complex Optimization Problem..............................................1121
46.2. Genetic Algorithms ...................................................................................................1121
46.3. Genetic Query Optimization (GEQO) in PostgreSQL ..............................................1122
46.3.1. Future Implementation Tasks for PostgreSQL GEQO .................................1123
46.4. Further Reading .........................................................................................................1123
47. Index Cost Estimation Functions ...........................................................................................1125
48. GiST Indexes..........................................................................................................................1128
48.1. Introduction ...............................................................................................................1128
48.2. Extensibility...............................................................................................................1128
48.3. Implementation..........................................................................................................1128
48.4. Limitations.................................................................................................................1129
48.5. Examples ...................................................................................................................1129
49. Database Physical Storage .....................................................................................................1131
49.1. Database File Layout.................................................................................................1131
49.2. TOAST ......................................................................................................................1132
49.3. Database Page Layout ...............................................................................................1134
50. BKI Backend Interface...........................................................................................................1137
50.1. BKI File Format ........................................................................................................1137
50.2. BKI Commands .........................................................................................................1137
50.3. Example.....................................................................................................................1138
VIII. Appendixes..................................................................................................................................1139
A. PostgreSQL Error Codes.........................................................................................................1140
B. Date/Time Support ..................................................................................................................1147
B.1. Date/Time Input Interpretation ...................................................................................1147
B.2. Date/Time Key Words.................................................................................................1148
B.3. History of Units ..........................................................................................................1164
C. SQL Key Words.......................................................................................................................1166
D. SQL Conformance ..................................................................................................................1186
D.1. Supported Features .....................................................................................................1187
D.2. Unsupported Features .................................................................................................1198
E. Release Notes ..........................................................................................................................1206
E.1. Release 8.0 ..................................................................................................................1206
E.1.1. Overview ........................................................................................................1206
E.1.2. Migration to version 8.0 .................................................................................1207
E.1.3. Deprecated Features .......................................................................................1208
E.1.4. Changes ..........................................................................................................1209
xxii
E.1.4.1. Performance Improvements ...............................................................1209
E.1.4.2. Server Changes ..................................................................................1211
E.1.4.3. Query Changes...................................................................................1213
E.1.4.4. Object Manipulation Changes ...........................................................1214
E.1.4.5. Utility Command Changes.................................................................1215
E.1.4.6. Data Type and Function Changes ......................................................1216
E.1.4.7. Server-Side Language Changes .........................................................1218
E.1.4.8. psql Changes ......................................................................................1219
E.1.4.9. pg_dump Changes..............................................................................1220
E.1.4.10. libpq Changes ..................................................................................1221
E.1.4.11. Source Code Changes ......................................................................1221
E.1.4.12. Contrib Changes ..............................................................................1222
E.2. Release 7.4.6 ...............................................................................................................1223
E.2.1. Migration to version 7.4.6 ..............................................................................1223
E.2.2. Changes ..........................................................................................................1223
E.3. Release 7.4.5 ...............................................................................................................1224
E.3.1. Migration to version 7.4.5 ..............................................................................1224
E.3.2. Changes ..........................................................................................................1224
E.4. Release 7.4.4 ...............................................................................................................1225
E.4.1. Migration to version 7.4.4 ..............................................................................1225
E.4.2. Changes ..........................................................................................................1225
E.5. Release 7.4.3 ...............................................................................................................1225
E.5.1. Migration to version 7.4.3 ..............................................................................1226
E.5.2. Changes ..........................................................................................................1226
E.6. Release 7.4.2 ...............................................................................................................1226
E.6.1. Migration to version 7.4.2 ..............................................................................1227
E.6.2. Changes ..........................................................................................................1228
E.7. Release 7.4.1 ...............................................................................................................1229
E.7.1. Migration to version 7.4.1 ..............................................................................1229
E.7.2. Changes ..........................................................................................................1229
E.8. Release 7.4 ..................................................................................................................1230
E.8.1. Overview ........................................................................................................1230
E.8.2. Migration to version 7.4 .................................................................................1232
E.8.3. Changes ..........................................................................................................1233
E.8.3.1. Server Operation Changes .................................................................1233
E.8.3.2. Performance Improvements ...............................................................1235
E.8.3.3. Server Configuration Changes ...........................................................1236
E.8.3.4. Query Changes...................................................................................1238
E.8.3.5. Object Manipulation Changes ...........................................................1238
E.8.3.6. Utility Command Changes.................................................................1240
E.8.3.7. Data Type and Function Changes ......................................................1241
E.8.3.8. Server-Side Language Changes .........................................................1243
E.8.3.9. psql Changes ......................................................................................1244
E.8.3.10. pg_dump Changes............................................................................1244
E.8.3.11. libpq Changes ..................................................................................1245
E.8.3.12. JDBC Changes.................................................................................1246
E.8.3.13. Miscellaneous Interface Changes ....................................................1246
E.8.3.14. Source Code Changes ......................................................................1246
xxiii
E.8.3.15. Contrib Changes ..............................................................................1247
E.9. Release 7.3.8 ...............................................................................................................1248
E.9.1. Migration to version 7.3.8 ..............................................................................1248
E.9.2. Changes ..........................................................................................................1248
E.10. Release 7.3.7 .............................................................................................................1249
E.10.1. Migration to version 7.3.7 ............................................................................1249
E.10.2. Changes ........................................................................................................1249
E.11. Release 7.3.6 .............................................................................................................1249
E.11.1. Migration to version 7.3.6 ............................................................................1249
E.11.2. Changes ........................................................................................................1250
E.12. Release 7.3.5 .............................................................................................................1250
E.12.1. Migration to version 7.3.5 ............................................................................1250
E.12.2. Changes ........................................................................................................1251
E.13. Release 7.3.4 .............................................................................................................1251
E.13.1. Migration to version 7.3.4 ............................................................................1251
E.13.2. Changes ........................................................................................................1252
E.14. Release 7.3.3 .............................................................................................................1252
E.14.1. Migration to version 7.3.3 ............................................................................1252
E.14.2. Changes ........................................................................................................1252
E.15. Release 7.3.2 .............................................................................................................1254
E.15.1. Migration to version 7.3.2 ............................................................................1254
E.15.2. Changes ........................................................................................................1255
E.16. Release 7.3.1 .............................................................................................................1256
E.16.1. Migration to version 7.3.1 ............................................................................1256
E.16.2. Changes ........................................................................................................1256
E.17. Release 7.3 ................................................................................................................1256
E.17.1. Overview ......................................................................................................1256
E.17.2. Migration to version 7.3 ...............................................................................1257
E.17.3. Changes ........................................................................................................1258
E.17.3.1. Server Operation ..............................................................................1258
E.17.3.2. Performance .....................................................................................1258
E.17.3.3. Privileges..........................................................................................1259
E.17.3.4. Server Configuration........................................................................1259
E.17.3.5. Queries .............................................................................................1260
E.17.3.6. Object Manipulation ........................................................................1261
E.17.3.7. Utility Commands............................................................................1261
E.17.3.8. Data Types and Functions................................................................1263
E.17.3.9. Internationalization ..........................................................................1264
E.17.3.10. Server-side Languages ...................................................................1264
E.17.3.11. psql.................................................................................................1265
E.17.3.12. libpq ...............................................................................................1265
E.17.3.13. JDBC..............................................................................................1265
E.17.3.14. Miscellaneous Interfaces................................................................1266
E.17.3.15. Source Code ...................................................................................1266
E.17.3.16. Contrib ...........................................................................................1267
E.18. Release 7.2.6 .............................................................................................................1268
E.18.1. Migration to version 7.2.6 ............................................................................1268
E.18.2. Changes ........................................................................................................1268
xxiv
E.19. Release 7.2.5 .............................................................................................................1269
E.19.1. Migration to version 7.2.5 ............................................................................1269
E.19.2. Changes ........................................................................................................1269
E.20. Release 7.2.4 .............................................................................................................1270
E.20.1. Migration to version 7.2.4 ............................................................................1270
E.20.2. Changes ........................................................................................................1270
E.21. Release 7.2.3 .............................................................................................................1270
E.21.1. Migration to version 7.2.3 ............................................................................1270
E.21.2. Changes ........................................................................................................1271
E.22. Release 7.2.2 .............................................................................................................1271
E.22.1. Migration to version 7.2.2 ............................................................................1271
E.22.2. Changes ........................................................................................................1271
E.23. Release 7.2.1 .............................................................................................................1272
E.23.1. Migration to version 7.2.1 ............................................................................1272
E.23.2. Changes ........................................................................................................1272
E.24. Release 7.2 ................................................................................................................1272
E.24.1. Overview ......................................................................................................1273
E.24.2. Migration to version 7.2 ...............................................................................1273
E.24.3. Changes ........................................................................................................1274
E.24.3.1. Server Operation ..............................................................................1274
E.24.3.2. Performance .....................................................................................1275
E.24.3.3. Privileges..........................................................................................1275
E.24.3.4. Client Authentication .......................................................................1275
E.24.3.5. Server Configuration........................................................................1275
E.24.3.6. Queries .............................................................................................1276
E.24.3.7. Schema Manipulation ......................................................................1276
E.24.3.8. Utility Commands............................................................................1277
E.24.3.9. Data Types and Functions................................................................1277
E.24.3.10. Internationalization ........................................................................1278
E.24.3.11. PL/pgSQL ......................................................................................1279
E.24.3.12. PL/Perl ...........................................................................................1279
E.24.3.13. PL/Tcl ............................................................................................1279
E.24.3.14. PL/Python ......................................................................................1279
E.24.3.15. psql.................................................................................................1279
E.24.3.16. libpq ...............................................................................................1280
E.24.3.17. JDBC..............................................................................................1280
E.24.3.18. ODBC ............................................................................................1281
E.24.3.19. ECPG .............................................................................................1281
E.24.3.20. Misc. Interfaces..............................................................................1281
E.24.3.21. Build and Install.............................................................................1282
E.24.3.22. Source Code ...................................................................................1282
E.24.3.23. Contrib ...........................................................................................1283
E.25. Release 7.1.3 .............................................................................................................1283
E.25.1. Migration to version 7.1.3 ............................................................................1283
E.25.2. Changes ........................................................................................................1283
E.26. Release 7.1.2 .............................................................................................................1284
E.26.1. Migration to version 7.1.2 ............................................................................1284
E.26.2. Changes ........................................................................................................1284
xxv
E.27. Release 7.1.1 .............................................................................................................1284
E.27.1. Migration to version 7.1.1 ............................................................................1284
E.27.2. Changes ........................................................................................................1284
E.28. Release 7.1 ................................................................................................................1285
E.28.1. Migration to version 7.1 ...............................................................................1285
E.28.2. Changes ........................................................................................................1286
E.29. Release 7.0.3 .............................................................................................................1289
E.29.1. Migration to version 7.0.3 ............................................................................1289
E.29.2. Changes ........................................................................................................1289
E.30. Release 7.0.2 .............................................................................................................1290
E.30.1. Migration to version 7.0.2 ............................................................................1291
E.30.2. Changes ........................................................................................................1291
E.31. Release 7.0.1 .............................................................................................................1291
E.31.1. Migration to version 7.0.1 ............................................................................1291
E.31.2. Changes ........................................................................................................1291
E.32. Release 7.0 ................................................................................................................1292
E.32.1. Migration to version 7.0 ...............................................................................1292
E.32.2. Changes ........................................................................................................1293
E.33. Release 6.5.3 .............................................................................................................1299
E.33.1. Migration to version 6.5.3 ............................................................................1299
E.33.2. Changes ........................................................................................................1299
E.34. Release 6.5.2 .............................................................................................................1299
E.34.1. Migration to version 6.5.2 ............................................................................1300
E.34.2. Changes ........................................................................................................1300
E.35. Release 6.5.1 .............................................................................................................1300
E.35.1. Migration to version 6.5.1 ............................................................................1301
E.35.2. Changes ........................................................................................................1301
E.36. Release 6.5 ................................................................................................................1301
E.36.1. Migration to version 6.5 ...............................................................................1302
E.36.1.1. Multiversion Concurrency Control ..................................................1303
E.36.2. Changes ........................................................................................................1303
E.37. Release 6.4.2 .............................................................................................................1306
E.37.1. Migration to version 6.4.2 ............................................................................1307
E.37.2. Changes ........................................................................................................1307
E.38. Release 6.4.1 .............................................................................................................1307
E.38.1. Migration to version 6.4.1 ............................................................................1307
E.38.2. Changes ........................................................................................................1307
E.39. Release 6.4 ................................................................................................................1308
E.39.1. Migration to version 6.4 ...............................................................................1309
E.39.2. Changes ........................................................................................................1309
E.40. Release 6.3.2 .............................................................................................................1312
E.40.1. Changes ........................................................................................................1313
E.41. Release 6.3.1 .............................................................................................................1313
E.41.1. Changes ........................................................................................................1314
E.42. Release 6.3 ................................................................................................................1315
E.42.1. Migration to version 6.3 ...............................................................................1316
E.42.2. Changes ........................................................................................................1316
E.43. Release 6.2.1 .............................................................................................................1319
xxvi
E.43.1. Migration from version 6.2 to version 6.2.1.................................................1320
E.43.2. Changes ........................................................................................................1320
E.44. Release 6.2 ................................................................................................................1320
E.44.1. Migration from version 6.1 to version 6.2....................................................1320
E.44.2. Migration from version 1.x to version 6.2 ...................................................1321
E.44.3. Changes ........................................................................................................1321
E.45. Release 6.1.1 .............................................................................................................1323
E.45.1. Migration from version 6.1 to version 6.1.1.................................................1323
E.45.2. Changes ........................................................................................................1323
E.46. Release 6.1 ................................................................................................................1324
E.46.1. Migration to version 6.1 ...............................................................................1324
E.46.2. Changes ........................................................................................................1324
E.47. Release 6.0 ................................................................................................................1326
E.47.1. Migration from version 1.09 to version 6.0..................................................1326
E.47.2. Migration from pre-1.09 to version 6.0 ........................................................1327
E.47.3. Changes ........................................................................................................1327
E.48. Release 1.09 ..............................................................................................................1329
E.49. Release 1.02 ..............................................................................................................1329
E.49.1. Migration from version 1.02 to version 1.02.1.............................................1329
E.49.2. Dump/Reload Procedure ..............................................................................1330
E.49.3. Changes ........................................................................................................1330
E.50. Release 1.01 ..............................................................................................................1331
E.50.1. Migration from version 1.0 to version 1.01..................................................1331
E.50.2. Changes ........................................................................................................1332
E.51. Release 1.0 ................................................................................................................1333
E.51.1. Changes ........................................................................................................1333
E.52. Postgres95 Release 0.03............................................................................................1334
E.52.1. Changes ........................................................................................................1334
E.53. Postgres95 Release 0.02............................................................................................1336
E.53.1. Changes ........................................................................................................1337
E.54. Postgres95 Release 0.01............................................................................................1337
F. The CVS Repository ................................................................................................................1339
F.1. Getting The Source Via Anonymous CVS ..................................................................1339
F.2. CVS Tree Organization ...............................................................................................1340
F.3. Getting The Source Via CVSup...................................................................................1342
F.3.1. Preparing A CVSup Client System.................................................................1342
F.3.2. Running a CVSup Client ................................................................................1342
F.3.3. Installing CVSup.............................................................................................1344
F.3.4. Installation from Sources ................................................................................1345
G. Documentation ........................................................................................................................1348
G.1. DocBook.....................................................................................................................1348
G.2. Tool Sets .....................................................................................................................1348
G.2.1. Linux RPM Installation..................................................................................1349
G.2.2. FreeBSD Installation......................................................................................1349
G.2.3. Debian Packages ............................................................................................1350
G.2.4. Manual Installation from Source....................................................................1350
G.2.4.1. Installing OpenJade ...........................................................................1350
G.2.4.2. Installing the DocBook DTD Kit ......................................................1351
xxvii
G.2.4.3. Installing the DocBook DSSSL Style Sheets ....................................1352
G.2.4.4. Installing JadeTeX .............................................................................1352
G.2.5. Detection by configure ...............................................................................1352
G.3. Building The Documentation .....................................................................................1353
G.3.1. HTML ............................................................................................................1353
G.3.2. Manpages .......................................................................................................1353
G.3.3. Print Output via JadeTex................................................................................1354
G.3.4. Print Output via RTF......................................................................................1354
G.3.5. Plain Text Files...............................................................................................1356
G.3.6. Syntax Check .................................................................................................1356
G.4. Documentation Authoring ..........................................................................................1356
G.4.1. Emacs/PSGML...............................................................................................1356
G.4.2. Other Emacs modes .......................................................................................1358
G.5. Style Guide .................................................................................................................1358
G.5.1. Reference Pages .............................................................................................1358
H. External Projects .....................................................................................................................1361
H.1. Externally Developed Interfaces.................................................................................1361
H.2. Extensions...................................................................................................................1362
Bibliography .........................................................................................................................................1363
Index......................................................................................................................................................1365
xxviii
List of Tables
4-1. Operator Precedence (decreasing)......................................................................................................30
8-1. Data Types ..........................................................................................................................................81
8-2. Numeric Types....................................................................................................................................82
8-3. Monetary Types ..................................................................................................................................86
8-4. Character Types ..................................................................................................................................86
8-5. Special Character Types .....................................................................................................................88
8-6. Binary Data Types ..............................................................................................................................88
8-7. bytea Literal Escaped Octets ............................................................................................................89
8-8. bytea Output Escaped Octets............................................................................................................90
8-9. Date/Time Types.................................................................................................................................90
8-10. Date Input .........................................................................................................................................92
8-11. Time Input ........................................................................................................................................92
8-12. Time Zone Input ...............................................................................................................................93
8-13. Special Date/Time Inputs .................................................................................................................94
8-14. Date/Time Output Styles ..................................................................................................................95
8-15. Date Order Conventions ...................................................................................................................95
8-16. Geometric Types...............................................................................................................................98
8-17. Network Address Types .................................................................................................................100
8-18. cidr Type Input Examples ............................................................................................................101
8-19. Object Identifier Types ...................................................................................................................116
8-20. Pseudo-Types..................................................................................................................................117
9-1. Comparison Operators......................................................................................................................119
9-2. Mathematical Operators ...................................................................................................................121
9-3. Mathematical Functions ...................................................................................................................122
9-4. Trigonometric Functions ..................................................................................................................123
9-5. SQL String Functions and Operators ...............................................................................................124
9-6. Other String Functions .....................................................................................................................125
9-7. Built-in Conversions.........................................................................................................................129
9-8. SQL Binary String Functions and Operators ...................................................................................132
9-9. Other Binary String Functions .........................................................................................................133
9-10. Bit String Operators........................................................................................................................134
9-11. Regular Expression Match Operators.............................................................................................137
9-12. Regular Expression Atoms .............................................................................................................138
9-13. Regular Expression Quantifiers......................................................................................................139
9-14. Regular Expression Constraints .....................................................................................................140
9-15. Regular Expression Character-Entry Escapes ................................................................................142
9-16. Regular Expression Class-Shorthand Escapes ...............................................................................143
9-17. Regular Expression Constraint Escapes .........................................................................................143
9-18. Regular Expression Back References.............................................................................................143
9-19. ARE Embedded-Option Letters .....................................................................................................144
9-20. Formatting Functions .....................................................................................................................147
9-21. Template Patterns for Date/Time Formatting .................................................................................148
9-22. Template Pattern Modifiers for Date/Time Formatting ..................................................................150
9-23. Template Patterns for Numeric Formatting ....................................................................................151
9-24. to_char Examples ........................................................................................................................152
9-25. Date/Time Operators ......................................................................................................................153
xxix
9-26. Date/Time Functions ......................................................................................................................154
9-27. AT TIME ZONE Variants ................................................................................................................160
9-28. Geometric Operators ......................................................................................................................162
9-29. Geometric Functions ......................................................................................................................163
9-30. Geometric Type Conversion Functions ..........................................................................................164
9-31. cidr and inet Operators ..............................................................................................................166
9-32. cidr and inet Functions ..............................................................................................................166
9-33. macaddr Functions ........................................................................................................................167
9-34. Sequence Functions........................................................................................................................167
9-35. array Operators ............................................................................................................................171
9-36. array Functions ............................................................................................................................172
9-37. Aggregate Functions.......................................................................................................................173
9-38. Series Generating Functions...........................................................................................................180
9-39. Session Information Functions.......................................................................................................181
9-40. Access Privilege Inquiry Functions................................................................................................182
9-41. Schema Visibility Inquiry Functions..............................................................................................184
9-42. System Catalog Information Functions..........................................................................................184
9-43. Comment Information Functions ...................................................................................................186
9-44. Configuration Settings Functions ...................................................................................................186
9-45. Backend Signalling Functions........................................................................................................187
9-46. Backup Control Functions..............................................................................................................187
12-1. SQL Transaction Isolation Levels ..................................................................................................208
16-1. Short option key .............................................................................................................................277
16-2. System V IPC parameters...............................................................................................................278
20-1. Server Character Sets .....................................................................................................................309
20-2. Client/Server Character Set Conversions .......................................................................................312
23-1. Standard Statistics Views ...............................................................................................................336
23-2. Statistics Access Functions ............................................................................................................337
30-1. information_schema_catalog_name Columns......................................................................432
30-2. applicable_roles Columns ......................................................................................................432
30-3. check_constraints Columns....................................................................................................432
30-4. column_domain_usage Columns ...............................................................................................433
30-5. column_privileges Columns....................................................................................................433
30-6. column_udt_usage Columns ......................................................................................................434
30-7. columns Columns .........................................................................................................................435
30-8. constraint_column_usage Columns.......................................................................................439
30-9. constraint_table_usage Columns .........................................................................................439
30-10. data_type_privileges Columns ...........................................................................................440
30-11. domain_constraints Columns................................................................................................441
30-12. domain_udt_usage Columns ....................................................................................................442
30-13. domains Columns .......................................................................................................................442
30-14. element_types Columns ..........................................................................................................445
30-15. enabled_roles Columns ..........................................................................................................447
30-16. key_column_usage Columns ....................................................................................................448
30-17. parameters Columns .................................................................................................................448
30-18. referential_constraints Columns.....................................................................................451
30-19. role_column_grants Columns................................................................................................452
30-20. role_routine_grants Columns .............................................................................................452
xxx
30-21. role_table_grants Columns..................................................................................................453
30-22. role_usage_grants Columns..................................................................................................454
30-23. routine_privileges Columns................................................................................................454
30-24. routines Columns .....................................................................................................................455
30-25. schemata Columns .....................................................................................................................459
30-26. sql_features Columns.............................................................................................................460
30-27. sql_implementation_info Columns.....................................................................................461
30-28. sql_languages Columns ..........................................................................................................461
30-29. sql_packages Columns.............................................................................................................462
30-30. sql_sizing Columns .................................................................................................................462
30-31. sql_sizing_profiles Columns .............................................................................................463
30-32. table_constraints Columns..................................................................................................463
30-33. table_privileges Columns ....................................................................................................464
30-34. tables Columns..........................................................................................................................465
30-35. triggers Columns .....................................................................................................................466
30-36. usage_privileges Columns ....................................................................................................467
30-37. view_column_usage Columns..................................................................................................467
30-38. view_table_usage Columns ....................................................................................................468
30-39. views Columns............................................................................................................................469
31-1. Equivalent C Types for Built-In SQL Types ..................................................................................488
31-2. B-tree Strategies .............................................................................................................................519
31-3. Hash Strategies ...............................................................................................................................519
31-4. R-tree Strategies .............................................................................................................................520
31-5. B-tree Support Functions................................................................................................................520
31-6. Hash Support Functions .................................................................................................................521
31-7. R-tree Support Functions................................................................................................................521
31-8. GiST Support Functions.................................................................................................................521
41-1. System Catalogs ...........................................................................................................................1026
41-2. pg_aggregate Columns.............................................................................................................1027
41-3. pg_am Columns............................................................................................................................1028
41-4. pg_amop Columns .......................................................................................................................1029
41-5. pg_amproc Columns ...................................................................................................................1029
41-6. pg_attrdef Columns .................................................................................................................1030
41-7. pg_attribute Columns.............................................................................................................1030
41-8. pg_cast Columns .......................................................................................................................1033
41-9. pg_class Columns .....................................................................................................................1034
41-10. pg_constraint Columns ........................................................................................................1037
41-11. pg_conversion Columns ........................................................................................................1038
41-12. pg_database Columns.............................................................................................................1039
41-13. pg_depend Columns .................................................................................................................1041
41-14. pg_description Columns ......................................................................................................1042
41-15. pg_group Columns ...................................................................................................................1043
41-16. pg_index Columns ...................................................................................................................1043
41-17. pg_inherits Columns.............................................................................................................1045
41-18. pg_language Columns.............................................................................................................1045
41-19. pg_largeobject Columns ......................................................................................................1046
41-20. pg_listener Columns.............................................................................................................1047
41-21. pg_namespace Columns...........................................................................................................1048
xxxi
41-22. pg_opclass Columns ...............................................................................................................1048
41-23. pg_operator Columns.............................................................................................................1049
41-24. pg_proc Columns .....................................................................................................................1050
41-25. pg_rewrite Columns ...............................................................................................................1052
41-26. pg_shadow Columns .................................................................................................................1053
41-27. pg_statistic Columns...........................................................................................................1054
41-28. pg_tablespace Columns ........................................................................................................1055
41-29. pg_trigger Columns ...............................................................................................................1056
41-30. pg_type Columns .....................................................................................................................1057
41-31. System Views .............................................................................................................................1064
41-32. pg_indexes Columns ...............................................................................................................1064
41-33. pg_locks Columns ...................................................................................................................1065
41-34. pg_rules Columns ...................................................................................................................1066
41-35. pg_settings Columns.............................................................................................................1066
41-36. pg_stats Columns ...................................................................................................................1067
41-37. pg_tables Columns .................................................................................................................1070
41-38. pg_user Columns .....................................................................................................................1070
41-39. pg_views Columns ...................................................................................................................1071
49-1. Contents of PGDATA .....................................................................................................................1131
49-2. Overall Page Layout .....................................................................................................................1134
49-3. PageHeaderData Layout...............................................................................................................1135
49-4. HeapTupleHeaderData Layout .....................................................................................................1136
A-1. PostgreSQL Error Codes ...............................................................................................................1140
B-1. Month Names.................................................................................................................................1148
B-2. Day of the Week Names ................................................................................................................1148
B-3. Date/Time Field Modifiers.............................................................................................................1149
B-4. Time Zone Abbreviations for Input ...............................................................................................1149
B-5. Australian Time Zone Abbreviations for Input .............................................................................1152
B-6. Time Zone Names for Setting timezone .....................................................................................1153
C-1. SQL Key Words.............................................................................................................................1166
List of Figures
46-1. Structured Diagram of a Genetic Algorithm ................................................................................1122
List of Examples
8-1. Using the character types ...................................................................................................................88
8-2. Using the boolean type.....................................................................................................................97
8-3. Using the bit string types..................................................................................................................103
10-1. Exponentiation Operator Type Resolution .....................................................................................192
10-2. String Concatenation Operator Type Resolution............................................................................192
10-3. Absolute-Value and Negation Operator Type Resolution ..............................................................192
10-4. Rounding Function Argument Type Resolution.............................................................................194
10-5. Substring Function Type Resolution ..............................................................................................195
10-6. character Storage Type Conversion ...........................................................................................196
xxxii
10-7. Type Resolution with Underspecified Types in a Union ................................................................197
10-8. Type Resolution in a Simple Union................................................................................................197
10-9. Type Resolution in a Transposed Union.........................................................................................197
11-1. Setting up a Partial Index to Exclude Common Values..................................................................204
11-2. Setting up a Partial Index to Exclude Uninteresting Values...........................................................205
11-3. Setting up a Partial Unique Index...................................................................................................206
19-1. Example pg_hba.conf entries .....................................................................................................300
19-2. An example pg_ident.conf file .................................................................................................305
27-1. libpq Example Program 1...............................................................................................................394
27-2. libpq Example Program 2...............................................................................................................396
27-3. libpq Example Program 3...............................................................................................................399
28-1. Large Objects with libpq Example Program ..................................................................................407
34-1. Manual Installation of PL/pgSQL ..................................................................................................560
35-1. A PL/pgSQL Trigger Procedure.....................................................................................................590
35-2. A PL/pgSQL Trigger Procedure For Auditing ...............................................................................591
35-3. A PL/pgSQL Trigger Procedure For Maintaining A Summary Table ...........................................592
35-4. Porting a Simple Function from PL/SQL to PL/pgSQL ................................................................595
35-5. Porting a Function that Creates Another Function from PL/SQL to PL/pgSQL ...........................595
35-6. Porting a Procedure With String Manipulation and OUT Parameters from PL/SQL to PL/pgSQL597
35-7. Porting a Procedure from PL/SQL to PL/pgSQL...........................................................................599
xxxiii
Preface
This book is the official documentation of PostgreSQL. It is being written by the PostgreSQL develop-
ers and other volunteers in parallel to the development of the PostgreSQL software. It describes all the
functionality that the current version of PostgreSQL officially supports.
To make the large amount of information about PostgreSQL manageable, this book has been organized
in several parts. Each part is targeted at a different class of users, or at users in different stages of their
PostgreSQL experience:
1. What is PostgreSQL?
PostgreSQL is an object-relational database management system (ORDBMS) based on POSTGRES,
Version 4.21, developed at the University of California at Berkeley Computer Science Department. POST-
GRES pioneered many concepts that only became available in some commercial database systems much
later.
PostgreSQL is an open-source descendant of this original Berkeley code. It supports a large part of the
SQL:2003 standard and offers many modern features:
• complex queries
• foreign keys
• triggers
• views
• transactional integrity
• multiversion concurrency control
Also, PostgreSQL can be extended by the user in many ways, for example by adding new
• data types
• functions
1. http://s2k-ftp.CS.Berkeley.EDU:8000/postgres/postgres.html
i
Preface
• operators
• aggregate functions
• index methods
• procedural languages
And because of the liberal license, PostgreSQL can be used, modified, and distributed by everyone free
of charge for any purpose, be it private, commercial, or academic.
2. http://www.informix.com/
3. http://www.ibm.com/
4. http://meteora.ucsd.edu/s2k/s2k_home.html
ii
Preface
2.2. Postgres95
In 1994, Andrew Yu and Jolly Chen added a SQL language interpreter to POSTGRES. Under a new
name, Postgres95 was subsequently released to the web to find its own way in the world as an open-
source descendant of the original POSTGRES Berkeley code.
Postgres95 code was completely ANSI C and trimmed in size by 25%. Many internal changes improved
performance and maintainability. Postgres95 release 1.0.x ran about 30-50% faster on the Wisconsin
Benchmark compared to POSTGRES, Version 4.2. Apart from bug fixes, the following were the major
enhancements:
• The query language PostQUEL was replaced with SQL (implemented in the server). Subqueries were
not supported until PostgreSQL (see below), but they could be imitated in Postgres95 with user-defined
SQL functions. Aggregate functions were re-implemented. Support for the GROUP BY query clause was
also added.
• A new program (psql) was provided for interactive SQL queries, which used GNU Readline. This
largely superseded the old monitor program.
• A new front-end library, libpgtcl, supported Tcl-based clients. A sample shell, pgtclsh, provided
new Tcl commands to interface Tcl programs with the Postgres95 server.
• The large-object interface was overhauled. The inversion large objects were the only mechanism for
storing large objects. (The inversion file system was removed.)
• The instance-level rule system was removed. Rules were still available as rewrite rules.
• A short tutorial introducing regular SQL features as well as those of Postgres95 was distributed with
the source code
• GNU make (instead of BSD make) was used for the build. Also, Postgres95 could be compiled with an
unpatched GCC (data alignment of doubles was fixed).
2.3. PostgreSQL
By 1996, it became clear that the name “Postgres95” would not stand the test of time. We chose a new
name, PostgreSQL, to reflect the relationship between the original POSTGRES and the more recent ver-
sions with SQL capability. At the same time, we set the version numbering to start at 6.0, putting the
numbers back into the sequence originally begun by the Berkeley POSTGRES project.
The emphasis during development of Postgres95 was on identifying and understanding existing problems
in the server code. With PostgreSQL, the emphasis has shifted to augmenting features and capabilities,
although work continues in all areas.
Details about what has happened in PostgreSQL since then can be found in Appendix E.
3. Conventions
This book uses the following typographical conventions to mark certain portions of text: new terms,
foreign phrases, and other important passages are emphasized in italics. Everything that represents in-
iii
Preface
put or output of the computer, in particular commands, program code, and screen output, is shown in a
monospaced font (example). Within such passages, italics (example) indicate placeholders; you must
insert an actual value instead of the placeholder. On occasion, parts of program code are emphasized in
bold face (example), if they have been added or changed since the preceding example.
The following conventions are used in the synopsis of a command: brackets ([ and ]) indicate optional
parts. (In the synopsis of a Tcl command, question marks (?) are used instead, as is usual in Tcl.) Braces
({ and }) and vertical lines (|) indicate that you must choose one alternative. Dots (...) mean that the
preceding element can be repeated.
Where it enhances the clarity, SQL commands are preceded by the prompt =>, and shell commands are
preceded by the prompt $. Normally, prompts are not shown, though.
An administrator is generally a person who is in charge of installing and running the server. A user
could be anyone who is using, or wants to use, any part of the PostgreSQL system. These terms should
not be interpreted too narrowly; this book does not have fixed presumptions about system administration
procedures.
4. Further Information
Besides the documentation, that is, this book, there are other resources about PostgreSQL:
FAQs
The FAQ list contains continuously updated answers to frequently asked questions.
READMEs
README files are available for most contributed packages.
Web Site
The PostgreSQL web site5 carries details on the latest release and other information to make your
work or play with PostgreSQL more productive.
Mailing Lists
The mailing lists are a good place to have your questions answered, to share experiences with other
users, and to contact the developers. Consult the PostgreSQL web site for details.
Yourself!
PostgreSQL is an open-source project. As such, it depends on the user community for ongoing sup-
port. As you begin to use PostgreSQL, you will rely on others for help, either through the documen-
tation or through the mailing lists. Consider contributing your knowledge back. Read the mailing
lists and answer questions. If you learn something which is not in the documentation, write it up and
contribute it. If you add features to the code, contribute them.
5. http://www.postgresql.org
iv
Preface
• A program terminates with a fatal signal or an operating system error message that would point to a
problem in the program. (A counterexample might be a “disk full” message, since you have to fix that
yourself.)
• A program produces the wrong output for any given input.
• A program refuses to accept valid input (as defined in the documentation).
• A program accepts invalid input without a notice or error message. But keep in mind that your idea of
invalid input might be our idea of an extension or compatibility with traditional practice.
• PostgreSQL fails to compile, build, or install according to the instructions on supported platforms.
Here “program” refers to any executable, not only the backend server.
Being slow or resource-hogging is not necessarily a bug. Read the documentation or ask on one of the
mailing lists for help in tuning your applications. Failing to comply to the SQL standard is not necessarily
a bug either, unless compliance for the specific feature is explicitly claimed.
Before you continue, check on the TODO list and in the FAQ to see if your bug is already known. If you
cannot decode the information on the TODO list, report your problem. The least we can do is make the
TODO list clearer.
v
Preface
straightforward (you can probably copy and paste them from the screen) but all too often important details
are left out because someone thought it does not matter or the report would be understood anyway.
The following items should be contained in every bug report:
• The exact sequence of steps from program start-up necessary to reproduce the problem. This should
be self-contained; it is not enough to send in a bare SELECT statement without the preceding CREATE
TABLE and INSERT statements, if the output should depend on the data in the tables. We do not have
the time to reverse-engineer your database schema, and if we are supposed to make up our own data we
would probably miss the problem.
The best format for a test case for SQL-related problems is a file that can be run through the psql
frontend that shows the problem. (Be sure to not have anything in your ~/.psqlrc start-up file.) An
easy start at this file is to use pg_dump to dump out the table declarations and data needed to set the
scene, then add the problem query. You are encouraged to minimize the size of your example, but this
is not absolutely necessary. If the bug is reproducible, we will find it either way.
If your application uses some other client interface, such as PHP, then please try to isolate the offending
queries. We will probably not set up a web server to reproduce your problem. In any case remember
to provide the exact input files; do not guess that the problem happens for “large files” or “midsize
databases”, etc. since this information is too inexact to be of use.
• The output you got. Please do not say that it “didn’t work” or “crashed”. If there is an error message,
show it, even if you do not understand it. If the program terminates with an operating system error,
say which. If nothing at all happens, say so. Even if the result of your test case is a program crash or
otherwise obvious it might not happen on our platform. The easiest thing is to copy the output from the
terminal, if possible.
Note: If you are reporting an error message, please obtain the most verbose form of the message.
In psql, say \set VERBOSITY verbose beforehand. If you are extracting the message from the
server log, set the run-time parameter log_error_verbosity to verbose so that all details are logged.
Note: In case of fatal errors, the error message reported by the client might not contain all the
information available. Please also look at the log output of the database server. If you do not keep
your server’s log output, this would be a good time to start doing so.
• The output you expected is very important to state. If you just write “This command gives me that
output.” or “This is not what I expected.”, we might run it ourselves, scan the output, and think it
looks OK and is exactly what we expected. We should not have to spend the time to decode the exact
semantics behind your commands. Especially refrain from merely saying that “This is not what SQL
says/Oracle does.” Digging out the correct behavior from SQL is not a fun undertaking, nor do we all
know how all the other relational databases out there behave. (If your problem is a program crash, you
can obviously omit this item.)
vi
Preface
• Any command line options and other start-up options, including any relevant environment variables or
configuration files that you changed from the default. Again, please provide exact information. If you
are using a prepackaged distribution that starts the database server at boot time, you should try to find
out how that is done.
• Anything you did at all differently from the installation instructions.
• The PostgreSQL version. You can run the command SELECT version(); to find out the version of
the server you are connected to. Most executable programs also support a --version option; at least
postmaster --version and psql --version should work. If the function or the options do not
exist then your version is more than old enough to warrant an upgrade. If you run a prepackaged version,
such as RPMs, say so, including any subversion the package may have. If you are talking about a CVS
snapshot, mention that, including its date and time.
If your version is older than 8.0.0 we will almost certainly tell you to upgrade. There are many bug
fixes and improvements in each new release, so it is quite possible that a bug you have encountered in
an older release of PostgreSQL has already been fixed. We can only provide limited support for sites
using older releases of PostgreSQL; if you require more than we can provide, consider acquiring a
commercial support contract.
• Platform information. This includes the kernel name and version, C library, processor, memory infor-
mation, and so on. In most cases it is sufficient to report the vendor and version, but do not assume
everyone knows what exactly “Debian” contains or that everyone runs on Pentiums. If you have instal-
lation problems then information about the toolchain on your machine (compiler, make, and so on) is
also necessary.
Do not be afraid if your bug report becomes rather lengthy. That is a fact of life. It is better to report
everything the first time than us having to squeeze the facts out of you. On the other hand, if your input
files are huge, it is fair to ask first whether somebody is interested in looking into it.
Do not spend all your time to figure out which changes in the input make the problem go away. This will
probably not help solving it. If it turns out that the bug cannot be fixed right away, you will still have time
to find and share your work-around. Also, once again, do not waste your time guessing why the bug exists.
We will find that out soon enough.
When writing a bug report, please avoid confusing terminology. The software package in total is called
“PostgreSQL”, sometimes “Postgres” for short. If you are specifically talking about the backend server,
mention that, do not just say “PostgreSQL crashes”. A crash of a single backend server process is quite
different from crash of the parent “postmaster” process; please don’t say “the postmaster crashed” when
you mean a single backend process went down, nor vice versa. Also, client programs such as the interactive
frontend “psql” are completely separate from the backend. Please try to be specific about whether the
problem is on the client or server side.
vii
Preface
Note: Due to the unfortunate amount of spam going around, all of the above email addresses are
closed mailing lists. That is, you need to be subscribed to a list to be allowed to post on it. (You need
not be subscribed to use the bug-report web form, however.) If you would like to send mail but do not
want to receive list traffic, you can subscribe and set your subscription option to nomail. For more
information send mail to <[email protected]> with the single word help in the body of the
message.
viii
I. Tutorial
Welcome to the PostgreSQL Tutorial. The following few chapters are intended to give a simple introduc-
tion to PostgreSQL, relational database concepts, and the SQL language to those who are new to any one
of these aspects. We only assume some general knowledge about how to use computers. No particular
Unix or programming experience is required. This part is mainly intended to give you some hands-on
experience with important aspects of the PostgreSQL system. It makes no attempt to be a complete or
thorough treatment of the topics it covers.
After you have worked through this tutorial you might want to move on to reading Part II to gain a more
formal knowledge of the SQL language, or Part IV for information about developing applications for
PostgreSQL. Those who set up and manage their own server should also read Part III.
Chapter 1. Getting Started
1.1. Installation
Before you can use PostgreSQL you need to install it, of course. It is possible that PostgreSQL is already
installed at your site, either because it was included in your operating system distribution or because
the system administrator already installed it. If that is the case, you should obtain information from the
operating system documentation or your system administrator about how to access PostgreSQL.
If you are not sure whether PostgreSQL is already available or whether you can use it for your experimen-
tation then you can install it yourself. Doing so is not hard and it can be a good exercise. PostgreSQL can
be installed by any unprivileged user; no superuser (root) access is required.
If you are installing PostgreSQL yourself, then refer to Chapter 14 for instructions on installation, and
return to this guide when the installation is complete. Be sure to follow closely the section about setting
up the appropriate environment variables.
If your site administrator has not set things up in the default way, you may have some more work to do. For
example, if the database server machine is a remote machine, you will need to set the PGHOST environment
variable to the name of the database server machine. The environment variable PGPORT may also have to
be set. The bottom line is this: if you try to start an application program and it complains that it cannot
connect to the database, you should consult your site administrator or, if that is you, the documentation
to make sure that your environment is properly set up. If you did not understand the preceding paragraph
then read the next section.
• A server process, which manages the database files, accepts connections to the database from client
applications, and performs actions on the database on behalf of the clients. The database server program
is called postmaster.
• The user’s client (frontend) application that wants to perform database operations. Client applications
can be very diverse in nature: a client could be a text-oriented tool, a graphical application, a web server
that accesses the database to display web pages, or a specialized database maintenance tool. Some client
applications are supplied with the PostgreSQL distribution; most are developed by users.
As is typical of client/server applications, the client and the server can be on different hosts. In that case
they communicate over a TCP/IP network connection. You should keep this in mind, because the files that
can be accessed on a client machine might not be accessible (or might only be accessible using a different
file name) on the database server machine.
1
Chapter 1. Getting Started
The PostgreSQL server can handle multiple concurrent connections from clients. For that purpose it starts
(“forks”) a new process for each connection. From that point on, the client and the new server process
communicate without intervention by the original postmaster process. Thus, the postmaster is always
running, waiting for client connections, whereas client and associated server processes come and go. (All
of this is of course invisible to the user. We only mention it here for completeness.)
$ createdb mydb
CREATE DATABASE
If so, this step was successful and you can skip over the remainder of this section.
If you see a message similar to
then PostgreSQL was not installed properly. Either it was not installed at all or the search path was not set
correctly. Try calling the command with an absolute path instead:
$ /usr/local/pgsql/bin/createdb mydb
The path at your site might be different. Contact your site administrator or check back in the installation
instructions to correct the situation.
Another response could be this:
createdb: could not connect to database template1: could not connect to server:
No such file or directory
Is the server running locally and accepting
connections on Unix domain socket "/tmp/.s.PGSQL.5432"?
This means that the server was not started, or it was not started where createdb expected it. Again, check
the installation instructions or consult the administrator.
Another response could be this:
createdb: could not connect to database template1: FATAL: user "joe" does not
exist
where your own login name is mentioned. This will happen if the administrator has not created a Post-
greSQL user account for you. (PostgreSQL user accounts are distinct from operating system user ac-
2
Chapter 1. Getting Started
counts.) If you are the administrator, see Chapter 17 for help creating accounts. You will need to become
the operating system user under which PostgreSQL was installed (usually postgres) to create the first
user account. It could also be that you were assigned a PostgreSQL user name that is different from your
operating system user name; in that case you need to use the -U switch or set the PGUSER environment
variable to specify your PostgreSQL user name.
If you have a user account but it does not have the privileges required to create a database, you will see
the following:
Not every user has authorization to create new databases. If PostgreSQL refuses to create databases for
you then the site administrator needs to grant you permission to create databases. Consult your site ad-
ministrator if this occurs. If you installed PostgreSQL yourself then you should log in for the purposes of
this tutorial under the user account that you started the server as. 1
You can also create databases with other names. PostgreSQL allows you to create any number of databases
at a given site. Database names must have an alphabetic first character and are limited to 63 characters in
length. A convenient choice is to create a database with the same name as your current user name. Many
tools assume that database name as the default, so it can save you some typing. To create that database,
simply type
$ createdb
If you do not want to use your database anymore you can remove it. For example, if you are the owner
(creator) of the database mydb, you can destroy it using the following command:
$ dropdb mydb
(For this command, the database name does not default to the user account name. You always need to
specify it.) This action physically removes all files associated with the database and cannot be undone, so
this should only be done with a great deal of forethought.
More about createdb and dropdb may be found in createdb and dropdb respectively.
• Running the PostgreSQL interactive terminal program, called psql, which allows you to interactively
enter, edit, and execute SQL commands.
• Using an existing graphical frontend tool like PgAccess or an office suite with ODBC support to create
and manipulate a database. These possibilities are not covered in this tutorial.
1. As an explanation for why this works: PostgreSQL user names are separate from operating system user accounts. If you
connect to a database, you can choose what PostgreSQL user name to connect as; if you don’t, it will default to the same name
as your current operating system account. As it happens, there will always be a PostgreSQL user account that has the same name
as the operating system user that started the server, and it also happens that that user always has permission to create databases.
Instead of logging in as that user you can also specify the -U option everywhere to select a PostgreSQL user name to connect as.
3
Chapter 1. Getting Started
• Writing a custom application, using one of the several available language bindings. These possibilities
are discussed further in Part IV.
You probably want to start up psql, to try out the examples in this tutorial. It can be activated for the
mydb database by typing the command:
$ psql mydb
If you leave off the database name then it will default to your user account name. You already discovered
this scheme in the previous section.
In psql, you will be greeted with the following message:
mydb=>
mydb=#
That would mean you are a database superuser, which is most likely the case if you installed PostgreSQL
yourself. Being a superuser means that you are not subject to access controls. For the purpose of this
tutorial this is not of importance.
If you encounter problems starting psql then go back to the previous section. The diagnostics of
createdb and psql are similar, and if the former worked the latter should work as well.
The last line printed out by psql is the prompt, and it indicates that psql is listening to you and that you
can type SQL queries into a work space maintained by psql. Try out these commands:
mydb=> SELECT 2 + 2;
?column?
----------
4
(1 row)
4
Chapter 1. Getting Started
The psql program has a number of internal commands that are not SQL commands. They begin with
the backslash character, “\”. Some of these commands were listed in the welcome message. For example,
you can get help on the syntax of various PostgreSQL SQL commands by typing:
mydb=> \h
mydb=> \q
and psql will quit and return you to your command shell. (For more internal commands, type \? at the
psql prompt.) The full capabilities of psql are documented in psql. If PostgreSQL is installed correctly
you can also type man psql at the operating system shell prompt to see the documentation. In this tutorial
we will not use these features explicitly, but you can use them yourself when you see fit.
5
Chapter 2. The SQL Language
2.1. Introduction
This chapter provides an overview of how to use SQL to perform simple operations. This tutorial is only
intended to give you an introduction and is in no way a complete tutorial on SQL. Numerous books have
been written on SQL, including Understanding the New SQL and A Guide to the SQL Standard. You
should be aware that some PostgreSQL language features are extensions to the standard.
In the examples that follow, we assume that you have created a database named mydb, as described in the
previous chapter, and have started psql.
Examples in this manual can also be found in the PostgreSQL source distribution in the directory
src/tutorial/. To use those files, first change to that directory and run make:
$ cd ..../src/tutorial
$ make
This creates the scripts and compiles the C files containing user-defined functions and types. (You must
use GNU make for this — it may be named something different on your system, often gmake.) Then, to
start the tutorial, do the following:
$ cd ..../src/tutorial
$ psql -s mydb
...
mydb=> \i basics.sql
The \i command reads in commands from the specified file. The -s option puts you in single step mode
which pauses before sending each statement to the server. The commands used in this section are in the
file basics.sql.
2.2. Concepts
PostgreSQL is a relational database management system (RDBMS). That means it is a system for man-
aging data stored in relations. Relation is essentially a mathematical term for table. The notion of storing
data in tables is so commonplace today that it might seem inherently obvious, but there are a number of
other ways of organizing databases. Files and directories on Unix-like operating systems form an example
of a hierarchical database. A more modern development is the object-oriented database.
Each table is a named collection of rows. Each row of a given table has the same set of named columns,
and each column is of a specific data type. Whereas columns have a fixed order in each row, it is important
to remember that SQL does not guarantee the order of the rows within the table in any way (although they
can be explicitly sorted for display).
Tables are grouped into databases, and a collection of databases managed by a single PostgreSQL server
instance constitutes a database cluster.
6
Chapter 2. The SQL Language
You can enter this into psql with the line breaks. psql will recognize that the command is not terminated
until the semicolon.
White space (i.e., spaces, tabs, and newlines) may be used freely in SQL commands. That means you can
type the command aligned differently than above, or even all on one line. Two dashes (“--”) introduce
comments. Whatever follows them is ignored up to the end of the line. SQL is case insensitive about key
words and identifiers, except when identifiers are double-quoted to preserve the case (not done above).
varchar(80) specifies a data type that can store arbitrary character strings up to 80 characters in length.
int is the normal integer type. real is a type for storing single precision floating-point numbers. date
should be self-explanatory. (Yes, the column of type date is also named date. This may be convenient
or confusing — you choose.)
PostgreSQL supports the standard SQL types int, smallint, real, double precision, char(N ),
varchar(N ), date, time, timestamp, and interval, as well as other types of general utility and a
rich set of geometric types. PostgreSQL can be customized with an arbitrary number of user-defined data
types. Consequently, type names are not syntactical key words, except where required to support special
cases in the SQL standard.
The second example will store cities and their associated geographical location:
INSERT INTO weather VALUES (’San Francisco’, 46, 50, 0.25, ’1994-11-27’);
7
Chapter 2. The SQL Language
Note that all data types use rather obvious input formats. Constants that are not simple numeric values
usually must be surrounded by single quotes (’), as in the example. The date type is actually quite flexible
in what it accepts, but for this tutorial we will stick to the unambiguous format shown here.
The point type requires a coordinate pair as input, as shown here:
The syntax used so far requires you to remember the order of the columns. An alternative syntax allows
you to list the columns explicitly:
You can list the columns in a different order if you wish or even omit some columns, e.g., if the precipi-
tation is unknown:
Many developers consider explicitly listing the columns better style than relying on the order implicitly.
Please enter all the commands shown above so you have some data to work with in the following sections.
You could also have used COPY to load large amounts of data from flat-text files. This is usually faster
because the COPY command is optimized for this application while allowing less flexibility than INSERT.
An example would be:
where the file name for the source file must be available to the backend server machine, not the client,
since the backend server reads the file directly. You can read more about the COPY command in COPY.
Here * is a shorthand for “all columns”. 1 So the same result would be had with:
1. While SELECT * is useful for off-the-cuff queries, it is widely considered bad style in production code, since adding a column
to the table would change the results.
8
Chapter 2. The SQL Language
---------------+---------+---------+------+------------
San Francisco | 46 | 50 | 0.25 | 1994-11-27
San Francisco | 43 | 57 | 0 | 1994-11-29
Hayward | 37 | 54 | | 1994-11-29
(3 rows)
You can write expressions, not just simple column references, in the select list. For example, you can do:
Notice how the AS clause is used to relabel the output column. (The AS clause is optional.)
A query can be “qualified” by adding a WHERE clause that specifies which rows are wanted. The WHERE
clause contains a Boolean (truth value) expression, and only rows for which the Boolean expression is
true are returned. The usual Boolean operators (AND, OR, and NOT) are allowed in the qualification. For
example, the following retrieves the weather of San Francisco on rainy days:
Result:
You can request that the results of a query be returned in sorted order:
In this example, the sort order isn’t fully specified, and so you might get the San Francisco rows in either
order. But you’d always get the results shown above if you do
9
Chapter 2. The SQL Language
You can request that duplicate rows be removed from the result of a query:
city
---------------
Hayward
San Francisco
(2 rows)
Here again, the result row ordering might vary. You can ensure consistent results by using DISTINCT and
ORDER BY together: 2
Note: This is only a conceptual model. The join is usually performed in a more efficient manner than
actually comparing each possible pair of rows, but this is invisible to the user.
SELECT *
FROM weather, cities
WHERE city = name;
2. In some database systems, including older versions of PostgreSQL, the implementation of DISTINCT automatically orders the
rows and so ORDER BY is redundant. But this is not required by the SQL standard, and current PostgreSQL doesn’t guarantee that
DISTINCT causes the rows to be ordered.
10
Chapter 2. The SQL Language
• There is no result row for the city of Hayward. This is because there is no matching entry in the cities
table for Hayward, so the join ignores the unmatched rows in the weather table. We will see shortly how
this can be fixed.
• There are two columns containing the city name. This is correct because the lists of columns of the
weather and the cities table are concatenated. In practice this is undesirable, though, so you will
probably want to list the output columns explicitly rather than using *:
SELECT city, temp_lo, temp_hi, prcp, date, location
FROM weather, cities
WHERE city = name;
Exercise: Attempt to find out the semantics of this query when the WHERE clause is omitted.
Since the columns all had different names, the parser automatically found out which table they belong to,
but it is good style to fully qualify column names in join queries:
Join queries of the kind seen thus far can also be written in this alternative form:
SELECT *
FROM weather INNER JOIN cities ON (weather.city = cities.name);
This syntax is not as commonly used as the one above, but we show it here to help you understand the
following topics.
Now we will figure out how we can get the Hayward records back in. What we want the query to do is to
scan the weather table and for each row to find the matching cities row. If no matching row is found
we want some “empty values” to be substituted for the cities table’s columns. This kind of query is
called an outer join. (The joins we have seen so far are inner joins.) The command looks like this:
SELECT *
FROM weather LEFT OUTER JOIN cities ON (weather.city = cities.name);
This query is called a left outer join because the table mentioned on the left of the join operator will have
each of its rows in the output at least once, whereas the table on the right will only have those rows output
that match some row of the left table. When outputting a left-table row for which there is no right-table
match, empty (null) values are substituted for the right-table columns.
11
Chapter 2. The SQL Language
Exercise: There are also right outer joins and full outer joins. Try to find out what those do.
We can also join a table against itself. This is called a self join. As an example, suppose we wish to find
all the weather records that are in the temperature range of other weather records. So we need to compare
the temp_lo and temp_hi columns of each weather row to the temp_lo and temp_hi columns of all
other weather rows. We can do this with the following query:
Here we have relabeled the weather table as W1 and W2 to be able to distinguish the left and right side of
the join. You can also use these kinds of aliases in other queries to save some typing, e.g.:
SELECT *
FROM weather w, cities c
WHERE w.city = c.name;
max
-----
46
(1 row)
If we wanted to know what city (or cities) that reading occurred in, we might try
but this will not work since the aggregate max cannot be used in the WHERE clause. (This restriction
exists because the WHERE clause determines the rows that will go into the aggregation stage; so it has to
be evaluated before aggregate functions are computed.) However, as is often the case the query can be
restated to accomplish the intended result, here by using a subquery:
12
Chapter 2. The SQL Language
city
---------------
San Francisco
(1 row)
This is OK because the subquery is an independent computation that computes its own aggregate sepa-
rately from what is happening in the outer query.
Aggregates are also very useful in combination with GROUP BY clauses. For example, we can get the
maximum low temperature observed in each city with
city | max
---------------+-----
Hayward | 37
San Francisco | 46
(2 rows)
which gives us one output row per city. Each aggregate result is computed over the table rows matching
that city. We can filter these grouped rows using HAVING:
city | max
---------+-----
Hayward | 37
(1 row)
which gives us the same results for only the cities that have all temp_lo values below 40. Finally, if we
only care about cities whose names begin with “S”, we might do
➊ The LIKE operator does pattern matching and is explained in Section 9.7.
It is important to understand the interaction between aggregates and SQL’s WHERE and HAVING clauses.
The fundamental difference between WHERE and HAVING is this: WHERE selects input rows before groups
and aggregates are computed (thus, it controls which rows go into the aggregate computation), whereas
HAVING selects group rows after groups and aggregates are computed. Thus, the WHERE clause must not
13
Chapter 2. The SQL Language
contain aggregate functions; it makes no sense to try to use an aggregate to determine which rows will
be inputs to the aggregates. On the other hand, the HAVING clause always contains aggregate functions.
(Strictly speaking, you are allowed to write a HAVING clause that doesn’t use aggregates, but it’s wasteful.
The same condition could be used more efficiently at the WHERE stage.)
In the previous example, we can apply the city name restriction in WHERE, since it needs no aggregate.
This is more efficient than adding the restriction to HAVING, because we avoid doing the grouping and
aggregate calculations for all rows that fail the WHERE check.
2.8. Updates
You can update existing rows using the UPDATE command. Suppose you discover the temperature readings
are all off by 2 degrees as of November 28. You may update the data as follows:
UPDATE weather
SET temp_hi = temp_hi - 2, temp_lo = temp_lo - 2
WHERE date > ’1994-11-28’;
2.9. Deletions
Rows can be removed from a table using the DELETE command. Suppose you are no longer interested in
the weather of Hayward. Then you can do the following to delete those rows from the table:
14
Chapter 2. The SQL Language
Without a qualification, DELETE will remove all rows from the given table, leaving it empty. The system
will not request confirmation before doing this!
15
Chapter 3. Advanced Features
3.1. Introduction
In the previous chapter we have covered the basics of using SQL to store and access your data in Post-
greSQL. We will now discuss some more advanced features of SQL that simplify management and prevent
loss or corruption of your data. Finally, we will look at some PostgreSQL extensions.
This chapter will on occasion refer to examples found in Chapter 2 to change or improve them, so it will
be of advantage if you have read that chapter. Some examples from this chapter can also be found in
advanced.sql in the tutorial directory. This file also contains some example data to load, which is not
repeated here. (Refer to Section 2.1 for how to use the file.)
3.2. Views
Refer back to the queries in Section 2.6. Suppose the combined listing of weather records and city location
is of particular interest to your application, but you do not want to type the query each time you need it.
You can create a view over the query, which gives a name to the query that you can refer to like an ordinary
table.
Making liberal use of views is a key aspect of good SQL database design. Views allow you to encapsulate
the details of the structure of your tables, which may change as your application evolves, behind consistent
interfaces.
Views can be used in almost any place a real table can be used. Building views upon other views is not
uncommon.
16
Chapter 3. Advanced Features
ERROR: insert or update on table "weather" violates foreign key constraint "weather_cit
DETAIL: Key (city)=(Berkeley) is not present in table "cities".
The behavior of foreign keys can be finely tuned to your application. We will not go beyond this simple
example in this tutorial, but just refer you to Chapter 5 for more information. Making correct use of foreign
keys will definitely improve the quality of your database applications, so you are strongly encouraged to
learn about them.
3.4. Transactions
Transactions are a fundamental concept of all database systems. The essential point of a transaction is that
it bundles multiple steps into a single, all-or-nothing operation. The intermediate states between the steps
are not visible to other concurrent transactions, and if some failure occurs that prevents the transaction
from completing, then none of the steps affect the database at all.
For example, consider a bank database that contains balances for various customer accounts, as well as
total deposit balances for branches. Suppose that we want to record a payment of $100.00 from Alice’s
account to Bob’s account. Simplifying outrageously, the SQL commands for this might look like
The details of these commands are not important here; the important point is that there are several separate
updates involved to accomplish this rather simple operation. Our bank’s officers will want to be assured
that either all these updates happen, or none of them happen. It would certainly not do for a system failure
to result in Bob receiving $100.00 that was not debited from Alice. Nor would Alice long remain a happy
17
Chapter 3. Advanced Features
customer if she was debited without Bob being credited. We need a guarantee that if something goes
wrong partway through the operation, none of the steps executed so far will take effect. Grouping the
updates into a transaction gives us this guarantee. A transaction is said to be atomic: from the point of
view of other transactions, it either happens completely or not at all.
We also want a guarantee that once a transaction is completed and acknowledged by the database system,
it has indeed been permanently recorded and won’t be lost even if a crash ensues shortly thereafter. For
example, if we are recording a cash withdrawal by Bob, we do not want any chance that the debit to
his account will disappear in a crash just after he walks out the bank door. A transactional database
guarantees that all the updates made by a transaction are logged in permanent storage (i.e., on disk) before
the transaction is reported complete.
Another important property of transactional databases is closely related to the notion of atomic updates:
when multiple transactions are running concurrently, each one should not be able to see the incomplete
changes made by others. For example, if one transaction is busy totalling all the branch balances, it would
not do for it to include the debit from Alice’s branch but not the credit to Bob’s branch, nor vice versa.
So transactions must be all-or-nothing not only in terms of their permanent effect on the database, but
also in terms of their visibility as they happen. The updates made so far by an open transaction are in-
visible to other transactions until the transaction completes, whereupon all the updates become visible
simultaneously.
In PostgreSQL, a transaction is set up by surrounding the SQL commands of the transaction with BEGIN
and COMMIT commands. So our banking transaction would actually look like
BEGIN;
UPDATE accounts SET balance = balance - 100.00
WHERE name = ’Alice’;
-- etc etc
COMMIT;
If, partway through the transaction, we decide we do not want to commit (perhaps we just noticed that
Alice’s balance went negative), we can issue the command ROLLBACK instead of COMMIT, and all our
updates so far will be canceled.
PostgreSQL actually treats every SQL statement as being executed within a transaction. If you do not is-
sue a BEGIN command, then each individual statement has an implicit BEGIN and (if successful) COMMIT
wrapped around it. A group of statements surrounded by BEGIN and COMMIT is sometimes called a trans-
action block.
Note: Some client libraries issue BEGIN and COMMIT commands automatically, so that you may get the
effect of transaction blocks without asking. Check the documentation for the interface you are using.
It’s possible to control the statements in a transaction in a more granular fashion through the use of save-
points. Savepoints allow you to selectively discard parts of the transaction, while committing the rest.
After defining a savepoint with SAVEPOINT, you can if needed roll back to the savepoint with ROLLBACK
TO. All the transaction’s database changes between defining the savepoint and rolling back to it are dis-
carded, but changes earlier than the savepoint are kept.
18
Chapter 3. Advanced Features
After rolling back to a savepoint, it continues to be defined, so you can roll back to it several times.
Conversely, if you are sure you won’t need to roll back to a particular savepoint again, it can be released,
so the system can free some resources. Keep in mind that either releasing or rolling back to a savepoint
will automatically release all savepoints that were defined after it.
All this is happening within the transaction block, so none of it is visible to other database sessions. When
and if you commit the transaction block, the committed actions become visible as a unit to other sessions,
while the rolled-back actions never become visible at all.
Remembering the bank database, suppose we debit $100.00 from Alice’s account, and credit Bob’s ac-
count, only to find later that we should have credited Wally’s account. We could do it using savepoints
like this:
BEGIN;
UPDATE accounts SET balance = balance - 100.00
WHERE name = ’Alice’;
SAVEPOINT my_savepoint;
UPDATE accounts SET balance = balance + 100.00
WHERE name = ’Bob’;
-- oops ... forget that and use Wally’s account
ROLLBACK TO my_savepoint;
UPDATE accounts SET balance = balance + 100.00
WHERE name = ’Wally’;
COMMIT;
This example is, of course, oversimplified, but there’s a lot of control to be had over a transaction block
through the use of savepoints. Moreover, ROLLBACK TO is the only way to regain control of a transaction
block that was put in aborted state by the system due to an error, short of rolling it back completely and
starting again.
3.5. Inheritance
Inheritance is a concept from object-oriented databases. It opens up interesting new possibilities of
database design.
Let’s create two tables: A table cities and a table capitals. Naturally, capitals are also cities, so you
want some way to show the capitals implicitly when you list all cities. If you’re really clever you might
invent some scheme like this:
19
Chapter 3. Advanced Features
);
This works OK as far as querying goes, but it gets ugly when you need to update several rows, for one
thing.
A better solution is this:
In this case, a row of capitals inherits all columns (name, population, and altitude) from its
parent, cities. The type of the column name is text, a native PostgreSQL type for variable length
character strings. State capitals have an extra column, state, that shows their state. In PostgreSQL, a table
can inherit from zero or more other tables.
For example, the following query finds the names of all cities, including state capitals, that are located at
an altitude over 500 ft.:
which returns:
name | altitude
-----------+----------
Las Vegas | 2174
Mariposa | 1953
Madison | 845
(3 rows)
On the other hand, the following query finds all the cities that are not state capitals and are situated at an
altitude of 500 ft. or higher:
name | altitude
20
Chapter 3. Advanced Features
-----------+----------
Las Vegas | 2174
Mariposa | 1953
(2 rows)
Here the ONLY before cities indicates that the query should be run over only the cities table, and not
tables below cities in the inheritance hierarchy. Many of the commands that we have already discussed
— SELECT, UPDATE, and DELETE — support this ONLY notation.
Note: Although inheritance is frequently useful, it has not been integrated with unique constraints or
foreign keys, which limits its usefulness. See Section 5.5 for more detail.
3.6. Conclusion
PostgreSQL has many features not touched upon in this tutorial introduction, which has been oriented
toward newer users of SQL. These features are discussed in more detail in the remainder of this book.
If you feel you need more introductory material, please visit the PostgreSQL web site1 for links to more
resources.
1. http://www.postgresql.org
21
II. The SQL Language
This part describes the use of the SQL language in PostgreSQL. We start with describing the general
syntax of SQL, then explain how to create the structures to hold data, how to populate the database, and
how to query it. The middle part lists the available data types and functions for use in SQL commands.
The rest treats several aspects that are important for tuning a database for optimal performance.
The information in this part is arranged so that a novice user can follow it start to end to gain a full un-
derstanding of the topics without having to refer forward too many times. The chapters are intended to be
self-contained, so that advanced users can read the chapters individually as they choose. The information
in this part is presented in a narrative fashion in topical units. Readers looking for a complete description
of a particular command should look into Part VI.
Readers of this part should know how to connect to a PostgreSQL database and issue SQL commands.
Readers that are unfamiliar with these issues are encouraged to read Part I first. SQL commands are
typically entered using the PostgreSQL interactive terminal psql, but other programs that have similar
functionality can be used as well.
Chapter 4. SQL Syntax
This chapter describes the syntax of SQL. It forms the foundation for understanding the following chapters
which will go into detail about how the SQL commands are applied to define and modify data.
We also advise users who are already familiar with SQL to read this chapter carefully because there are
several rules and concepts that are implemented inconsistently among SQL databases or that are specific
to PostgreSQL.
This is a sequence of three commands, one per line (although this is not required; more than one command
can be on a line, and commands can usefully be split across lines).
The SQL syntax is not very consistent regarding what tokens identify commands and which are operands
or parameters. The first few tokens are generally the command name, so in the above example we would
usually speak of a “SELECT”, an “UPDATE”, and an “INSERT” command. But for instance the UPDATE
command always requires a SET token to appear in a certain position, and this particular variation of
INSERT also requires a VALUES in order to be complete. The precise syntax rules for each command are
described in Part VI.
24
Chapter 4. SQL Syntax
will not define a key word that contains digits or starts or ends with an underscore, so identifiers of this
form are safe against possible conflict with future extensions of the standard.
The system uses no more than NAMEDATALEN-1 characters of an identifier; longer names can be written
in commands, but they will be truncated. By default, NAMEDATALEN is 64 so the maximum identifier
length is 63. If this limit is problematic, it can be raised by changing the NAMEDATALEN constant in
src/include/postgres_ext.h.
A convention often used is to write key words in upper case and names in lower case, e.g.,
There is a second kind of identifier: the delimited identifier or quoted identifier. It is formed by enclosing
an arbitrary sequence of characters in double-quotes ("). A delimited identifier is always an identifier,
never a key word. So "select" could be used to refer to a column or table named “select”, whereas an
unquoted select would be taken as a key word and would therefore provoke a parse error when used
where a table or column name is expected. The example can be written with quoted identifiers like this:
Quoted identifiers can contain any character other than a double quote itself. (To include a double quote,
write two double quotes.) This allows constructing table or column names that would otherwise not be
possible, such as ones containing spaces or ampersands. The length limitation still applies.
Quoting an identifier also makes it case-sensitive, whereas unquoted names are always folded to lower
case. For example, the identifiers FOO, foo, and "foo" are considered the same by PostgreSQL, but
"Foo" and "FOO" are different from these three and each other. (The folding of unquoted names to lower
case in PostgreSQL is incompatible with the SQL standard, which says that unquoted names should be
folded to upper case. Thus, foo should be equivalent to "FOO" not "foo" according to the standard. If
you want to write portable applications you are advised to always quote a particular name or never quote
it.)
4.1.2. Constants
There are three kinds of implicitly-typed constants in PostgreSQL: strings, bit strings, and numbers. Con-
stants can also be specified with explicit types, which can enable more accurate representation and more
efficient handling by the system. These alternatives are discussed in the following subsections.
25
Chapter 4. SQL Syntax
Another PostgreSQL extension is that C-style backslash escapes are available: \b is a backspace, \f is a
form feed, \n is a newline, \r is a carriage return, \t is a tab, and \xxx , where xxx is an octal number,
is a byte with the corresponding code. (It is your responsibility that the byte sequences you create are
valid characters in the server character set encoding.) Any other character following a backslash is taken
literally. Thus, to include a backslash in a string constant, write two backslashes.
The character with the code zero cannot be in a string constant.
Two string constants that are only separated by whitespace with at least one newline are concatenated and
effectively treated as if the string had been written in one constant. For example:
SELECT ’foo’
’bar’;
is equivalent to
SELECT ’foobar’;
but
is not valid syntax. (This slightly bizarre behavior is specified by SQL; PostgreSQL is following the
standard.)
$$Dianne’s horse$$
$SomeTag$Dianne’s horse$SomeTag$
Notice that inside the dollar-quoted string, single quotes can be used without needing to be escaped.
Indeed, no characters inside a dollar-quoted string are ever escaped: the string content is always written
literally. Backslashes are not special, and neither are dollar signs, unless they are part of a sequence
matching the opening tag.
It is possible to nest dollar-quoted string constants by choosing different tags at each nesting level. This is
most commonly used in writing function definitions. For example:
26
Chapter 4. SQL Syntax
$function$
BEGIN
RETURN ($1 ~ $q$[\t\r\n\v\\]$q$);
END;
$function$
A dollar-quoted string that follows a keyword or identifier must be separated from it by whitespace;
otherwise the dollar quoting delimiter would be taken as part of the preceding identifier.
Dollar quoting is not part of the SQL standard, but it is often a more convenient way to write complicated
string literals than the standard-compliant single quote syntax. It is particularly useful when representing
string constants inside other constants, as is often needed in procedural function definitions. With single-
quote syntax, each backslash in the above example would have to be written as four backslashes, which
would be reduced to two backslashes in parsing the original string constant, and then to one when the
inner string constant is re-parsed during function execution.
digits
digits.[digits][e[+-]digits]
[digits].digits[e[+-]digits]
digitse[+-]digits
where digits is one or more decimal digits (0 through 9). At least one digit must be before or after the
decimal point, if one is used. At least one digit must follow the exponent marker (e), if one is present.
There may not be any spaces or other characters embedded in the constant. Note that any leading plus or
minus sign is not actually considered part of the constant; it is an operator applied to the constant.
27
Chapter 4. SQL Syntax
42
3.5
4.
.001
5e2
1.925e-3
A numeric constant that contains neither a decimal point nor an exponent is initially presumed to be type
integer if its value fits in type integer (32 bits); otherwise it is presumed to be type bigint if its value
fits in type bigint (64 bits); otherwise it is taken to be type numeric. Constants that contain decimal
points and/or exponents are always initially presumed to be type numeric.
The initially assigned data type of a numeric constant is just a starting point for the type resolution algo-
rithms. In most cases the constant will be automatically coerced to the most appropriate type depending
on context. When necessary, you can force a numeric value to be interpreted as a specific data type by
casting it. For example, you can force a numeric value to be treated as type real (float4) by writing
These are actually just special cases of the general casting notations discussed next.
type ’string’
’string’::type
CAST ( ’string’ AS type )
The string constant’s text is passed to the input conversion routine for the type called type. The result is
a constant of the indicated type. The explicit type cast may be omitted if there is no ambiguity as to the
type the constant must be (for example, when it is assigned directly to a table column), in which case it is
automatically coerced.
The string constant can be written using either regular SQL notation or dollar-quoting.
It is also possible to specify a type coercion using a function-like syntax:
typename ( ’string’ )
but not all type names may be used in this way; see Section 4.2.8 for details.
The ::, CAST(), and function-call syntaxes can also be used to specify run-time type conversions of
arbitrary expressions, as discussed in Section 4.2.8. But the form type ’string’ can only be used to
specify the type of a literal constant. Another restriction on type ’string’ is that it does not work for
array types; use :: or CAST() to specify the type of an array constant.
28
Chapter 4. SQL Syntax
4.1.3. Operators
An operator name is a sequence of up to NAMEDATALEN-1 (63 by default) characters from the following
list:
+-*/<>=~!@#%^&|‘?
• -- and /* cannot appear anywhere in an operator name, since they will be taken as the start of a
comment.
• A multiple-character operator name cannot end in + or -, unless the name also contains at least one of
these characters:
~!@#%^&|‘?
For example, @- is an allowed operator name, but *- is not. This restriction allows PostgreSQL to parse
SQL-compliant queries without requiring spaces between tokens.
When working with non-SQL-standard operator names, you will usually need to separate adjacent opera-
tors with spaces to avoid ambiguity. For example, if you have defined a left unary operator named @, you
cannot write X*@Y; you must write X* @Y to ensure that PostgreSQL reads it as two operator names not
one.
• A dollar sign ($) followed by digits is used to represent a positional parameter in the body of a function
definition or a prepared statement. In other contexts the dollar sign may be part of an identifier or a
dollar-quoted string constant.
• Parentheses (()) have their usual meaning to group expressions and enforce precedence. In some cases
parentheses are required as part of the fixed syntax of a particular SQL command.
• Brackets ([]) are used to select the elements of an array. See Section 8.10 for more information on
arrays.
• Commas (,) are used in some syntactical constructs to separate the elements of a list.
• The semicolon (;) terminates an SQL command. It cannot appear anywhere within a command, except
within a string constant or quoted identifier.
• The colon (:) is used to select “slices” from arrays. (See Section 8.10.) In certain SQL dialects (such
as Embedded SQL), the colon is used to prefix variable names.
• The asterisk (*) is used in some contexts to denote all the fields of a table row or composite value. It
also has a special meaning when used as the argument of the COUNT aggregate function.
29
Chapter 4. SQL Syntax
• The period (.) is used in numeric constants, and to separate schema, table, and column names.
4.1.5. Comments
A comment is an arbitrary sequence of characters beginning with double dashes and extending to the end
of the line, e.g.:
/* multiline comment
* with nesting: /* nested block comment */
*/
where the comment begins with /* and extends to the matching occurrence of */. These block comments
nest, as specified in the SQL standard but unlike C, so that one can comment out larger blocks of code
that may contain existing block comments.
A comment is removed from the input stream before further syntax analysis and is effectively replaced by
whitespace.
SELECT 5 ! - 6;
will be parsed as
SELECT 5 ! (- 6);
because the parser has no idea — until it is too late — that ! is defined as a postfix operator, not an infix
one. To get the desired behavior in this case, you must write
SELECT (5 !) - 6;
30
Chapter 4. SQL Syntax
Note that the operator precedence rules also apply to user-defined operators that have the same names as
the built-in operators mentioned above. For example, if you define a “+” operator for some custom data
type it will have the same precedence as the built-in “+” operator, no matter what yours does.
When a schema-qualified operator name is used in the OPERATOR syntax, as for example in
SELECT 3 OPERATOR(pg_catalog.+) 4;
the OPERATOR construct is taken to have the default precedence shown in Table 4-1 for “any other” oper-
ator. This is true no matter which specific operator name appears inside OPERATOR().
31
Chapter 4. SQL Syntax
In addition to this list, there are a number of constructs that can be classified as an expression but do
not follow any general syntax rules. These generally have the semantics of a function or operator and are
explained in the appropriate location in Chapter 9. An example is the IS NULL clause.
We have already discussed constants in Section 4.1.2. The following sections discuss the remaining op-
tions.
correlation.columnname
correlation is the name of a table (possibly qualified with a schema name), or an alias for a table
defined by means of a FROM clause, or one of the key words NEW or OLD. (NEW and OLD can only appear in
rewrite rules, while other correlation names can be used in any SQL statement.) The correlation name and
separating dot may be omitted if the column name is unique across all the tables being used in the current
query. (See also Chapter 7.)
$number
32
Chapter 4. SQL Syntax
Here the $1 will be replaced by the first function argument when the function is invoked.
4.2.3. Subscripts
If an expression yields a value of an array type, then a specific element of the array value can be extracted
by writing
expression[subscript]
expression[lower_subscript:upper_subscript]
(Here, the brackets [ ] are meant to appear literally.) Each subscript is itself an expression, which
must yield an integer value.
In general the array expression must be parenthesized, but the parentheses may be omitted when the
expression to be subscripted is just a column reference or positional parameter. Also, multiple subscripts
can be concatenated when the original array is multi-dimensional. For example,
mytable.arraycolumn[4]
mytable.two_d_column[17][34]
$1[10:42]
(arrayfunction(a,b))[42]
The parentheses in the last example are required. See Section 8.10 for more about arrays.
expression.fieldname
In general the row expression must be parenthesized, but the parentheses may be omitted when the
expression to be selected from is just a table reference or positional parameter. For example,
mytable.mycolumn
$1.somecolumn
(rowfunction(a,b)).col3
(Thus, a qualified column reference is actually just a special case of the field selection syntax.)
33
Chapter 4. SQL Syntax
OPERATOR(schema.operatorname)
Which particular operators exist and whether they are unary or binary depends on what operators have
been defined by the system or the user. Chapter 9 describes the built-in operators.
sqrt(2)
The list of built-in functions is in Chapter 9. Other functions may be added by the user.
aggregate_name (expression)
aggregate_name (ALL expression)
aggregate_name (DISTINCT expression)
aggregate_name ( * )
where aggregate_name is a previously defined aggregate (possibly qualified with a schema name),
and expression is any value expression that does not itself contain an aggregate expression.
The first form of aggregate expression invokes the aggregate across all input rows for which the given
expression yields a non-null value. (Actually, it is up to the aggregate function whether to ignore null
values or not — but all the standard ones do.) The second form is the same as the first, since ALL is the
default. The third form invokes the aggregate for all distinct non-null values of the expression found in
the input rows. The last form invokes the aggregate once for each input row regardless of null or non-null
34
Chapter 4. SQL Syntax
values; since no particular input value is specified, it is generally only useful for the count() aggregate
function.
For example, count(*) yields the total number of input rows; count(f1) yields the number of input
rows in which f1 is non-null; count(distinct f1) yields the number of distinct non-null values of
f1.
The predefined aggregate functions are described in Section 9.15. Other aggregate functions may be added
by the user.
An aggregate expression may only appear in the result list or HAVING clause of a SELECT command. It is
forbidden in other clauses, such as WHERE, because those clauses are logically evaluated before the results
of aggregates are formed.
When an aggregate expression appears in a subquery (see Section 4.2.9 and Section 9.16), the aggregate
is normally evaluated over the rows of the subquery. But an exception occurs if the aggregate’s argument
contains only outer-level variables: the aggregate then belongs to the nearest such outer level, and is
evaluated over the rows of that query. The aggregate expression as a whole is then an outer reference for
the subquery it appears in, and acts as a constant over any one evaluation of that subquery. The restriction
about appearing only in the result list or HAVING clause applies with respect to the query level that the
aggregate belongs to.
The CAST syntax conforms to SQL; the syntax with :: is historical PostgreSQL usage.
When a cast is applied to a value expression of a known type, it represents a run-time type conversion. The
cast will succeed only if a suitable type conversion operation has been defined. Notice that this is subtly
different from the use of casts with constants, as shown in Section 4.1.2.5. A cast applied to an unadorned
string literal represents the initial assignment of a type to a literal constant value, and so it will succeed
for any type (if the contents of the string literal are acceptable input syntax for the data type).
An explicit type cast may usually be omitted if there is no ambiguity as to the type that a value expression
must produce (for example, when it is assigned to a table column); the system will automatically apply
a type cast in such cases. However, automatic casting is only done for casts that are marked “OK to
apply implicitly” in the system catalogs. Other casts must be invoked with explicit casting syntax. This
restriction is intended to prevent surprising conversions from being applied silently.
It is also possible to specify a type cast using a function-like syntax:
typename ( expression )
However, this only works for types whose names are also valid as function names. For example, double
precision can’t be used this way, but the equivalent float8 can. Also, the names interval, time,
and timestamp can only be used in this fashion if they are double-quoted, because of syntactic conflicts.
Therefore, the use of the function-like cast syntax leads to inconsistencies and should probably be avoided
in new applications. (The function-like syntax is in fact just a function call. When one of the two standard
35
Chapter 4. SQL Syntax
cast syntaxes is used to do a run-time conversion, it will internally invoke a registered function to perform
the conversion. By convention, these conversion functions have the same name as their output type, and
thus the “function-like syntax” is nothing more than a direct invocation of the underlying conversion
function. Obviously, this is not something that a portable application should rely on.)
SELECT ARRAY[1,2,3+4];
array
---------
{1,2,7}
(1 row)
The array element type is the common type of the member expressions, determined using the same rules
as for UNION or CASE constructs (see Section 10.5).
Multidimensional array values can be built by nesting array constructors. In the inner constructors, the
key word ARRAY may be omitted. For example, these produce the same result:
SELECT ARRAY[[1,2],[3,4]];
array
---------------
{{1,2},{3,4}}
(1 row)
36
Chapter 4. SQL Syntax
Since multidimensional arrays must be rectangular, inner constructors at the same level must produce
sub-arrays of identical dimensions.
Multidimensional array constructor elements can be anything yielding an array of the proper kind, not
only a sub-ARRAY construct. For example:
It is also possible to construct an array from the results of a subquery. In this form, the array constructor
is written with the key word ARRAY followed by a parenthesized (not bracketed) subquery. For example:
The subquery must return a single column. The resulting one-dimensional array will have an element for
each row in the subquery result, with an element type matching that of the subquery’s output column.
The subscripts of an array value built with ARRAY always begin with one. For more information about
arrays, see Section 8.10.
The key word ROW is optional when there is more than one expression in the list.
By default, the value created by a ROW expression is of an anonymous record type. If necessary, it can be
cast to a named composite type — either the row type of a table, or a composite type created with CREATE
TYPE AS. An explicit cast may be needed to avoid ambiguity. For example:
37
Chapter 4. SQL Syntax
getf1
-------
1
(1 row)
Row constructors can be used to build composite values to be stored in a composite-type table column,
or to be passed to a function that accepts a composite parameter. Also, it is possible to compare two row
values or test a row with IS NULL or IS NOT NULL, for example
For more detail see Section 9.17. Row constructors can also be used in connection with subqueries, as
discussed in Section 9.16.
then somefunc() would (probably) not be called at all. The same would be the case if one wrote
38
Chapter 4. SQL Syntax
Note that this is not the same as the left-to-right “short-circuiting” of Boolean operators that is found in
some programming languages.
As a consequence, it is unwise to use functions with side effects as part of complex expressions. It is
particularly dangerous to rely on side effects or evaluation order in WHERE and HAVING clauses, since
those clauses are extensively reprocessed as part of developing an execution plan. Boolean expressions
(AND/OR/NOT combinations) in those clauses may be reorganized in any manner allowed by the laws of
Boolean algebra.
When it is essential to force evaluation order, a CASE construct (see Section 9.13) may be used. For
example, this is an untrustworthy way of trying to avoid division by zero in a WHERE clause:
SELECT ... WHERE CASE WHEN x <> 0 THEN y/x > 1.5 ELSE false END;
A CASE construct used in this fashion will defeat optimization attempts, so it should only be done when
necessary. (In this particular example, it would doubtless be best to sidestep the problem by writing y >
1.5*x instead.)
39
Chapter 5. Data Definition
This chapter covers how one creates the database structures that will hold one’s data. In a relational
database, the raw data is stored in tables, so the majority of this chapter is devoted to explaining how
tables are created and modified and what features are available to control what data is stored in the tables.
Subsequently, we discuss how tables can be organized into schemas, and how privileges can be assigned to
tables. Finally, we will briefly look at other features that affect the data storage, such as views, functions,
and triggers.
This creates a table named my_first_table with two columns. The first column is named
first_column and has a data type of text; the second column has the name second_column and the
type integer. The table and column names follow the identifier syntax explained in Section 4.1.1.
The type names are usually also identifiers, but there are some exceptions. Note that the column list is
comma-separated and surrounded by parentheses.
Of course, the previous example was heavily contrived. Normally, you would give names to your tables
and columns that convey what kind of data they store. So let’s look at a more realistic example:
40
Chapter 5. Data Definition
(The numeric type can store fractional components, as would be typical of monetary amounts.)
Tip: When you create many interrelated tables it is wise to choose a consistent naming pattern for the
tables and columns. For instance, there is a choice of using singular or plural nouns for table names,
both of which are favored by some theorist or other.
There is a limit on how many columns a table can contain. Depending on the column types, it is between
250 and 1600. However, defining a table with anywhere near this many columns is highly unusual and
often a questionable design.
If you no longer need a table, you can remove it using the DROP TABLE command. For example:
Attempting to drop a table that does not exist is an error. Nevertheless, it is common in SQL script files to
unconditionally try to drop each table before creating it, ignoring the error messages.
If you need to modify a table that already exists look into Section 5.6 later in this chapter.
With the tools discussed so far you can create fully functional tables. The remainder of this chapter is
concerned with adding features to the table definition to ensure data integrity, security, or convenience. If
you are eager to fill your tables with data now you can skip ahead to Chapter 6 and read the rest of this
chapter later.
41
Chapter 5. Data Definition
The default value may be an expression, which will be evaluated whenever the default value is inserted
(not when the table is created). A common example is that a timestamp column may have a default of
now(), so that it gets set to the time of row insertion. Another common example is generating a “serial
number” for each row. In PostgreSQL this is typically done by something like
where the nextval() function supplies successive values from a sequence object (see Section 9.12). This
arrangement is sufficiently common that there’s a special shorthand for it:
5.3. Constraints
Data types are a way to limit the kind of data that can be stored in a table. For many applications, however,
the constraint they provide is too coarse. For example, a column containing a product price should prob-
ably only accept positive values. But there is no data type that accepts only positive numbers. Another
issue is that you might want to constrain column data with respect to other columns or rows. For example,
in a table containing product information, there should only be one row for each product number.
To that end, SQL allows you to define constraints on columns and tables. Constraints give you as much
control over the data in your tables as you wish. If a user attempts to store data in a column that would
violate a constraint, an error is raised. This applies even if the value came from the default value definition.
As you see, the constraint definition comes after the data type, just like default value definitions. Default
values and constraints can be listed in any order. A check constraint consists of the key word CHECK
followed by an expression in parentheses. The check constraint expression should involve the column
thus constrained, otherwise the constraint would not make too much sense.
42
Chapter 5. Data Definition
You can also give the constraint a separate name. This clarifies error messages and allows you to refer to
the constraint when you need to change it. The syntax is:
So, to specify a named constraint, use the key word CONSTRAINT followed by an identifier followed by
the constraint definition. (If you don’t specify a constraint name in this way, the system chooses a name
for you.)
A check constraint can also refer to several columns. Say you store a regular price and a discounted price,
and you want to ensure that the discounted price is lower than the regular price.
The first two constraints should look familiar. The third one uses a new syntax. It is not attached to a
particular column, instead it appears as a separate item in the comma-separated column list. Column
definitions and these constraint definitions can be listed in mixed order.
We say that the first two constraints are column constraints, whereas the third one is a table constraint
because it is written separately from any one column definition. Column constraints can also be written
as table constraints, while the reverse is not necessarily possible, since a column constraint is supposed to
refer to only the column it is attached to. (PostgreSQL doesn’t enforce that rule, but you should follow it
if you want your table definitions to work with other database systems.) The above example could also be
written as
or even
43
Chapter 5. Data Definition
It should be noted that a check constraint is satisfied if the check expression evaluates to true or the null
value. Since most expressions will evaluate to the null value if any operand is null, they will not prevent
null values in the constrained columns. To ensure that a column does not contain null values, the not-null
constraint described in the next section can be used.
A not-null constraint is always written as a column constraint. A not-null constraint is functionally equiv-
alent to creating a check constraint CHECK (column_name IS NOT NULL), but in PostgreSQL creating
an explicit not-null constraint is more efficient. The drawback is that you cannot give explicit names to
not-null constraints created that way.
Of course, a column can have more than one constraint. Just write the constraints one after another:
The order doesn’t matter. It does not necessarily determine in which order the constraints are checked.
The NOT NULL constraint has an inverse: the NULL constraint. This does not mean that the column must
be null, which would surely be useless. Instead, this simply selects the default behavior that the column
may be null. The NULL constraint is not defined in the SQL standard and should not be used in portable
44
Chapter 5. Data Definition
applications. (It was only added to PostgreSQL to be compatible with some other database systems.) Some
users, however, like it because it makes it easy to toggle the constraint in a script file. For example, you
could start with
Tip: In most database designs the majority of columns should be marked not null.
This specifies that the combination of values in the indicated columns is unique across the whole table,
though any one of the columns need not be (and ordinarily isn’t) unique.
You can assign your own name for a unique constraint, in the usual way:
45
Chapter 5. Data Definition
In general, a unique constraint is violated when there are two or more rows in the table where the values
of all of the columns included in the constraint are equal. However, null values are not considered equal in
this comparison. That means even in the presence of a unique constraint it is possible to store an unlimited
number of rows that contain a null value in at least one of the constrained columns. This behavior conforms
to the SQL standard, but we have heard that other SQL databases may not follow this rule. So be careful
when developing applications that are intended to be portable.
Primary keys can also constrain more than one column; the syntax is similar to unique constraints:
A primary key indicates that a column or group of columns can be used as a unique identifier for rows in
the table. (This is a direct consequence of the definition of a primary key. Note that a unique constraint
does not, by itself, provide a unique identifier because it does not exclude null values.) This is useful
both for documentation purposes and for client applications. For example, a GUI application that allows
modifying row values probably needs to know the primary key of a table to be able to identify rows
uniquely.
46
Chapter 5. Data Definition
A table can have at most one primary key (while it can have many unique and not-null constraints).
Relational database theory dictates that every table must have a primary key. This rule is not enforced by
PostgreSQL, but it is usually best to follow it.
Let’s also assume you have a table storing orders of those products. We want to ensure that the orders table
only contains orders of products that actually exist. So we define a foreign key constraint in the orders
table that references the products table:
Now it is impossible to create orders with product_no entries that do not appear in the products table.
We say that in this situation the orders table is the referencing table and the products table is the referenced
table. Similarly, there are referencing and referenced columns.
You can also shorten the above command to
because in absence of a column list the primary key of the referenced table is used as the referenced
column(s).
A foreign key can also constrain and reference a group of columns. As usual, it then needs to be written
in table constraint form. Here is a contrived syntax example:
CREATE TABLE t1 (
a integer PRIMARY KEY,
b integer,
c integer,
FOREIGN KEY (b, c) REFERENCES other_table (c1, c2)
);
47
Chapter 5. Data Definition
Of course, the number and type of the constrained columns need to match the number and type of the
referenced columns.
You can assign your own name for a foreign key constraint, in the usual way.
A table can contain more than one foreign key constraint. This is used to implement many-to-many rela-
tionships between tables. Say you have tables about products and orders, but now you want to allow one
order to contain possibly many products (which the structure above did not allow). You could use this
table structure:
Notice that the primary key overlaps with the foreign keys in the last table.
We know that the foreign keys disallow creation of orders that do not relate to any products. But what if
a product is removed after an order is created that references it? SQL allows you to handle that as well.
Intuitively, we have a few options:
To illustrate this, let’s implement the following policy on the many-to-many relationship example above:
when someone wants to remove a product that is still referenced by an order (via order_items), we
disallow it. If someone removes an order, the order items are removed as well.
48
Chapter 5. Data Definition
...
);
Restricting and cascading deletes are the two most common options. RESTRICT prevents deletion of a
referenced row. NO ACTION means that if any referencing rows still exist when the constraint is checked,
an error is raised; this is the default behavior if you do not specify anything. (The essential difference
between these two choices is that NO ACTION allows the check to be deferred until later in the transaction,
whereas RESTRICT does not.) CASCADE specifies that when a referenced row is deleted, row(s) referencing
it should be automatically deleted as well. There are two other options: SET NULL and SET DEFAULT.
These cause the referencing columns to be set to nulls or default values, respectively, when the referenced
row is deleted. Note that these do not excuse you from observing any constraints. For example, if an action
specifies SET DEFAULT but the default value would not satisfy the foreign key, the operation will fail.
Analogous to ON DELETE there is also ON UPDATE which is invoked when a referenced column is
changed (updated). The possible actions are the same.
More information about updating and deleting data is in Chapter 6.
Finally, we should mention that a foreign key must reference columns that either are a primary key or
form a unique constraint. If the foreign key references a unique constraint, there are some additional
possibilities regarding how null values are matched. These are explained in the reference documentation
for CREATE TABLE.
oid
The object identifier (object ID) of a row. This is a serial number that is automatically added by
PostgreSQL to all table rows (unless the table was created using WITHOUT OIDS, in which case this
column is not present). This column is of type oid (same name as the column); see Section 8.12 for
more information about the type.
tableoid
The OID of the table containing this row. This column is particularly handy for queries that select
from inheritance hierarchies, since without it, it’s difficult to tell which individual table a row came
from. The tableoid can be joined against the oid column of pg_class to obtain the table name.
49
Chapter 5. Data Definition
xmin
The identity (transaction ID) of the inserting transaction for this row version. (A row version is an
individual state of a row; each update of a row creates a new row version for the same logical row.)
cmin
The identity (transaction ID) of the deleting transaction, or zero for an undeleted row version. It
is possible for this column to be nonzero in a visible row version. That usually indicates that the
deleting transaction hasn’t committed yet, or that an attempted deletion was rolled back.
cmax
The physical location of the row version within its table. Note that although the ctid can be used to
locate the row version very quickly, a row’s ctid will change each time it is updated or moved by
VACUUM FULL. Therefore ctid is useless as a long-term row identifier. The OID, or even better a
user-defined serial number, should be used to identify logical rows.
OIDs are 32-bit quantities and are assigned from a single cluster-wide counter. In a large or long-lived
database, it is possible for the counter to wrap around. Hence, it is bad practice to assume that OIDs are
unique, unless you take steps to ensure that this is the case. If you need to identify the rows in a table,
using a sequence generator is strongly recommended. However, OIDs can be used as well, provided that
a few additional precautions are taken:
• A unique constraint should be created on the OID column of each table for which the OID will be used
to identify rows.
• OIDs should never be assumed to be unique across tables; use the combination of tableoid and row
OID if you need a database-wide identifier.
• The tables in question should be created using WITH OIDS to ensure forward compatibility with future
releases of PostgreSQL. It is planned that WITHOUT OIDS will become the default.
Transaction identifiers are also 32-bit quantities. In a long-lived database it is possible for transaction IDs
to wrap around. This is not a fatal problem given appropriate maintenance procedures; see Chapter 21 for
details. It is unwise, however, to depend on the uniqueness of transaction IDs over the long term (more
than one billion transactions).
Command identifiers are also 32-bit quantities. This creates a hard limit of 232 (4 billion) SQL commands
within a single transaction. In practice this limit is not a problem — note that the limit is on number of
SQL commands, not number of rows processed.
50
Chapter 5. Data Definition
5.5. Inheritance
Let’s create two tables. The capitals table contains state capitals which are also cities. Naturally, the
capitals table should inherit from cities.
In this case, a row of capitals inherits all attributes (name, population, and altitude) from its parent, cities.
State capitals have an extra attribute, state, that shows their state. In PostgreSQL, a table can inherit from
zero or more other tables, and a query can reference either all rows of a table or all rows of a table plus all
of its descendants.
For example, the following query finds the names of all cities, including state capitals, that are located at
an altitude over 500ft:
which returns:
name | altitude
-----------+----------
Las Vegas | 2174
Mariposa | 1953
Madison | 845
On the other hand, the following query finds all the cities that are not state capitals and are situated at an
altitude over 500ft:
name | altitude
-----------+----------
Las Vegas | 2174
Mariposa | 1953
51
Chapter 5. Data Definition
Here the “ONLY” before cities indicates that the query should be run over only cities and not tables below
cities in the inheritance hierarchy. Many of the commands that we have already discussed -- SELECT,
UPDATE and DELETE -- support this “ONLY” notation.
Deprecated: In previous versions of PostgreSQL, the default behavior was not to include child tables
in queries. This was found to be error prone and is also in violation of the SQL:1999 standard. Under
the old syntax, to get the sub-tables you append * to the table name. For example
You can still explicitly specify scanning child tables by appending *, as well as explicitly specify not
scanning child tables by writing “ONLY”. But beginning in version 7.1, the default behavior for an
undecorated table name is to scan its child tables too, whereas before the default was not to do so.
To get the old default behavior, set the configuration option SQL_Inheritance to off, e.g.,
In some cases you may wish to know which table a particular row originated from. There is a system
column called tableoid in each table which can tell you the originating table:
which returns:
(If you try to reproduce this example, you will probably get different numeric OIDs.) By doing a join with
pg_class you can see the actual table names:
which returns:
52
Chapter 5. Data Definition
A table can inherit from more than one parent table, in which case it has the union of the columns defined
by the parent tables (plus any columns declared specifically for the child table).
A serious limitation of the inheritance feature is that indexes (including unique constraints) and foreign
key constraints only apply to single tables, not to their inheritance children. This is true on both the
referencing and referenced sides of a foreign key constraint. Thus, in the terms of the above example:
• If we declared cities.name to be UNIQUE or a PRIMARY KEY, this would not stop the capitals
table from having rows with names duplicating rows in cities. And those duplicate rows would by
default show up in queries from cities. In fact, by default capitals would have no unique constraint
at all, and so could contain multiple rows with the same name. You could add a unique constraint to
capitals, but this would not prevent duplication compared to cities.
• Similarly, if we were to specify that cities.name REFERENCES some other table, this constraint would
not automatically propagate to capitals. In this case you could work around it by manually adding
the same REFERENCES constraint to capitals.
• Specifying that another table’s column REFERENCES cities(name) would allow the other table to
contain city names, but not capital names. There is no good workaround for this case.
These deficiencies will probably be fixed in some future release, but in the meantime considerable care is
needed in deciding whether inheritance is useful for your problem.
• Add columns,
• Remove columns,
• Add constraints,
• Remove constraints,
• Change default values,
• Change column data types,
• Rename columns,
• Rename tables.
All these actions are performed using the ALTER TABLE command.
53
Chapter 5. Data Definition
The new column is initially filled with whatever default value is given (null if you don’t specify a DEFAULT
clause).
You can also define constraints on the column at the same time, using the usual syntax:
ALTER TABLE products ADD COLUMN description text CHECK (description <> ”);
In fact all the options that can be applied to a column description in CREATE TABLE can be used here.
Keep in mind however that the default value must satisfy the given constraints, or the ADD will fail.
Alternatively, you can add constraints later (see below) after you’ve filled in the new column correctly.
Whatever data was in the column disappears. Table constraints involving the column are dropped, too.
However, if the column is referenced by a foreign key constraint of another table, PostgreSQL will not
silently drop that constraint. You can authorize dropping everything that depends on the column by adding
CASCADE:
See Section 5.10 for a description of the general mechanism behind this.
To add a not-null constraint, which cannot be written as a table constraint, use this syntax:
The constraint will be checked immediately, so the table data must satisfy the constraint before it can be
added.
54
Chapter 5. Data Definition
(If you are dealing with a generated constraint name like $2, don’t forget that you’ll need to double-quote
it to make it a valid identifier.)
As with dropping a column, you need to add CASCADE if you want to drop a constraint that something else
depends on. An example is that a foreign key constraint depends on a unique or primary key constraint on
the referenced column(s).
This works the same for all constraint types except not-null constraints. To drop a not null constraint use
Note that this doesn’t affect any existing rows in the table, it just changes the default for future INSERT
commands.
To remove any default value, use
This is effectively the same as setting the default to null. As a consequence, it is not an error to drop a
default where one hadn’t been defined, because the default is implicitly the null value.
This will succeed only if each existing entry in the column can be converted to the new type by an implicit
cast. If a more complex conversion is needed, you can add a USING clause that specifies how to compute
the new values from the old.
PostgreSQL will attempt to convert the column’s default value (if any) to the new type, as well as any
constraints that involve the column. But these conversions may fail, or may produce surprising results.
It’s often best to drop any constraints on the column before altering its type, and then add back suitably
modified constraints afterwards.
55
Chapter 5. Data Definition
5.7. Privileges
When you create a database object, you become its owner. By default, only the owner of an object can
do anything with the object. In order to allow other users to use it, privileges must be granted. (However,
users that have the superuser attribute can always access any object.)
There are several different privileges: SELECT, INSERT, UPDATE, DELETE, RULE, REFERENCES,
TRIGGER, CREATE, TEMPORARY, EXECUTE, and USAGE. The privileges applicable to a particular object
vary depending on the object’s type (table, function, etc). For complete information on the different types
of privileges supported by PostgreSQL, refer to the GRANT reference page. The following sections and
chapters will also show you how those privileges are used.
The right to modify or destroy an object is always the privilege of the owner only.
Note: To change the owner of a table, index, sequence, or view, use the ALTER TABLE command.
There are corresponding ALTER commands for other object types.
To assign privileges, the GRANT command is used. For example, if joe is an existing user, and accounts
is an existing table, the privilege to update the table can be granted with
The special “user” name PUBLIC can be used to grant a privilege to every user on the system. Writing
ALL in place of a specific privilege grants all privileges that are relevant for the object type.
The special privileges of the object owner (i.e., the right to do DROP, GRANT, REVOKE, etc.) are always
implicit in being the owner, and cannot be granted or revoked. But the object owner can choose to revoke
his own ordinary privileges, for example to make a table read-only for himself as well as others.
Ordinarily, only the object’s owner (or a superuser) can grant or revoke privileges on an object. However,
it is possible to grant a privilege “with grant option”, which gives the recipient the right to grant it in
turn to others. If the grant option is subsequently revoked then all who received the privilege from that
recipient (directly or through a chain of grants) will lose the privilege. For details see the GRANT and
REVOKE reference pages.
56
Chapter 5. Data Definition
5.8. Schemas
A PostgreSQL database cluster contains one or more named databases. Users and groups of users are
shared across the entire cluster, but no other data is shared across databases. Any given client connection
to the server can access only the data in a single database, the one specified in the connection request.
Note: Users of a cluster do not necessarily have the privilege to access every database in the cluster.
Sharing of user names means that there cannot be different users named, say, joe in two databases in
the same cluster; but the system can be configured to allow joe access to only some of the databases.
A database contains one or more named schemas, which in turn contain tables. Schemas also contain
other kinds of named objects, including data types, functions, and operators. The same object name can
be used in different schemas without conflict; for example, both schema1 and myschema may contain
tables named mytable. Unlike databases, schemas are not rigidly separated: a user may access objects in
any of the schemas in the database he is connected to, if he has privileges to do so.
There are several reasons why one might want to use schemas:
• To allow many users to use one database without interfering with each other.
• To organize database objects into logical groups to make them more manageable.
• Third-party applications can be put into separate schemas so they cannot collide with the names of
other objects.
Schemas are analogous to directories at the operating system level, except that schemas cannot be nested.
To create or access objects in a schema, write a qualified name consisting of the schema name and table
name separated by a dot:
schema.table
This works anywhere a table name is expected, including the table modification commands and the data
access commands discussed in the following chapters. (For brevity we will speak of tables only, but the
same ideas apply to other kinds of named objects, such as types and functions.)
Actually, the even more general syntax
database.schema.table
can be used too, but at present this is just for pro forma compliance with the SQL standard. If you write a
database name, it must be the same as the database you are connected to.
So to create a table in the new schema, use
57
Chapter 5. Data Definition
To drop a schema if it’s empty (all objects in it have been dropped), use
See Section 5.10 for a description of the general mechanism behind this.
Often you will want to create a schema owned by someone else (since this is one of the ways to restrict
the activities of your users to well-defined namespaces). The syntax for that is:
You can even omit the schema name, in which case the schema name will be the same as the user name.
See Section 5.8.6 for how this can be useful.
Schema names beginning with pg_ are reserved for system purposes and may not be created by users.
and
58
Chapter 5. Data Definition
SHOW search_path;
search_path
--------------
$user,public
The first element specifies that a schema with the same name as the current user is to be searched. If no
such schema exists, the entry is ignored. The second element refers to the public schema that we have
seen already.
The first schema in the search path that exists is the default location for creating new objects. That is
the reason that by default objects are created in the public schema. When objects are referenced in any
other context without schema qualification (table modification, data modification, or query commands)
the search path is traversed until a matching object is found. Therefore, in the default configuration, any
unqualified access again can only refer to the public schema.
To put our new schema in the path, we use
(We omit the $user here because we have no immediate need for it.) And then we can access the table
without schema qualification:
Also, since myschema is the first element in the path, new objects would by default be created in it.
We could also have written
Then we no longer have access to the public schema without explicit qualification. There is nothing special
about the public schema except that it exists by default. It can be dropped, too.
See also Section 9.19 for other ways to manipulate the schema search path.
The search path works in the same way for data type names, function names, and operator names as it
does for table names. Data type and function names can be qualified in exactly the same way as table
names. If you need to write a qualified operator name in an expression, there is a special provision: you
must write
OPERATOR(schema.operator )
SELECT 3 OPERATOR(pg_catalog.+) 4;
In practice one usually relies on the search path for operators, so as not to have to write anything so ugly
as that.
59
Chapter 5. Data Definition
(The first “public” is the schema, the second “public” means “every user”. In the first sense it is an
identifier, in the second sense it is a key word, hence the different capitalization; recall the guidelines
from Section 4.1.1.)
• If you do not create any schemas then all users access the public schema implicitly. This simulates the
situation where schemas are not available at all. This setup is mainly recommended when there is only
a single user or a few cooperating users in a database. This setup also allows smooth transition from the
non-schema-aware world.
• You can create a schema for each user with the same name as that user. Recall that the default search
path starts with $user, which resolves to the user name. Therefore, if each user has a separate schema,
they access their own schemas by default.
If you use this setup then you might also want to revoke access to the public schema (or drop it alto-
gether), so users are truly constrained to their own schemas.
60
Chapter 5. Data Definition
• To install shared applications (tables to be used by everyone, additional functions provided by third
parties, etc.), put them into separate schemas. Remember to grant appropriate privileges to allow the
other users to access them. Users can then refer to these additional objects by qualifying the names with
a schema name, or they can put the additional schemas into their search path, as they choose.
5.8.7. Portability
In the SQL standard, the notion of objects in the same schema being owned by different users does not
exist. Moreover, some implementations do not allow you to create schemas that have a different name
than their owner. In fact, the concepts of schema and user are nearly equivalent in a database system
that implements only the basic schema support specified in the standard. Therefore, many users consider
qualified names to really consist of username.tablename. This is how PostgreSQL will effectively
behave if you create a per-user schema for every user.
Also, there is no concept of a public schema in the SQL standard. For maximum conformance to the
standard, you should not use (perhaps even remove) the public schema.
Of course, some SQL database systems might not implement schemas at all, or provide namespace sup-
port by allowing (possibly limited) cross-database access. If you need to work with those systems, then
maximum portability would be achieved by not using schemas at all.
• Views
• Functions and operators
• Data types and domains
• Triggers and rewrite rules
Detailed information on these topics appears in Part V.
61
Chapter 5. Data Definition
considered in Section 5.3.5, with the orders table depending on it, would result in an error message such
as this:
The error message contains a useful hint: if you do not want to bother deleting all the dependent objects
individually, you can run
and all the dependent objects will be removed. In this case, it doesn’t remove the orders table, it only
removes the foreign key constraint. (If you want to check what DROP ... CASCADE will do, run DROP
without CASCADE and read the NOTICE messages.)
All drop commands in PostgreSQL support specifying CASCADE. Of course, the nature of the possible
dependencies varies with the type of the object. You can also write RESTRICT instead of CASCADE to get
the default behavior, which is to prevent drops of objects that other objects depend on.
Note: According to the SQL standard, specifying either RESTRICT or CASCADE is required. No database
system actually enforces that rule, but whether the default behavior is RESTRICT or CASCADE varies
across systems.
Note: Foreign key constraint dependencies and serial column dependencies from PostgreSQL ver-
sions prior to 7.3 are not maintained or created during the upgrade process. All other dependency
types will be properly created during an upgrade from a pre-7.3 database.
62
Chapter 6. Data Manipulation
The previous chapter discussed how to create tables and other structures to hold your data. Now it is
time to fill the tables with data. This chapter covers how to insert, update, and delete table data. We also
introduce ways to effect automatic data changes when certain events occur: triggers and rewrite rules. The
chapter after this will finally explain how to extract your long-lost data back out of the database.
The data values are listed in the order in which the columns appear in the table, separated by commas.
Usually, the data values will be literals (constants), but scalar expressions are also allowed.
The above syntax has the drawback that you need to know the order of the columns in the table. To avoid
that you can also list the columns explicitly. For example, both of the following commands have the same
effect as the one above:
INSERT INTO products (product_no, name, price) VALUES (1, ’Cheese’, 9.99);
INSERT INTO products (name, price, product_no) VALUES (’Cheese’, 9.99, 1);
Many users consider it good practice to always list the column names.
If you don’t have values for all the columns, you can omit some of them. In that case, the columns will be
filled with their default values. For example,
The second form is a PostgreSQL extension. It fills the columns from the left with as many values as are
given, and the rest will be defaulted.
For clarity, you can also request default values explicitly, for individual columns or for the entire row:
INSERT INTO products (product_no, name, price) VALUES (1, ’Cheese’, DEFAULT);
INSERT INTO products DEFAULT VALUES;
63
Chapter 6. Data Manipulation
Tip: To do “bulk loads”, that is, inserting a lot of data, take a look at the COPY command. It is not as
flexible as the INSERT command, but is more efficient.
Recall from Chapter 5 that SQL does not, in general, provide a unique identifier for rows. Therefore it is
not necessarily possible to directly specify which row to update. Instead, you specify which conditions
a row must meet in order to be updated. Only if you have a primary key in the table (no matter whether
you declared it or not) can you reliably address individual rows, by choosing a condition that matches the
primary key. Graphical database access tools rely on this fact to allow you to update rows individually.
For example, this command updates all products that have a price of 5 to have a price of 10:
This may cause zero, one, or many rows to be updated. It is not an error to attempt an update that does not
match any rows.
Let’s look at that command in detail. First is the key word UPDATE followed by the table name. As usual,
the table name may be schema-qualified, otherwise it is looked up in the path. Next is the key word SET
followed by the column name, an equals sign and the new column value. The new column value can be
any scalar expression, not just a constant. For example, if you want to raise the price of all products by
10% you could use:
As you see, the expression for the new value can refer to the existing value(s) in the row. We also left
out the WHERE clause. If it is omitted, it means that all rows in the table are updated. If it is present, only
those rows that match the WHERE condition are updated. Note that the equals sign in the SET clause is an
assignment while the one in the WHERE clause is a comparison, but this does not create any ambiguity. Of
course, the WHERE condition does not have to be an equality test. Many other operators are available (see
Chapter 9). But the expression needs to evaluate to a Boolean result.
You can update more than one column in an UPDATE command by listing more than one assignment in
the SET clause. For example:
64
Chapter 6. Data Manipulation
65
Chapter 7. Queries
The previous chapters explained how to create tables, how to fill them with data, and how to manipulate
that data. Now we finally discuss how to retrieve the data out of the database.
7.1. Overview
The process of retrieving or the command to retrieve data from a database is called a query. In SQL the
SELECT command is used to specify queries. The general syntax of the SELECT command is
The following sections describe the details of the select list, the table expression, and the sort specification.
The simplest kind of query has the form
Assuming that there is a table called table1, this command would retrieve all rows and all columns from
table1. (The method of retrieval depends on the client application. For example, the psql program will
display an ASCII-art table on the screen, while client libraries will offer functions to extract individual
values from the query result.) The select list specification * means all columns that the table expression
happens to provide. A select list can also select a subset of the available columns or make calculations
using the columns. For example, if table1 has columns named a, b, and c (and perhaps others) you can
make the following query:
(assuming that b and c are of a numerical data type). See Section 7.3 for more details.
FROM table1 is a particularly simple kind of table expression: it reads just one table. In general, table
expressions can be complex constructs of base tables, joins, and subqueries. But you can also omit the
table expression entirely and use the SELECT command as a calculator:
SELECT 3 * 4;
This is more useful if the expressions in the select list return varying results. For example, you could call
a function this way:
SELECT random();
66
Chapter 7. Queries
The optional WHERE, GROUP BY, and HAVING clauses in the table expression specify a pipeline of succes-
sive transformations performed on the table derived in the FROM clause. All these transformations produce
a virtual table that provides the rows that are passed to the select list to compute the output rows of the
query.
A table reference may be a table name (possibly schema-qualified), or a derived table such as a subquery,
a table join, or complex combinations of these. If more than one table reference is listed in the FROM
clause they are cross-joined (see below) to form the intermediate virtual table that may then be subject to
transformations by the WHERE, GROUP BY, and HAVING clauses and is finally the result of the overall table
expression.
When a table reference names a table that is the supertable of a table inheritance hierarchy, the table
reference produces rows of not only that table but all of its subtable successors, unless the key word ONLY
precedes the table name. However, the reference produces only the columns that appear in the named table
— any columns added in subtables are ignored.
Join Types
Cross join
T1 CROSS JOIN T2
For each combination of rows from T1 and T2, the derived table will contain a row consisting of
all columns in T1 followed by all columns in T2. If the tables have N and M rows respectively, the
joined table will have N * M rows.
FROM T1 CROSS JOIN T2 is equivalent to FROM T1, T2. It is also equivalent to FROM T1 INNER
JOIN T2 ON TRUE (see below).
Qualified joins
T1 { [INNER] | { LEFT | RIGHT | FULL } [OUTER] } JOIN T2 ON boolean_expression
T1 { [INNER] | { LEFT | RIGHT | FULL } [OUTER] } JOIN T2 USING ( join column list )
T1 NATURAL { [INNER] | { LEFT | RIGHT | FULL } [OUTER] } JOIN T2
The words INNER and OUTER are optional in all forms. INNER is the default; LEFT, RIGHT, and FULL
imply an outer join.
The join condition is specified in the ON or USING clause, or implicitly by the word NATURAL. The
join condition determines which rows from the two source tables are considered to “match”, as
explained in detail below.
67
Chapter 7. Queries
The ON clause is the most general kind of join condition: it takes a Boolean value expression of the
same kind as is used in a WHERE clause. A pair of rows from T1 and T2 match if the ON expression
evaluates to true for them.
USING is a shorthand notation: it takes a comma-separated list of column names, which the joined
tables must have in common, and forms a join condition specifying equality of each of these pairs
of columns. Furthermore, the output of a JOIN USING has one column for each of the equated pairs
of input columns, followed by all of the other columns from each table. Thus, USING (a, b, c)
is equivalent to ON (t1.a = t2.a AND t1.b = t2.b AND t1.c = t2.c) with the exception
that if ON is used there will be two columns a, b, and c in the result, whereas with USING there will
be only one of each.
Finally, NATURAL is a shorthand form of USING: it forms a USING list consisting of exactly those
column names that appear in both input tables. As with USING, these columns appear only once in
the output table.
The possible types of qualified join are:
INNER JOIN
For each row R1 of T1, the joined table has a row for each row in T2 that satisfies the join
condition with R1.
LEFT OUTER JOIN
First, an inner join is performed. Then, for each row in T1 that does not satisfy the join condition
with any row in T2, a joined row is added with null values in columns of T2. Thus, the joined
table unconditionally has at least one row for each row in T1.
RIGHT OUTER JOIN
First, an inner join is performed. Then, for each row in T2 that does not satisfy the join condition
with any row in T1, a joined row is added with null values in columns of T1. This is the converse
of a left join: the result table will unconditionally have a row for each row in T2.
FULL OUTER JOIN
First, an inner join is performed. Then, for each row in T1 that does not satisfy the join condition
with any row in T2, a joined row is added with null values in columns of T2. Also, for each row
of T2 that does not satisfy the join condition with any row in T1, a joined row with null values
in the columns of T1 is added.
Joins of all types can be chained together or nested: either or both of T1 and T2 may be joined tables.
Parentheses may be used around JOIN clauses to control the join order. In the absence of parentheses,
JOIN clauses nest left-to-right.
To put this together, assume we have tables t1
num | name
-----+------
1 | a
2 | b
68
Chapter 7. Queries
3 | c
and t2
num | value
-----+-------
1 | xxx
3 | yyy
5 | zzz
69
Chapter 7. Queries
(3 rows)
The join condition specified with ON can also contain conditions that do not relate directly to the join. This
can prove useful for some queries but needs to be thought out carefully. For example:
=> SELECT * FROM t1 LEFT JOIN t2 ON t1.num = t2.num AND t2.value = ’xxx’;
num | name | num | value
-----+------+-----+-------
1 | a | 1 | xxx
2 | b | |
3 | c | |
(3 rows)
or
70
Chapter 7. Queries
The alias becomes the new name of the table reference for the current query — it is no longer possible to
refer to the table by the original name. Thus
is not valid SQL syntax. What will actually happen (this is a PostgreSQL extension to the standard) is that
an implicit table reference is added to the FROM clause, so the query is processed as if it were written as
which will result in a cross join, which is usually not what you want.
Table aliases are mainly for notational convenience, but it is necessary to use them when joining a table
to itself, e.g.,
Additionally, an alias is required if the table reference is a subquery (see Section 7.2.1.3).
Parentheses are used to resolve ambiguities. The following statement will assign the alias b to the result
of the join, unlike the previous example:
Another form of table aliasing gives temporary names to the columns of the table, as well as the table
itself:
If fewer column aliases are specified than the actual table has columns, the remaining columns are not
renamed. This syntax is especially useful for self-joins or subqueries.
When an alias is applied to the output of a JOIN clause, using any of these forms, the alias hides the
original names within the JOIN. For example,
is not valid: the table alias a is not visible outside the alias c.
71
Chapter 7. Queries
7.2.1.3. Subqueries
Subqueries specifying a derived table must be enclosed in parentheses and must be assigned a table alias
name. (See Section 7.2.1.2.) For example:
This example is equivalent to FROM table1 AS alias_name. More interesting cases, which can’t be
reduced to a plain join, arise when the subquery involves grouping or aggregation.
In some cases it is useful to define table functions that can return different column sets depending on how
they are invoked. To support this, the table function can be declared as returning the pseudotype record.
When such a function is used in a query, the expected row structure must be specified in the query itself,
so that the system can know how to parse and plan the query. Consider this example:
SELECT *
FROM dblink(’dbname=mydb’, ’select proname, prosrc from pg_proc’)
AS t1(proname name, prosrc text)
72
Chapter 7. Queries
The dblink function executes a remote query (see contrib/dblink). It is declared to return record
since it might be used for any kind of query. The actual column set must be specified in the calling query
so that the parser knows, for example, what * should expand to.
WHERE search_condition
where search_condition is any value expression (see Section 4.2) that returns a value of type
boolean.
After the processing of the FROM clause is done, each row of the derived virtual table is checked against
the search condition. If the result of the condition is true, the row is kept in the output table, otherwise
(that is, if the result is false or null) it is discarded. The search condition typically references at least some
column of the table generated in the FROM clause; this is not required, but otherwise the WHERE clause will
be fairly useless.
Note: The join condition of an inner join can be written either in the WHERE clause or in the JOIN clause.
For example, these table expressions are equivalent:
and
or perhaps even
Which one of these you use is mainly a matter of style. The JOIN syntax in the FROM clause is probably
not as portable to other SQL database management systems. For outer joins there is no choice in any
case: they must be done in the FROM clause. An ON/USING clause of an outer join is not equivalent to
a WHERE condition, because it determines the addition of rows (for unmatched input rows) as well as
the removal of rows from the final result.
SELECT ... FROM fdt WHERE c1 IN (SELECT c3 FROM t2 WHERE c2 = fdt.c1 + 10)
73
Chapter 7. Queries
SELECT ... FROM fdt WHERE c1 BETWEEN (SELECT c3 FROM t2 WHERE c2 = fdt.c1 + 10) AND 100
SELECT ... FROM fdt WHERE EXISTS (SELECT c1 FROM t2 WHERE c2 > fdt.c1)
fdt is the table derived in the FROM clause. Rows that do not meet the search condition of the WHERE
clause are eliminated from fdt. Notice the use of scalar subqueries as value expressions. Just like any
other query, the subqueries can employ complex table expressions. Notice also how fdt is referenced in
the subqueries. Qualifying c1 as fdt.c1 is only necessary if c1 is also the name of a column in the derived
input table of the subquery. But qualifying the column name adds clarity even when it is not needed. This
example shows how the column naming scope of an outer query extends into its inner queries.
SELECT select_list
FROM ...
[WHERE ...]
GROUP BY grouping_column_reference [, grouping_column_reference]...
The GROUP BY Clause is used to group together those rows in a table that share the same values in all
the columns listed. The order in which the columns are listed does not matter. The effect is to combine
each set of rows sharing common values into one group row that is representative of all rows in the group.
This is done to eliminate redundancy in the output and/or compute aggregates that apply to these groups.
For instance:
In the second query, we could not have written SELECT * FROM test1 GROUP BY x, because there is
no single value for the column y that could be associated with each group. The grouped-by columns can
be referenced in the select list since they have a single value in each group.
In general, if a table is grouped, columns that are not used in the grouping cannot be referenced except in
aggregate expressions. An example with aggregate expressions is:
74
Chapter 7. Queries
Here sum is an aggregate function that computes a single value over the entire group. More information
about the available aggregate functions can be found in Section 9.15.
Tip: Grouping without aggregate expressions effectively calculates the set of distinct values in a col-
umn. This can also be achieved using the DISTINCT clause (see Section 7.3.3).
Here is another example: it calculates the total sales for each product (rather than the total sales on all
products).
In this example, the columns product_id, p.name, and p.price must be in the GROUP BY clause since
they are referenced in the query select list. (Depending on how exactly the products table is set up, name
and price may be fully dependent on the product ID, so the additional groupings could theoretically be
unnecessary, but this is not implemented yet.) The column s.units does not have to be in the GROUP BY
list since it is only used in an aggregate expression (sum(...)), which represents the sales of a product.
For each product, the query returns a summary row about all sales of the product.
In strict SQL, GROUP BY can only group by columns of the source table but PostgreSQL extends this to
also allow GROUP BY to group by columns in the select list. Grouping by value expressions instead of
simple column names is also allowed.
If a table has been grouped using a GROUP BY clause, but then only certain groups are of interest, the
HAVING clause can be used, much like a WHERE clause, to eliminate groups from a grouped table. The
syntax is:
SELECT select_list FROM ... [WHERE ...] GROUP BY ... HAVING boolean_expression
Expressions in the HAVING clause can refer both to grouped expressions and to ungrouped expressions
(which necessarily involve an aggregate function).
Example:
75
Chapter 7. Queries
---+-----
a | 4
b | 5
(2 rows)
In the example above, the WHERE clause is selecting rows by a column that is not grouped (the expression
is only true for sales during the last four weeks), while the HAVING clause restricts the output to groups
with total gross sales over 5000. Note that the aggregate expressions do not necessarily need to be the
same in all parts of the query.
The columns names a, b, and c are either the actual names of the columns of tables referenced in the
FROM clause, or the aliases given to them as explained in Section 7.2.1.2. The name space available in the
select list is the same as in the WHERE clause, unless grouping is used, in which case it is the same as in
the HAVING clause.
If more than one table has a column of the same name, the table name must also be given, as in
When working with multiple tables, it can also be useful to ask for all the columns of a particular table:
76
Chapter 7. Queries
If an arbitrary value expression is used in the select list, it conceptually adds a new virtual column to the
returned table. The value expression is evaluated once for each result row, with the row’s values substituted
for any column references. But the expressions in the select list do not have to reference any columns in the
table expression of the FROM clause; they could be constant arithmetic expressions as well, for instance.
If no output column name is specified using AS, the system assigns a default name. For simple column
references, this is the name of the referenced column. For function calls, this is the name of the function.
For complex expressions, the system will generate a generic name.
Note: The naming of output columns here is different from that done in the FROM clause (see Section
7.2.1.2). This pipeline will in fact allow you to rename the same column twice, but the name chosen in
the select list is the one that will be passed on.
7.3.3. DISTINCT
After the select list has been processed, the result table may optionally be subject to the elimination of
duplicate rows. The DISTINCT key word is written directly after SELECT to specify this:
(Instead of DISTINCT the key word ALL can be used to specify the default behavior of retaining all rows.)
Obviously, two rows are considered distinct if they differ in at least one column value. Null values are
considered equal in this comparison.
Alternatively, an arbitrary expression can determine what rows are to be considered distinct:
Here expression is an arbitrary value expression that is evaluated for all rows. A set of rows for which
all the expressions are equal are considered duplicates, and only the first row of the set is kept in the
output. Note that the “first row” of a set is unpredictable unless the query is sorted on enough columns
to guarantee a unique ordering of the rows arriving at the DISTINCT filter. (DISTINCT ON processing
occurs after ORDER BY sorting.)
The DISTINCT ON clause is not part of the SQL standard and is sometimes considered bad style because
of the potentially indeterminate nature of its results. With judicious use of GROUP BY and subqueries in
FROM the construct can be avoided, but it is often the most convenient alternative.
77
Chapter 7. Queries
query1 and query2 are queries that can use any of the features discussed up to this point. Set operations
can also be nested and chained, for example
UNION effectively appends the result of query2 to the result of query1 (although there is no guarantee
that this is the order in which the rows are actually returned). Furthermore, it eliminates duplicate rows
from its result, in the same way as DISTINCT, unless UNION ALL is used.
INTERSECT returns all rows that are both in the result of query1 and in the result of query2. Duplicate
rows are eliminated unless INTERSECT ALL is used.
EXCEPT returns all rows that are in the result of query1 but not in the result of query2. (This is
sometimes called the difference between two queries.) Again, duplicates are eliminated unless EXCEPT
ALL is used.
In order to calculate the union, intersection, or difference of two queries, the two queries must be “union
compatible”, which means that they return the same number of columns and the corresponding columns
have compatible data types, as described in Section 10.5.
SELECT select_list
FROM table_expression
ORDER BY column1 [ASC | DESC] [, column2 [ASC | DESC] ...]
column1, etc., refer to select list columns. These can be either the output name of a column (see Section
7.3.2) or the number of a column. Some examples:
78
Chapter 7. Queries
As an extension to the SQL standard, PostgreSQL also allows ordering by arbitrary expressions:
References to column names of the FROM clause that are not present in the select list are also allowed:
But these extensions do not work in queries involving UNION, INTERSECT, or EXCEPT, and are not
portable to other SQL databases.
Each column specification may be followed by an optional ASC or DESC to set the sort direction to ascend-
ing or descending. ASC order is the default. Ascending order puts smaller values first, where “smaller” is
defined in terms of the < operator. Similarly, descending order is determined with the > operator. 1
If more than one sort column is specified, the later entries are used to sort rows that are equal under the
order imposed by the earlier sort columns.
SELECT select_list
FROM table_expression
[LIMIT { number | ALL }] [OFFSET number]
If a limit count is given, no more than that many rows will be returned (but possibly less, if the query itself
yields less rows). LIMIT ALL is the same as omitting the LIMIT clause.
OFFSET says to skip that many rows before beginning to return rows. OFFSET 0 is the same as omitting
the OFFSET clause. If both OFFSET and LIMIT appear, then OFFSET rows are skipped before starting to
count the LIMIT rows that are returned.
When using LIMIT, it is important to use an ORDER BY clause that constrains the result rows into a unique
order. Otherwise you will get an unpredictable subset of the query’s rows. You may be asking for the tenth
through twentieth rows, but tenth through twentieth in what ordering? The ordering is unknown, unless
you specified ORDER BY.
The query optimizer takes LIMIT into account when generating a query plan, so you are very likely to get
different plans (yielding different row orders) depending on what you give for LIMIT and OFFSET. Thus,
using different LIMIT/OFFSET values to select different subsets of a query result will give inconsistent
results unless you enforce a predictable result ordering with ORDER BY. This is not a bug; it is an inherent
1. Actually, PostgreSQL uses the default B-tree operator class for the column’s data type to determine the sort ordering for ASC
and DESC. Conventionally, data types will be set up so that the < and > operators correspond to this sort ordering, but a user-defined
data type’s designer could choose to do something different.
79
Chapter 7. Queries
consequence of the fact that SQL does not promise to deliver the results of a query in any particular order
unless ORDER BY is used to constrain the order.
The rows skipped by an OFFSET clause still have to be computed inside the server; therefore a large
OFFSET can be inefficient.
80
Chapter 8. Data Types
PostgreSQL has a rich set of native data types available to users. Users may add new types to PostgreSQL
using the CREATE TYPE command.
Table 8-1 shows all the built-in general-purpose data types. Most of the alternative names listed in the
“Aliases” column are the names used internally by PostgreSQL for historical reasons. In addition, some
internally used or deprecated types are available, but they are not listed here.
81
Chapter 8. Data Types
Compatibility: The following types (or spellings thereof) are specified by SQL: bit, bit varying,
boolean, char, character varying, character, varchar, date, double precision, integer,
interval, numeric, decimal, real, smallint, time (with or without time zone), timestamp (with or
without time zone).
Each data type has an external representation determined by its input and output functions. Many of the
built-in types have obvious external formats. However, several types are either unique to PostgreSQL,
such as geometric paths, or have several possibilities for formats, such as the date and time types. Some
of the input and output functions are not invertible. That is, the result of an output function may lose
accuracy when compared to the original input.
82
Chapter 8. Data Types
The syntax of constants for the numeric types is described in Section 4.1.2. The numeric types have a full
set of corresponding arithmetic operators and functions. Refer to Chapter 9 for more information. The
following sections describe the types in detail.
NUMERIC(precision, scale)
83
Chapter 8. Data Types
NUMERIC(precision)
NUMERIC
without any precision or scale creates a column in which numeric values of any precision and scale can
be stored, up to the implementation limit on precision. A column of this kind will not coerce input values
to any particular scale, whereas numeric columns with a declared scale will coerce input values to that
scale. (The SQL standard requires a default scale of 0, i.e., coercion to integer precision. We find this a bit
useless. If you’re concerned about portability, always specify the precision and scale explicitly.)
If the scale of a value to be stored is greater than the declared scale of the column, the system will round
the value to the specified number of fractional digits. Then, if the number of digits to the left of the decimal
point exceeds the declared precision minus the declared scale, an error is raised.
Numeric values are physically stored without any extra leading or trailing zeroes. Thus, the declared
precision and scale of a column are maximums, not fixed allocations. (In this sense the numeric type is
more akin to varchar(n) than to char(n).)
In addition to ordinary numeric values, the numeric type allows the special value NaN, meaning “not-
a-number”. Any operation on NaN yields another NaN. When writing this value as a constant in a SQL
command, you must put quotes around it, for example UPDATE table SET x = ’NaN’. On input, the
string NaN is recognized in a case-insensitive manner.
The types decimal and numeric are equivalent. Both types are part of the SQL standard.
• If you require exact storage and calculations (such as for monetary amounts), use the numeric type
instead.
• If you want to do complicated calculations with these types for anything important, especially if you
rely on certain behavior in boundary cases (infinity, underflow), you should evaluate the implementation
carefully.
• Comparing two floating-point values for equality may or may not work as expected.
On most platforms, the real type has a range of at least 1E-37 to 1E+37 with a precision of at least 6
decimal digits. The double precision type typically has a range of around 1E-307 to 1E+308 with a
precision of at least 15 digits. Values that are too large or too small will cause an error. Rounding may take
84
Chapter 8. Data Types
place if the precision of an input number is too high. Numbers too close to zero that are not representable
as distinct from zero will cause an underflow error.
In addition to ordinary numeric values, the floating-point types have several special values:
Infinity
-Infinity
NaN
These represent the IEEE 754 special values “infinity”, “negative infinity”, and “not-a-number”, respec-
tively. (On a machine whose floating-point arithmetic does not follow IEEE 754, these values will prob-
ably not work as expected.) When writing these values as constants in a SQL command, you must put
quotes around them, for example UPDATE table SET x = ’Infinity’. On input, these strings are
recognized in a case-insensitive manner.
PostgreSQL also supports the SQL-standard notations float and float(p) for specifying inexact nu-
meric types. Here, p specifies the minimum acceptable precision in binary digits. PostgreSQL accepts
float(1) to float(24) as selecting the real type, while float(25) to float(53) select double
precision. Values of p outside the allowed range draw an error. float with no precision specified is
taken to mean double precision.
Note: Prior to PostgreSQL 7.4, the precision in float(p) was taken to mean so many decimal digits.
This has been corrected to match the SQL standard, which specifies that the precision is measured
in binary digits. The assumption that real and double precision have exactly 24 and 53 bits in
the mantissa respectively is correct for IEEE-standard floating point implementations. On non-IEEE
platforms it may be off a little, but for simplicity the same ranges of p are used on all platforms.
is equivalent to specifying:
Thus, we have created an integer column and arranged for its default values to be assigned from a sequence
generator. A NOT NULL constraint is applied to ensure that a null value cannot be explicitly inserted, either.
In most cases you would also want to attach a UNIQUE or PRIMARY KEY constraint to prevent duplicate
values from being inserted by accident, but this is not automatic.
85
Chapter 8. Data Types
Note: Prior to PostgreSQL 7.3, serial implied UNIQUE. This is no longer automatic. If you wish a
serial column to be in a unique constraint or a primary key, it must now be specified, same as with any
other data type.
To insert the next value of the sequence into the serial column, specify that the serial column should
be assigned its default value. This can be done either by excluding the column from the list of columns in
the INSERT statement, or through the use of the DEFAULT key word.
The type names serial and serial4 are equivalent: both create integer columns. The type names
bigserial and serial8 work just the same way, except that they create a bigint column. bigserial
should be used if you anticipate the use of more than 231 identifiers over the lifetime of the table.
The sequence created for a serial column is automatically dropped when the owning column is dropped,
and cannot be dropped otherwise. (This was not true in PostgreSQL releases before 7.3. Note that this
automatic drop linkage will not occur for a sequence created by reloading a dump from a pre-7.3 database;
the dump file does not contain the information needed to establish the dependency link.) Furthermore,
this dependency between sequence and column is made only for the serial column itself. If any other
columns reference the sequence (perhaps by manually calling the nextval function), they will be broken
if the sequence is removed. Using a serial column’s sequence in such a fashion is considered bad
form; if you wish to feed several columns from the same sequence generator, create the sequence as an
independent object.
Note: The money type is deprecated. Use numeric or decimal instead, in combination with the
to_char function.
The money type stores a currency amount with a fixed fractional precision; see Table 8-3. Input is ac-
cepted in a variety of formats, including integer and floating-point literals, as well as “typical” currency
formatting, such as ’$1,000.00’. Output is generally in the latter form but depends on the locale.
Name Description
86
Chapter 8. Data Types
Name Description
character varying(n), varchar(n) variable-length with limit
character(n), char(n) fixed-length, blank padded
text variable unlimited length
Note: Prior to PostgreSQL 7.2, strings that were too long were always truncated without raising an
error, in either explicit or implicit casting contexts.
The notations varchar(n) and char(n) are aliases for character varying(n) and
character(n), respectively. character without length specifier is equivalent to character(1). If
character varying is used without length specifier, the type accepts strings of any size. The latter is
a PostgreSQL extension.
In addition, PostgreSQL provides the text type, which stores strings of any length. Although the type
text is not in the SQL standard, several other SQL database management systems have it as well.
Values of type character are physically padded with spaces to the specified width n, and are stored
and displayed that way. However, the padding spaces are treated as semantically insignificant. Trailing
spaces are disregarded when comparing two values of type character, and they will be removed when
converting a character value to one of the other string types. Note that trailing spaces are semantically
significant in character varying and text values.
The storage requirement for data of these types is 4 bytes plus the actual string, and in case of character
plus the padding. Long strings are compressed by the system automatically, so the physical requirement
on disk may be less. Long values are also stored in background tables so they do not interfere with rapid
access to the shorter column values. In any case, the longest possible character string that can be stored
is about 1 GB. (The maximum value that will be allowed for n in the data type declaration is less than
that. It wouldn’t be very useful to change this because with multibyte character encodings the number of
characters and bytes can be quite different anyway. If you desire to store long strings with no specific upper
limit, use text or character varying without a length specifier, rather than making up an arbitrary
length limit.)
Tip: There are no performance differences between these three types, apart from the increased stor-
age size when using the blank-padded type. While character(n) has performance advantages in
some other database systems, it has no such advantages in PostgreSQL. In most situations text or
character varying should be used instead.
87
Chapter 8. Data Types
Refer to Section 4.1.2.1 for information about the syntax of string literals, and to Chapter 9 for information
about available operators and functions. The database character set determines the character set used to
store textual values; for more information on character set support, refer to Section 20.2.
There are two other fixed-length character types in PostgreSQL, shown in Table 8-5. The name type exists
only for storage of identifiers in the internal system catalogs and is not intended for use by the general user.
Its length is currently defined as 64 bytes (63 usable characters plus terminator) but should be referenced
using the constant NAMEDATALEN. The length is set at compile time (and is therefore adjustable for special
uses); the default maximum length may change in a future release. The type "char" (note the quotes) is
different from char(1) in that it only uses one byte of storage. It is internally used in the system catalogs
as a poor-man’s enumeration type.
88
Chapter 8. Data Types
A binary string is a sequence of octets (or bytes). Binary strings are distinguished from character strings
by two characteristics: First, binary strings specifically allow storing octets of value zero and other “non-
printable” octets (usually, octets outside the range 32 to 126). Character strings disallow zero octets,
and also disallow any other octet values and sequences of octet values that are invalid according to the
database’s selected character set encoding. Second, operations on binary strings process the actual bytes,
whereas the processing of character strings depends on locale settings. In short, binary strings are ap-
propriate for storing data that the programmer thinks of as “raw bytes”, whereas character strings are
appropriate for storing text.
When entering bytea values, octets of certain values must be escaped (but all octet values may be escaped)
when used as part of a string literal in an SQL statement. In general, to escape an octet, it is converted into
the three-digit octal number equivalent of its decimal octet value, and preceded by two backslashes. Table
8-7 shows the characters that must be escaped, and gives the alternate escape sequences where applicable.
The requirement to escape “non-printable” octets actually varies depending on locale settings. In some
instances you can get away with leaving them unescaped. Note that the result in each of the examples
in Table 8-7 was exactly one octet in length, even though the output representation of the zero octet and
backslash are more than one character.
The reason that you have to write so many backslashes, as shown in Table 8-7, is that an input string written
as a string literal must pass through two parse phases in the PostgreSQL server. The first backslash of each
pair is interpreted as an escape character by the string-literal parser and is therefore consumed, leaving the
second backslash of the pair. The remaining backslash is then recognized by the bytea input function as
starting either a three digit octal value or escaping another backslash. For example, a string literal passed
to the server as ’\\001’ becomes \001 after passing through the string-literal parser. The \001 is then
sent to the bytea input function, where it is converted to a single octet with a decimal value of 1. Note
that the apostrophe character is not treated specially by bytea, so it follows the normal rules for string
literals. (See also Section 4.1.2.1.)
Bytea octets are also escaped in the output. In general, each “non-printable” octet is converted into its
89
Chapter 8. Data Types
equivalent three-digit octal value and preceded by one backslash. Most “printable” octets are represented
by their standard representation in the client character set. The octet with decimal value 92 (backslash)
has a special alternative output representation. Details are in Table 8-8.
Depending on the front end to PostgreSQL you use, you may have additional work to do in terms of
escaping and unescaping bytea strings. For example, you may also have to escape line feeds and carriage
returns if your interface automatically translates these.
The SQL standard defines a different binary string type, called BLOB or BINARY LARGE OBJECT. The
input format is different from bytea, but the provided functions and operators are mostly the same.
90
Chapter 8. Data Types
Note: Prior to PostgreSQL 7.3, writing just timestamp was equivalent to timestamp with time
zone. This was changed for SQL compliance.
time, timestamp, and interval accept an optional precision value p which specifies the number of
fractional digits retained in the seconds field. By default, there is no explicit bound on precision. The
allowed range of p is from 0 to 6 for the timestamp and interval types.
Note: When timestamp values are stored as double precision floating-point numbers (currently the
default), the effective limit of precision may be less than 6. timestamp values are stored as seconds
before or after midnight 2000-01-01. Microsecond precision is achieved for dates within a few years of
2000-01-01, but the precision degrades for dates further away. When timestamp values are stored as
eight-byte integers (a compile-time option), microsecond precision is available over the full range of
values. However eight-byte integer timestamps have a more limited range of dates than shown above:
from 4713 BC up to 294276 AD. The same compile-time option also determines whether time and
interval values are stored as floating-point or eight-byte integers. In the floating-point case, large
interval values degrade in precision as the size of the interval increases.
For the time types, the allowed range of p is from 0 to 6 when eight-byte integer storage is used, or from
0 to 10 when floating-point storage is used.
The type time with time zone is defined by the SQL standard, but the definition exhibits properties
which lead to questionable usefulness. In most cases, a combination of date, time, timestamp
without time zone, and timestamp with time zone should provide a complete range of
date/time functionality required by any application.
The types abstime and reltime are lower precision types which are used internally. You are discour-
aged from using these types in new applications and are encouraged to move any old ones over when
appropriate. Any or all of these internal types might disappear in a future release.
PostgreSQL is more flexible in handling date/time input than the SQL standard requires. See Appendix B
for the exact parsing rules of date/time input and for the recognized text fields including months, days of
the week, and time zones.
Remember that any date or time literal input needs to be enclosed in single quotes, like text strings. Refer
to Section 4.1.2.5 for more information. SQL requires the following syntax
where p in the optional precision specification is an integer corresponding to the number of fractional
digits in the seconds field. Precision can be specified for time, timestamp, and interval types. The
91
Chapter 8. Data Types
allowed values are mentioned above. If no precision is specified in a constant specification, it defaults to
the precision of the literal value.
8.5.1.1. Dates
Table 8-10 shows some possible inputs for the date type.
Example Description
January 8, 1999 unambiguous in any datestyle input mode
1999-01-08 ISO 8601; January 8 in any mode (recommended
format)
1/8/1999 January 8 in MDY mode; August 1 in DMY mode
1/18/1999 January 18 in MDY mode; rejected in other modes
01/02/03 January 2, 2003 in MDY mode; February 1, 2003 in
DMY mode; February 3, 2001 in YMD mode
1999-Jan-08 January 8 in any mode
Jan-08-1999 January 8 in any mode
08-Jan-1999 January 8 in any mode
99-Jan-08 January 8 in YMD mode, else error
08-Jan-99 January 8, except error in YMD mode
Jan-08-99 January 8, except error in YMD mode
19990108 ISO 8601; January 8, 1999 in any mode
990108 ISO 8601; January 8, 1999 in any mode
1999.008 year and day of year
J2451187 Julian day
January 8, 99 BC year 99 before the Common Era
8.5.1.2. Times
The time-of-day types are time [ (p) ] without time zone and time [ (p) ] with time
zone. Writing just time is equivalent to time without time zone.
Valid input for these types consists of a time of day followed by an optional time zone. (See Table 8-11
and Table 8-12.) If a time zone is specified in the input for time without time zone, it is silently
ignored.
Example Description
04:05:06.789 ISO 8601
04:05:06 ISO 8601
04:05 ISO 8601
92
Chapter 8. Data Types
Example Description
040506 ISO 8601
04:05 AM same as 04:05; AM does not affect value
04:05 PM same as 16:05; input hour must be <= 12
04:05:06.789-8 ISO 8601
04:05:06-08:00 ISO 8601
04:05-08:00 ISO 8601
040506-08 ISO 8601
04:05:06 PST time zone specified by name
Example Description
PST Pacific Standard Time
-8:00 ISO-8601 offset for PST
-800 ISO-8601 offset for PST
-8 ISO-8601 offset for PST
zulu Military abbreviation for UTC
z Short form of zulu
Refer to Appendix B for a list of time zone names that are recognized for input.
1999-01-08 04:05:06
and
are valid values, which follow the ISO 8601 standard. In addition, the wide-spread format
is supported.
The SQL standard differentiates timestamp without time zone and timestamp with time
zone literals by the existence of a “+”; or “-”. Hence, according to the standard,
93
Chapter 8. Data Types
is a timestamp with time zone. PostgreSQL differs from the standard by requiring that timestamp
with time zone literals be explicitly typed:
If a literal is not explicitly indicated as being of timestamp with time zone, PostgreSQL will silently
ignore any time zone indication in the literal. That is, the resulting date/time value is derived from the
date/time fields in the input value, and is not adjusted for time zone.
For timestamp with time zone, the internally stored value is always in UTC (Universal Coordinated
Time, traditionally known as Greenwich Mean Time, GMT). An input value that has an explicit time zone
specified is converted to UTC using the appropriate offset for that time zone. If no time zone is stated in
the input string, then it is assumed to be in the time zone indicated by the system’s timezone parameter,
and is converted to UTC using the offset for the timezone zone.
When a timestamp with time zone value is output, it is always converted from UTC to the current
timezone zone, and displayed as local time in that zone. To see the time in another time zone, either
change timezone or use the AT TIME ZONE construct (see Section 9.9.3).
Conversions between timestamp without time zone and timestamp with time zone normally
assume that the timestamp without time zone value should be taken or given as timezone local
time. A different zone reference can be specified for the conversion using AT TIME ZONE.
8.5.1.4. Intervals
interval values can be written with the following syntax:
Where: quantity is a number (possibly signed); unit is second, minute, hour, day, week, month,
year, decade, century, millennium, or abbreviations or plurals of these units; direction can be
ago or empty. The at sign (@) is optional noise. The amounts of different units are implicitly added up
with appropriate sign accounting.
Quantities of days, hours, minutes, and seconds can be specified without explicit unit markings. For ex-
ample, ’1 12:59:10’ is read the same as ’1 day 12 hours 59 min 10 sec’.
The optional precision p should be between 0 and 6, and defaults to the precision of the input literal.
94
Chapter 8. Data Types
The following SQL-compatible functions can also be used to obtain the current time value for the
corresponding data type: CURRENT_DATE, CURRENT_TIME, CURRENT_TIMESTAMP, LOCALTIME,
LOCALTIMESTAMP. The latter four accept an optional precision specification. (See Section 9.9.4.) Note
however that these are SQL functions and are not recognized as data input strings.
In the SQL and POSTGRES styles, day appears before month if DMY field ordering has been specified,
otherwise month appears before day. (See Section 8.5.1 for how this setting also affects interpretation of
input values.) Table 8-15 shows an example.
95
Chapter 8. Data Types
interval output looks like the input format, except that units like century or week are converted to
years and days and ago is converted to an appropriate sign. In ISO mode the output looks like
The date/time styles can be selected by the user using the SET datestyle command, the DateStyle
parameter in the postgresql.conf configuration file, or the PGDATESTYLE environment variable on
the server or client. The formatting function to_char (see Section 9.8) is also available as a more flexible
way to format the date/time output.
• Although the date type does not have an associated time zone, the time type can. Time zones in the
real world have little meaning unless associated with a date as well as a time, since the offset may vary
through the year with daylight-saving time boundaries.
• The default time zone is specified as a constant numeric offset from UTC. It is therefore not possible to
adapt to daylight-saving time when doing date/time arithmetic across DST boundaries.
To address these difficulties, we recommend using date/time types that contain both date and time when
using time zones. We recommend not using the type time with time zone (though it is supported
by PostgreSQL for legacy applications and for compliance with the SQL standard). PostgreSQL assumes
your local time zone for any type containing only date or time.
All timezone-aware dates and times are stored internally in UTC. They are converted to local time in the
zone specified by the timezone configuration parameter before being displayed to the client.
The timezone configuration parameter can be set in the file postgresql.conf, or in any of the other
standard ways described in Section 16.4. There are also several special ways to set it:
96
Chapter 8. Data Types
• The PGTZ environment variable, if set at the client, is used by libpq applications to send a SET TIME
ZONE command to the server upon connection.
8.5.4. Internals
PostgreSQL uses Julian dates for all date/time calculations. They have the nice property of correctly
predicting/calculating any date more recent than 4713 BC to far into the future, using the assumption that
the length of the year is 365.2425 days.
Date conventions before the 19th century make for interesting reading, but are not consistent enough to
warrant coding into a date/time handler.
TRUE
’t’
’true’
’y’
’yes’
’1’
FALSE
’f’
’false’
’n’
’no’
’0’
Using the key words TRUE and FALSE is preferred (and SQL-compliant).
97
Chapter 8. Data Types
f | non est
Example 8-2 shows that boolean values are output using the letters t and f.
Tip: Values of the boolean type cannot be cast directly to other types (e.g., CAST (boolval AS
integer) does not work). This can be accomplished using the CASE expression: CASE WHEN boolval
THEN ’value if true’ ELSE ’value if false’ END. See Section 9.13.
A rich set of functions and operators is available to perform various geometric operations such as scaling,
translation, rotation, and determining intersections. They are explained in Section 9.10.
8.7.1. Points
Points are the fundamental two-dimensional building block for geometric types. Values of type point are
specified using the following syntax:
( x , y )
98
Chapter 8. Data Types
x , y
( ( x1 , y1 ) , ( x2 , y2 ) )
( x1 , y1 ) , ( x2 , y2 )
x1 , y1 , x2 , y2
where (x1,y1) and (x2,y2) are the end points of the line segment.
8.7.3. Boxes
Boxes are represented by pairs of points that are opposite corners of the box. Values of type box are
specified using the following syntax:
( ( x1 , y1 ) , ( x2 , y2 ) )
( x1 , y1 ) , ( x2 , y2 )
x1 , y1 , x2 , y2
where (x1,y1) and (x2,y2) are any two opposite corners of the box.
Boxes are output using the first syntax. The corners are reordered on input to store the upper right corner,
then the lower left corner. Other corners of the box can be entered, but the lower left and upper right
corners are determined from the input and stored.
8.7.4. Paths
Paths are represented by lists of connected points. Paths can be open, where the first and last points in the
list are not considered connected, or closed, where the first and last points are considered connected.
Values of type path are specified using the following syntax:
( ( x1 , y1 ) , ... , ( xn , yn ) )
[ ( x1 , y1 ) , ... , ( xn , yn ) ]
( x1 , y1 ) , ... , ( xn , yn )
( x1 , y1 , ... , xn , yn )
x1 , y1 , ... , xn , yn
where the points are the end points of the line segments comprising the path. Square brackets ([]) indicate
an open path, while parentheses (()) indicate a closed path.
Paths are output using the first syntax.
99
Chapter 8. Data Types
8.7.5. Polygons
Polygons are represented by lists of points (the vertexes of the polygon). Polygons should probably be
considered equivalent to closed paths, but are stored differently and have their own set of support routines.
Values of type polygon are specified using the following syntax:
( ( x1 , y1 ) , ... , ( xn , yn ) )
( x1 , y1 ) , ... , ( xn , yn )
( x1 , y1 , ... , xn , yn )
x1 , y1 , ... , xn , yn
where the points are the end points of the line segments comprising the boundary of the polygon.
Polygons are output using the first syntax.
8.7.6. Circles
Circles are represented by a center point and a radius. Values of type circle are specified using the
following syntax:
< ( x , y ) , r >
( ( x , y ) , r )
( x , y ) , r
x , y , r
When sorting inet or cidr data types, IPv4 addresses will always sort before IPv6 addresses, including
IPv4 addresses encapsulated or mapped into IPv6 addresses, such as ::10.2.3.4 or ::ffff::10.4.3.2.
8.8.1. inet
The inet type holds an IPv4 or IPv6 host address, and optionally the identity of the subnet it is in, all
100
Chapter 8. Data Types
in one field. The subnet identity is represented by stating how many bits of the host address represent the
network address (the “netmask”). If the netmask is 32 and the address is IPv4, then the value does not
indicate a subnet, only a single host. In IPv6, the address length is 128 bits, so 128 bits specify a unique
host address. Note that if you want to accept networks only, you should use the cidr type rather than
inet.
The input format for this type is address/y where address is an IPv4 or IPv6 address and y is the
number of bits in the netmask. If the /y part is left off, then the netmask is 32 for IPv4 and 128 for IPv6,
so the value represents just a single host. On display, the /y portion is suppressed if the netmask specifies
a single host.
8.8.2. cidr
The cidr type holds an IPv4 or IPv6 network specification. Input and output formats follow Class-
less Internet Domain Routing conventions. The format for specifying networks is address/y where
address is the network represented as an IPv4 or IPv6 address, and y is the number of bits in the
netmask. If y is omitted, it is calculated using assumptions from the older classful network numbering
system, except that it will be at least large enough to include all of the octets written in the input. It is an
error to specify a network address that has bits set to the right of the specified netmask.
Table 8-18 shows some examples.
101
Chapter 8. Data Types
Tip: If you do not like the output format for inet or cidr values, try the functions host, text, and
abbrev.
8.8.4. macaddr
The macaddr type stores MAC addresses, i.e., Ethernet card hardware addresses (although MAC ad-
dresses are used for other purposes as well). Input is accepted in various customary formats, including
’08002b:010203’
’08002b-010203’
’0800.2b01.0203’
’08-00-2b-01-02-03’
’08:00:2b:01:02:03’
which would all specify the same address. Upper and lower case is accepted for the digits a through f.
Output is always in the last of the forms shown.
The directory contrib/mac in the PostgreSQL source distribution contains tools that can be used to map
MAC addresses to hardware manufacturer names.
Note: If one explicitly casts a bit-string value to bit(n), it will be truncated or zero-padded on the
right to be exactly n bits, without raising an error. Similarly, if one explicitly casts a bit-string value to
bit varying(n), it will be truncated on the right if it is more than n bits.
Note: Prior to PostgreSQL 7.2, bit data was always silently truncated or zero-padded on the right,
with or without an explicit cast. This was changed to comply with the SQL standard.
102
Chapter 8. Data Types
Refer to Section 4.1.2.3 for information about the syntax of bit string constants. Bit-logical operators and
string manipulation functions are available; see Section 9.6.
8.10. Arrays
PostgreSQL allows columns of a table to be defined as variable-length multidimensional arrays. Arrays of
any built-in or user-defined base type can be created. (Arrays of composite types or domains are not yet
supported, however.)
As shown, an array data type is named by appending square brackets ([]) to the data type name of the
array elements. The above command will create a table named sal_emp with a column of type text
(name), a one-dimensional array of type integer (pay_by_quarter), which represents the employee’s
salary by quarter, and a two-dimensional array of text (schedule), which represents the employee’s
weekly schedule.
The syntax for CREATE TABLE allows the exact size of arrays to be specified, for example:
However, the current implementation does not enforce the array size limits — the behavior is the same as
for arrays of unspecified length.
Actually, the current implementation does not enforce the declared number of dimensions either. Arrays
of a particular element type are all considered to be of the same type, regardless of size or number of
103
Chapter 8. Data Types
dimensions. So, declaring number of dimensions or sizes in CREATE TABLE is simply documentation, it
does not affect runtime behavior.
An alternative syntax, which conforms to the SQL:1999 standard, may be used for one-dimensional arrays.
pay_by_quarter could have been defined as:
This syntax requires an integer constant to denote the array size. As before, however, PostgreSQL does
not enforce the size restriction.
where delim is the delimiter character for the type, as recorded in its pg_type entry. Among the standard
data types provided in the PostgreSQL distribution, type box uses a semicolon (;) but all the others use
comma (,). Each val is either a constant of the array element type, or a subarray. An example of an array
constant is
’{{1,2,3},{4,5,6},{7,8,9}}’
Note that multidimensional arrays must have matching extents for each dimension. A mismatch causes an
error report.
104
Chapter 8. Data Types
A limitation of the present array implementation is that individual elements of an array cannot be SQL
null values. The entire array can be set to null, but you can’t have an array with some elements null and
some not.
The result of the previous two inserts looks like this:
Notice that the array elements are ordinary SQL constants or expressions; for instance, string literals are
single quoted, instead of double quoted as they would be in an array literal. The ARRAY constructor syntax
is discussed in more detail in Section 4.2.10.
name
-------
Carol
(1 row)
The array subscript numbers are written within square brackets. By default PostgreSQL uses the one-
based numbering convention for arrays, that is, an array of n elements starts with array[1] and ends
with array[n].
This query retrieves the third quarter pay of all employees:
pay_by_quarter
105
Chapter 8. Data Types
----------------
10000
25000
(2 rows)
We can also access arbitrary rectangular slices of an array, or subarrays. An array slice is denoted by writ-
ing lower-bound:upper-bound for one or more array dimensions. For example, this query retrieves
the first item on Bill’s schedule for the first two days of the week:
schedule
------------------------
{{meeting},{training}}
(1 row)
with the same result. An array subscripting operation is always taken to represent an array slice if any of
the subscripts are written in the form lower:upper . A lower bound of 1 is assumed for any subscript
where only one value is specified, as in this example:
schedule
-------------------------------------------
{{meeting,lunch},{training,presentation}}
(1 row)
The current dimensions of any array value can be retrieved with the array_dims function:
array_dims
------------
[1:2][1:1]
(1 row)
array_dims produces a text result, which is convenient for people to read but perhaps not so convenient
for programs. Dimensions can also be retrieved with array_upper and array_lower, which return the
upper and lower bound of a specified array dimension, respectively.
array_upper
-------------
2
(1 row)
106
Chapter 8. Data Types
or updated in a slice:
A stored array value can be enlarged by assigning to an element adjacent to those already present, or by
assigning to a slice that is adjacent to or overlaps the data already present. For example, if array myarray
currently has 4 elements, it will have five elements after an update that assigns to myarray[5]. Currently,
enlargement in this fashion is only allowed for one-dimensional arrays, not multidimensional arrays.
Array slice assignment allows creation of arrays that do not use one-based subscripts. For example one
might assign to myarray[-2:7] to create an array with subscript values running from -2 to 7.
New array values can also be constructed by using the concatenation operator, ||.
The concatenation operator allows a single element to be pushed on to the beginning or end of a
one-dimensional array. It also accepts two N -dimensional arrays, or an N -dimensional and an
N+1-dimensional array.
107
Chapter 8. Data Types
When a single element is pushed on to the beginning of a one-dimensional array, the result is an array
with a lower bound subscript equal to the right-hand operand’s lower bound subscript, minus one. When
a single element is pushed on to the end of a one-dimensional array, the result is an array retaining the
lower bound of the left-hand operand. For example:
When two arrays with an equal number of dimensions are concatenated, the result retains the lower bound
subscript of the left-hand operand’s outer dimension. The result is an array comprising every element of
the left-hand operand followed by every element of the right-hand operand. For example:
When an N -dimensional array is pushed on to the beginning or end of an N+1-dimensional array, the result
is analogous to the element-array case above. Each N -dimensional sub-array is essentially an element of
the N+1-dimensional array’s outer dimension. For example:
108
Chapter 8. Data Types
operator. However, they may be directly useful in the creation of user-defined aggregates. Some
examples:
However, this quickly becomes tedious for large arrays, and is not helpful if the size of the array is
uncertain. An alternative method is described in Section 9.17. The above query could be replaced by:
In addition, you could find rows where the array had all values equal to 10000 with:
109
Chapter 8. Data Types
Tip: Arrays are not sets; searching for specific array elements may be a sign of database misdesign.
Consider using a separate table with a row for each item that would be an array element. This will be
easier to search, and is likely to scale up better to large numbers of elements.
array
---------------
[0:2]={1,2,3}
(1 row)
array
--------------------------
[0:1][1:2]={{1,2},{3,4}}
(1 row)
This syntax can also be used to specify non-default array subscripts in an array literal. For example:
110
Chapter 8. Data Types
e1 | e2
----+----
1 | 6
(1 row)
As shown previously, when writing an array value you may write double quotes around any individual ar-
ray element. You must do so if the element value would otherwise confuse the array-value parser. For ex-
ample, elements containing curly braces, commas (or whatever the delimiter character is), double quotes,
backslashes, or leading or trailing whitespace must be double-quoted. To put a double quote or backslash
in a quoted array element value, precede it with a backslash. Alternatively, you can use backslash-escaping
to protect all data characters that would otherwise be taken as array syntax.
You may write whitespace before a left brace or after a right brace. You may also write whitespace be-
fore or after any individual item string. In all of these cases the whitespace will be ignored. However,
whitespace within double-quoted elements, or surrounded on both sides by non-whitespace characters of
an element, is not ignored.
Note: Remember that what you write in an SQL command will first be interpreted as a string literal,
and then as an array. This doubles the number of backslashes you need. For example, to insert a
text array value containing a backslash and a double quote, you’d need to write
The string-literal processor removes one level of backslashes, so that what arrives at the array-value
parser looks like {"\\","\""}. In turn, the strings fed to the text data type’s input routine become \
and " respectively. (If we were working with a data type whose input routine also treated backslashes
specially, bytea for example, we might need as many as eight backslashes in the command to get
one backslash into the stored array element.) Dollar quoting (see Section 4.1.2.2) may be used to
avoid the need to double backslashes.
Tip: The ARRAY constructor syntax (see Section 4.2.10) is often easier to work with than the array-
literal syntax when writing array values in SQL commands. In ARRAY, individual element values are
written the same way they would be written when not members of an array.
111
Chapter 8. Data Types
The syntax is comparable to CREATE TABLE, except that only field names and types can be specified; no
constraints (such as NOT NULL) can presently be included. Note that the AS keyword is essential; without
it, the system will think a quite different kind of CREATE TYPE command is meant, and you’ll get odd
syntax errors.
Having defined the types, we can use them to create tables:
or functions:
Whenever you create a table, a composite type is also automatically created, with the same name as the
table, to represent the table’s row type. For example, had we said
then the same inventory_item composite type shown above would come into being as a byproduct, and
could be used just as above. Note however an important restriction of the current implementation: since
no constraints are associated with a composite type, the constraints shown in the table definition do not
apply to values of the composite type outside the table. (A partial workaround is to use domain types as
members of composite types.)
112
Chapter 8. Data Types
An example is
’("fuzzy dice",42,1.99)’
which would be a valid value of the inventory_item type defined above. To make a field be NULL,
write no characters at all in its position in the list. For example, this constant specifies a NULL third field:
’("fuzzy dice",42,)’
If you want an empty string rather than NULL, write double quotes:
’("",42,)’
Here the first field is a non-NULL empty string, the third is NULL.
(These constants are actually only a special case of the generic type constants discussed in Section 4.1.2.5.
The constant is initially treated as a string and passed to the composite-type input conversion routine. An
explicit type specification might be necessary.)
The ROW expression syntax may also be used to construct composite values. In most cases this is consid-
erably simpler to use than the string-literal syntax, since you don’t have to worry about multiple layers of
quoting. We already used this method above:
The ROW keyword is actually optional as long as you have more than one field in the expression, so these
can simplify to
113
Chapter 8. Data Types
This will not work since the name item is taken to be a table name, not a field name, per SQL syntax
rules. You must write it like this:
or if you need to use the table name as well (for instance in a multi-table query), like this:
Now the parenthesized object is correctly interpreted as a reference to the item column, and then the
subfield can be selected from it.
Similar syntactic issues apply whenever you select a field from a composite value. For instance, to select
just one field from the result of a function that returns a composite value, you’d need to write something
like
The first example omits ROW, the second uses it; we could have done it either way.
We can update an individual subfield of a composite column:
Notice here that we don’t need to (and indeed cannot) put parentheses around the column name appearing
just after SET, but we do need parentheses when referencing the same column in the expression to the
right of the equal sign.
And we can specify subfields as targets for INSERT, too:
Had we not supplied values for all the subfields of the column, the remaining subfields would have been
filled with null values.
114
Chapter 8. Data Types
part of the field value, and may or may not be significant depending on the input conversion rules for the
field data type. For example, in
’( 42)’
the whitespace will be ignored if the field type is integer, but not if it is text.
As shown previously, when writing a composite value you may write double quotes around any individual
field value. You must do so if the field value would otherwise confuse the composite-value parser. In
particular, fields containing parentheses, commas, double quotes, or backslashes must be double-quoted.
To put a double quote or backslash in a quoted composite field value, precede it with a backslash. (Also,
a pair of double quotes within a double-quoted field value is taken to represent a double quote character,
analogously to the rules for single quotes in SQL literal strings.) Alternatively, you can use backslash-
escaping to protect all data characters that would otherwise be taken as composite syntax.
A completely empty field value (no characters at all between the commas or parentheses) represents a
NULL. To write a value that is an empty string rather than NULL, write "".
The composite output routine will put double quotes around field values if they are empty strings or
contain parentheses, commas, double quotes, backslashes, or white space. (Doing so for white space is
not essential, but aids legibility.) Double quotes and backslashes embedded in field values will be doubled.
Note: Remember that what you write in an SQL command will first be interpreted as a string literal,
and then as a composite. This doubles the number of backslashes you need. For example, to insert a
text field containing a double quote and a backslash in a composite value, you’d need to write
The string-literal processor removes one level of backslashes, so that what arrives at the composite-
value parser looks like ("\"\\"). In turn, the string fed to the text data type’s input routine becomes
"\. (If we were working with a data type whose input routine also treated backslashes specially, bytea
for example, we might need as many as eight backslashes in the command to get one backslash into
the stored composite field.) Dollar quoting (see Section 4.1.2.2) may be used to avoid the need to
double backslashes.
Tip: The ROW constructor syntax is usually easier to work with than the composite-literal syntax when
writing composite values in SQL commands. In ROW, individual field values are written the same way
they would be written when not members of a composite.
115
Chapter 8. Data Types
The oid type is currently implemented as an unsigned four-byte integer. Therefore, it is not large enough
to provide database-wide uniqueness in large databases, or even in large individual tables. So, using a
user-created table’s OID column as a primary key is discouraged. OIDs are best used only for references
to system tables.
Note: OIDs are included by default in user-created tables in PostgreSQL 8.0.0. However, this be-
havior is likely to change in a future version of PostgreSQL. Eventually, user-created tables will not
include an OID system column unless WITH OIDS is specified when the table is created, or the
default_with_oids configuration variable is set to true. If your application requires the presence
of an OID system column in a table, it should specify WITH OIDS when that table is created to ensure
compatibility with future releases of PostgreSQL.
The oid type itself has few operations beyond comparison. It can be cast to integer, however, and then
manipulated using the standard integer operators. (Beware of possible signed-versus-unsigned confusion
if you do this.)
The OID alias types have no operations of their own except for specialized input and output routines.
These routines are able to accept and display symbolic names for system objects, rather than the raw
numeric value that type oid would use. The alias types allow simplified lookup of OID values for objects.
For example, to examine the pg_attribute rows related to a table mytable, one could write
rather than
While that doesn’t look all that bad by itself, it’s still oversimplified. A far more complicated sub-select
would be needed to select the right OID if there are multiple tables named mytable in different schemas.
The regclass input converter handles the table lookup according to the schema path setting, and so it
does the “right thing” automatically. Similarly, casting a table’s OID to regclass is handy for symbolic
display of a numeric OID.
All of the OID alias types accept schema-qualified names, and will display schema-qualified names on
116
Chapter 8. Data Types
output if the object would not be found in the current search path without being qualified. The regproc
and regoper alias types will only accept input names that are unique (not overloaded), so they are of
limited use; for most uses regprocedure or regoperator is more appropriate. For regoperator,
unary operators are identified by writing NONE for the unused operand.
Another identifier type used by the system is xid, or transaction (abbreviated xact) identifier. This is the
data type of the system columns xmin and xmax. Transaction identifiers are 32-bit quantities.
A third identifier type used by the system is cid, or command identifier. This is the data type of the system
columns cmin and cmax. Command identifiers are also 32-bit quantities.
A final identifier type used by the system is tid, or tuple identifier (row identifier). This is the data type
of the system column ctid. A tuple ID is a pair (block number, tuple index within block) that identifies
the physical location of the row within its table.
(The system columns are further explained in Section 5.4.)
8.13. Pseudo-Types
The PostgreSQL type system contains a number of special-purpose entries that are collectively called
pseudo-types. A pseudo-type cannot be used as a column data type, but it can be used to declare a func-
tion’s argument or result type. Each of the available pseudo-types is useful in situations where a function’s
behavior does not correspond to simply taking or returning a value of a specific SQL data type. Table 8-20
lists the existing pseudo-types.
Name Description
any Indicates that a function accepts any input data type
whatever.
anyarray Indicates that a function accepts any array data type
(see Section 31.2.5).
anyelement Indicates that a function accepts any data type (see
Section 31.2.5).
cstring Indicates that a function accepts or returns a
null-terminated C string.
internal Indicates that a function accepts or returns a
server-internal data type.
language_handler A procedural language call handler is declared to
return language_handler.
record Identifies a function returning an unspecified row
type.
trigger A trigger function is declared to return trigger.
void Indicates that a function returns no value.
opaque An obsolete type name that formerly served all the
above purposes.
117
Chapter 8. Data Types
Functions coded in C (whether built-in or dynamically loaded) may be declared to accept or return any of
these pseudo data types. It is up to the function author to ensure that the function will behave safely when
a pseudo-type is used as an argument type.
Functions coded in procedural languages may use pseudo-types only as allowed by their implementation
languages. At present the procedural languages all forbid use of a pseudo-type as argument type, and allow
only void and record as a result type (plus trigger when the function is used as a trigger). Some also
support polymorphic functions using the types anyarray and anyelement.
The internal pseudo-type is used to declare functions that are meant only to be called internally by the
database system, and not by direct invocation in a SQL query. If a function has at least one internal-type
argument then it cannot be called from SQL. To preserve the type safety of this restriction it is important
to follow this coding rule: do not create any function that is declared to return internal unless it has at
least one internal argument.
118
Chapter 9. Functions and Operators
PostgreSQL provides a large number of functions and operators for the built-in data types. Users can also
define their own functions and operators, as described in Part V. The psql commands \df and \do can be
used to show the list of all actually available functions and operators, respectively.
If you are concerned about portability then take note that most of the functions and operators described
in this chapter, with the exception of the most trivial arithmetic and comparison operators and some
explicitly marked functions, are not specified by the SQL standard. Some of the extended functionality is
present in other SQL database management systems, and in many cases this functionality is compatible
and consistent between the various implementations.
AND
OR
NOT
SQL uses a three-valued Boolean logic where the null value represents “unknown”. Observe the following
truth tables:
a b a AND b a OR b
TRUE TRUE TRUE TRUE
TRUE FALSE FALSE TRUE
TRUE NULL NULL TRUE
FALSE FALSE FALSE FALSE
FALSE NULL FALSE NULL
NULL NULL NULL NULL
a NOT a
TRUE FALSE
FALSE TRUE
NULL NULL
The operators AND and OR are commutative, that is, you can switch the left and right operand without
affecting the result. But see Section 4.2.12 for more information about the order of evaluation of subex-
pressions.
119
Chapter 9. Functions and Operators
Operator Description
< less than
> greater than
<= less than or equal to
>= greater than or equal to
= equal
<> or != not equal
Note: The != operator is converted to <> in the parser stage. It is not possible to implement != and
<> operators that do different things.
Comparison operators are available for all data types where this makes sense. All comparison operators are
binary operators that return values of type boolean; expressions like 1 < 2 < 3 are not valid (because
there is no < operator to compare a Boolean value with 3).
In addition to the comparison operators, the special BETWEEN construct is available.
a BETWEEN x AND y
is equivalent to
Similarly,
is equivalent to
a < x OR a > y
There is no difference between the two respective forms apart from the CPU cycles required to rewrite the
first one into the second one internally.
To check whether a value is or is not null, use the constructs
expression IS NULL
expression IS NOT NULL
expression ISNULL
expression NOTNULL
Do not write expression = NULL because NULL is not “equal to” NULL. (The null value represents an
unknown value, and it is not known whether two unknown values are equal.) This behavior conforms to
the SQL standard.
120
Chapter 9. Functions and Operators
Tip: Some applications may expect that expression = NULL returns true if expression evaluates
to the null value. It is highly recommended that these applications be modified to comply with the
SQL standard. However, if that cannot be done the transform_null_equals configuration variable is
available. If it is enabled, PostgreSQL will convert x = NULL clauses to x IS NULL. This was the
default behavior in PostgreSQL releases 6.5 through 7.1.
The ordinary comparison operators yield null (signifying “unknown”) when either input is null. Another
way to do comparisons is with the IS DISTINCT FROM construct:
For non-null inputs this is the same as the <> operator. However, when both inputs are null it will return
false, and when just one input is null it will return true. Thus it effectively acts as though null were a
normal data value, rather than “unknown”.
Boolean values can also be tested using the constructs
expression IS TRUE
expression IS NOT TRUE
expression IS FALSE
expression IS NOT FALSE
expression IS UNKNOWN
expression IS NOT UNKNOWN
These will always return true or false, never a null value, even when the operand is null. A null input
is treated as the logical value “unknown”. Notice that IS UNKNOWN and IS NOT UNKNOWN are effec-
tively the same as IS NULL and IS NOT NULL, respectively, except that the input expression must be of
Boolean type.
121
Chapter 9. Functions and Operators
The bitwise operators work only on integral data types, whereas the others are available for all numeric
data types. The bitwise operators are also available for the bit string types bit and bit varying, as
shown in Table 9-10.
Table 9-3 shows the available mathematical functions. In the table, dp indicates double precision.
Many of these functions are provided in multiple forms with different argument types. Except where
noted, any given form of a function returns the same data type as its argument. The functions working
with double precision data are mostly implemented on top of the host system’s C library; accuracy
and behavior in boundary cases may therefore vary depending on the host system.
122
Chapter 9. Functions and Operators
Finally, Table 9-4 shows the available trigonometric functions. All trigonometric functions take arguments
and return values of type double precision.
123
Chapter 9. Functions and Operators
Function Description
acos(x) inverse cosine
asin(x) inverse sine
atan(x) inverse tangent
atan2(x, y) inverse tangent of x/y
cos(x) cosine
cot(x) cotangent
sin(x) sine
tan(x) tangent
124
Chapter 9. Functions and Operators
Additional string manipulation functions are available and are listed in Table 9-6. Some of them are used
internally to implement the SQL-standard string functions listed in Table 9-5.
125
Chapter 9. Functions and Operators
126
Chapter 9. Functions and Operators
127
Chapter 9. Functions and Operators
128
Chapter 9. Functions and Operators
129
Chapter 9. Functions and Operators
130
Chapter 9. Functions and Operators
131
Chapter 9. Functions and Operators
132
Chapter 9. Functions and Operators
octet_length(string
integer
) Number of bytes in octet_length( 5
binary string ’jo\\000se’::bytea)
Additional binary string manipulation functions are available and are listed in Table 9-9. Some of them
are used internally to implement the SQL-standard string functions listed in Table 9-8.
133
Chapter 9. Functions and Operators
The following SQL-standard functions work on bit strings as well as character strings: length,
bit_length, octet_length, position, substring.
In addition, it is possible to cast integral values to and from type bit. Some examples:
44::bit(10) 0000101100
44::bit(3) 100
cast(-44 as bit(12)) 111111010100
’1110’::bit(4)::integer 14
Note that casting to just “bit” means casting to bit(1), and so it will deliver only the least significant bit
of the integer.
134
Chapter 9. Functions and Operators
Note: Prior to PostgreSQL 8.0, casting an integer to bit(n) would copy the leftmost n bits of the
integer, whereas now it copies the rightmost n bits. Also, casting an integer to a bit string width wider
than the integer itself will sign-extend on the left.
Tip: If you have pattern matching needs that go beyond this, consider writing a user-defined function
in Perl or Tcl.
9.7.1. LIKE
string LIKE pattern [ESCAPE escape-character]
string NOT LIKE pattern [ESCAPE escape-character]
Every pattern defines a set of strings. The LIKE expression returns true if the string is contained in
the set of strings represented by pattern. (As expected, the NOT LIKE expression returns false if LIKE
returns true, and vice versa. An equivalent expression is NOT (string LIKE pattern).)
If pattern does not contain percent signs or underscore, then the pattern only represents the string itself;
in that case LIKE acts like the equals operator. An underscore (_) in pattern stands for (matches) any
single character; a percent sign (%) matches any string of zero or more characters.
Some examples:
LIKE pattern matches always cover the entire string. To match a sequence anywhere within a string, the
pattern must therefore start and end with a percent sign.
To match a literal underscore or percent sign without matching other characters, the respective character
in pattern must be preceded by the escape character. The default escape character is the backslash but
a different one may be selected by using the ESCAPE clause. To match the escape character itself, write
two escape characters.
Note that the backslash already has a special meaning in string literals, so to write a pattern constant that
contains a backslash you must write two backslashes in an SQL statement. Thus, writing a pattern that
actually matches a literal backslash means writing four backslashes in the statement. You can avoid this
135
Chapter 9. Functions and Operators
by selecting a different escape character with ESCAPE; then a backslash is not special to LIKE anymore.
(But it is still special to the string literal parser, so you still need two of them.)
It’s also possible to select no escape character by writing ESCAPE ”. This effectively disables the escape
mechanism, which makes it impossible to turn off the special meaning of underscore and percent signs in
the pattern.
The key word ILIKE can be used instead of LIKE to make the match case-insensitive according to the
active locale. This is not in the SQL standard but is a PostgreSQL extension.
The operator ~~ is equivalent to LIKE, and ~~* corresponds to ILIKE. There are also !~~ and !~~*
operators that represent NOT LIKE and NOT ILIKE, respectively. All of these operators are PostgreSQL-
specific.
The SIMILAR TO operator returns true or false depending on whether its pattern matches the given string.
It is much like LIKE, except that it interprets the pattern using the SQL standard’s definition of a regular
expression. SQL regular expressions are a curious cross between LIKE notation and common regular
expression notation.
Like LIKE, the SIMILAR TO operator succeeds only if its pattern matches the entire string; this is unlike
common regular expression practice, wherein the pattern may match any part of the string. Also like
LIKE, SIMILAR TO uses _ and % as wildcard characters denoting any single character and any string,
respectively (these are comparable to . and .* in POSIX regular expressions).
In addition to these facilities borrowed from LIKE, SIMILAR TO supports these pattern-matching
metacharacters borrowed from POSIX regular expressions:
136
Chapter 9. Functions and Operators
The substring function with three parameters, substring(string from pattern for
escape-character), provides extraction of a substring that matches an SQL regular expression
pattern. As with SIMILAR TO, the specified pattern must match to the entire data string, else the function
fails and returns null. To indicate the part of the pattern that should be returned on success, the pattern
must contain two occurrences of the escape character followed by a double quote ("). The text matching
the portion of the pattern between these markers is returned.
Some examples:
POSIX regular expressions provide a more powerful means for pattern matching than the LIKE and
SIMILAR TO operators. Many Unix tools such as egrep, sed, or awk use a pattern matching language
that is similar to the one described here.
A regular expression is a character sequence that is an abbreviated definition of a set of strings (a regular
set). A string is said to match a regular expression if it is a member of the regular set described by the
regular expression. As with LIKE, pattern characters match string characters exactly unless they are special
characters in the regular expression language — but regular expressions use different special characters
than LIKE does. Unlike LIKE patterns, a regular expression is allowed to match anywhere within a string,
unless the regular expression is explicitly anchored to the beginning or end of the string.
Some examples:
137
Chapter 9. Functions and Operators
The substring function with two parameters, substring(string from pattern), provides extrac-
tion of a substring that matches a POSIX regular expression pattern. It returns null if there is no match,
otherwise the portion of the text that matched the pattern. But if the pattern contains any parentheses,
the portion of the text that matched the first parenthesized subexpression (the one whose left parenthe-
sis comes first) is returned. You can put parentheses around the whole expression if you want to use
parentheses within it without triggering this exception. If you need parentheses in the pattern before the
subexpression you want to extract, see the non-capturing parentheses described below.
Some examples:
PostgreSQL’s regular expressions are implemented using a package written by Henry Spencer. Much of
the description of regular expressions below is copied verbatim from his manual entry.
Note: The form of regular expressions accepted by PostgreSQL can be chosen by setting the
regex_flavor run-time parameter. The usual setting is advanced, but one might choose extended for
maximum backwards compatibility with pre-7.4 releases of PostgreSQL.
A regular expression is defined as one or more branches, separated by |. It matches anything that matches
one of the branches.
A branch is zero or more quantified atoms or constraints, concatenated. It matches a match for the first,
followed by a match for the second, etc; an empty branch matches the empty string.
A quantified atom is an atom possibly followed by a single quantifier. Without a quantifier, it matches a
match for the atom. With a quantifier, it can match some number of matches of the atom. An atom can
be any of the possibilities shown in Table 9-12. The possible quantifiers and their meanings are shown in
Table 9-13.
A constraint matches an empty string, but matches only when specific conditions are met. A constraint
can be used where an atom could be used, except it may not be followed by a quantifier. The simple
constraints are shown in Table 9-14; some more constraints are described later.
138
Chapter 9. Functions and Operators
Atom Description
(re) (where re is any regular expression) matches a
match for re, with the match noted for possible
reporting
(?:re) as above, but the match is not noted for reporting (a
“non-capturing” set of parentheses) (AREs only)
. matches any single character
[chars] a bracket expression, matching any one of the
chars (see Section 9.7.3.2 for more detail)
\k (where k is a non-alphanumeric character) matches
that character taken as an ordinary character, e.g. \\
matches a backslash character
\c where c is alphanumeric (possibly followed by
other characters) is an escape, see Section 9.7.3.3
(AREs only; in EREs and BREs, this matches c)
{ when followed by a character other than a digit,
matches the left-brace character {; when followed
by a digit, it is the beginning of a bound (see
below)
x where x is a single character with no other
significance, matches that character
Note: Remember that the backslash (\) already has a special meaning in PostgreSQL string literals.
To write a pattern constant that contains a backslash, you must write two backslashes in the statement.
Quantifier Matches
* a sequence of 0 or more matches of the atom
+ a sequence of 1 or more matches of the atom
? a sequence of 0 or 1 matches of the atom
{m } a sequence of exactly m matches of the atom
{m,} a sequence of m or more matches of the atom
{m,n} a sequence of m through n (inclusive) matches of
the atom; m may not exceed n
*? non-greedy version of *
+? non-greedy version of +
?? non-greedy version of ?
139
Chapter 9. Functions and Operators
Quantifier Matches
{m}? non-greedy version of {m}
{m,}? non-greedy version of {m,}
{m,n}? non-greedy version of {m,n}
The forms using {...} are known as bounds. The numbers m and n within a bound are unsigned decimal
integers with permissible values from 0 to 255 inclusive.
Non-greedy quantifiers (available in AREs only) match the same possibilities as their corresponding nor-
mal (greedy) counterparts, but prefer the smallest number rather than the largest number of matches. See
Section 9.7.3.5 for more detail.
Note: A quantifier cannot immediately follow another quantifier. A quantifier cannot begin an expres-
sion or subexpression or follow ^ or |.
Constraint Description
^ matches at the beginning of the string
$ matches at the end of the string
(?=re) positive lookahead matches at any point where a
substring matching re begins (AREs only)
(?!re) negative lookahead matches at any point where no
substring matching re begins (AREs only)
Lookahead constraints may not contain back references (see Section 9.7.3.3), and all parentheses within
them are considered non-capturing.
140
Chapter 9. Functions and Operators
sequence of characters of that collating element. The sequence is a single element of the bracket expres-
sion’s list. A bracket expression containing a multiple-character collating element can thus match more
than one character, e.g. if the collating sequence includes a ch collating element, then the RE [[.ch.]]*c
matches the first five characters of chchcc.
Note: PostgreSQL currently has no multi-character collating elements. This information describes
possible future behavior.
Within a bracket expression, a collating element enclosed in [= and =] is an equivalence class, standing
for the sequences of characters of all collating elements equivalent to that one, including itself. (If there
are no other equivalent collating elements, the treatment is as if the enclosing delimiters were [. and .].)
For example, if o and ^ are the members of an equivalence class, then [[=o=]], [[=^=]], and [o^] are
all synonymous. An equivalence class may not be an endpoint of a range.
Within a bracket expression, the name of a character class enclosed in [: and :] stands for the list of
all characters belonging to that class. Standard character class names are: alnum, alpha, blank, cntrl,
digit, graph, lower, print, punct, space, upper, xdigit. These stand for the character classes
defined in ctype. A locale may provide others. A character class may not be used as an endpoint of a
range.
There are two special cases of bracket expressions: the bracket expressions [[:<:]] and [[:>:]] are
constraints, matching empty strings at the beginning and end of a word respectively. A word is defined as
a sequence of word characters that is neither preceded nor followed by word characters. A word character
is an alnum character (as defined by ctype) or an underscore. This is an extension, compatible with but
not specified by POSIX 1003.2, and should be used with caution in software intended to be portable to
other systems. The constraint escapes described below are usually preferable (they are no more standard,
but are certainly easier to type).
141
Chapter 9. Functions and Operators
Note: Keep in mind that an escape’s leading \ will need to be doubled when entering the pattern as
an SQL string constant. For example:
Escape Description
\a alert (bell) character, as in C
\b backspace, as in C
\B synonym for \ to help reduce the need for
backslash doubling
\cX (where X is any character) the character whose
low-order 5 bits are the same as those of X, and
whose other bits are all zero
\e the character whose collating-sequence name is
ESC, or failing that, the character with octal value
033
\f form feed, as in C
\n newline, as in C
\r carriage return, as in C
\t horizontal tab, as in C
\uwxyz (where wxyz is exactly four hexadecimal digits)
the Unicode character U+wxyz in the local byte
ordering
\Ustuvwxyz (where stuvwxyz is exactly eight hexadecimal
digits) reserved for a somewhat-hypothetical
Unicode extension to 32 bits
\v vertical tab, as in C
\xhhh (where hhh is any sequence of hexadecimal digits)
the character whose hexadecimal value is 0xhhh (a
single character no matter how many hexadecimal
digits are used)
\0 the character whose value is 0
\xy (where xy is exactly two octal digits, and is not a
back reference) the character whose octal value is
0xy
\xyz (where xyz is exactly three octal digits, and is not
a back reference) the character whose octal value is
0xyz
Hexadecimal digits are 0-9, a-f, and A-F. Octal digits are 0-7.
142
Chapter 9. Functions and Operators
The character-entry escapes are always taken as ordinary characters. For example, \135 is ] in ASCII,
but \135 does not terminate a bracket expression.
Escape Description
\d [[:digit:]]
\s [[:space:]]
\w [[:alnum:]_] (note underscore is included)
\D [^[:digit:]]
\S [^[:space:]]
\W [^[:alnum:]_] (note underscore is included)
Within bracket expressions, \d, \s, and \w lose their outer brackets, and \D, \S, and \W are illegal.
(So, for example, [a-c\d] is equivalent to [a-c[:digit:]]. Also, [a-c\D], which is equivalent to
[a-c^[:digit:]], is illegal.)
Escape Description
\A matches only at the beginning of the string (see
Section 9.7.3.5 for how this differs from ^)
\m matches only at the beginning of a word
\M matches only at the end of a word
\y matches only at the beginning or end of a word
\Y matches only at a point that is not the beginning or
end of a word
\Z matches only at the end of the string (see Section
9.7.3.5 for how this differs from $)
A word is defined as in the specification of [[:<:]] and [[:>:]] above. Constraint escapes are illegal
within bracket expressions.
Escape Description
\m (where m is a nonzero digit) a back reference to the
m’th subexpression
\mnn (where m is a nonzero digit, and nn is some more
digits, and the decimal value mnn is not greater than
the number of closing capturing parentheses seen so
far) a back reference to the mnn’th subexpression
Note: There is an inherent historical ambiguity between octal character-entry escapes and back ref-
erences, which is resolved by heuristics, as hinted at above. A leading zero always indicates an octal
143
Chapter 9. Functions and Operators
escape. A single non-zero digit, not followed by another digit, is always taken as a back reference. A
multi-digit sequence not starting with a zero is taken as a back reference if it comes after a suitable
subexpression (i.e. the number is in the legal range for a back reference), and otherwise is taken as
octal.
Option Description
b rest of RE is a BRE
c case-sensitive matching (overrides operator type)
e rest of RE is an ERE
i case-insensitive matching (see Section 9.7.3.5)
(overrides operator type)
m historical synonym for n
n newline-sensitive matching (see Section 9.7.3.5)
p partial newline-sensitive matching (see Section
9.7.3.5)
q rest of RE is a literal (“quoted”) string, all ordinary
characters
s non-newline-sensitive matching (default)
t tight syntax (default; see below)
w inverse partial newline-sensitive (“weird”)
matching (see Section 9.7.3.5)
x expanded syntax (see below)
Embedded options take effect at the ) terminating the sequence. They may appear only at the start of an
ARE (after the ***: director if any).
In addition to the usual (tight) RE syntax, in which all characters are significant, there is an expanded
syntax, available by specifying the embedded x option. In the expanded syntax, white-space characters in
the RE are ignored, as are all characters between a # and the following newline (or the end of the RE).
144
Chapter 9. Functions and Operators
This permits paragraphing and commenting a complex RE. There are three exceptions to that basic rule:
• Most atoms, and all constraints, have no greediness attribute (because they cannot match variable
amounts of text anyway).
• Adding parentheses around an RE does not change its greediness.
• A quantified atom with a fixed-repetition quantifier ({m} or {m}?) has the same greediness (possibly
none) as the atom itself.
• A quantified atom with other normal quantifiers (including {m,n} with m equal to n) is greedy (prefers
longest match).
• A quantified atom with a non-greedy quantifier (including {m,n}? with m equal to n) is non-greedy
(prefers shortest match).
• A branch — that is, an RE that has no top-level | operator — has the same greediness as the first
quantified atom in it that has a greediness attribute.
• An RE consisting of two or more branches connected by the | operator is always greedy.
The above rules associate greediness attributes not only with individual quantified atoms, but with
branches and entire REs that contain quantified atoms. What that means is that the matching is done in
such a way that the branch, or whole RE, matches the longest or shortest possible substring as a whole.
Once the length of the entire match is determined, the part of it that matches any particular subexpression
is determined on the basis of the greediness attribute of that subexpression, with subexpressions starting
earlier in the RE taking priority over ones starting later.
145
Chapter 9. Functions and Operators
In the first case, the RE as a whole is greedy because Y* is greedy. It can match beginning at the Y, and it
matches the longest possible string starting there, i.e., Y123. The output is the parenthesized part of that,
or 123. In the second case, the RE as a whole is non-greedy because Y*? is non-greedy. It can match
beginning at the Y, and it matches the shortest possible string starting there, i.e., Y1. The subexpression
[0-9]{1,3} is greedy but it cannot change the decision as to the overall match length; so it is forced to
match just 1.
In short, when an RE contains both greedy and non-greedy subexpressions, the total match length is
either as long as possible or as short as possible, according to the attribute assigned to the whole RE. The
attributes assigned to the subexpressions only affect how much of that match they are allowed to “eat”
relative to each other.
The quantifiers {1,1} and {1,1}? can be used to force greediness or non-greediness, respectively, on a
subexpression or a whole RE.
Match lengths are measured in characters, not collating elements. An empty string is considered
longer than no match at all. For example: bb* matches the three middle characters of abbbc;
(week|wee)(night|knights) matches all ten characters of weeknights; when (.*).* is matched
against abc the parenthesized subexpression matches all three characters; and when (a*)* is matched
against bc both the whole RE and the parenthesized subexpression match an empty string.
If case-independent matching is specified, the effect is much as if all case distinctions had vanished from
the alphabet. When an alphabetic that exists in multiple cases appears as an ordinary character outside
a bracket expression, it is effectively transformed into a bracket expression containing both cases, e.g. x
becomes [xX]. When it appears inside a bracket expression, all case counterparts of it are added to the
bracket expression, e.g. [x] becomes [xX] and [^x] becomes [^xX].
If newline-sensitive matching is specified, . and bracket expressions using ^ will never match the newline
character (so that matches will never cross newlines unless the RE explicitly arranges it) and ^and $ will
match the empty string after and before a newline respectively, in addition to matching at beginning and
end of string respectively. But the ARE escapes \A and \Z continue to match beginning or end of string
only.
If partial newline-sensitive matching is specified, this affects . and bracket expressions as with newline-
sensitive matching, but not ^ and $.
If inverse partial newline-sensitive matching is specified, this affects ^ and $ as with newline-sensitive
matching, but not . and bracket expressions. This isn’t very useful but is provided for symmetry.
146
Chapter 9. Functions and Operators
The only feature of AREs that is actually incompatible with POSIX EREs is that \ does not lose its
special significance inside bracket expressions. All other ARE features use syntax which is illegal or
has undefined or unspecified effects in POSIX EREs; the *** syntax of directors likewise is outside the
POSIX syntax for both BREs and EREs.
Many of the ARE extensions are borrowed from Perl, but some have been changed to clean them up, and a
few Perl extensions are not present. Incompatibilities of note include \b, \B, the lack of special treatment
for a trailing newline, the addition of complemented bracket expressions to the things affected by newline-
sensitive matching, the restrictions on parentheses and back references in lookahead constraints, and the
longest/shortest-match (rather than first-match) matching semantics.
Two significant incompatibilities exist between AREs and the ERE syntax recognized by pre-7.4 releases
of PostgreSQL:
147
Chapter 9. Functions and Operators
Warning: to_char(interval, text) is deprecated and should not be used in newly-written code. It will
be removed in the next version.
In an output template string (for to_char), there are certain patterns that are recognized and replaced
with appropriately-formatted data from the value to be formatted. Any text that is not a template pattern
is simply copied verbatim. Similarly, in an input template string (for anything but to_char), template
patterns identify the parts of the input data string to be looked at and the values to be found there.
Table 9-21 shows the template patterns available for formatting date and time values.
Pattern Description
HH hour of day (01-12)
HH12 hour of day (01-12)
HH24 hour of day (00-23)
MI minute (00-59)
SS second (00-59)
MS millisecond (000-999)
US microsecond (000000-999999)
SSSS seconds past midnight (0-86399)
AM or A.M. or PM or P.M. meridian indicator (uppercase)
am or a.m. or pm or p.m. meridian indicator (lowercase)
Y,YYY year (4 and more digits) with comma
YYYY year (4 and more digits)
YYY last 3 digits of year
148
Chapter 9. Functions and Operators
Pattern Description
YY last 2 digits of year
Y last digit of year
IYYY ISO year (4 and more digits)
IYY last 3 digits of ISO year
IY last 2 digits of ISO year
I last digits of ISO year
BC or B.C. or AD or A.D. era indicator (uppercase)
bc or b.c. or ad or a.d. era indicator (lowercase)
MONTH full uppercase month name (blank-padded to 9
chars)
Month full mixed-case month name (blank-padded to 9
chars)
month full lowercase month name (blank-padded to 9
chars)
MON abbreviated uppercase month name (3 chars)
Mon abbreviated mixed-case month name (3 chars)
mon abbreviated lowercase month name (3 chars)
MM month number (01-12)
DAY full uppercase day name (blank-padded to 9 chars)
Day full mixed-case day name (blank-padded to 9 chars)
day full lowercase day name (blank-padded to 9 chars)
DY abbreviated uppercase day name (3 chars)
Dy abbreviated mixed-case day name (3 chars)
dy abbreviated lowercase day name (3 chars)
DDD day of year (001-366)
DD day of month (01-31)
D day of week (1-7; Sunday is 1)
W week of month (1-5) (The first week starts on the
first day of the month.)
WW week number of year (1-53) (The first week starts
on the first day of the year.)
IW ISO week number of year (The first Thursday of the
new year is in week 1.)
CC century (2 digits)
J Julian Day (days since January 1, 4712 BC)
Q quarter
RM month in Roman numerals (I-XII; I=January)
(uppercase)
rm month in Roman numerals (i-xii; i=January)
(lowercase)
149
Chapter 9. Functions and Operators
Pattern Description
TZ time-zone name (uppercase)
tz time-zone name (lowercase)
Certain modifiers may be applied to any template pattern to alter its behavior. For example, FMMonth is
the Month pattern with the FM modifier. Table 9-22 shows the modifier patterns for date/time formatting.
• FM suppresses leading zeroes and trailing blanks that would otherwise be added to make the output of a
pattern be fixed-width.
• to_timestamp and to_date skip multiple blank spaces in the input string if the FX option is not used.
FX must be specified as the first item in the template. For example to_timestamp(’2000 JUN’,
’YYYY MON’) is correct, but to_timestamp(’2000 JUN’, ’FXYYYY MON’) returns an error,
because to_timestamp expects one space only.
• Ordinary text is allowed in to_char templates and will be output literally. You can put a substring
in double quotes to force it to be interpreted as literal text even if it contains pattern key words. For
example, in ’"Hello Year "YYYY’, the YYYY will be replaced by the year data, but the single Y in
Year will not be.
• If you want to have a double quote in the output you must precede it with a backslash, for example
’\\"YYYY Month\\"’. (Two backslashes are necessary because the backslash already has a special
meaning in a string constant.)
• The YYYY conversion from string to timestamp or date has a restriction if you use a year with
more than 4 digits. You must use some non-digit character or template after YYYY, otherwise the
year is always interpreted as 4 digits. For example (with the year 20000): to_date(’200001131’,
’YYYYMMDD’) will be interpreted as a 4-digit year; instead use a non-digit separator after the year, like
to_date(’20000-1131’, ’YYYY-MMDD’) or to_date(’20000Nov31’, ’YYYYMonDD’).
• Millisecond (MS) and microsecond (US) values in a conversion from string to timestamp are used as
part of the seconds after the decimal point. For example to_timestamp(’12:3’, ’SS:MS’) is not 3
milliseconds, but 300, because the conversion counts it as 12 + 0.3 seconds. This means for the format
SS:MS, the input values 12:3, 12:30, and 12:300 specify the same number of milliseconds. To get
three milliseconds, one must use 12:003, which the conversion counts as 12 + 0.003 = 12.003 seconds.
150
Chapter 9. Functions and Operators
• to_char’s day of the week numbering (see the ’D’ formatting pattern) is different from that of the
extract function.
Table 9-23 shows the template patterns available for formatting numeric values.
Pattern Description
9 value with the specified number of digits
0 value with leading zeros
. (period) decimal point
, (comma) group (thousand) separator
PR negative value in angle brackets
S sign anchored to number (uses locale)
L currency symbol (uses locale)
D decimal point (uses locale)
G group separator (uses locale)
MI minus sign in specified position (if number < 0)
PL plus sign in specified position (if number > 0)
SG plus/minus sign in specified position
RN roman numeral (input between 1 and 3999)
TH or th ordinal number suffix
V shift specified number of digits (see notes)
EEEE scientific notation (not implemented yet)
• A sign formatted using SG, PL, or MI is not anchored to the number; for example, to_char(-12,
’S9999’) produces ’ -12’, but to_char(-12, ’MI9999’) produces ’- 12’. The Oracle im-
plementation does not allow the use of MI ahead of 9, but rather requires that 9 precede MI.
• 9 results in a value with the same number of digits as there are 9s. If a digit is not available it outputs a
space.
• TH does not convert values less than zero and does not convert fractional numbers.
• PL, SG, and TH are PostgreSQL extensions.
• V effectively multiplies the input values by 10^n, where n is the number of digits following V. to_char
does not support the use of V combined with a decimal point. (E.g., 99.9V99 is not allowed.)
151
Chapter 9. Functions and Operators
Table 9-24 shows some examples of the use of the to_char function.
Expression Result
to_char(current_timestamp, ’Tuesday , 06 05:39:18’
’Day, DD HH12:MI:SS’)
to_char(current_timestamp, ’Tuesday, 6 05:39:18’
’FMDay, FMDD HH12:MI:SS’)
to_char(-0.1, ’99.99’) ’ -.10’
to_char(-0.1, ’FM9.99’) ’-.1’
to_char(0.1, ’0.9’) ’ 0.1’
to_char(12, ’9990999.9’) ’ 0012.0’
to_char(12, ’FM9990999.9’) ’0012.’
to_char(485, ’999’) ’ 485’
to_char(-485, ’999’) ’-485’
to_char(485, ’9 9 9’) ’ 4 8 5’
to_char(1485, ’9,999’) ’ 1,485’
to_char(1485, ’9G999’) ’ 1 485’
to_char(148.5, ’999.999’) ’ 148.500’
to_char(148.5, ’FM999.999’) ’148.5’
to_char(148.5, ’FM999.990’) ’148.500’
to_char(148.5, ’999D999’) ’ 148,500’
to_char(3148.5, ’9G999D999’) ’ 3 148,500’
to_char(-485, ’999S’) ’485-’
to_char(-485, ’999MI’) ’485-’
to_char(485, ’999MI’) ’485 ’
to_char(485, ’FM999MI’) ’485’
to_char(485, ’PL999’) ’+485’
to_char(485, ’SG999’) ’+485’
to_char(-485, ’SG999’) ’-485’
to_char(-485, ’9SG99’) ’4-85’
to_char(-485, ’999PR’) ’<485>’
to_char(485, ’L999’) ’DM 485
to_char(485, ’RN’) ’ CDLXXXV’
to_char(485, ’FMRN’) ’CDLXXXV’
to_char(5.2, ’FMRN’) ’V’
to_char(482, ’999th’) ’ 482nd’
to_char(485, ’"Good number:"999’) ’Good number: 485’
152
Chapter 9. Functions and Operators
Expression Result
to_char(485.8, ’Pre: 485 Post: .800’
’"Pre:"999" Post:" .999’)
to_char(12, ’99V999’) ’ 12000’
to_char(12.4, ’99V999’) ’ 12400’
to_char(12.45, ’99V9’) ’ 125’
153
Chapter 9. Functions and Operators
154
Chapter 9. Functions and Operators
This expression yields true when two time periods (defined by their endpoints) overlap, false when they
do not overlap. The endpoints can be specified as pairs of dates, times, or time stamps; or as a date, time,
or time stamp followed by an interval.
155
Chapter 9. Functions and Operators
The extract function retrieves subfields such as year or hour from date/time values. source must be
a value expression of type timestamp, time, or interval. (Expressions of type date will be cast to
timestamp and can therefore be used as well.) field is an identifier or string that selects what field to
extract from the source value. The extract function returns values of type double precision. The
following are valid field names:
century
The century
SELECT EXTRACT(CENTURY FROM TIMESTAMP ’2000-12-16 12:21:13’);
Result: 20
SELECT EXTRACT(CENTURY FROM TIMESTAMP ’2001-02-16 20:38:40’);
Result: 21
The first century starts at 0001-01-01 00:00:00 AD, although they did not know it at the time. This
definition applies to all Gregorian calendar countries. There is no century number 0, you go from -1
to 1. If you disagree with this, please write your complaint to: Pope, Cathedral Saint-Peter of Roma,
Vatican.
PostgreSQL releases before 8.0 did not follow the conventional numbering of centuries, but just
returned the year field divided by 100.
day
decade
dow
Note that extract’s day of the week numbering is different from that of the to_char function.
doy
156
Chapter 9. Functions and Operators
epoch
For date and timestamp values, the number of seconds since 1970-01-01 00:00:00-00 (can be
negative); for interval values, the total number of seconds in the interval
SELECT EXTRACT(EPOCH FROM TIMESTAMP WITH TIME ZONE ’2001-02-16 20:38:40-08’);
Result: 982384720
Here is how you can convert an epoch value back to a time stamp:
SELECT TIMESTAMP WITH TIME ZONE ’epoch’ + 982384720 * INTERVAL ’1 second’;
hour
microseconds
The seconds field, including fractional parts, multiplied by 1 000 000. Note that this includes full
seconds.
SELECT EXTRACT(MICROSECONDS FROM TIME ’17:12:28.5’);
Result: 28500000
millennium
The millennium
SELECT EXTRACT(MILLENNIUM FROM TIMESTAMP ’2001-02-16 20:38:40’);
Result: 3
Years in the 1900s are in the second millennium. The third millennium starts January 1, 2001.
PostgreSQL releases before 8.0 did not follow the conventional numbering of millennia, but just
returned the year field divided by 1000.
milliseconds
The seconds field, including fractional parts, multiplied by 1000. Note that this includes full seconds.
SELECT EXTRACT(MILLISECONDS FROM TIME ’17:12:28.5’);
Result: 28500
minute
month
For timestamp values, the number of the month within the year (1 - 12) ; for interval values the
number of months, modulo 12 (0 - 11)
SELECT EXTRACT(MONTH FROM TIMESTAMP ’2001-02-16 20:38:40’);
Result: 2
157
Chapter 9. Functions and Operators
Result: 3
quarter
The quarter of the year (1 - 4) that the day is in (for timestamp values only)
SELECT EXTRACT(QUARTER FROM TIMESTAMP ’2001-02-16 20:38:40’);
Result: 1
second
timezone
The time zone offset from UTC, measured in seconds. Positive values correspond to time zones east
of UTC, negative values to zones west of UTC.
timezone_hour
The number of the week of the year that the day is in. By definition (ISO 8601), the first week of a
year contains January 4 of that year. (The ISO-8601 week starts on Monday.) In other words, the first
Thursday of a year is in week 1 of that year. (for timestamp values only)
SELECT EXTRACT(WEEK FROM TIMESTAMP ’2001-02-16 20:38:40’);
Result: 7
year
The year field. Keep in mind there is no 0 AD, so subtracting BC years from AD years should be done
with care.
SELECT EXTRACT(YEAR FROM TIMESTAMP ’2001-02-16 20:38:40’);
Result: 2001
The extract function is primarily intended for computational processing. For formatting date/time val-
ues for display, see Section 9.8.
The date_part function is modeled on the traditional Ingres equivalent to the SQL-standard function
extract:
158
Chapter 9. Functions and Operators
date_part(’field’, source)
Note that here the field parameter needs to be a string value, not a name. The valid field names for
date_part are the same as for extract.
9.9.2. date_trunc
The function date_trunc is conceptually similar to the trunc function for numbers.
date_trunc(’field’, source)
source is a value expression of type timestamp or interval. (Values of type date and time are cast
automatically, to timestamp or interval respectively.) field selects to which precision to truncate
the input value. The return value is of type timestamp or interval with all fields that are less significant
than the selected one set to zero (or one, for day and month).
Valid values for field are:
microseconds
milliseconds
second
minute
hour
day
week
month
year
decade
century
millennium
Examples:
159
Chapter 9. Functions and Operators
In these expressions, the desired time zone zone can be specified either as a text string (e.g., ’PST’) or
as an interval (e.g., INTERVAL ’-08:00’). In the text case, the available zone names are those shown in
Table B-4. (It would be useful to support the more general names shown in Table B-6, but this is not yet
implemented.)
Examples (supposing that the local time zone is PST8PDT):
SELECT TIMESTAMP WITH TIME ZONE ’2001-02-16 20:38:40-05’ AT TIME ZONE ’MST’;
Result: 2001-02-16 18:38:40
The first example takes a zone-less time stamp and interprets it as MST time (UTC-7) to produce a UTC
time stamp, which is then rotated to PST (UTC-8) for display. The second example takes a time stamp
specified in EST (UTC-5) and converts it to local time in MST (UTC-7).
The function timezone(zone, timestamp) is equivalent to the SQL-conforming construct timestamp
AT TIME ZONE zone.
CURRENT_DATE
CURRENT_TIME
CURRENT_TIMESTAMP
CURRENT_TIME ( precision )
CURRENT_TIMESTAMP ( precision )
LOCALTIME
LOCALTIMESTAMP
LOCALTIME ( precision )
LOCALTIMESTAMP ( precision )
160
Chapter 9. Functions and Operators
CURRENT_TIME and CURRENT_TIMESTAMP deliver values with time zone; LOCALTIME and
LOCALTIMESTAMP deliver values without time zone.
Note: Prior to PostgreSQL 7.2, the precision parameters were unimplemented, and the result was
always given in integer seconds.
Some examples:
SELECT CURRENT_TIME;
Result: 14:39:53.662522-05
SELECT CURRENT_DATE;
Result: 2001-12-23
SELECT CURRENT_TIMESTAMP;
Result: 2001-12-23 14:39:53.662522-05
SELECT CURRENT_TIMESTAMP(2);
Result: 2001-12-23 14:39:53.66-05
SELECT LOCALTIMESTAMP;
Result: 2001-12-23 14:39:53.662522
SELECT timeofday();
Result: Sat Feb 17 19:07:32.000126 2001 EST
It is important to know that CURRENT_TIMESTAMP and related functions return the start time of the current
transaction; their values do not change during the transaction. This is considered a feature: the intent is to
allow a single transaction to have a consistent notion of the “current” time, so that multiple modifications
within the same transaction bear the same time stamp. timeofday() returns the wall-clock time and does
advance during transactions.
Note: Other database systems may advance these values more frequently.
All the date/time data types also accept the special literal value now to specify the current date and time.
Thus, the following three all return the same result:
SELECT CURRENT_TIMESTAMP;
161
Chapter 9. Functions and Operators
SELECT now();
SELECT TIMESTAMP ’now’;
Tip: You do not want to use the third form when specifying a DEFAULT clause while creating a table.
The system will convert now to a timestamp as soon as the constant is parsed, so that when the
default value is needed, the time of the table creation would be used! The first two forms will not
be evaluated until the default value is used, because they are function calls. Thus they will give the
desired behavior of defaulting to the time of row insertion.
162
Chapter 9. Functions and Operators
163
Chapter 9. Functions and Operators
164
Chapter 9. Functions and Operators
It is possible to access the two component numbers of a point as though it were an array with indices
0 and 1. For example, if t.p is a point column then SELECT p[0] FROM t retrieves the X coordinate
and UPDATE t SET p[1] = ... changes the Y coordinate. In the same way, a value of type box or
lseg may be treated as an array of two point values.
The area function works for the types box, circle, and path. The area function only
works on the path data type if the points in the path are non-intersecting. For example,
the path ’((0,0),(0,1),(2,1),(2,2),(1,2),(1,0),(0,0))’::PATH
won’t work, however, the following visually identical path
’((0,0),(0,1),(1,1),(1,2),(2,2),(2,1),(1,1),(1,0),(0,0))’::PATH
will work. If the concept of an intersecting versus non-intersecting path is confusing, draw both of the
above paths side by side on a piece of graph paper.
165
Chapter 9. Functions and Operators
host part, and determine whether one network part is identical to or a subnet of the other.
Table 9-32 shows the functions available for use with the cidr and inet types. The host, text, and
abbrev functions are primarily intended to offer alternative display formats. You can cast a text value to
inet using normal casting syntax: inet(expression) or colname::inet.
166
Chapter 9. Functions and Operators
Table 9-33 shows the functions available for use with the macaddr type. The function trunc(macaddr)
returns a MAC address with the last 3 bytes set to zero. This can be used to associate the remaining prefix
with a manufacturer. The directory contrib/mac in the source distribution contains some utilities to
create and maintain such an association table.
The macaddr type also supports the standard relational operators (>, <=, etc.) for lexicographical order-
ing.
167
Chapter 9. Functions and Operators
For largely historical reasons, the sequence to be operated on by a sequence-function call is specified
by a text-string argument. To achieve some compatibility with the handling of ordinary SQL names, the
sequence functions convert their argument to lowercase unless the string is double-quoted. Thus
Of course, the text argument can be the result of an expression, not only a simple literal, which is occa-
sionally useful.
The available sequence functions are:
nextval
Advance the sequence object to its next value and return that value. This is done atomically: even if
multiple sessions execute nextval concurrently, each will safely receive a distinct sequence value.
currval
Return the value most recently obtained by nextval for this sequence in the current session. (An
error is reported if nextval has never been called for this sequence in this session.) Notice that
because this is returning a session-local value, it gives a predictable answer whether or not other
sessions have executed nextval since the current session did.
setval
Reset the sequence object’s counter value. The two-parameter form sets the sequence’s last_value
field to the specified value and sets its is_called field to true, meaning that the next nextval
will advance the sequence before returning a value. In the three-parameter form, is_called may
be set either true or false. If it’s set to false, the next nextval will return exactly the specified
value, and sequence advancement commences with the following nextval. For example,
SELECT setval(’foo’, 42); Next nextval will return 43
SELECT setval(’foo’, 42, true); Same as above
SELECT setval(’foo’, 42, false); Next nextval will return 42
The result returned by setval is just the value of its second argument.
Important: To avoid blocking of concurrent transactions that obtain numbers from the same sequence,
a nextval operation is never rolled back; that is, once a value has been fetched it is considered used,
even if the transaction that did the nextval later aborts. This means that aborted transactions may
168
Chapter 9. Functions and Operators
leave unused “holes” in the sequence of assigned values. setval operations are never rolled back,
either.
If a sequence object has been created with default parameters, nextval calls on it will return successive
values beginning with 1. Other behaviors can be obtained by using special parameters in the CREATE
SEQUENCE command; see its command reference page for more information.
Tip: If your needs go beyond the capabilities of these conditional expressions you might want to
consider writing a stored procedure in a more expressive programming language.
9.13.1. CASE
The SQL CASE expression is a generic conditional expression, similar to if/else statements in other lan-
guages:
CASE clauses can be used wherever an expression is valid. condition is an expression that returns a
boolean result. If the result is true then the value of the CASE expression is the result that follows the
condition. If the result is false any subsequent WHEN clauses are searched in the same manner. If no WHEN
condition is true then the value of the case expression is the result in the ELSE clause. If the ELSE
clause is omitted and no condition matches, the result is null.
An example:
a
---
1
2
3
SELECT a,
CASE WHEN a=1 THEN ’one’
WHEN a=2 THEN ’two’
ELSE ’other’
END
FROM test;
169
Chapter 9. Functions and Operators
a | case
---+-------
1 | one
2 | two
3 | other
The data types of all the result expressions must be convertible to a single output type. See Section
10.5 for more detail.
The following “simple” CASE expression is a specialized variant of the general form above:
CASE expression
WHEN value THEN result
[WHEN ...]
[ELSE result]
END
The expression is computed and compared to all the value specifications in the WHEN clauses until
one is found that is equal. If no match is found, the result in the ELSE clause (or a null value) is
returned. This is similar to the switch statement in C.
The example above can be written using the simple CASE syntax:
SELECT a,
CASE a WHEN 1 THEN ’one’
WHEN 2 THEN ’two’
ELSE ’other’
END
FROM test;
a | case
---+-------
1 | one
2 | two
3 | other
A CASE expression does not evaluate any subexpressions that are not needed to determine the result. For
example, this is a possible way of avoiding a division-by-zero failure:
SELECT ... WHERE CASE WHEN x <> 0 THEN y/x > 1.5 ELSE false END;
9.13.2. COALESCE
COALESCE(value [, ...])
170
Chapter 9. Functions and Operators
The COALESCE function returns the first of its arguments that is not null. Null is returned only if all
arguments are null. This is often useful to substitute a default value for null values when data is retrieved
for display, for example:
Like a CASE expression, COALESCE will not evaluate arguments that are not needed to determine the
result; that is, arguments to the right of the first non-null argument are not evaluated.
9.13.3. NULLIF
NULLIF(value1, value2)
The NULLIF function returns a null value if and only if value1 and value2 are equal. Otherwise it
returns value1. This can be used to perform the inverse operation of the COALESCE example given
above:
171
Chapter 9. Functions and Operators
See Section 8.10 for more details about array operator behavior.
Table 9-36 shows the functions available for use with array types. See Section 8.10 for more discussion
and examples of the use of these functions.
172
Chapter 9. Functions and Operators
aggregate functions. The special syntax considerations for aggregate functions are explained in Section
4.2.7. Consult Section 2.7 for additional introductory information.
173
Chapter 9. Functions and Operators
It should be noted that except for count, these functions return a null value when no rows are selected.
In particular, sum of no rows returns null, not zero as one might expect. The coalesce function may be
used to substitute zero for null when necessary.
Note: Boolean aggregates bool_and and bool_or correspond to standard SQL aggregates every
and any or some. As for any and some, it seems that there is an ambiguity built into the standard
syntax:
Here ANY can be considered both as leading to a subquery or as an aggregate if the select expression
returns 1 row. Thus the standard name cannot be given to these aggregates.
Note: Users accustomed to working with other SQL database management systems may be surprised
by the performance characteristics of certain aggregate functions in PostgreSQL when the aggregate
is applied to the entire table (in other words, no WHERE clause is specified). In particular, a query like
will be executed by PostgreSQL using a sequential scan of the entire table. Other database systems
may optimize queries of this form to use an index on the column, if one is available. Similarly, the
aggregate functions max() and count() always require a sequential scan if applied to the entire table
in PostgreSQL.
PostgreSQL cannot easily implement this optimization because it also allows for user-defined ag-
gregate queries. Since min(), max(), and count() are defined using a generic API for aggregate
functions, there is no provision for special-casing the execution of these functions under certain cir-
cumstances.
Fortunately, there is a simple workaround for min() and max(). The query shown below is equivalent
to the query above, except that it can take advantage of a B-tree index if there is one present on the
column in question.
A similar query (obtained by substituting DESC for ASC in the query above) can be used in the place of
max().
174
Chapter 9. Functions and Operators
Unfortunately, there is no similarly trivial query that can be used to improve the performance of
count() when applied to the entire table.
9.16.1. EXISTS
EXISTS ( subquery )
The argument of EXISTS is an arbitrary SELECT statement, or subquery. The subquery is evaluated to
determine whether it returns any rows. If it returns at least one row, the result of EXISTS is “true”; if the
subquery returns no rows, the result of EXISTS is “false”.
The subquery can refer to variables from the surrounding query, which will act as constants during any
one evaluation of the subquery.
The subquery will generally only be executed far enough to determine whether at least one row is returned,
not all the way to completion. It is unwise to write a subquery that has any side effects (such as calling
sequence functions); whether the side effects occur or not may be difficult to predict.
Since the result depends only on whether any rows are returned, and not on the contents of those rows, the
output list of the subquery is normally uninteresting. A common coding convention is to write all EXISTS
tests in the form EXISTS(SELECT 1 WHERE ...). There are exceptions to this rule however, such as
subqueries that use INTERSECT.
This simple example is like an inner join on col2, but it produces at most one output row for each tab1
row, even if there are multiple matching tab2 rows:
9.16.2. IN
expression IN (subquery)
The right-hand side is a parenthesized subquery, which must return exactly one column. The left-hand
expression is evaluated and compared to each row of the subquery result. The result of IN is “true” if any
equal subquery row is found. The result is “false” if no equal row is found (including the special case
where the subquery returns no rows).
Note that if the left-hand expression yields null, or if there are no equal right-hand values and at least one
right-hand row yields null, the result of the IN construct will be null, not false. This is in accordance with
SQL’s normal rules for Boolean combinations of null values.
175
Chapter 9. Functions and Operators
As with EXISTS, it’s unwise to assume that the subquery will be evaluated completely.
row_constructor IN (subquery)
The left-hand side of this form of IN is a row constructor, as described in Section 4.2.11. The right-hand
side is a parenthesized subquery, which must return exactly as many columns as there are expressions
in the left-hand row. The left-hand expressions are evaluated and compared row-wise to each row of the
subquery result. The result of IN is “true” if any equal subquery row is found. The result is “false” if no
equal row is found (including the special case where the subquery returns no rows).
As usual, null values in the rows are combined per the normal rules of SQL Boolean expressions. Two
rows are considered equal if all their corresponding members are non-null and equal; the rows are unequal
if any corresponding members are non-null and unequal; otherwise the result of that row comparison is
unknown (null). If all the row results are either unequal or null, with at least one null, then the result of IN
is null.
9.16.3. NOT IN
expression NOT IN (subquery)
The right-hand side is a parenthesized subquery, which must return exactly one column. The left-hand
expression is evaluated and compared to each row of the subquery result. The result of NOT IN is “true”
if only unequal subquery rows are found (including the special case where the subquery returns no rows).
The result is “false” if any equal row is found.
Note that if the left-hand expression yields null, or if there are no equal right-hand values and at least one
right-hand row yields null, the result of the NOT IN construct will be null, not true. This is in accordance
with SQL’s normal rules for Boolean combinations of null values.
As with EXISTS, it’s unwise to assume that the subquery will be evaluated completely.
The left-hand side of this form of NOT IN is a row constructor, as described in Section 4.2.11. The right-
hand side is a parenthesized subquery, which must return exactly as many columns as there are expressions
in the left-hand row. The left-hand expressions are evaluated and compared row-wise to each row of the
subquery result. The result of NOT IN is “true” if only unequal subquery rows are found (including the
special case where the subquery returns no rows). The result is “false” if any equal row is found.
As usual, null values in the rows are combined per the normal rules of SQL Boolean expressions. Two
rows are considered equal if all their corresponding members are non-null and equal; the rows are unequal
if any corresponding members are non-null and unequal; otherwise the result of that row comparison is
unknown (null). If all the row results are either unequal or null, with at least one null, then the result of
NOT IN is null.
9.16.4. ANY/SOME
expression operator ANY (subquery)
expression operator SOME (subquery)
176
Chapter 9. Functions and Operators
The right-hand side is a parenthesized subquery, which must return exactly one column. The left-hand
expression is evaluated and compared to each row of the subquery result using the given operator,
which must yield a Boolean result. The result of ANY is “true” if any true result is obtained. The result is
“false” if no true result is found (including the special case where the subquery returns no rows).
SOME is a synonym for ANY. IN is equivalent to = ANY.
Note that if there are no successes and at least one right-hand row yields null for the operator’s result,
the result of the ANY construct will be null, not false. This is in accordance with SQL’s normal rules for
Boolean combinations of null values.
As with EXISTS, it’s unwise to assume that the subquery will be evaluated completely.
The left-hand side of this form of ANY is a row constructor, as described in Section 4.2.11. The right-hand
side is a parenthesized subquery, which must return exactly as many columns as there are expressions
in the left-hand row. The left-hand expressions are evaluated and compared row-wise to each row of the
subquery result, using the given operator. Presently, only = and <> operators are allowed in row-wise
ANY constructs. The result of ANY is “true” if any equal or unequal row is found, respectively. The result
is “false” if no such row is found (including the special case where the subquery returns no rows).
As usual, null values in the rows are combined per the normal rules of SQL Boolean expressions. Two
rows are considered equal if all their corresponding members are non-null and equal; the rows are unequal
if any corresponding members are non-null and unequal; otherwise the result of that row comparison is
unknown (null). If there is at least one null row result, then the result of ANY cannot be false; it will be
true or null.
9.16.5. ALL
expression operator ALL (subquery)
The right-hand side is a parenthesized subquery, which must return exactly one column. The left-hand
expression is evaluated and compared to each row of the subquery result using the given operator,
which must yield a Boolean result. The result of ALL is “true” if all rows yield true (including the special
case where the subquery returns no rows). The result is “false” if any false result is found.
NOT IN is equivalent to <> ALL.
Note that if there are no failures but at least one right-hand row yields null for the operator’s result, the
result of the ALL construct will be null, not true. This is in accordance with SQL’s normal rules for Boolean
combinations of null values.
As with EXISTS, it’s unwise to assume that the subquery will be evaluated completely.
The left-hand side of this form of ALL is a row constructor, as described in Section 4.2.11. The right-hand
side is a parenthesized subquery, which must return exactly as many columns as there are expressions
in the left-hand row. The left-hand expressions are evaluated and compared row-wise to each row of the
subquery result, using the given operator. Presently, only = and <> operators are allowed in row-wise
ALL queries. The result of ALL is “true” if all subquery rows are equal or unequal, respectively (including
177
Chapter 9. Functions and Operators
the special case where the subquery returns no rows). The result is “false” if any row is found to be unequal
or equal, respectively.
As usual, null values in the rows are combined per the normal rules of SQL Boolean expressions. Two
rows are considered equal if all their corresponding members are non-null and equal; the rows are unequal
if any corresponding members are non-null and unequal; otherwise the result of that row comparison is
unknown (null). If there is at least one null row result, then the result of ALL cannot be true; it will be false
or null.
The left-hand side is a row constructor, as described in Section 4.2.11. The right-hand side is a paren-
thesized subquery, which must return exactly as many columns as there are expressions in the left-hand
row. Furthermore, the subquery cannot return more than one row. (If it returns zero rows, the result is
taken to be null.) The left-hand side is evaluated and compared row-wise to the single subquery result
row. Presently, only = and <> operators are allowed in row-wise comparisons. The result is “true” if the
two rows are equal or unequal, respectively.
As usual, null values in the rows are combined per the normal rules of SQL Boolean expressions. Two
rows are considered equal if all their corresponding members are non-null and equal; the rows are unequal
if any corresponding members are non-null and unequal; otherwise the result of the row comparison is
unknown (null).
9.17.1. IN
expression IN (value[, ...])
The right-hand side is a parenthesized list of scalar expressions. The result is “true” if the left-hand ex-
pression’s result is equal to any of the right-hand expressions. This is a shorthand notation for
expression = value1
OR
expression = value2
OR
...
178
Chapter 9. Functions and Operators
Note that if the left-hand expression yields null, or if there are no equal right-hand values and at least one
right-hand expression yields null, the result of the IN construct will be null, not false. This is in accordance
with SQL’s normal rules for Boolean combinations of null values.
9.17.2. NOT IN
expression NOT IN (value[, ...])
The right-hand side is a parenthesized list of scalar expressions. The result is “true” if the left-hand ex-
pression’s result is unequal to all of the right-hand expressions. This is a shorthand notation for
Note that if the left-hand expression yields null, or if there are no equal right-hand values and at least one
right-hand expression yields null, the result of the NOT IN construct will be null, not true as one might
naively expect. This is in accordance with SQL’s normal rules for Boolean combinations of null values.
Tip: x NOT IN y is equivalent to NOT (x IN y) in all cases. However, null values are much more
likely to trip up the novice when working with NOT IN than when working with IN. It’s best to express
your condition positively if possible.
The right-hand side is a parenthesized expression, which must yield an array value. The left-hand expres-
sion is evaluated and compared to each element of the array using the given operator, which must
yield a Boolean result. The result of ANY is “true” if any true result is obtained. The result is “false” if no
true result is found (including the special case where the array has zero elements).
SOME is a synonym for ANY.
The right-hand side is a parenthesized expression, which must yield an array value. The left-hand expres-
sion is evaluated and compared to each element of the array using the given operator, which must
yield a Boolean result. The result of ALL is “true” if all comparisons yield true (including the special case
where the array has zero elements). The result is “false” if any false result is found.
179
Chapter 9. Functions and Operators
Each side is a row constructor, as described in Section 4.2.11. The two row values must have the same
number of fields. Each side is evaluated and they are compared row-wise. Presently, only = and <>
operators are allowed in row-wise comparisons. The result is “true” if the two rows are equal or unequal,
respectively.
As usual, null values in the rows are combined per the normal rules of SQL Boolean expressions. Two
rows are considered equal if all their corresponding members are non-null and equal; the rows are unequal
if any corresponding members are non-null and unequal; otherwise the result of the row comparison is
unknown (null).
This construct is similar to a <> row comparison, but it does not yield null for null inputs. Instead, any
null value is considered unequal to (distinct from) any non-null value, and any two nulls are considered
equal (not distinct). Thus the result will always be either true or false, never null.
row_constructor IS NULL
row_constructor IS NOT NULL
These constructs test a row value for null or not null. A row value is considered not null if it has at least
one field that is not null.
When step is positive, zero rows are returned if start is greater than stop. Conversely, when step is
negative, zero rows are returned if start is less than stop. Zero rows are also returned for NULL inputs.
It is an error for step to be zero. Some examples follow:
180
Chapter 9. Functions and Operators
-----------------
2
3
4
(3 rows)
181
Chapter 9. Functions and Operators
The session_user is normally the user who initiated the current database connection; but superusers
can change this setting with SET SESSION AUTHORIZATION. The current_user is the user identifier
that is applicable for permission checking. Normally, it is equal to the session user, but it changes during
the execution of functions with the attribute SECURITY DEFINER. In Unix parlance, the session user is
the “real user” and the current user is the “effective user”.
Note: current_user, session_user, and user have special syntactic status in SQL: they must be
called without trailing parentheses.
current_schema returns the name of the schema that is at the front of the search path (or a null value
if the search path is empty). This is the schema that will be used for any tables or other named objects
that are created without specifying a target schema. current_schemas(boolean) returns an array of
the names of all schemas presently in the search path. The Boolean option determines whether or not
implicitly included system schemas such as pg_catalog are included in the search path returned.
Note: The search path may be altered at run time. The command is:
inet_client_addr returns the IP address of the current client, and inet_client_port returns the
port number. inet_server_addr returns the IP address on which the server accepted the current con-
nection, and inet_server_port returns the port number. All these functions return NULL if the current
connection is via a Unix-domain socket.
version() returns a string describing the PostgreSQL server’s version.
Table 9-40 lists functions that allow the user to query object access privileges programmatically. See
Section 5.7 for more information about privileges.
182
Chapter 9. Functions and Operators
has_table_privilege checks whether a user can access a table in a particular way. The user can
be specified by name or by ID (pg_user.usesysid), or if the argument is omitted current_user
is assumed. The table can be specified by name or by OID. (Thus, there are actually six variants of
has_table_privilege, which can be distinguished by the number and types of their arguments.) When
specifying by name, the name can be schema-qualified if necessary. The desired access privilege type is
specified by a text string, which must evaluate to one of the values SELECT, INSERT, UPDATE, DELETE,
RULE, REFERENCES, or TRIGGER. (Case of the string is not significant, however.) An example is:
has_database_privilege checks whether a user can access a database in a particular way. The pos-
sibilities for its arguments are analogous to has_table_privilege. The desired access privilege type
must evaluate to CREATE, TEMPORARY, or TEMP (which is equivalent to TEMPORARY).
has_function_privilege checks whether a user can access a function in a particular way. The possi-
bilities for its arguments are analogous to has_table_privilege. When specifying a function by a text
string rather than by OID, the allowed input is the same as for the regprocedure data type (see Section
8.12). The desired access privilege type must evaluate to EXECUTE. An example is:
has_language_privilege checks whether a user can access a procedural language in a particular way.
The possibilities for its arguments are analogous to has_table_privilege. The desired access privilege
type must evaluate to USAGE.
has_schema_privilege checks whether a user can access a schema in a particular way. The possibili-
ties for its arguments are analogous to has_table_privilege. The desired access privilege type must
evaluate to CREATE or USAGE.
183
Chapter 9. Functions and Operators
has_tablespace_privilege checks whether a user can access a tablespace in a particular way. The
possibilities for its arguments are analogous to has_table_privilege. The desired access privilege
type must evaluate to CREATE.
To test whether a user holds a grant option on the privilege, append WITH GRANT OPTION to the privi-
lege key word; for example ’UPDATE WITH GRANT OPTION’.
Table 9-41 shows functions that determine whether a certain object is visible in the current schema search
path. A table is said to be visible if its containing schema is in the search path and no table of the same
name appears earlier in the search path. This is equivalent to the statement that the table can be referenced
by name without explicit schema qualification. For example, to list the names of all visible tables:
pg_operator_is_visible(operator_oid
boolean
) is operator visible in search path
pg_opclass_is_visible(opclass_oid
boolean
) is operator class visible in search
path
pg_conversion_is_visible(conversion_oid
boolean ) is conversion visible in search path
pg_table_is_visible performs the check for tables (or views, or any other kind of pg_class
entry). pg_type_is_visible, pg_function_is_visible, pg_operator_is_visible,
pg_opclass_is_visible, and pg_conversion_is_visible perform the same sort of visibility
check for types (and domains), functions, operators, operator classes and conversions, respectively. For
functions and operators, an object in the search path is visible if there is no object of the same name and
argument data type(s) earlier in the path. For operator classes, both name and associated index access
method are considered.
All these functions require object OIDs to identify the object to be checked. If you want to test an
object by name, it is convenient to use the OID alias types (regclass, regtype, regprocedure, or
regoperator), for example
SELECT pg_type_is_visible(’myschema.widget’::regtype);
Note that it would not make much sense to test an unqualified name in this way — if the name can be
recognized at all, it must be visible.
Table 9-42 lists functions that extract information from the system catalogs.
184
Chapter 9. Functions and Operators
pg_get_constraintdef(constraint_oid
text , get definition of a constraint
pretty_bool)
pg_get_expr(expr_text, text decompile internal form of an
relation_oid) expression, assuming that any Vars
in it refer to the relation indicated
by the second parameter
pg_get_expr(expr_text, text decompile internal form of an
relation_oid, pretty_bool) expression, assuming that any Vars
in it refer to the relation indicated
by the second parameter
pg_get_userbyid(userid) name get user name with given ID
pg_get_serial_sequence(table_name
text, get name of the sequence that a
column_name) serial or bigserial column
uses
pg_tablespace_databases(tablespace_oid
setof oid) get set of database OIDs that have
objects in the tablespace
185
Chapter 9. Functions and Operators
pg_get_expr decompiles the internal form of an individual expression, such as the default value for a
column. It may be useful when examining the contents of system catalogs. Most of these functions come
in two variants, one of which can optionally “pretty-print” the result. The pretty-printed format is more
readable, but the default format is more likely to be interpreted the same way by future versions of
PostgreSQL; avoid using pretty-printed output for dump purposes. Passing false for the pretty-print
parameter yields the same result as the variant that does not have the parameter at all.
pg_get_userbyid extracts a user’s name given a user ID number. pg_get_serial_sequence fetches
the name of the sequence associated with a serial or bigserial column. The name is suitably formatted
for passing to the sequence functions (see Section 9.12). NULL is returned if the column does not have a
sequence attached.
pg_tablespace_databases allows usage examination of a tablespace. It will return a set of OIDs of
databases that have objects stored in the tablespace. If this function returns any row, the tablespace is not
empty and cannot be dropped. To display the specific objects populating the tablespace, you will need to
connect to the databases identified by pg_tablespace_databases and query their pg_class catalogs.
The functions shown in Table 9-43 extract comments previously stored with the COMMENT command. A
null value is returned if no comment could be found matching the specified parameters.
186
Chapter 9. Functions and Operators
The function current_setting yields the current value of the setting setting_name. It corresponds
to the SQL command SHOW. An example:
SELECT current_setting(’datestyle’);
current_setting
-----------------
ISO, MDY
(1 row)
set_config sets the parameter setting_name to new_value. If is_local is true, the new value
will only apply to the current transaction. If you want the new value to apply for the current session, use
false instead. The function corresponds to the SQL command SET. An example:
set_config
------------
off
(1 row)
The function shown in Table 9-45 sends control signals to other server processes. Use of this function is
restricted to superusers.
This function returns 1 if successful, 0 if not successful. The process ID (pid) of an active backend can be
found from the procpid column in the pg_stat_activity view, or by listing the postgres processes
on the server with ps.
The functions shown in Table 9-46 assist in making on-line backups. Use of these functions is restricted
to superusers.
187
Chapter 9. Functions and Operators
pg_start_backup accepts a single parameter which is an arbitrary user-defined label for the backup.
(Typically this would be the name under which the backup dump file will be stored.) The function writes
a backup label file into the database cluster’s data directory, and then returns the backup’s starting WAL
offset as text. (The user need not pay any attention to this result value, but it is provided in case it is of
use.)
pg_stop_backup removes the label file created by pg_start_backup, and instead creates a backup
history file in the WAL archive area. The history file includes the label given to pg_start_backup, the
starting and ending WAL offsets for the backup, and the starting and ending times of the backup. The
return value is the backup’s ending WAL offset (which again may be of little interest).
For details about proper usage of these functions, see Section 22.3.
188
Chapter 10. Type Conversion
SQL statements can, intentionally or not, require mixing of different data types in the same expression.
PostgreSQL has extensive facilities for evaluating mixed-type expressions.
In many cases a user will not need to understand the details of the type conversion mechanism. However,
the implicit conversions done by PostgreSQL can affect the results of a query. When necessary, these
results can be tailored by using explicit type conversion.
This chapter introduces the PostgreSQL type conversion mechanisms and conventions. Refer to the rele-
vant sections in Chapter 8 and Chapter 9 for more information on specific data types and allowed functions
and operators.
10.1. Overview
SQL is a strongly typed language. That is, every data item has an associated data type which determines
its behavior and allowed usage. PostgreSQL has an extensible type system that is much more general
and flexible than other SQL implementations. Hence, most type conversion behavior in PostgreSQL is
governed by general rules rather than by ad hoc heuristics. This allows mixed-type expressions to be
meaningful even with user-defined types.
The PostgreSQL scanner/parser divides lexical elements into only five fundamental categories: integers,
non-integer numbers, strings, identifiers, and key words. Constants of most non-numeric types are first
classified as strings. The SQL language definition allows specifying type names with strings, and this
mechanism can be used in PostgreSQL to start the parser down the correct path. For example, the query
label | value
--------+-------
Origin | (0,0)
(1 row)
has two literal constants, of type text and point. If a type is not specified for a string literal, then the
placeholder type unknown is assigned initially, to be resolved in later stages as described below.
There are four fundamental SQL constructs requiring distinct type conversion rules in the PostgreSQL
parser:
Function calls
Much of the PostgreSQL type system is built around a rich set of functions. Functions can have one
or more arguments. Since PostgreSQL permits function overloading, the function name alone does
not uniquely identify the function to be called; the parser must select the right function based on the
data types of the supplied arguments.
Operators
PostgreSQL allows expressions with prefix and postfix unary (one-argument) operators, as well as
binary (two-argument) operators. Like functions, operators can be overloaded, and so the same prob-
lem of selecting the right operator exists.
189
Chapter 10. Type Conversion
Value Storage
SQL INSERT and UPDATE statements place the results of expressions into a table. The expressions
in the statement must be matched up with, and perhaps converted to, the types of the target columns.
UNION, CASE, and ARRAY constructs
Since all query results from a unionized SELECT statement must appear in a single set of columns,
the types of the results of each SELECT clause must be matched up and converted to a uniform set.
Similarly, the result expressions of a CASE construct must be converted to a common type so that the
CASE expression as a whole has a known output type. The same holds for ARRAY constructs.
The system catalogs store information about which conversions, called casts, between data types are valid,
and how to perform those conversions. Additional casts can be added by the user with the CREATE CAST
command. (This is usually done in conjunction with defining new data types. The set of casts between the
built-in types has been carefully crafted and is best not altered.)
An additional heuristic is provided in the parser to allow better guesses at proper behavior for SQL stan-
dard types. There are several basic type categories defined: boolean, numeric, string, bitstring,
datetime, timespan, geometric, network, and user-defined. Each category, with the exception of
user-defined, has one or more preferred types which are preferentially selected when there is ambiguity.
In the user-defined category, each type is its own preferred type. Ambiguous expressions (those with mul-
tiple candidate parsing solutions) can therefore often be resolved when there are multiple possible built-in
types, but they will raise an error when there are multiple choices for user-defined types.
All type conversion rules are designed with several principles in mind:
190
Chapter 10. Type Conversion
10.2. Operators
The specific operator to be used in an operator invocation is determined by following the procedure below.
Note that this procedure is indirectly affected by the precedence of the involved operators. See Section
4.1.6 for more information.
1. Select the operators to be considered from the pg_operator system catalog. If an unqualified opera-
tor name was used (the usual case), the operators considered are those of the right name and argument
count that are visible in the current search path (see Section 5.8.3). If a qualified operator name was
given, only operators in the specified schema are considered.
a. If the search path finds multiple operators of identical argument types, only the one ap-
pearing earliest in the path is considered. But operators of different argument types are
considered on an equal footing regardless of search path position.
2. Check for an operator accepting exactly the input argument types. If one exists (there can be only one
exact match in the set of operators considered), use it.
a. If one argument of a binary operator invocation is of the unknown type, then assume it is the
same type as the other argument for this check. Other cases involving unknown will never
find a match at this step.
3. Look for the best match.
a. Discard candidate operators for which the input types do not match and cannot be converted
(using an implicit conversion) to match. unknown literals are assumed to be convertible to
anything for this purpose. If only one candidate remains, use it; else continue to the next
step.
b. Run through all candidates and keep those with the most exact matches on input types.
(Domains are considered the same as their base type for this purpose.) Keep all candidates
if none have any exact matches. If only one candidate remains, use it; else continue to the
next step.
c. Run through all candidates and keep those that accept preferred types (of the input data
type’s type category) at the most positions where type conversion will be required. Keep
all candidates if none accept preferred types. If only one candidate remains, use it; else
continue to the next step.
d. If any input arguments are unknown, check the type categories accepted at those argument
positions by the remaining candidates. At each position, select the string category if any
candidate accepts that category. (This bias towards string is appropriate since an unknown-
type literal does look like a string.) Otherwise, if all the remaining candidates accept the
same type category, select that category; otherwise fail because the correct choice cannot
be deduced without more clues. Now discard candidates that do not accept the selected
type category. Furthermore, if any candidate accepts a preferred type at a given argument
position, discard candidates that accept non-preferred types for that argument.
e. If only one candidate remains, use it. If no candidate or more than one candidate remains,
then fail.
191
Chapter 10. Type Conversion
There is only one exponentiation operator defined in the catalog, and it takes arguments of type double
precision. The scanner assigns an initial type of integer to both arguments of this query expression:
SELECT 2 ^ 3 AS "exp";
exp
-----
8
(1 row)
So the parser does a type conversion on both operands and the query is equivalent to
SELECT CAST(2 AS double precision) ^ CAST(3 AS double precision) AS "exp";
A string-like syntax is used for working with string types as well as for working with complex extension
types. Strings with unspecified type are matched with likely operator candidates.
An example with one unspecified argument:
SELECT text ’abc’ || ’def’ AS "text and unknown";
In this case the parser looks to see if there is an operator taking text for both arguments. Since there is,
it assumes that the second argument should be interpreted as of type text.
Here is a concatenation on unspecified types:
SELECT ’abc’ || ’def’ AS "unspecified";
unspecified
-------------
abcdef
(1 row)
In this case there is no initial hint for which type to use, since no types are specified in the query. So, the
parser looks for all candidate operators and finds that there are candidates accepting both string-category
and bit-string-category inputs. Since string category is preferred when available, that category is selected,
and then the preferred type for strings, text, is used as the specific type to resolve the unknown literals
to.
192
Chapter 10. Type Conversion
The PostgreSQL operator catalog has several entries for the prefix operator @, all of which implement
absolute-value operations for various numeric data types. One of these entries is for type float8, which
is the preferred type in the numeric category. Therefore, PostgreSQL will use that entry when faced with
a non-numeric input:
SELECT @ ’-4.5’ AS "abs";
abs
-----
4.5
(1 row)
Here the system has performed an implicit conversion from text to float8 before applying the chosen
operator. We can verify that float8 and not some other type was used:
SELECT @ ’-4.5e500’ AS "abs";
On the other hand, the prefix operator ~ (bitwise negation) is defined only for integer data types, not for
float8. So, if we try a similar case with ~, we get:
SELECT ~ ’20’ AS "negation";
negation
----------
-21
(1 row)
10.3. Functions
The specific function to be used in a function invocation is determined according to the following steps.
1. Select the functions to be considered from the pg_proc system catalog. If an unqualified function
name was used, the functions considered are those of the right name and argument count that are
visible in the current search path (see Section 5.8.3). If a qualified function name was given, only
functions in the specified schema are considered.
a. If the search path finds multiple functions of identical argument types, only the one ap-
pearing earliest in the path is considered. But functions of different argument types are
considered on an equal footing regardless of search path position.
193
Chapter 10. Type Conversion
2. Check for a function accepting exactly the input argument types. If one exists (there can be only one
exact match in the set of functions considered), use it. (Cases involving unknown will never find a
match at this step.)
3. If no exact match is found, see whether the function call appears to be a trivial type conversion request.
This happens if the function call has just one argument and the function name is the same as the
(internal) name of some data type. Furthermore, the function argument must be either an unknown-
type literal or a type that is binary-compatible with the named data type. When these conditions are
met, the function argument is converted to the named data type without any actual function call.
4. Look for the best match.
a. Discard candidate functions for which the input types do not match and cannot be converted
(using an implicit conversion) to match. unknown literals are assumed to be convertible to
anything for this purpose. If only one candidate remains, use it; else continue to the next
step.
b. Run through all candidates and keep those with the most exact matches on input types.
(Domains are considered the same as their base type for this purpose.) Keep all candidates
if none have any exact matches. If only one candidate remains, use it; else continue to the
next step.
c. Run through all candidates and keep those that accept preferred types (of the input data
type’s type category) at the most positions where type conversion will be required. Keep
all candidates if none accept preferred types. If only one candidate remains, use it; else
continue to the next step.
d. If any input arguments are unknown, check the type categories accepted at those argument
positions by the remaining candidates. At each position, select the string category if any
candidate accepts that category. (This bias towards string is appropriate since an unknown-
type literal does look like a string.) Otherwise, if all the remaining candidates accept the
same type category, select that category; otherwise fail because the correct choice cannot
be deduced without more clues. Now discard candidates that do not accept the selected
type category. Furthermore, if any candidate accepts a preferred type at a given argument
position, discard candidates that accept non-preferred types for that argument.
e. If only one candidate remains, use it. If no candidate or more than one candidate remains,
then fail.
Note that the “best match” rules are identical for operator and function type resolution. Some examples
follow.
There is only one round function with two arguments. (The first is numeric, the second is integer.)
So the following query automatically converts the first argument of type integer to numeric:
SELECT round(4, 4);
round
--------
4.0000
(1 row)
194
Chapter 10. Type Conversion
Since numeric constants with decimal points are initially assigned the type numeric, the following query
will require no type conversion and may therefore be slightly more efficient:
SELECT round(4.0, 4);
There are several substr functions, one of which takes types text and integer. If called with a string
constant of unspecified type, the system chooses the candidate function that accepts an argument of the
preferred category string (namely of type text).
SELECT substr(’1234’, 3);
substr
--------
34
(1 row)
If the string is declared to be of type varchar, as might be the case if it comes from a table, then the
parser will try to convert it to become text:
SELECT substr(varchar ’1234’, 3);
substr
--------
34
(1 row)
This is transformed by the parser to effectively become
SELECT substr(CAST (varchar ’1234’ AS text), 3);
Note: The parser learns from the pg_cast catalog that text and varchar are binary-compatible,
meaning that one can be passed to a function that accepts the other without doing any physical
conversion. Therefore, no explicit type conversion call is really inserted in this case.
And, if the function is called with an argument of type integer, the parser will try to convert that to
text:
SELECT substr(1234, 3);
substr
--------
34
(1 row)
This actually executes as
SELECT substr(CAST (1234 AS text), 3);
195
Chapter 10. Type Conversion
This automatic transformation can succeed because there is an implicitly invocable cast from integer to
text.
For a target column declared as character(20) the following statement ensures that the stored value is
sized correctly:
CREATE TABLE vv (v character(20));
INSERT INTO vv SELECT ’abc’ || ’def’;
SELECT v, length(v) FROM vv;
v | length
----------------------+--------
abcdef | 20
(1 row)
What has really happened here is that the two unknown literals are resolved to text by default, allowing
the || operator to be resolved as text concatenation. Then the text result of the operator is converted to
bpchar (“blank-padded char”, the internal name of the character data type) to match the target column
type. (Since the types text and bpchar are binary-compatible, this conversion does not insert any real
function call.) Finally, the sizing function bpchar(bpchar, integer) is found in the system catalog
and applied to the operator’s result and the stored column length. This type-specific function performs the
required length check and addition of padding spaces.
196
Chapter 10. Type Conversion
1. If all inputs are of type unknown, resolve as type text (the preferred type of the string category).
Otherwise, ignore the unknown inputs while choosing the result type.
2. If the non-unknown inputs are not all of the same type category, fail.
3. Choose the first non-unknown input type which is a preferred type in that category or allows all the
non-unknown inputs to be implicitly converted to it.
4. Convert all inputs to the selected type.
text
------
a
b
(2 rows)
Here, the unknown-type literal ’b’ will be resolved as type text.
numeric
---------
1
1.2
(2 rows)
The literal 1.2 is of type numeric, and the integer value 1 can be cast implicitly to numeric, so that
type is used.
197
Chapter 10. Type Conversion
real
------
1
2.2
(2 rows)
Here, since type real cannot be implicitly cast to integer, but integer can be implicitly cast to real,
the union result type is resolved as real.
198
Chapter 11. Indexes
Indexes are a common way to enhance database performance. An index allows the database server to find
and retrieve specific rows much faster than it could do without an index. But indexes also add overhead to
the database system as a whole, so they should be used sensibly.
11.1. Introduction
Suppose we have a table similar to this:
With no advance preparation, the system would have to scan the entire test1 table, row by row, to find all
matching entries. If there are a lot of rows in test1 and only a few rows (perhaps only zero or one) that
would be returned by such a query, then this is clearly an inefficient method. But if the system has been
instructed to maintain an index on the id column, then it can use a more efficient method for locating
matching rows. For instance, it might only have to walk a few levels deep into a search tree.
A similar approach is used in most books of non-fiction: terms and concepts that are frequently looked
up by readers are collected in an alphabetic index at the end of the book. The interested reader can scan
the index relatively quickly and flip to the appropriate page(s), rather than having to read the entire book
to find the material of interest. Just as it is the task of the author to anticipate the items that the readers
are most likely to look up, it is the task of the database programmer to foresee which indexes would be of
advantage.
The following command would be used to create the index on the id column, as discussed:
The name test1_id_index can be chosen freely, but you should pick something that enables you to
remember later what the index was for.
To remove an index, use the DROP INDEX command. Indexes can be added to and removed from tables at
any time.
Once an index is created, no further intervention is required: the system will update the index when the
table is modified, and it will use the index in queries when it thinks this would be more efficient than a
sequential table scan. But you may have to run the ANALYZE command regularly to update statistics to
allow the query planner to make educated decisions. See Chapter 13 for information about how to find out
whether an index is used and when and why the planner may choose not to use an index.
Indexes can also benefit UPDATE and DELETE commands with search conditions. Indexes can moreover
be used in join queries. Thus, an index defined on a column that is part of a join condition can significantly
speed up queries with joins.
199
Chapter 11. Indexes
When an index is created, the system has to keep it synchronized with the table. This adds overhead to
data manipulation operations. Therefore indexes that are non-essential or do not get used at all should be
removed. Note that a query or data manipulation command can use at most one index per table.
<
<=
=
>=
>
Constructs equivalent to combinations of these operators, such as BETWEEN and IN, can also be imple-
mented with a B-tree index search. (But note that IS NULL is not equivalent to = and is not indexable.)
The optimizer can also use a B-tree index for queries involving the pattern matching operators LIKE,
ILIKE, ~, and ~*, if the pattern is anchored to the beginning of the string, e.g., col LIKE ’foo%’ or
col ~ ’^foo’, but not col LIKE ’%bar’. However, if your server does not use the C locale you will
need to create the index with a special operator class to support indexing of pattern-matching queries. See
Section 11.6 below.
R-tree indexes are suited for queries on spatial data. To create an R-tree index, use a command of the form
The PostgreSQL query planner will consider using an R-tree index whenever an indexed column is in-
volved in a comparison using one of these operators:
<<
&<
&>
>>
@
~=
&&
200
Chapter 11. Indexes
Note: Testing has shown PostgreSQL’s hash indexes to perform no better than B-tree indexes, and
the index size and build time for hash indexes is much worse. For these reasons, hash index use is
presently discouraged.
GiST indexes are not a single kind of index, but rather an infrastructure within which many different in-
dexing strategies can be implemented. Accordingly, the particular operators with which a GiST index can
be used vary depending on the indexing strategy (the operator class). For more information see Chapter
48.
The B-tree index method is an implementation of Lehman-Yao high-concurrency B-trees. The R-tree
index method implements standard R-trees using Guttman’s quadratic split algorithm. The hash index
method is an implementation of Litwin’s linear hashing. We mention the algorithms used solely to indicate
that all of these index methods are fully dynamic and do not have to be optimized periodically (as is the
case with, for example, static hash methods).
(say, you keep your /dev directory in a database...) and you frequently make queries like
SELECT name FROM test2 WHERE major = constant AND minor = constant;
then it may be appropriate to define an index on the columns major and minor together, e.g.,
Currently, only the B-tree and GiST implementations support multicolumn indexes. Up to 32 columns may
be specified. (This limit can be altered when building PostgreSQL; see the file pg_config_manual.h.)
The query planner can use a multicolumn index for queries that involve the leftmost column in the index
definition plus any number of columns listed to the right of it, without a gap. For example, an index on
(a, b, c) can be used in queries involving all of a, b, and c, or in queries involving both a and b,
or in queries involving only a, but not in other combinations. (In a query involving a and c the planner
could choose to use the index for a, while treating c like an ordinary unindexed column.) Of course, each
column must be used with operators appropriate to the index type; clauses that involve other operators
will not be considered.
Multicolumn indexes can only be used if the clauses involving the indexed columns are joined with AND.
For instance,
201
Chapter 11. Indexes
cannot make use of the index test2_mm_idx defined above to look up both columns. (It can be used to
look up only the major column, however.)
Multicolumn indexes should be used sparingly. Most of the time, an index on a single column is sufficient
and saves space and time. Indexes with more than three columns are unlikely to be helpful unless the
usage of the table is extremely stylized.
Note: The preferred way to add a unique constraint to a table is ALTER TABLE ... ADD CONSTRAINT.
The use of indexes to enforce unique constraints could be considered an implementation detail that
should not be accessed directly. One should, however, be aware that there’s no need to manually
create indexes on unique columns; doing so would just duplicate the automatically-created index.
This query can use an index, if one has been defined on the result of the lower(col1) operation:
202
Chapter 11. Indexes
If we were to declare this index UNIQUE, it would prevent creation of rows whose col1 values differ only
in case, as well as rows whose col1 values are actually identical. Thus, indexes on expressions can be
used to enforce constraints that are not definable as simple unique constraints.
As another example, if one often does queries like this:
The syntax of the CREATE INDEX command normally requires writing parentheses around index expres-
sions, as shown in the second example. The parentheses may be omitted when the expression is just a
function call, as in the first example.
Index expressions are relatively expensive to maintain, since the derived expression(s) must be computed
for each row upon insertion or whenever it is updated. Therefore they should be used only when queries
that can use the index are very frequent.
The operator class identifies the operators to be used by the index for that column. For example, a B-
tree index on the type int4 would use the int4_ops class; this operator class includes comparison
functions for values of type int4. In practice the default operator class for the column’s data type is
usually sufficient. The main point of having operator classes is that for some data types, there could be
more than one meaningful index behavior. For example, we might want to sort a complex-number data
type either by absolute value or by real part. We could do this by defining two operator classes for the data
type and then selecting the proper class when making an index.
There are also some built-in operator classes besides the default ones:
If you do use the C locale, you may instead create an index with the default operator class, and it
will still be useful for pattern-matching queries. Also note that you should create an index with the
default operator class if you want queries involving ordinary comparisons to use an index. Such queries
203
Chapter 11. Indexes
cannot use the xxx_pattern_ops operator classes. It is allowed to create multiple indexes on the
same column with different operator classes.
Suppose you are storing web server access logs in a database. Most accesses originate from the IP address
range of your organization but some are from elsewhere (say, employees on dial-up connections). If your
searches by IP are primarily for outside accesses, you probably do not need to index the IP range that
corresponds to your organization’s subnet.
Assume a table like this:
CREATE TABLE access_log (
url varchar,
client_ip inet,
...
);
204
Chapter 11. Indexes
To create a partial index that suits our example, use a command such as this:
CREATE INDEX access_log_client_ip_ix ON access_log (client_ip)
WHERE NOT (client_ip > inet ’192.168.100.0’ AND client_ip < inet ’192.168.100.255’);
Observe that this kind of partial index requires that the common values be predetermined. If the distribu-
tion of values is inherent (due to the nature of the application) and static (not changing over time), this is
not difficult, but if the common values are merely due to the coincidental data load this can require a lot
of maintenance work.
Another possibility is to exclude values from the index that the typical query workload is not interested
in; this is shown in Example 11-2. This results in the same advantages as listed above, but it prevents the
“uninteresting” values from being accessed via that index at all, even if an index scan might be profitable
in that case. Obviously, setting up partial indexes for this kind of scenario will require a lot of care and
experimentation.
If you have a table that contains both billed and unbilled orders, where the unbilled orders take up a small
fraction of the total table and yet those are the most-accessed rows, you can improve performance by
creating an index on just the unbilled rows. The command to create the index would look like this:
CREATE INDEX orders_unbilled_index ON orders (order_nr)
WHERE billed is not true;
Example 11-2 also illustrates that the indexed column and the column used in the predicate do not need
to match. PostgreSQL supports partial indexes with arbitrary predicates, so long as only columns of the
table being indexed are involved. However, keep in mind that the predicate must match the conditions
used in the queries that are supposed to benefit from the index. To be precise, a partial index can be used
in a query only if the system can recognize that the WHERE condition of the query mathematically implies
the predicate of the index. PostgreSQL does not have a sophisticated theorem prover that can recognize
mathematically equivalent expressions that are written in different forms. (Not only is such a general
205
Chapter 11. Indexes
theorem prover extremely difficult to create, it would probably be too slow to be of any real use.) The
system can recognize simple inequality implications, for example “x < 1” implies “x < 2”; otherwise
the predicate condition must exactly match part of the query’s WHERE condition or the index will not be
recognized to be usable.
A third possible use for partial indexes does not require the index to be used in queries at all. The idea
here is to create a unique index over a subset of a table, as in Example 11-3. This enforces uniqueness
among the rows that satisfy the index predicate, without constraining those that do not.
Suppose that we have a table describing test outcomes. We wish to ensure that there is only one “success-
ful” entry for a given subject and target combination, but there might be any number of “unsuccessful”
entries. Here is one way to do it:
CREATE TABLE tests (
subject text,
target text,
success boolean,
...
);
Finally, a partial index can also be used to override the system’s query plan choices. It may occur that data
sets with peculiar distributions will cause the system to use an index when it really should not. In that case
the index can be set up so that it is not available for the offending query. Normally, PostgreSQL makes
reasonable choices about index usage (e.g., it avoids them when retrieving common values, so the earlier
example really only saves index size, it is not required to avoid index usage), and grossly incorrect plan
choices are cause for a bug report.
Keep in mind that setting up a partial index indicates that you know at least as much as the query planner
knows, in particular you know when an index might be profitable. Forming this knowledge requires ex-
perience and understanding of how indexes in PostgreSQL work. In most cases, the advantage of a partial
index over a regular index will not be much.
More information about partial indexes can be found in The case for partial indexes, Partial indexing in
POSTGRES: research project, and Generalized Partial Indexes.
206
Chapter 11. Indexes
It is difficult to formulate a general procedure for determining which indexes to set up. There are a number
of typical cases that have been shown in the examples throughout the previous sections. A good deal of
experimentation will be necessary in most cases. The rest of this section gives some tips for that.
• Always run ANALYZE first. This command collects statistics about the distribution of the values in the
table. This information is required to guess the number of rows returned by a query, which is needed by
the planner to assign realistic costs to each possible query plan. In absence of any real statistics, some
default values are assumed, which are almost certain to be inaccurate. Examining an application’s index
usage without having run ANALYZE is therefore a lost cause.
• Use real data for experimentation. Using test data for setting up indexes will tell you what indexes you
need for the test data, but that is all.
It is especially fatal to use very small test data sets. While selecting 1000 out of 100000 rows could be
a candidate for an index, selecting 1 out of 100 rows will hardly be, because the 100 rows will probably
fit within a single disk page, and there is no plan that can beat sequentially fetching 1 disk page.
Also be careful when making up test data, which is often unavoidable when the application is not in
production use yet. Values that are very similar, completely random, or inserted in sorted order will
skew the statistics away from the distribution that real data would have.
• When indexes are not used, it can be useful for testing to force their use. There are run-time parameters
that can turn off various plan types (described in Section 16.4). For instance, turning off sequential scans
(enable_seqscan) and nested-loop joins (enable_nestloop), which are the most basic plans, will
force the system to use a different plan. If the system still chooses a sequential scan or nested-loop join
then there is probably a more fundamental problem for why the index is not used, for example, the
query condition does not match the index. (What kind of query can use what kind of index is explained
in the previous sections.)
• If forcing index usage does use the index, then there are two possibilities: Either the system is right
and using the index is indeed not appropriate, or the cost estimates of the query plans are not reflecting
reality. So you should time your query with and without indexes. The EXPLAIN ANALYZE command
can be useful here.
• If it turns out that the cost estimates are wrong, there are, again, two possibilities. The total cost is
computed from the per-row costs of each plan node times the selectivity estimate of the plan node.
The costs of the plan nodes can be tuned with run-time parameters (described in Section 16.4). An
inaccurate selectivity estimate is due to insufficient statistics. It may be possible to help this by tuning
the statistics-gathering parameters (see ALTER TABLE).
If you do not succeed in adjusting the costs to be more appropriate, then you may have to resort to
forcing index usage explicitly. You may also want to contact the PostgreSQL developers to examine the
issue.
207
Chapter 12. Concurrency Control
This chapter describes the behavior of the PostgreSQL database system when two or more sessions try
to access the same data at the same time. The goals in that situation are to allow efficient access for
all sessions while maintaining strict data integrity. Every developer of database applications should be
familiar with the topics covered in this chapter.
12.1. Introduction
Unlike traditional database systems which use locks for concurrency control, PostgreSQL maintains data
consistency by using a multiversion model (Multiversion Concurrency Control, MVCC). This means that
while querying a database each transaction sees a snapshot of data (a database version) as it was some
time ago, regardless of the current state of the underlying data. This protects the transaction from viewing
inconsistent data that could be caused by (other) concurrent transaction updates on the same data rows,
providing transaction isolation for each database session.
The main advantage to using the MVCC model of concurrency control rather than locking is that in
MVCC locks acquired for querying (reading) data do not conflict with locks acquired for writing data,
and so reading never blocks writing and writing never blocks reading.
Table- and row-level locking facilities are also available in PostgreSQL for applications that cannot adapt
easily to MVCC behavior. However, proper use of MVCC will generally provide better performance than
locks.
dirty read
A transaction reads data written by a concurrent uncommitted transaction.
nonrepeatable read
A transaction re-reads data it has previously read and finds that data has been modified by another
transaction (that committed since the initial read).
phantom read
A transaction re-executes a query returning a set of rows that satisfy a search condition and finds that
the set of rows satisfying the condition has changed due to another recently-committed transaction.
The four transaction isolation levels and the corresponding behaviors are described in Table 12-1.
208
Chapter 12. Concurrency Control
In PostgreSQL, you can request any of the four standard transaction isolation levels. But internally, there
are only two distinct isolation levels, which correspond to the levels Read Committed and Serializable.
When you select the level Read Uncommitted you really get Read Committed, and when you select
Repeatable Read you really get Serializable, so the actual isolation level may be stricter than what you
select. This is permitted by the SQL standard: the four isolation levels only define which phenomena
must not happen, they do not define which phenomena must happen. The reason that PostgreSQL only
provides two isolation levels is that this is the only sensible way to map the standard isolation levels to the
multiversion concurrency control architecture. The behavior of the available isolation levels is detailed in
the following subsections.
To set the transaction isolation level of a transaction, use the command SET TRANSACTION.
BEGIN;
UPDATE accounts SET balance = balance + 100.00 WHERE acctnum = 12345;
UPDATE accounts SET balance = balance - 100.00 WHERE acctnum = 7534;
COMMIT;
209
Chapter 12. Concurrency Control
If two such transactions concurrently try to change the balance of account 12345, we clearly want the
second transaction to start from the updated version of the account’s row. Because each command is
affecting only a predetermined row, letting it see the updated version of the row does not create any
troublesome inconsistency.
Since in Read Committed mode each new command starts with a new snapshot that includes all transac-
tions committed up to that instant, subsequent commands in the same transaction will see the effects of
the committed concurrent transaction in any case. The point at issue here is whether or not within a single
command we see an absolutely consistent view of the database.
The partial transaction isolation provided by Read Committed mode is adequate for many applications,
and this mode is fast and simple to use. However, for applications that do complex queries and updates, it
may be necessary to guarantee a more rigorously consistent view of the database than the Read Committed
mode provides.
because a serializable transaction cannot modify rows changed by other transactions after the serializable
transaction began.
When the application receives this error message, it should abort the current transaction and then retry
the whole transaction from the beginning. The second time through, the transaction sees the previously-
committed change as part of its initial view of the database, so there is no logical conflict in using the new
version of the row as the starting point for the new transaction’s update.
Note that only updating transactions may need to be retried; read-only transactions will never have serial-
ization conflicts.
210
Chapter 12. Concurrency Control
The Serializable mode provides a rigorous guarantee that each transaction sees a wholly consistent view
of the database. However, the application has to be prepared to retry transactions when concurrent up-
dates make it impossible to sustain the illusion of serial execution. Since the cost of redoing complex
transactions may be significant, this mode is recommended only when updating transactions contain logic
sufficiently complex that they may give wrong answers in Read Committed mode. Most commonly, Se-
rializable mode is necessary when a transaction executes several successive commands that must see
identical views of the database.
class | value
-------+-------
1 | 10
1 | 20
2 | 100
2 | 200
and then inserts the result (30) as the value in a new row with class = 2. Concurrently, serializable
transaction B computes
and obtains the result 300, which it inserts in a new row with class = 1. Then both transactions commit.
None of the listed undesirable behaviors have occurred, yet we have a result that could not have occurred
in either order serially. If A had executed before B, B would have computed the sum 330, not 300, and
similarly the other order would have resulted in a different sum computed by A.
To guarantee true mathematical serializability, it is necessary for a database system to enforce predicate
locking, which means that a transaction cannot insert or modify a row that would have matched the WHERE
condition of a query in another concurrent transaction. For example, once transaction A has executed the
query SELECT ... WHERE class = 1, a predicate-locking system would forbid transaction B from in-
serting any new row with class 1 until A has committed. 1 Such a locking system is complex to implement
and extremely expensive in execution, since every session must be aware of the details of every query
executed by every concurrent transaction. And this large expense is mostly wasted, since in practice most
applications do not do the sorts of things that could result in problems. (Certainly the example above is
rather contrived and unlikely to represent real software.) Accordingly, PostgreSQL does not implement
predicate locking, and so far as we are aware no other production DBMS does either.
1. Essentially, a predicate-locking system prevents phantom reads by restricting what is written, whereas MVCC prevents them
by restricting what is read.
211
Chapter 12. Concurrency Control
In those cases where the possibility of nonserializable execution is a real hazard, problems can be pre-
vented by appropriate use of explicit locking. Further discussion appears in the following sections.
Conflicts with the SHARE, SHARE ROW EXCLUSIVE, EXCLUSIVE, and ACCESS EXCLUSIVE lock
modes.
212
Chapter 12. Concurrency Control
The commands UPDATE, DELETE, and INSERT acquire this lock mode on the target table (in addition
to ACCESS SHARE locks on any other referenced tables). In general, this lock mode will be acquired
by any command that modifies the data in a table.
SHARE UPDATE EXCLUSIVE
Conflicts with the SHARE UPDATE EXCLUSIVE, SHARE, SHARE ROW EXCLUSIVE, EXCLUSIVE,
and ACCESS EXCLUSIVE lock modes. This mode protects a table against concurrent schema
changes and VACUUM runs.
Acquired by VACUUM (without FULL).
SHARE
Conflicts with the ROW EXCLUSIVE, SHARE UPDATE EXCLUSIVE, SHARE ROW EXCLUSIVE,
EXCLUSIVE, and ACCESS EXCLUSIVE lock modes. This mode protects a table against concurrent
data changes.
Acquired by CREATE INDEX.
SHARE ROW EXCLUSIVE
Conflictswith the ROW EXCLUSIVE, SHARE UPDATE EXCLUSIVE, SHARE, SHARE ROW
EXCLUSIVE, EXCLUSIVE, and ACCESS EXCLUSIVE lock modes.
Conflicts with the ROW SHARE, ROW EXCLUSIVE, SHARE UPDATE EXCLUSIVE, SHARE, SHARE
ROW EXCLUSIVE, EXCLUSIVE, and ACCESS EXCLUSIVE lock modes. This mode allows only
concurrent ACCESS SHARE locks, i.e., only reads from the table can proceed in parallel with a
transaction holding this lock mode.
This lock mode is not automatically acquired by any PostgreSQL command.
ACCESS EXCLUSIVE
Conflicts with locks of all modes (ACCESS SHARE, ROW SHARE, ROW EXCLUSIVE, SHARE UPDATE
EXCLUSIVE, SHARE, SHARE ROW EXCLUSIVE, EXCLUSIVE, and ACCESS EXCLUSIVE). This mode
guarantees that the holder is the only transaction accessing the table in any way.
Acquired by the ALTER TABLE, DROP TABLE, REINDEX, CLUSTER, and VACUUM FULL commands.
This is also the default lock mode for LOCK TABLE statements that do not specify a mode explicitly.
Tip: Only an ACCESS EXCLUSIVE lock blocks a SELECT (without FOR UPDATE) statement.
213
Chapter 12. Concurrency Control
with SELECT FOR UPDATE. Note that once a particular row-level lock is acquired, the transaction may
update the row multiple times without fear of conflicts.
PostgreSQL doesn’t remember any information about modified rows in memory, so it has no limit to the
number of rows locked at one time. However, locking a row may cause a disk write; thus, for example,
SELECT FOR UPDATE will modify selected rows to mark them and so will result in disk writes.
In addition to table and row locks, page-level share/exclusive locks are used to control read/write access
to table pages in the shared buffer pool. These locks are released immediately after a row is fetched or
updated. Application developers normally need not be concerned with page-level locks, but we mention
them for completeness.
12.3.3. Deadlocks
The use of explicit locking can increase the likelihood of deadlocks, wherein two (or more) transactions
each hold locks that the other wants. For example, if transaction 1 acquires an exclusive lock on table A
and then tries to acquire an exclusive lock on table B, while transaction 2 has already exclusive-locked
table B and now wants an exclusive lock on table A, then neither one can proceed. PostgreSQL automati-
cally detects deadlock situations and resolves them by aborting one of the transactions involved, allowing
the other(s) to complete. (Exactly which transaction will be aborted is difficult to predict and should not
be relied on.)
Note that deadlocks can also occur as the result of row-level locks (and thus, they can occur even if explicit
locking is not used). Consider the case in which there are two concurrent transactions modifying a table.
The first transaction executes:
This acquires a row-level lock on the row with the specified account number. Then, the second transaction
executes:
The first UPDATE statement successfully acquires a row-level lock on the specified row, so it succeeds in
updating that row. However, the second UPDATE statement finds that the row it is attempting to update has
already been locked, so it waits for the transaction that acquired the lock to complete. Transaction two is
now waiting on transaction one to complete before it continues execution. Now, transaction one executes:
Transaction one attempts to acquire a row-level lock on the specified row, but it cannot: transaction two
already holds such a lock. So it waits for transaction two to complete. Thus, transaction one is blocked
on transaction two, and transaction two is blocked on transaction one: a deadlock condition. PostgreSQL
will detect this situation and abort one of the transactions.
The best defense against deadlocks is generally to avoid them by being certain that all applications using a
database acquire locks on multiple objects in a consistent order. In the example above, if both transactions
had updated the rows in the same order, no deadlock would have occurred. One should also ensure that the
first lock acquired on an object in a transaction is the highest mode that will be needed for that object. If it
214
Chapter 12. Concurrency Control
is not feasible to verify this in advance, then deadlocks may be handled on-the-fly by retrying transactions
that are aborted due to deadlock.
So long as no deadlock situation is detected, a transaction seeking either a table-level or row-level lock
will wait indefinitely for conflicting locks to be released. This means it is a bad idea for applications to
hold transactions open for long periods of time (e.g., while waiting for user input).
215
Chapter 12. Concurrency Control
B-tree indexes
Short-term share/exclusive page-level locks are used for read/write access. Locks are released imme-
diately after each index row is fetched or inserted. B-tree indexes provide the highest concurrency
without deadlock conditions.
GiST and R-tree indexes
Share/exclusive index-level locks are used for read/write access. Locks are released after the com-
mand is done.
Hash indexes
Share/exclusive hash-bucket-level locks are used for read/write access. Locks are released after the
whole bucket is processed. Bucket-level locks provide better concurrency than index-level ones, but
deadlock is possible since the locks are held longer than one index operation.
In short, B-tree indexes offer the best performance for concurrent applications; since they also have more
features than hash indexes, they are the recommended index type for concurrent applications that need to
index scalar data. When dealing with non-scalar data, B-trees obviously cannot be used; in that situation,
application developers should be aware of the relatively poor concurrent performance of GiST and R-tree
indexes.
216
Chapter 13. Performance Tips
Query performance can be affected by many things. Some of these can be manipulated by the user, while
others are fundamental to the underlying design of the system. This chapter provides some hints about
understanding and tuning PostgreSQL performance.
• Estimated start-up cost (Time expended before output scan can start, e.g., time to do the sorting in a
sort node.)
• Estimated total cost (If all rows were to be retrieved, which they may not be: a query with a LIMIT
clause will stop short of paying the total cost, for example.)
• Estimated number of rows output by this plan node (Again, only if executed to completion)
• Estimated average width (in bytes) of rows output by this plan node
The costs are measured in units of disk page fetches. (CPU effort estimates are converted into disk-page
units using some fairly arbitrary fudge factors. If you want to experiment with these factors, see the list of
run-time configuration parameters in Section 16.4.5.2.)
It’s important to note that the cost of an upper-level node includes the cost of all its child nodes. It’s also
important to realize that the cost only reflects things that the planner/optimizer cares about. In particular,
the cost does not consider the time spent transmitting result rows to the frontend, which could be a pretty
dominant factor in the true elapsed time; but the planner ignores it because it cannot change it by altering
the plan. (Every correct plan will output the same row set, we trust.)
Rows output is a little tricky because it is not the number of rows processed/scanned by the query, it is
usually less, reflecting the estimated selectivity of any WHERE-clause conditions that are being applied
at this node. Ideally the top-level rows estimate will approximate the number of rows actually returned,
updated, or deleted by the query.
Here are some examples (using the regression test database after a VACUUM ANALYZE, and 7.3 develop-
ment sources):
QUERY PLAN
-------------------------------------------------------------
Seq Scan on tenk1 (cost=0.00..333.00 rows=10000 width=148)
217
Chapter 13. Performance Tips
you will find out that tenk1 has 233 disk pages and 10000 rows. So the cost is estimated at 233 page
reads, defined as costing 1.0 apiece, plus 10000 * cpu_tuple_cost which is currently 0.01 (try SHOW
cpu_tuple_cost).
QUERY PLAN
------------------------------------------------------------
Seq Scan on tenk1 (cost=0.00..358.00 rows=1033 width=148)
Filter: (unique1 < 1000)
The estimate of output rows has gone down because of the WHERE clause. However, the scan will still have
to visit all 10000 rows, so the cost hasn’t decreased; in fact it has gone up a bit to reflect the extra CPU
time spent checking the WHERE condition.
The actual number of rows this query would select is 1000, but the estimate is only approximate. If you try
to duplicate this experiment, you will probably get a slightly different estimate; moreover, it will change
after each ANALYZE command, because the statistics produced by ANALYZE are taken from a randomized
sample of the table.
Modify the query to restrict the condition even more:
QUERY PLAN
-------------------------------------------------------------------------------
Index Scan using tenk1_unique1 on tenk1 (cost=0.00..179.33 rows=49 width=148)
Index Cond: (unique1 < 50)
and you will see that if we make the WHERE condition selective enough, the planner will eventually decide
that an index scan is cheaper than a sequential scan. This plan will only have to visit 50 rows because of
the index, so it wins despite the fact that each individual fetch is more expensive than reading a whole
disk page sequentially.
Add another condition to the WHERE clause:
EXPLAIN SELECT * FROM tenk1 WHERE unique1 < 50 AND stringu1 = ’xxx’;
QUERY PLAN
-------------------------------------------------------------------------------
Index Scan using tenk1_unique1 on tenk1 (cost=0.00..179.45 rows=1 width=148)
Index Cond: (unique1 < 50)
Filter: (stringu1 = ’xxx’::name)
The added condition stringu1 = ’xxx’ reduces the output-rows estimate, but not the cost because we
still have to visit the same set of rows. Notice that the stringu1 clause cannot be applied as an index
condition (since this index is only on the unique1 column). Instead it is applied as a filter on the rows
retrieved by the index. Thus the cost has actually gone up a little bit to reflect this extra checking.
218
Chapter 13. Performance Tips
Let’s try joining two tables, using the columns we have been discussing:
EXPLAIN SELECT * FROM tenk1 t1, tenk2 t2 WHERE t1.unique1 < 50 AND t1.unique2 = t2.uniq
QUERY PLAN
----------------------------------------------------------------------------
Nested Loop (cost=0.00..327.02 rows=49 width=296)
-> Index Scan using tenk1_unique1 on tenk1 t1
(cost=0.00..179.33 rows=49 width=148)
Index Cond: (unique1 < 50)
-> Index Scan using tenk2_unique2 on tenk2 t2
(cost=0.00..3.01 rows=1 width=148)
Index Cond: ("outer".unique2 = t2.unique2)
In this nested-loop join, the outer scan is the same index scan we had in the example before last, and so
its cost and row count are the same because we are applying the WHERE clause unique1 < 50 at that
node. The t1.unique2 = t2.unique2 clause is not relevant yet, so it doesn’t affect row count of the
outer scan. For the inner scan, the unique2 value of the current outer-scan row is plugged into the inner
index scan to produce an index condition like t2.unique2 = constant. So we get the same inner-scan
plan and costs that we’d get from, say, EXPLAIN SELECT * FROM tenk2 WHERE unique2 = 42. The
costs of the loop node are then set on the basis of the cost of the outer scan, plus one repetition of the inner
scan for each outer row (49 * 3.01, here), plus a little CPU time for join processing.
In this example the join’s output row count is the same as the product of the two scans’ row counts, but
that’s not true in general, because in general you can have WHERE clauses that mention both tables and
so can only be applied at the join point, not to either input scan. For example, if we added WHERE ...
AND t1.hundred < t2.hundred, that would decrease the output row count of the join node, but not
change either input scan.
One way to look at variant plans is to force the planner to disregard whatever strategy it thought was the
winner, using the enable/disable flags for each plan type. (This is a crude tool, but useful. See also Section
13.3.)
QUERY PLAN
--------------------------------------------------------------------------
Hash Join (cost=179.45..563.06 rows=49 width=296)
Hash Cond: ("outer".unique2 = "inner".unique2)
-> Seq Scan on tenk2 t2 (cost=0.00..333.00 rows=10000 width=148)
-> Hash (cost=179.33..179.33 rows=49 width=148)
-> Index Scan using tenk1_unique1 on tenk1 t1
(cost=0.00..179.33 rows=49 width=148)
Index Cond: (unique1 < 50)
This plan proposes to extract the 50 interesting rows of tenk1 using ye same olde index scan, stash them
into an in-memory hash table, and then do a sequential scan of tenk2, probing into the hash table for
possible matches of t1.unique2 = t2.unique2 at each tenk2 row. The cost to read tenk1 and set
up the hash table is entirely start-up cost for the hash join, since we won’t get any rows out until we can
start reading tenk2. The total time estimate for the join also includes a hefty charge for the CPU time to
219
Chapter 13. Performance Tips
probe the hash table 10000 times. Note, however, that we are not charging 10000 times 179.33; the hash
table setup is only done once in this plan type.
It is possible to check on the accuracy of the planner’s estimated costs by using EXPLAIN ANALYZE. This
command actually executes the query, and then displays the true run time accumulated within each plan
node along with the same estimated costs that a plain EXPLAIN shows. For example, we might get a result
like this:
EXPLAIN ANALYZE SELECT * FROM tenk1 t1, tenk2 t2 WHERE t1.unique1 < 50 AND t1.unique2 =
QUERY PLAN
-------------------------------------------------------------------------------
Nested Loop (cost=0.00..327.02 rows=49 width=296)
(actual time=1.181..29.822 rows=50 loops=1)
-> Index Scan using tenk1_unique1 on tenk1 t1
(cost=0.00..179.33 rows=49 width=148)
(actual time=0.630..8.917 rows=50 loops=1)
Index Cond: (unique1 < 50)
-> Index Scan using tenk2_unique2 on tenk2 t2
(cost=0.00..3.01 rows=1 width=148)
(actual time=0.295..0.324 rows=1 loops=50)
Index Cond: ("outer".unique2 = t2.unique2)
Total runtime: 31.604 ms
Note that the “actual time” values are in milliseconds of real time, whereas the “cost” estimates are
expressed in arbitrary units of disk fetches; so they are unlikely to match up. The thing to pay attention to
is the ratios.
In some query plans, it is possible for a subplan node to be executed more than once. For example, the
inner index scan is executed once per outer row in the above nested-loop plan. In such cases, the “loops”
value reports the total number of executions of the node, and the actual time and rows values shown are
averages per-execution. This is done to make the numbers comparable with the way that the cost estimates
are shown. Multiply by the “loops” value to get the total time actually spent in the node.
The Total runtime shown by EXPLAIN ANALYZE includes executor start-up and shut-down time, as
well as time spent processing the result rows. It does not include parsing, rewriting, or planning time. For
a SELECT query, the total run time will normally be just a little larger than the total time reported for the
top-level plan node. For INSERT, UPDATE, and DELETE commands, the total run time may be considerably
larger, because it includes the time spent processing the result rows. In these commands, the time for the
top plan node essentially is the time spent computing the new rows and/or locating the old ones, but it
doesn’t include the time spent making the changes.
It is worth noting that EXPLAIN results should not be extrapolated to situations other than the one you are
actually testing; for example, results on a toy-sized table can’t be assumed to apply to large tables. The
planner’s cost estimates are not linear and so it may well choose a different plan for a larger or smaller
table. An extreme example is that on a table that only occupies one disk page, you’ll nearly always get a
sequential scan plan whether indexes are available or not. The planner realizes that it’s going to take one
disk page read to process the table in any case, so there’s no value in expending additional page reads to
look at an index.
220
Chapter 13. Performance Tips
SELECT relname, relkind, reltuples, relpages FROM pg_class WHERE relname LIKE ’tenk1%’;
Here we can see that tenk1 contains 10000 rows, as do its indexes, but the indexes are (unsurprisingly)
much smaller than the table.
For efficiency reasons, reltuples and relpages are not updated on-the-fly, and so they usually contain
somewhat out-of-date values. They are updated by VACUUM, ANALYZE, and a few DDL commands such
as CREATE INDEX. A stand-alone ANALYZE, that is one not part of VACUUM, generates an approximate
reltuples value since it does not read every row of the table. The planner will scale the values it finds
in pg_class to match the current physical table size, thus obtaining a closer approximation.
Most queries retrieve only a fraction of the rows in a table, due to having WHERE clauses that restrict the
rows to be examined. The planner thus needs to make an estimate of the selectivity of WHERE clauses, that
is, the fraction of rows that match each condition in the WHERE clause. The information used for this task
is stored in the pg_statistic system catalog. Entries in pg_statistic are updated by ANALYZE and
VACUUM ANALYZE commands and are always approximate even when freshly updated.
Rather than look at pg_statistic directly, it’s better to look at its view pg_stats when examining the
statistics manually. pg_stats is designed to be more easily readable. Furthermore, pg_stats is readable
by all, whereas pg_statistic is only readable by a superuser. (This prevents unprivileged users from
learning something about the contents of other people’s tables from the statistics. The pg_stats view is
restricted to show only rows about tables that the current user can read.) For example, we might do:
attname | n_distinct |
---------+------------+-----------------------------------------------------------------
name | -0.467008 | {"I- 580 Ramp","I- 880
thepath | 20 | {"[(-122.089,37.71),(-122.0886,37.711)]"}
(2 rows)
221
Chapter 13. Performance Tips
The amount of information stored in pg_statistic, in particular the maximum number of entries in
the most_common_vals and histogram_bounds arrays for each column, can be set on a column-
by-column basis using the ALTER TABLE SET STATISTICS command, or globally by setting the de-
fault_statistics_target configuration variable. The default limit is presently 10 entries. Raising the limit
may allow more accurate planner estimates to be made, particularly for columns with irregular data distri-
butions, at the price of consuming more space in pg_statistic and slightly more time to compute the
estimates. Conversely, a lower limit may be appropriate for columns with simple data distributions.
the planner is free to join the given tables in any order. For example, it could generate a query plan that
joins A to B, using the WHERE condition a.id = b.id, and then joins C to this joined table, using the
other WHERE condition. Or it could join B to C and then join A to that result. Or it could join A to C and
then join them with B, but that would be inefficient, since the full Cartesian product of A and C would
have to be formed, there being no applicable condition in the WHERE clause to allow optimization of the
join. (All joins in the PostgreSQL executor happen between two input tables, so it’s necessary to build up
the result in one or another of these fashions.) The important point is that these different join possibilities
give semantically equivalent results but may have hugely different execution costs. Therefore, the planner
will explore all of them to try to find the most efficient query plan.
When a query only involves two or three tables, there aren’t many join orders to worry about. But the
number of possible join orders grows exponentially as the number of tables expands. Beyond ten or so
input tables it’s no longer practical to do an exhaustive search of all the possibilities, and even for six
or seven tables planning may take an annoyingly long time. When there are too many input tables, the
PostgreSQL planner will switch from exhaustive search to a genetic probabilistic search through a limited
number of possibilities. (The switch-over threshold is set by the geqo_threshold run-time parameter.) The
genetic search takes less time, but it won’t necessarily find the best possible plan.
When the query involves outer joins, the planner has much less freedom than it does for plain (inner)
joins. For example, consider
Although this query’s restrictions are superficially similar to the previous example, the semantics are
different because a row must be emitted for each row of A that has no matching row in the join of B and
C. Therefore the planner has no choice of join order here: it must join B to C and then join A to that result.
Accordingly, this query takes less time to plan than the previous query.
Explicit inner join syntax (INNER JOIN, CROSS JOIN, or unadorned JOIN) is semantically the same as
listing the input relations in FROM, so it does not need to constrain the join order. But it is possible to
instruct the PostgreSQL query planner to treat explicit inner JOINs as constraining the join order anyway.
For example, these three queries are logically equivalent:
222
Chapter 13. Performance Tips
SELECT * FROM a CROSS JOIN b CROSS JOIN c WHERE a.id = b.id AND b.ref = c.id;
SELECT * FROM a JOIN (b JOIN c ON (b.ref = c.id)) ON (a.id = b.id);
But if we tell the planner to honor the JOIN order, the second and third take less time to plan than the first.
This effect is not worth worrying about for only three tables, but it can be a lifesaver with many tables.
To force the planner to follow the JOIN order for inner joins, set the join_collapse_limit run-time param-
eter to 1. (Other possible values are discussed below.)
You do not need to constrain the join order completely in order to cut search time, because it’s OK to use
JOIN operators within items of a plain FROM list. For example, consider
With join_collapse_limit = 1, this forces the planner to join A to B before joining them to other
tables, but doesn’t constrain its choices otherwise. In this example, the number of possible join orders is
reduced by a factor of 5.
Constraining the planner’s search in this way is a useful technique both for reducing planning time and
for directing the planner to a good query plan. If the planner chooses a bad join order by default, you can
force it to choose a better order via JOIN syntax — assuming that you know of a better order, that is.
Experimentation is recommended.
A closely related issue that affects planning time is collapsing of subqueries into their parent query. For
example, consider
SELECT *
FROM x, y,
(SELECT * FROM a, b, c WHERE something) AS ss
WHERE somethingelse;
This situation might arise from use of a view that contains a join; the view’s SELECT rule will be inserted
in place of the view reference, yielding a query much like the above. Normally, the planner will try to
collapse the subquery into the parent, yielding
This usually results in a better plan than planning the subquery separately. (For example, the outer WHERE
conditions might be such that joining X to A first eliminates many rows of A, thus avoiding the need
to form the full logical output of the subquery.) But at the same time, we have increased the plan-
ning time; here, we have a five-way join problem replacing two separate three-way join problems. Be-
cause of the exponential growth of the number of possibilities, this makes a big difference. The plan-
ner tries to avoid getting stuck in huge join search problems by not collapsing a subquery if more than
from_collapse_limit FROM items would result in the parent query. You can trade off planning time
against quality of plan by adjusting this run-time parameter up or down.
from_collapse_limit and join_collapse_limit are similarly named because they do almost the same
thing: one controls when the planner will “flatten out” subselects, and the other controls when
it will flatten out explicit inner joins. Typically you would either set join_collapse_limit
equal to from_collapse_limit (so that explicit joins and subselects act similarly) or set
join_collapse_limit to 1 (if you want to control join order with explicit joins). But you might set
them differently if you are trying to fine-tune the trade off between planning time and run time.
223
Chapter 13. Performance Tips
Note that loading a large number of rows using COPY is almost always faster than using INSERT, even if
PREPARE is used and multiple insertions are batched into a single transaction.
224
Chapter 13. Performance Tips
225
III. Server Administration
This part covers topics that are of interest to a PostgreSQL database administrator. This includes instal-
lation of the software, set up and configuration of the server, management of users and databases, and
maintenance tasks. Anyone who runs a PostgreSQL server, even for personal use, but especially in pro-
duction, should be familiar with the topics covered in this part.
The information in this part is arranged approximately in the order in which a new user should read it.
But the chapters are self-contained and can be read individually as desired. The information in this part is
presented in a narrative fashion in topical units. Readers looking for a complete description of a particular
command should look into Part VI.
The first few chapters are written so that they can be understood without prerequisite knowledge, so that
new users who need to set up their own server can begin their exploration with this part. The rest of this
part is about tuning and management; that material assumes that the reader is familiar with the general
use of the PostgreSQL database system. Readers are encouraged to look at Part I and Part II for additional
information.
Chapter 14. Installation Instructions
This chapter describes the installation of PostgreSQL from the source code distribution. (If you are in-
stalling a pre-packaged distribution, such as an RPM or Debian package, ignore this chapter and read the
packager’s instructions instead.)
./configure
gmake
su
gmake install
adduser postgres
mkdir /usr/local/pgsql/data
chown postgres /usr/local/pgsql/data
su - postgres
/usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data
/usr/local/pgsql/bin/postmaster -D /usr/local/pgsql/data >logfile 2>&1 &
/usr/local/pgsql/bin/createdb test
/usr/local/pgsql/bin/psql test
14.2. Requirements
In general, a modern Unix-compatible platform should be able to run PostgreSQL. The platforms that had
received specific testing at the time of release are listed in Section 14.7 below. In the doc subdirectory of
the distribution there are several platform-specific FAQ documents you might wish to consult if you are
having trouble.
The following software packages are required for building PostgreSQL:
• GNU make is required; other make programs will not work. GNU make is often installed under the
name gmake; this document will always refer to it by that name. (On some systems GNU make is the
default tool with the name make.) To test for GNU make enter
gmake --version
228
Chapter 14. Installation Instructions
• Additional software is needed to build PostgreSQL on Windows. You can build PostgreSQL for NT-
based versions of Windows (like Windows XP and 2003) using MinGW; see doc/FAQ_MINGW for
details. You can also build PostgreSQL using Cygwin; see doc/FAQ_CYGWIN. A Cygwin-based build
will work on older versions of Windows, but if you have a choice, we recommend the MinGW approach.
While these are the only tool sets recommended for a complete build, it is possible to build just the C
client library (libpq) and the interactive terminal (psql) using other Windows tool sets. For details of
that see Chapter 15.
The following packages are optional. They are not required in the default configuration, but they are
needed when certain build options are enabled, as explained below.
• To build the server programming language PL/Perl you need a full Perl installation, including the
libperl library and the header files. Since PL/Perl will be a shared library, the libperl library
must be a shared library also on most platforms. This appears to be the default in recent Perl versions,
but it was not in earlier versions, and in any case it is the choice of whomever installed Perl at your site.
If you don’t have the shared library but you need one, a message like this will appear during the build
to point out this fact:
*** Cannot build PL/Perl because libperl is not a shared library.
*** You might have to rebuild your Perl installation. Refer to
*** the documentation for details.
(If you don’t follow the on-screen output you will merely notice that the PL/Perl library object,
plperl.so or similar, will not be installed.) If you see this, you will have to rebuild and install Perl
manually to be able to build PL/Perl. During the configuration process for Perl, request a shared
library.
• To build the PL/Python server programming language, you need a Python installation with the header
files and the distutils module. The distutils module is included by default with Python 1.6 and later;
users of earlier versions of Python will need to install it.
Since PL/Python will be a shared library, the libpython library must be a shared library also on most
platforms. This is not the case in a default Python installation. If after building and installing you have
a file called plpython.so (possibly a different extension), then everything went well. Otherwise you
should have seen a notice like this flying by:
*** Cannot build PL/Python because libpython is not a shared library.
*** You might have to rebuild your Python installation. Refer to
*** the documentation for details.
That means you have to rebuild (part of) your Python installation to supply this shared library.
If you have problems, run Python 2.3 or later’s configure using the --enable-shared flag. On some
operating systems you don’t have to build a shared library, but you will have to convince the PostgreSQL
build system of this. Consult the Makefile in the src/pl/plpython directory for details.
• If you want to build the PL/Tcl procedural language, you of course need a Tcl installation.
• To enable Native Language Support (NLS), that is, the ability to display a program’s messages in a
language other than English, you need an implementation of the Gettext API. Some operating systems
229
Chapter 14. Installation Instructions
have this built-in (e.g., Linux, NetBSD, Solaris), for other systems you can download an add-on package
from here: http://developer.postgresql.org/~petere/bsd-gettext/. If you are using the Gettext implemen-
tation in the GNU C library then you will additionally need the GNU Gettext package for some utility
programs. For any of the other implementations you will not need it.
• Kerberos, OpenSSL, and/or PAM, if you want to support authentication or encryption using these ser-
vices.
If you are building from a CVS tree instead of using a released source package, or if you want to do
development, you also need the following packages:
• GNU Flex and Bison are needed to build a CVS checkout or if you changed the actual scanner and
parser definition files. If you need them, be sure to get Flex 2.5.4 or later and Bison 1.875 or later. Other
yacc programs can sometimes be used, but doing so requires extra effort and is not recommended. Other
lex programs will definitely not work.
If you need to get a GNU package, you can find it at your local GNU mirror site (see
http://www.gnu.org/order/ftp.html for a list) or at ftp://ftp.gnu.org/gnu/.
Also check that you have sufficient disk space. You will need about 65 MB for the source tree during
compilation and about 15 MB for the installation directory. An empty database cluster takes about 25
MB, databases take about five times the amount of space that a flat text file with the same data would take.
If you are going to run the regression tests you will temporarily need up to an extra 90 MB. Use the df
command to check free disk space.
gunzip postgresql-8.0.0.tar.gz
tar xf postgresql-8.0.0.tar
This will create a directory postgresql-8.0.0 under the current directory with the PostgreSQL sources.
Change into that directory for the rest of the installation procedure.
230
Chapter 14. Installation Instructions
1. Make sure that your database is not updated during or after the backup. This does not affect the
integrity of the backup, but the changed data would of course not be included. If necessary, edit the
permissions in the file /usr/local/pgsql/data/pg_hba.conf (or equivalent) to disallow access
from everyone except you.
2. To back up your database installation, type:
pg_dumpall > outputfile
If you need to preserve OIDs (such as when using them as foreign keys), then use the -o option when
running pg_dumpall.
pg_dumpall does not save large objects. Check Section 22.1.4 if you need to do this.
To make the backup, you can use the pg_dumpall command from the version you are currently run-
ning. For best results, however, try to use the pg_dumpall command from PostgreSQL 8.0.0, since
this version contains bug fixes and improvements over older versions. While this advice might seem
idiosyncratic since you haven’t installed the new version yet, it is advisable to follow it if you plan to
install the new version in parallel with the old version. In that case you can complete the installation
normally and transfer the data later. This will also decrease the downtime.
3. If you are installing the new version at the same location as the old one then shut down the old server,
at the latest before you install the new files:
pg_ctl stop
On systems that have PostgreSQL started at boot time, there is probably a start-up file that will
accomplish the same thing. For example, on a Red Hat Linux system one might find that
/etc/rc.d/init.d/postgresql stop
works.
Very old versions might not have pg_ctl. If you can’t find it or it doesn’t work, find out the process
ID of the old server, for example by typing
ps ax | grep postmaster
4. If you are installing in the same place as the old version then it is also a good idea to move the old
installation out of the way, in case you have trouble and need to revert to it. Use a command like this:
mv /usr/local/pgsql /usr/local/pgsql.old
After you have installed PostgreSQL 8.0.0, create a new database directory and start the new server.
Remember that you must execute these commands while logged in to the special database user account
(which you already have if you are upgrading).
/usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data
/usr/local/pgsql/bin/postmaster -D /usr/local/pgsql/data
231
Chapter 14. Installation Instructions
1. Configuration
The first step of the installation procedure is to configure the source tree for your system and choose
the options you would like. This is done by running the configure script. For a default installation
simply enter
./configure
This script will run a number of tests to guess values for various system dependent variables and
detect some quirks of your operating system, and finally will create several files in the build tree to
record what it found. (You can also run configure in a directory outside the source tree if you want
to keep the build directory separate.)
The default configuration will build the server and utilities, as well as all client applications and
interfaces that require only a C compiler. All files will be installed under /usr/local/pgsql by
default.
You can customize the build and installation process by supplying one or more of the following
command line options to configure:
--prefix=PREFIX
Install all files under the directory PREFIX instead of /usr/local/pgsql. The actual files will
be installed into various subdirectories; no files will ever be installed directly into the PREFIX
directory.
If you have special needs, you can also customize the individual subdirectories with the follow-
ing options. However, if you leave these with their defaults, the installation will be relocatable,
meaning you can move the directory after installation. (The man and doc locations are not af-
fected by this.)
For relocatable installs, you might want to use configure’s --disable-rpath option. Also,
you will need to tell the operating system how to find the shared libraries.
--exec-prefix=EXEC-PREFIX
You can install architecture-dependent files under a different prefix, EXEC-PREFIX, than what
PREFIX was set to. This can be useful to share architecture-independent files between hosts.
If you omit this, then EXEC-PREFIX is set equal to PREFIX and both architecture-dependent
and independent files will be installed under the same tree, which is probably what you want.
--bindir=DIRECTORY
Specifies the directory for executable programs. The default is EXEC-PREFIX/bin, which nor-
mally means /usr/local/pgsql/bin.
--datadir=DIRECTORY
Sets the directory for read-only data files used by the installed programs. The default is
PREFIX/share. Note that this has nothing to do with where your database files will be placed.
232
Chapter 14. Installation Instructions
--sysconfdir=DIRECTORY
The location to install libraries and dynamically loadable modules. The default is
EXEC-PREFIX/lib.
--includedir=DIRECTORY
The directory for installing C and C++ header files. The default is PREFIX/include.
--mandir=DIRECTORY
The man pages that come with PostgreSQL will be installed under this directory, in their respec-
tive manx subdirectories. The default is PREFIX/man.
--with-docdir=DIRECTORY
--without-docdir
Documentation files, except “man” pages, will be installed into this directory. The default is
PREFIX/doc. If the option --without-docdir is specified, the documentation will not be
installed by make install. This is intended for packaging scripts that have special methods
for installing documentation.
Note: Care has been taken to make it possible to install PostgreSQL into shared installation
locations (such as /usr/local/include) without interfering with the namespace of the rest of
the system. First, the string “/postgresql” is automatically appended to datadir, sysconfdir,
and docdir, unless the fully expanded directory name already contains the string “postgres”
or “pgsql”. For example, if you choose /usr/local as prefix, the documentation will be
installed in /usr/local/doc/postgresql, but if the prefix is /opt/postgres, then it will be
in /opt/postgres/doc. The public C header files of the client interfaces are installed into
includedir and are namespace-clean. The internal header files and the server header files are
installed into private directories under includedir. See the documentation of each interface for
information about how to get at the its header files. Finally, a private subdirectory will also be
created, if appropriate, under libdir for dynamically loadable modules.
--with-includes=DIRECTORIES
DIRECTORIES is a colon-separated list of directories that will be added to the list the com-
piler searches for header files. If you have optional packages (such as GNU Readline) installed
in a non-standard location, you have to use this option and probably also the corresponding
--with-libraries option.
Example: --with-includes=/opt/gnu/include:/usr/sup/include.
--with-libraries=DIRECTORIES
DIRECTORIES is a colon-separated list of directories to search for libraries. You will probably
have to use this option (and the corresponding --with-includes option) if you have packages
installed in non-standard locations.
233
Chapter 14. Installation Instructions
Example: --with-libraries=/opt/gnu/lib:/usr/sup/lib.
--enable-nls[=LANGUAGES]
Enables Native Language Support (NLS), that is, the ability to display a program’s messages in
a language other than English. LANGUAGES is a space-separated list of codes of the languages
that you want supported, for example --enable-nls=’de fr’. (The intersection between
your list and the set of actually provided translations will be computed automatically.) If you do
not specify a list, then all available translations are installed.
To use this option, you will need an implementation of the Gettext API; see above.
--with-pgport=NUMBER
Set NUMBER as the default port number for server and clients. The default is 5432. The port can
always be changed later on, but if you specify it here then both server and clients will have the
same default compiled in, which can be very convenient. Usually the only good reason to select
a non-default value is if you intend to run multiple PostgreSQL servers on the same machine.
--with-perl
Tcl installs the file tclConfig.sh, which contains configuration information needed to build
modules interfacing to Tcl. This file is normally found automatically at a well-known location,
but if you want to use a different version of Tcl you can specify the directory in which to look
for it.
--with-krb4
--with-krb5
Build with support for Kerberos authentication. You can use either Kerberos version 4 or 5, but
not both. On many systems, the Kerberos system is not installed in a location that is searched by
default (e.g., /usr/include, /usr/lib), so you must use the options --with-includes and
--with-libraries in addition to this option. configure will check for the required header
files and libraries to make sure that your Kerberos installation is sufficient before proceeding.
--with-krb-srvnam=NAME
The name of the Kerberos service principal. postgres is the default. There’s probably no reason
to change this.
--with-openssl
Build with support for SSL (encrypted) connections. This requires the OpenSSL package to be
installed. configure will check for the required header files and libraries to make sure that
your OpenSSL installation is sufficient before proceeding.
--with-pam
234
Chapter 14. Installation Instructions
--without-readline
Prevents use of the Readline library. This disables command-line editing and history in psql, so
it is not recommended.
--with-rendezvous
Build with Rendezvous support. This requires Rendezvous support in your operating system.
Recommended on Mac OS X.
--disable-spinlocks
Allow the build to succeed even if PostgreSQL has no CPU spinlock support for the platform.
The lack of spinlock support will result in poor performance; therefore, this option should only
be used if the build aborts and informs you that the platform lacks spinlock support. If this option
is required to build PostgreSQL on your platform, please report the problem to the PostgreSQL
developers.
--enable-thread-safety
Make the client libraries thread-safe. This allows concurrent threads in libpq and ECPG pro-
grams to safely control their private connection handles. This option requires adequate threading
support in your operating system.
--without-zlib
Prevents use of the Zlib library. This disables support for compressed archives in pg_dump and
pg_restore. This option is only intended for those rare systems where this library is not available.
--enable-debug
Compiles all programs and libraries with debugging symbols. This means that you can run the
programs through a debugger to analyze problems. This enlarges the size of the installed exe-
cutables considerably, and on non-GCC compilers it usually also disables compiler optimization,
causing slowdowns. However, having the symbols available is extremely helpful for dealing with
any problems that may arise. Currently, this option is recommended for production installations
only if you use GCC. But you should always have it on if you are doing development work or
running a beta version.
--enable-cassert
Enables assertion checks in the server, which test for many “can’t happen” conditions. This is
invaluable for code development purposes, but the tests slow things down a little. Also, having
the tests turned on won’t necessarily enhance the stability of your server! The assertion checks
are not categorized for severity, and so what might be a relatively harmless bug will still lead
to server restarts if it triggers an assertion failure. Currently, this option is not recommended for
production use, but you should have it on for development work or when running a beta version.
--enable-depend
Enables automatic dependency tracking. With this option, the makefiles are set up so that all
affected object files will be rebuilt when any header file is changed. This is useful if you are
doing development work, but is just wasted overhead if you intend only to compile once and
install. At present, this option will work only if you use GCC.
235
Chapter 14. Installation Instructions
If you prefer a C compiler different from the one configure picks, you can set the environment
variable CC to the program of your choice. By default, configure will pick gcc if available, else the
platform’s default (usually cc). Similarly, you can override the default compiler flags if needed with
the CFLAGS variable.
You can specify environment variables on the configure command line, for example:
./configure CC=/opt/bin/gcc CFLAGS=’-O2 -pipe’
2. Build
To start the build, type
gmake
(Remember to use GNU make.) The build may take anywhere from 5 minutes to half an hour de-
pending on your hardware. The last line displayed should be
All of PostgreSQL is successfully made. Ready to install.
3. Regression Tests
If you want to test the newly built server before you install it, you can run the regression tests at this
point. The regression tests are a test suite to verify that PostgreSQL runs on your machine in the way
the developers expected it to. Type
gmake check
(This won’t work as root; do it as an unprivileged user.) Chapter 26 contains detailed information
about interpreting the test results. You can repeat this test at any later time by issuing the same
command.
4. Installing The Files
Note: If you are upgrading an existing system and are going to install the new files over the old
ones, be sure to back up your data and shut down the old server before proceeding, as explained
in Section 14.4 above.
gmake install
This will install files into the directories that were specified in step 1. Make sure that you have appro-
priate permissions to write into that area. Normally you need to do this step as root. Alternatively, you
could create the target directories in advance and arrange for appropriate permissions to be granted.
You can use gmake install-strip instead of gmake install to strip the executable files and
libraries as they are installed. This will save some space. If you built with debugging support, stripping
will effectively remove the debugging support, so it should only be done if debugging is no longer
needed. install-strip tries to do a reasonable job saving space, but it does not have perfect
knowledge of how to strip every unneeded byte from an executable file, so if you want to save all the
disk space you possibly can, you will have to do manual work.
236
Chapter 14. Installation Instructions
The standard installation provides all the header files needed for client application development as
well as for server-side program development, such as custom functions or data types written in C.
(Prior to PostgreSQL 8.0, a separate gmake install-all-headers command was needed for the
latter, but this step has been folded into the standard install.)
Client-only installation: If you want to install only the client applications and interface libraries,
then you can use these commands:
gmake -C src/bin install
gmake -C src/include install
gmake -C src/interfaces install
gmake -C doc install
Registering eventlog on Windows: To register a Windows eventlog library with the operating system,
issue this command after installation:
regsvr32 pgsql_library_directory/pgevent.dll
LD_LIBRARY_PATH=/usr/local/pgsql/lib
export LD_LIBRARY_PATH
or in csh or tcsh
237
Chapter 14. Installation Instructions
Replace /usr/local/pgsql/lib with whatever you set --libdir to in step 1. You should put these
commands into a shell start-up file such as /etc/profile or ~/.bash_profile. Some good informa-
tion about the caveats associated with this method can be found at http://www.visi.com/~barr/ldpath.html.
On some systems it might be preferable to set the environment variable LD_RUN_PATH before building.
On Cygwin, put the library directory in the PATH or move the .dll files into the bin directory.
If in doubt, refer to the manual pages of your system (perhaps ld.so or rld). If you later on get a message
like
/sbin/ldconfig /usr/local/pgsql/lib
(or equivalent directory) after installation to enable the run-time linker to find the shared libraries faster.
Refer to the manual page of ldconfig for more information. On FreeBSD, NetBSD, and OpenBSD the
command is
/sbin/ldconfig -m /usr/local/pgsql/lib
PATH=/usr/local/pgsql/bin:$PATH
export PATH
To enable your system to find the man documentation, you need to add lines like the following to a shell
start-up file unless you installed into a location that is searched by default.
MANPATH=/usr/local/pgsql/man:$MANPATH
export MANPATH
238
Chapter 14. Installation Instructions
The environment variables PGHOST and PGPORT specify to client applications the host and port of the
database server, overriding the compiled-in defaults. If you are going to run client applications remotely
then it is convenient if every user that plans to use the database sets PGHOST. This is not required, however:
the settings can be communicated via command line options to most client programs.
Note: If you are having problems with the installation on a supported platform, please write to
<[email protected]> or <[email protected]>, not to the people listed here.
6. http://www.pgbuildfarm.org/
239
Chapter 14. Installation Instructions
240
Chapter 14. Installation Instructions
241
Chapter 14. Installation Instructions
242
Chapter 14. Installation Instructions
Unsupported Platforms: The following platforms are either known not to work, or they used to work
in a fairly distant previous release. We include these here to let you know that these platforms could be
supported if given some attention.
243
Chapter 14. Installation Instructions
244
Chapter 15. Client-Only Installation on
Windows
Although a complete PostgreSQL installation for Windows can only be built using MinGW or Cygwin,
the C client library (libpq) and the interactive terminal (psql) can be compiled using other Windows tool
sets. Makefiles are included in the source distribution for Microsoft Visual C++ and Borland C++. It
should be possible to compile the libraries manually for other configurations.
Tip: Using MinGW or Cygwin is preferred. If using one of those tool sets, see Chapter 14.
To build everything that you can on Windows using Microsoft Visual C++, change into the src directory
and type the command
nmake /f win32.mak
interfaces\libpq\Release\libpq.dll
The only file that really needs to be installed is the libpq.dll library. This file should in most cases
be placed in the WINNT\SYSTEM32 directory (or in WINDOWS\SYSTEM on a Windows 95/98/ME sys-
tem). If this file is installed using a setup program, it should be installed with version checking using the
VERSIONINFO resource included in the file, to ensure that a newer version of the library is not overwritten.
If you plan to do development using libpq on this machine, you will have to add the src\include and
src\interfaces\libpq subdirectories of the source tree to the include path in your compiler’s settings.
To use the library, you must add the libpqdll.lib file to your project. (In Visual C++, just right-click
on the project and choose to add it.)
245
Chapter 16. Server Run-time Environment
This chapter discusses how to set up and run the database server and its interactions with the operating
system.
$ initdb -D /usr/local/pgsql/data
Note that you must execute this command while logged into the PostgreSQL user account, which is
described in the previous secti