How does DuckDB handle large and complex queries involving aggregate functions and window functions, and what are the implications for query performance?

DuckDB processes large and complex queries involving aggregate and window functions by utilizing structured expressions in the SELECT clause, coupled with GROUP BY and OVER clauses. Aggregate functions combine rows into singular results, while window functions calculate results per row. The implications for performance include potentially heavy memory use and computation time due to overlapping operations. Profiling tools like EXPLAIN and EXPLAIN ANALYZE help identify inefficiencies in execution plans, allowing optimization of resource allocation .

How can a user protect against SQL injection when using prepared statements in DuckDB, and what are the advantages of this protection method?

Prepared statements in DuckDB mitigate SQL injection risks by separating SQL logic from user inputs. By using placeholders for inputs, these statements prevent unsanitized data from being interpreted as executable code. This approach reduces vulnerability to injection attacks, securing data integrity, and simplifies application logic by avoiding string concatenation for SQL command formation .

What are the recommended methods for handling concurrency when writing to DuckDB from multiple processes?

When writing to DuckDB from multiple processes, it's not supported automatically due to concurrency challenges. Recommended methods include implementing application logic with design patterns such as acquiring a cross-process mutex lock, retrying connections on conflict, using multi-process transactions with MySQL, PostgreSQL, or SQLite, or managing reads and writes through a web server. Writing data to Parquet files with DuckDB's ability to read multiple files is another option .

Discuss how the 'DISTINCT ON' clause differs from a standard DISTINCT clause in SQL queries, with an example of when to use it in DuckDB.

The 'DISTINCT ON' clause in SQL allows selecting a unique row per specified grouping of expressions, influenced by an ORDER BY clause. Unlike a full DISTINCT, it refines uniqueness by targeted expressions to optimize query performance. It is useful in scenarios such as selecting the highest population city per country where you execute 'SELECT DISTINCT ON(country) city, population FROM cities ORDER BY population DESC;', ensuring efficient data retrieval and reduced computational cost .

Explain the concept of ASOF JOIN in DuckDB and provide an example of its application.

ASOF JOIN in DuckDB is an advanced joining method that aligns rows from two tables based on time proximity and specified join conditions. It's used to attach prices to stock trades, even if they don't match perfectly by timestamp. For example: 'SELECT * FROM trades t ASOF LEFT JOIN prices p ON t.symbol = p.symbol AND t.when >= p.when;' This would result in trades being matched with the closest available price before or at the trade time .

Explain the concept and use of prepared statements in DuckDB and their advantages over direct query execution.

Prepared statements in DuckDB are pre-computed parameterized queries that optimize execution by allowing repeated use with different parameters. They prevent SQL injection attacks, offer performance benefits for recurring execution, and ensure consistent query handling. This method especially speeds up processing when combined with parameter binding functions like 'duckdb_bind_int32' and is cleared with 'duckdb_destroy_prepare' upon completion. The drawback is they're less efficient for bulk data insertion compared to using an Appender .

How does DuckDB facilitate the loading of multiple file types simultaneously, and what are the benefits of this capability?

DuckDB facilitates loading multiple file types (CSV, Parquet, JSON) by using glob patterns or providing a list of files to read. This capability allows merging of multiple data sources into a single query operation, which simplifies data handling and enhances efficient data processing by treating them as a single input for analysis .

Describe the benefits of using the time_bucket function in DuckDB for managing timestamped data.

The time_bucket function in DuckDB simplifies timestamp data management by grouping timestamps into buckets of specified intervals, which align data temporal granularity for analysis. It provides robust capabilities to handle varying interval widths and timezone adjustments, enhancing data normalization in time series analysis, essential for aggregating data points over regular intervals .

What are the considerations for reading JSON data in DuckDB, including schema detection and file format handling?

When reading JSON data in DuckDB, consider its automatic detection capabilities for newline-delimited versus regular JSON formats. This informs whether the data schema is inferred or specified, crucial for structured analysis. Users can define explicit column types to bypass automatic detection and improve processing accuracy, especially when dealing with inconsistent JSON structures across multiple files .

What role does the Appender class play in DuckDB's data loading process, and why is it preferred over traditional SQL INSERT statements for bulk data loading?

The Appender class in DuckDB enhances bulk data loading by efficiently adding rows without using SQL INSERT statements, minimizing per-row overhead. It's especially beneficial in APIs like C, C++, Go, Java, and Rust, enabling faster data throughput via direct row additions for large datasets, which is more efficient than executing multiple INSERT commands commonly associated with greater overhead .

Open navigation menu

Upload

0% found this document useful (0 votes)

2K views721 pages

Duckdb Docs

This document provides documentation for DuckDB version 0.10.0-dev. It contains sections covering connecting to DuckDB, importing different data formats like CSV and JSON, working with multiple files, partitioning data, client APIs for C, and more. The document serves as a technical reference for using DuckDB's various features and interfaces.

Uploaded by

Kurnivan Noer Yusvianto

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2K views721 pages

Duckdb Docs

Uploaded by

Kurnivan Noer Yusvianto

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 721

DuckDB Documentation

DuckDB version 0.10.0‑dev

Generated on 2024‑03‑04 at 09:16 UTC
Contents

Contents i

Summary 1

Documentation 3

Connect 5
Connect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Concurrency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

Data Import 7
Importing Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
CSV Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
CSV Import . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
CSV Auto Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Reading Faulty CSV Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
CSV Import Tips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
JSON Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
JSON Loading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Multiple Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Reading Multiple Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Combining Schemas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Parquet Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Reading and Writing Parquet Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Querying Parquet Metadata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Parquet Encryption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Parquet Tips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Hive Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Partitioned Writes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
Appender . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
INSERT Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

Client APIs 39
Client APIs Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Startup & Shutdown . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Query . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
Data Chunks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
Prepared Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
Appender . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
Table Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

i
DuckDB Documentation

Replacement Scans . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

Complete API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
C++ API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
CLI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
CLI API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
Command Line Arguments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
Dot Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
Output Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
Editing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
Autocomplete . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
Syntax Highlighting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
Go . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
Java JDBC API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
Julia Package . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
Node.js . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
Node.js API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
Node.js API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
Python . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
Python API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
Data Ingestion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
Result Conversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256
Python DB API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
Relational API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
Python Function API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264
Types API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
Expression API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
Spark API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272
Python Client API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273
Known Python Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273
R API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274
Rust API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
Swift API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 278
Wasm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280
DuckDB Wasm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280
Instantiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280
Data Ingestion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282
Query . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285
Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286
ADBC API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288
ODBC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292
ODBC API ‑ Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292
ODBC API ‑ Linux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293
ODBC API ‑ Windows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295
ODBC API ‑ macOS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298

Configuration 301
Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301
Pragmas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305
Secrets Manager . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311

SQL 313
SQL Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313
Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 320
Statements Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 320

ii
DuckDB Documentation

ALTER TABLE Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 320

ALTER VIEW Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 322
ATTACH/DETACH Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 322
CALL Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324
CHECKPOINT Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324
COMMENT ON Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325
COPY Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326
CREATE MACRO Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 330
CREATE SCHEMA Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332
CREATE SECRET Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333
CREATE SEQUENCE Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333
CREATE TABLE Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335
CREATE VIEW Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338
CREATE TYPE Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339
DELETE Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339
DROP Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340
EXPORT/IMPORT DATABASE Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340
INSERT Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341
PIVOT Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345
Profiling Queries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351
SELECT Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352
SET/RESET Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354
Transaction Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355
UNPIVOT Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355
UPDATE Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361
USE Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363
VACUUM Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364
Query Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364
SELECT Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364
FROM & JOIN Clauses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 366
WHERE Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 370
GROUP BY Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371
GROUPING SETS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 372
HAVING Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375
ORDER BY Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 376
LIMIT Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 378
SAMPLE Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 378
Unnesting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379
WITH Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 380
WINDOW Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385
QUALIFY Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385
VALUES Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 386
FILTER Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387
Set Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389
Prepared Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 392
Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393
Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393
Array Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395
Bitstring Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 397
Blob Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 397
Boolean Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 398
Date Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 399
Enum Data Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 399

iii
DuckDB Documentation

Interval Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403

List Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 404
Literal Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405
Map Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 407
NULL Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 408
Numeric Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 409
Struct Data Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 411
Text Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414
Time Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 415
Timestamp Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 415
Time Zone Reference List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 418
Union Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 434
Typecasting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 436
Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 439
Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 439
CASE Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 439
Casting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 440
Collations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 440
Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 442
IN Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 443
Logical Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 443
Star Expression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 444
Subqueries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 446
Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 448
Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 448
Bitstring Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 449
Blob Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 451
Date Format Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 451
Date Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 453
Date Part Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455
Enum Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 457
Interval Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 458
Lambda Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 459
Nested Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 462
Numeric Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 472
Pattern Matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475
Regular Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 478
Text Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 480
Time Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 491
Timestamp Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 492
Timestamp with Time Zone Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 496
Utility Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 501
Aggregate Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 504
Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 509
Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 510
Information Schema . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 511
DuckDB_% Metadata Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 514
Keywords and Identifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 523
Samples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 525
Window Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 527

Extensions 533
Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 533
Official Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 535

iv
DuckDB Documentation

Working with Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 536

Versioning of Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 538
Arrow Extension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 539
AutoComplete Extension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 539
AWS Extension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 540
Azure Extension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 542
Excel Extension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 546
Full‑Text Search Extension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 547
httpfs (HTTP and S3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 550
httpfs Extension for HTTP and S3 Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 550
HTTP(S) Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 551
S3 API Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 551
Legacy Authentication Scheme for S3 API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 555
Iceberg Extension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 556
ICU Extension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 557
inet Extension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 558
jemalloc Extension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 558
JSON Extension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 559
MySQL Extension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 570
PostgreSQL Extension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 574
Spatial Extension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 578
SQLite Extension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 588
Substrait Extension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 592
TPC‑DS Extension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 594
TPC‑H Extension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 595

Guides 599

Data Import & Export 601

Data Import Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 601
CSV Import . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 601
CSV Export . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 602
Parquet Import . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 602
Parquet Export . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 602
Querying Parquet Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 602
HTTP Parquet Import . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 603
S3 Parquet Import . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 603
S3 Parquet Export . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 604
S3 Iceberg Import . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 605
S3 Express One . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 605
GCS Import . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 606
Cloudflare R2 Import . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 607
JSON Import . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 607
JSON Export . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 608
Excel Import . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 608
Excel Export . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 609
MySQL Import . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 610
PostgreSQL Import . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 610
SQLite Import . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 611
Directly Reading Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 612

Performance 615
Performance Guide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 615

v
DuckDB Documentation

Schema . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 615
Indexing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 616
Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 617
File Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 618
Tuning Workloads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 620
My Workload Is Slow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 623
Benchmarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 623

Meta Queries 625

Describe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 625
EXPLAIN: Inspect Query Plans . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 626
EXPLAIN ANALYZE: Profile Queries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 628
List Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 630
Summarize . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 631
DuckDB Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 632

ODBC 635
ODBC 101: A Duck Themed Guide to ODBC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 635

Python 643
Installing the Python Client . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 643
Executing SQL in Python . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 643
Jupyter Notebooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 644
SQL on Pandas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 647
Import from Pandas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 647
Export to Pandas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 648
SQL on Apache Arrow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 648
Import from Apache Arrow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 650
Export to Apache Arrow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 650
Relational API on Pandas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 651
Multiple Python Threads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 652
Integration with Ibis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 654
Integration with Polars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 663
Using fsspec Filesystems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 664

SQL Features 667

Friendly SQL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 667
AsOf Join . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 668
Full‑Text Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 671

SQL Editors 673

DBeaver SQL IDE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 673

Data Viewers 675

Tableau ‑ A Data Visualization Tool . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 675
CLI Charting with YouPlot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 678

Under the Hood 681

Internals 683
Overview of DuckDB Internals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 683
Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 684
Execution Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 687

vi
DuckDB Documentation

Developer Guides 691

Building DuckDB from Source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 691
Profiling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 695
Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 699
sqllogictest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 699
sqllogictest ‑ Debugging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 701
sqllogictest ‑ Result Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 702
sqllogictest ‑ Persistent Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 704
sqllogictest ‑ Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 705
sqllogictest ‑ Multiple Connections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 706
Catch C/C++ Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 707

Acknowledgments 709

vii
Summary

This document contains DuckDB's official documentation and guides in a single‑file easy‑to‑search form. If you find any issues, please
report them as a GitHub issue. Contributions are very welcome in the form of pull requests. If you are considering submitting a contribution
to the documentation, please consult our contributor guide.

Code repositories:

• DuckDB source code: github.com/duckdb/duckdb

• DuckDB documentation source code: github.com/duckdb/duckdb‑web

1
DuckDB Documentation

2
DuckDB Documentation

Documentation

3
Connect

Connect

Connect or Create a Database

To use DuckDB, you must first create a connection to a database. The exact syntax varies between the client APIs but it typically involves
passing an argument to configure persistence.

Persistence

DuckDB can operate in both persistent mode, where the data is saved to disk, and in in‑memory mode, where the entire data set is stored
in the main memory.

Persistent Database To create or open a persistent database, set the path of the database file, e.g., my_database.duckdb, when
creating the connection. This path can point to an existing database or to a file that does not yet exist and DuckDB will open or create a
database at that location as needed. The file may have an arbitrary extension, but .db or .duckdb are two common choices.

Note. Tip Running on a persistent database allows spilling to disk, thus facilitating larger‑than‑memory workloads (i.e., out‑of‑core‑
processing).

Starting with v0.10, DuckDB's storage format is backwards‑compatible, i.e., DuckDB is able to read database files produced by an older
versions of DuckDB. For example, DuckDB v0.10 can read and operate on files created by the previous DuckDB version, v0.9. For more
details on DuckDB's storage format, see the storage page.

In‑Memory Database DuckDB can operate in in‑memory mode. In most clients, this can be activated by passing the special value :mem-
ory: as the database file or omitting the database file argument. In in‑memory mode, no data is persisted to disk, therefore, all data is
lost when the process finishes.

Concurrency

Handling Concurrency

DuckDB has two configurable options for concurrency:

1. One process can both read and write to the database.

2. Multiple processes can read from the database, but no processes can write (access_mode = 'READ_ONLY').

When using option 1, DuckDB supports multiple writer threads using a combination of MVCC (Multi‑Version Concurrency Control) and
optimistic concurrency control (see Concurrency within a Single Process), but all within that single writer process. The reason for this con‑
currency model is to allow for the caching of data in RAM for faster analytical queries, rather than going back and forth to disk during each
query. It also allows the caching of functions pointers, the database catalog, and other items so that subsequent queries on the same
connection are faster.

5
DuckDB Documentation

Note. DuckDB is optimized for bulk operations, so executing many small transactions is not a primary design goal.

Concurrency within a Single Process

DuckDB supports concurrency within a single process according to the following rules. As long as there are no write conflicts, multiple
concurrent writes will succeed. Appends will never conflict, even on the same table. Multiple threads can also simultaneously update
separate tables or separate subsets of the same table. Optimistic concurrency control comes into play when two threads attempt to edit
(update or delete) the same row at the same time. In that situation, the second thread to attempt the edit will fail with a conflict error.

Writing to DuckDB from Multiple Processes

Writing to DuckDB from multiple processes is not supported automatically and is not a primary design goal (see Handling Concurrency).

If multiple processes must write to the same file, several design patterns are possible, but would need to be implemented in application
logic. For example, each process could acquire a cross‑process mutex lock, then open the database in read/write mode and close it when the
query is complete. Instead of using a mutex lock, each process could instead retry the connection if another process is already connected to
the database (being sure to close the connection upon query completion). Another alternative would be to do multi‑process transactions
on a MySQL, PostgreSQL, or SQLite database, and use DuckDB's MySQL, PostgreSQL, or SQLite extensions to execute analytical queries on
that data periodically.

Additional options include writing data to Parquet files and using DuckDB's ability to read multiple Parquet files, taking a similar approach
with CSV files, or creating a web server to receive requests and manage reads and writes to DuckDB.

6
Data Import

Importing Data

The first step to using a database system is to insert data into that system. DuckDB provides several data ingestion methods that allow you
to easily and efficiently fill up the database. In this section, we provide an overview of these methods so you can select which one is correct
for you.

Insert Statements

Insert statements are the standard way of loading data into a database system. They are suitable for quick prototyping, but should be
avoided for bulk loading as they have significant per‑row overhead.

INSERT INTO people VALUES (1, 'Mark');

For a more detailed description, see the page on the INSERT statement.

CSV Loading

Data can be efficiently loaded from CSV files using the read_csv function or the COPY statement.

SELECT * FROM read_csv('test.csv');

You can also load data from compressed (e.g., compressed with gzip) CSV files, for example:

SELECT * FROM read_csv('test.csv.gz');

For more details, see the page on CSV loading.

Parquet Loading

Parquet files can be efficiently loaded and queried using the read_parquet function.

SELECT * FROM read_parquet('test.parquet');

For more details, see the page on Parquet loading.

JSON Loading

JSON files can be efficiently loaded and queried using the read_json_auto function.

SELECT * FROM read_json_auto('test.json');

For more details, see the page on JSON loading.

Appender

In several APIs (C, C++, Go, Java, and Rust), the Appender can be used as an alternative for bulk data loading. This class can be used to
efficiently add rows to the database system without using SQL statements.

7
DuckDB Documentation

CSV Files

CSV Import

Examples

The following examples use the flights.csv file.

-- read a CSV file from disk, auto-infer options

SELECT * FROM 'flights.csv';
-- read_csv with custom options
SELECT * FROM read_csv('flights.csv',
delim = '|',
header = true,
columns = {
'FlightDate': 'DATE',
'UniqueCarrier': 'VARCHAR',
'OriginCityName': 'VARCHAR',
'DestCityName': 'VARCHAR'
});

# read a CSV from stdin, auto-infer options

cat flights.csv | duckdb -c "SELECT * FROM read_csv('/dev/stdin')"

-- read a CSV file into a table

CREATE TABLE ontime (
FlightDate DATE,
UniqueCarrier VARCHAR,
OriginCityName VARCHAR,
DestCityName VARCHAR
);
COPY ontime FROM 'flights.csv';

-- alternatively, create a table without specifying the schema manually

CREATE TABLE ontime AS SELECT * FROM 'flights.csv';
-- we can use the FROM-first syntax to omit 'SELECT *'
CREATE TABLE ontime AS FROM 'flights.csv';

-- write the result of a query to a CSV file

COPY (SELECT * FROM ontime) TO 'flights.csv' WITH (HEADER true, DELIMITER '|');
-- if we serialize the entire table, we can simply refer to it with its name
COPY ontime TO 'flights.csv' WITH (HEADER true, DELIMITER '|');

CSV Loading

CSV loading, i.e., importing CSV files to the database, is a very common, and yet surprisingly tricky, task. While CSVs seem simple on the
surface, there are a lot of inconsistencies found within CSV files that can make loading them a challenge. CSV files come in many different
varieties, are often corrupt, and do not have a schema. The CSV reader needs to cope with all of these different situations.

The DuckDB CSV reader can automatically infer which configuration flags to use by analyzing the CSV file using the CSV sniffer. This will
work correctly in most situations, and should be the first option attempted. In rare situations where the CSV reader cannot figure out the
correct configuration it is possible to manually configure the CSV reader to correctly parse the CSV file. See the auto detection page for
more information.

Parameters

Below are parameters that can be passed to the CSV reader. These parameters are accepted by both the COPY statement and the read_
csv function.

8
DuckDB Documentation

Name Description Type Default

all_varchar Option to skip type detection for CSV parsing and assume BOOL false
all columns to be of type VARCHAR.
allow_quoted_nulls Option to allow the conversion of quoted values to NULL BOOL true
values
auto_detect Enables auto detection of CSV parameters. BOOL true
auto_type_candidates This option allows you to specify the types that the sniffer TYPE[] ['SQLNULL',
will use when detecting CSV column types, e.g., SELECT 'BOOLEAN',
* FROM read_csv('csv_file.csv', auto_ 'BIGINT',
type_candidates=['BIGINT', 'DATE']). The 'DOUBLE',
VARCHAR type is always included in the detected types (as 'TIME',
a fallback option). 'DATE',
'TIMESTAMP',
'VARCHAR']
columns A struct that specifies the column names and column types STRUCT (empty)
contained within the CSV file (e.g., {'col1':
'INTEGER', 'col2': 'VARCHAR'}). Using this
option implies that auto detection is not used.
compression The compression type for the file. By default this will be VARCHAR auto
detected automatically from the file extension (e.g.,
t.csv.gz will use gzip, t.csv will use none). Options
are none, gzip, zstd.
dateformat Specifies the date format to use when parsing dates. See VARCHAR (empty)
Date Format.
decimal_separator The decimal separator of numbers. VARCHAR .
delim or sep Specifies the string that separates columns within each VARCHAR ,
row (line) of the file.
escape Specifies the string that should appear before a data VARCHAR "
character sequence that matches the quote value.
filename Whether or not an extra filename column should be BOOL false
included in the result.
force_not_null Do not match the specified columns' values against the VARCHAR[] []
NULL string. In the default case where the NULL string is
empty, this means that empty values will be read as
zero‑length strings rather than NULLs.
header Specifies that the file contains a header line with the BOOL false
names of each column in the file.
hive_partitioning Whether or not to interpret the path as a Hive partitioned BOOL false
path.
ignore_errors Option to ignore any parsing errors encountered ‑ and BOOL false
instead ignore rows with errors.
max_line_size The maximum line size in bytes. BIGINT 2097152
names The column names as a list, see example. VARCHAR[] (empty)
new_line Set the new line character(s) in the file. Options are VARCHAR (empty)
'\r','\n', or '\r\n'.

9
DuckDB Documentation

Name Description Type Default

normalize_names Boolean value that specifies whether or not column names BOOL false
should be normalized, removing any non‑alphanumeric
characters from them.
null_padding If this option is enabled, when a row lacks columns, it will BOOL false
pad the remaining columns on the right with null values.
nullstr Specifies the string that represents a NULL value. VARCHAR (empty)
parallel Whether or not the parallel CSV reader is used. BOOL true
quote Specifies the quoting string to be used when a data value is VARCHAR "
quoted.
sample_size The number of sample rows for auto detection of BIGINT 20480
parameters.
skip The number of lines at the top of the file to skip. BIGINT 0
timestampformat Specifies the date format to use when parsing timestamps. VARCHAR (empty)
See Date Format
types or dtypes The column types as either a list (by position) or a struct VARCHAR[] or (empty)
(by name). Example here. STRUCT
union_by_name Whether the columns of multiple schemas should be BOOL false
unified by name, rather than by position.

CSV Functions

Note. Deprecated DuckDB v0.10.0 introduced breaking changes to the read_csv function. Namely, The read_csv function now
attempts auto‑detecting the CSV parameters, making its behavior identical to the old read_csv_auto function. If you would like to
use read_csv with its old behavior, turn off the auto‑detection manually by using read_csv(..., auto_detect = false).

The read_csv automatically attempts to figure out the correct configuration of the CSV reader using the CSV sniffer. It also automatically
deduces types of columns. If the CSV file has a header, it will use the names found in that header to name the columns. Otherwise, the
columns will be named column0, column1, column2, .... An example with the flights.csv file:

SELECT * FROM read_csv('flights.csv');

FlightDate UniqueCarrier OriginCityName DestCityName

1988‑01‑01 AA New York, NY Los Angeles, CA

1988‑01‑02 AA New York, NY Los Angeles, CA
1988‑01‑03 AA New York, NY Los Angeles, CA

The path can either be a relative path (relative to the current working directory) or an absolute path.

We can use read_csv to create a persistent table as well:

CREATE TABLE ontime AS SELECT * FROM read_csv('flights.csv');

DESCRIBE ontime;

Field Type Null Key Default Extra

FlightDate DATE YES NULL NULL NULL

10
DuckDB Documentation

Field Type Null Key Default Extra

UniqueCarrier VARCHAR YES NULL NULL NULL

OriginCityName VARCHAR YES NULL NULL NULL
DestCityName VARCHAR YES NULL NULL NULL

SELECT * FROM read_csv('flights.csv', sample_size = 20000);

If we set delim/sep, quote, escape, or header explicitly, we can bypass the automatic detection of this particular parameter:

SELECT * FROM read_csv('flights.csv', header = true);

Multiple files can be read at once by providing a glob or a list of files. Refer to the multiple files section for more information.

Writing Using the COPY Statement

The COPY statement can be used to load data from a CSV file into a table. This statement has the same syntax as the one used in PostgreSQL.
To load the data using the COPY statement, we must first create a table with the correct schema (which matches the order of the columns
in the CSV file and uses types that fit the values in the CSV file). COPY detects the CSV's configuration options automatically.

CREATE TABLE ontime (

flightdate DATE,
uniquecarrier VARCHAR,
origincityname VARCHAR,
destcityname VARCHAR
);
COPY ontime FROM 'flights.csv';
SELECT * FROM ontime;

flightdate uniquecarrier origincityname destcityname

1988‑01‑01 AA New York, NY Los Angeles, CA

1988‑01‑02 AA New York, NY Los Angeles, CA
1988‑01‑03 AA New York, NY Los Angeles, CA

If we want to manually specify the CSV format, we can do so using the configuration options of COPY.

CREATE TABLE ontime (flightdate DATE, uniquecarrier VARCHAR, origincityname VARCHAR, destcityname
VARCHAR);
COPY ontime FROM 'flights.csv' (DELIMITER '|', HEADER);
SELECT * FROM ontime;

Reading Faulty CSV Files

DuckDB supports reading erroneous CSV files. For details, see the Reading Faulty CSV Files page.

Limitations

The CSV reader only supports input files using UTF‑8 character encoding. For CSV files using different encodings, use e.g. the iconv
command‑line tool to convert them to UTF‑8.

11
DuckDB Documentation

CSV Auto Detection

When using read_csv, the system tries to automatically infer how to read the CSV file using the CSV sniffer. This step is necessary because
CSV files are not self‑describing and come in many different dialects. The auto‑detection works roughly as follows:

• Detect the dialect of the CSV file (delimiter, quoting rule, escape)
• Detect the types of each of the columns
• Detect whether or not the file has a header row

By default the system will try to auto‑detect all options. However, options can be individually overridden by the user. This can be useful in
case the system makes a mistake. For example, if the delimiter is chosen incorrectly, we can override it by calling the read_csv with an
explicit delimiter (e.g., read_csv('file.csv', delim = '|')).

The detection works by operating on a sample of the file. The size of the sample can be modified by setting the sample_size parameter.
The default sample size is 20480 rows. Setting the sample_size parameter to -1 means the entire file is read for sampling. The way
sampling is performed depends on the type of file. If we are reading from a regular file on disk, we will jump into the file and try to sample
from different locations in the file. If we are reading from a file in which we cannot jump ‑ such as a .gz compressed CSV file or stdin ‑
samples are taken only from the beginning of the file.

sniff_csv Function

It is possible to run the CSV sniffer as a separate step using the sniff_csv(filename) function, which returns the detected CSV prop‑
erties as a table with a single row. The sniff_csv function accepts an optional sample_size parameter to configure the number of
rows sampled.

FROM sniff_csv('my_file.csv');
FROM sniff_csv('my_file.csv', sample_size = 1000);

Column name Description Example

Delimiter delimiter ,
Quote quote character "
Escape escape \
NewLineDelimiter new‑line delimiter \r\n
SkipRow number of rows skipped 1
HasHeader whether the CSV has a header true
Columns column types encoded as a LIST of ({'name': 'VARCHAR', 'age': 'BIGINT'})
STRUCTs
DateFormat date Format %d/%m/%Y
TimestampFormat timestamp Format %Y-%m-%dT%H:%M:%S.%f
UserArguments arguments used to invoke sniff_csv sample_size = 1000
Prompt prompt ready to be used to read the CSV FROM read_csv('my_file.csv', auto_
detect=false, delim=',', ...)

Prompt The Prompt column contains a SQL command with the configurations detected by the sniffer.

-- use line mode in CLI to get the full command

.mode line
SELECT Prompt FROM sniff_csv('my_file.csv');

Prompt = FROM read_csv('my_file.csv', auto_detect=false, delim=',', quote='"', escape='"', new_

line='\n', skip=0, header=true, columns={...});

12
DuckDB Documentation

Detection Steps

Dialect Detection Dialect detection works by attempting to parse the samples using the set of considered values. The detected dialect
is the dialect that has (1) a consistent number of columns for each row, and (2) the highest number of columns for each row.

The following dialects are considered for automatic dialect detection.

Parameters Considered values

delim , | ; \t
quote " ' (empty)
escape " ' \ (empty)

Consider the example file flights.csv:

In this file, the dialect detection works as follows:

• If we split by a | every row is split into 4 columns

• If we split by a , rows 2‑4 are split into 3 columns, while the first row is split into 1 column
• If we split by ;, every row is split into 1 column
• If we split by \t, every row is split into 1 column

In this example ‑ the system selects the | as the delimiter. All rows are split into the same amount of columns, and there is more than one
column per row meaning the delimiter was actually found in the CSV file.

Type Detection After detecting the dialect, the system will attempt to figure out the types of each of the columns. Note that this step
is only performed if we are calling read_csv. In case of the COPY statement the types of the table that we are copying into will be used
instead.

The type detection works by attempting to convert the values in each column to the candidate types. If the conversion is unsuccessful, the
candidate type is removed from the set of candidate types for that column. After all samples have been handled ‑ the remaining candidate
type with the highest priority is chosen. The set of considered candidate types in order of priority is given below:

Types

BOOLEAN
BIGINT
DOUBLE
TIME
DATE
TIMESTAMP
VARCHAR

Note everything can be cast to VARCHAR. This type has the lowest priority ‑ i.e., columns are converted to VARCHAR if they cannot be cast
to anything else. In flights.csv the FlightDate column will be cast to a DATE, while the other columns will be cast to VARCHAR.

13
DuckDB Documentation

The detected types can be individually overridden using the types option. This option takes either a list of types (e.g., types=[INT,
VARCHAR, DATE]) which overrides the types of the columns in‑order of occurrence in the CSV file. Alternatively, types takes a name
-> type map which overrides options of individual columns (e.g., types={'quarter': INT}).

The type detection can be entirely disabled by using the all_varchar option. If this is set all columns will remain as VARCHAR (as they
originally occur in the CSV file).

Header Detection

Header detection works by checking if the candidate header row deviates from the other rows in the file in terms of types. For example,
in flights.csv, we can see that the header row consists of only VARCHAR columns ‑ whereas the values contain a DATE value for the
FlightDate column. As such ‑ the system defines the first row as the header row and extracts the column names from the header row.

In files that do not have a header row, the column names are generated as column0, column1, etc.

Note that headers cannot be detected correctly if all columns are of type VARCHAR ‑ as in this case the system cannot distinguish the header
row from the other rows in the file. In this case the system assumes the file has no header. This can be overridden using the header
option.

Dates and Timestamps DuckDB supports the ISO 8601 format format by default for timestamps, dates and times. Unfortunately, not all
dates and times are formatted using this standard. For that reason, the CSV reader also supports the dateformat and timestampfor-
mat options. Using this format the user can specify a format string that specifies how the date or timestamp should be read.

As part of the auto‑detection, the system tries to figure out if dates and times are stored in a different representation. This is not always
possible ‑ as there are ambiguities in the representation. For example, the date 01-02-2000 can be parsed as either January 2nd or
February 1st. Often these ambiguities can be resolved. For example, if we later encounter the date 21-02-2000 then we know that the
format must have been DD-MM-YYYY. MM-DD-YYYY is no longer possible as there is no 21nd month.

If the ambiguities cannot be resolved by looking at the data the system has a list of preferences for which date format to use. If the system
choses incorrectly, the user can specify the dateformat and timestampformat options manually.

The system considers the following formats for dates (dateformat). Higher entries are chosen over lower entries in case of ambiguities
(i.e., ISO 8601 is preferred over MM-DD-YYYY).

dateformat

ISO 8601
%y-%m-%d
%Y-%m-%d
%d-%m-%y
%d-%m-%Y
%m-%d-%y
%m-%d-%Y

The system considers the following formats for timestamps (timestampformat). Higher entries are chosen over lower entries in case of
ambiguities.

timestampformat

ISO 8601
%y-%m-%d %H:%M:%S
%Y-%m-%d %H:%M:%S

14
DuckDB Documentation

timestampformat

%d-%m-%y %H:%M:%S
%d-%m-%Y %H:%M:%S
%m-%d-%y %I:%M:%S %p
%m-%d-%Y %I:%M:%S %p
%Y-%m-%d %H:%M:%S.%f

Reading Faulty CSV Files

Reading erroneous CSV files is possible by utilizing the ignore_errors option. With that option set, rows containing data that would
otherwise cause the CSV Parser to generate an error will be ignored.

Using the ignore_errors Option

For example, consider the following CSV file, faulty.csv:

Pedro,31
Oogie Boogie, three

If you read the CSV file, specifying that the first column is a VARCHAR and the second column is an INTEGER, loading the file would fail, as
the string three cannot be converted to an INTEGER.

For example, the following query will throw a casting error.

FROM read_csv('faulty.csv', columns = {'name': 'VARCHAR', 'age': 'INTEGER'});

However, with ignore_errors set, the second row of the file is skipped, outputting only the complete first row. For example:

FROM read_csv(
'faulty.csv',
columns = {'name': 'VARCHAR', 'age': 'INTEGER'},
ignore_errors = true
);

Outputs:

name age

Pedro 31

One should note that the CSV Parser is affected by the projection pushdown optimization. Hence, if we were to select only the name column,
both rows would be considered valid, as the casting error on the age would never occur. For example:

SELECT name
FROM read_csv('faulty.csv', columns = {'name': 'VARCHAR', 'age': 'INTEGER'});

Outputs:

name

Pedro
Oogie Boogie

15
DuckDB Documentation

Retrieving Faulty CSV Lines

Being able to read faulty CSV files is important, but for many data cleaning operations, it is also necessary to know exactly which lines are
corrupted and what errors the parser discovered on them. For scenarios like these, it is possible to use DuckDB's CSV Rejects Table feature.
It is important to note that the Rejects Table can only be used when ignore_errors is set, and currently, only stores casting errors and
does not save errors when the number of columns differ.

The CSV Rejects Table returns the following information:

Column name Description Type

file File path. VARCHAR

line Line number, from the CSV File, where the error occured. INTEGER
column Column number, from the CSV File, where the error occured. INTEGER
column_name Column name, from the CSV File, where the error occured. VARCHAR
parsed_value The value, where the casting error happened, in a string format. VARCHAR
recovery_columns An optional primary key of the CSV File. STRUCT {NAME:
VALUE}
error Exact error encountered by the parser. VARCHAR

Parameters

The parameters listed below are used in the read_csv function to configure the CSV Rejects Table.

Name Description Type Default

rejects_table Name of a temporary table where the information of the VARCHAR (empty)
faulty lines of a CSV file are stored.
rejects_limit Upper limit on the number of faulty records from a CSV file BIGINT 0
that will be recorded in the rejects table. 0 is used when no
limit should be applied.
rejects_recovery_ Column values that serve as a primary key to the CSV file. VARCHAR[] (empty)
columns The are stored in the CSV Rejects Table to help identify the
faulty tuples.

To store the information of the faulty CSV lines in a rejects table, the user must simply provide the rejects table name in therejects_
table option. For example:

FROM read_csv(
'faulty.csv',
columns = {'name': 'VARCHAR', 'age': 'INTEGER'},
rejects_table = 'rejects_table',
ignore_errors = true
);

You can then query the rejects_table table, to retrieve information about the rejected tuples. For example:

FROM rejects_table;

Outputs:

16
DuckDB Documentation

file line column column_name parsed_value error

faulty.csv 2 1 age three Could not convert string ' three' to 'INTEGER'

Additionally, the name column could also be provided as a primary key via the rejects_recovery_columns option to provide more
information over the faulty lines. For example:

FROM read_csv(
'faulty.csv',
columns = {'name': 'VARCHAR', 'age': 'INTEGER'},
rejects_table = 'rejects_table',
rejects_recovery_columns = '[name]',
ignore_errors = true
);

Reading from the rejects_table will return:

column_ parsed_
file line column name value recovery_columns error

faulty.csv 2 1 age three {'name': 'Oogie Could not convert string ' three' to 'INTEGER'
Boogie'}

CSV Import Tips

Below is a collection of tips to help when attempting to import complex CSV files. In the examples, we use the flights.csv file.

Override the Header Flag if the Header Is Not Correctly Detected If a file contains only string columns the header auto‑detection
might fail. Provide the header option to override this behavior.

SELECT * FROM read_csv('flights.csv', header = true);

Provide Names if the File Does Not Contain a Header If the file does not contain a header, names will be auto‑generated by default.
You can provide your own names with the names option.

SELECT * FROM read_csv('flights.csv', names = ['DateOfFlight', 'CarrierName']);

Override the Types of Specific Columns The types flag can be used to override types of only certain columns by providing a struct of
name -> type mappings.

SELECT * FROM read_csv('flights.csv', types = {'FlightDate': 'DATE'});

Use COPY When Loading Data into a Table The COPY statement copies data directly into a table. The CSV reader uses the schema of
the table instead of auto‑detecting types from the file. This speeds up the auto‑detection, and prevents mistakes from being made during
auto‑detection.

COPY tbl FROM 'test.csv';

Use union_by_name When Loading Files with Different Schemas The union_by_name option can be used to unify the schema of
files that have different or missing columns. For files that do not have certain columns, NULL values are filled in.

SELECT * FROM read_csv('flights*.csv', union_by_name = true);

17
DuckDB Documentation

JSON Files

JSON Loading

Examples

-- read a JSON file from disk, auto-infer options

SELECT * FROM 'todos.json';
-- read_json with custom options
SELECT *
FROM read_json('todos.json',
format = 'array',
columns = {userId: 'UBIGINT',
id: 'UBIGINT',
title: 'VARCHAR',
completed: 'BOOLEAN'});

-- read a JSON file from stdin, auto-infer options

cat data/json/todos.json | duckdb -c "SELECT * FROM read_json_auto('/dev/stdin')"

-- read a JSON file into a table

CREATE TABLE todos (userId UBIGINT, id UBIGINT, title VARCHAR, completed BOOLEAN);
COPY todos FROM 'todos.json';
-- alternatively, create a table without specifying the schema manually
CREATE TABLE todos AS SELECT * FROM 'todos.json';

-- write the result of a query to a JSON file

COPY (SELECT * FROM todos) TO 'todos.json';

JSON Loading

JSON is an open standard file format and data interchange format that uses human‑readable text to store and transmit data objects con‑
sisting of attribute–value pairs and arrays (or other serializable values). While it is not a very efficient format for tabular data, it is very
commonly used, especially as a data interchange format.

The DuckDB JSON reader can automatically infer which configuration flags to use by analyzing the JSON file. This will work correctly in most
situations, and should be the first option attempted. In rare situations where the JSON reader cannot figure out the correct configuration,
it is possible to manually configure the JSON reader to correctly parse the JSON file.

Below are parameters that can be passed in to the JSON reader.

Parameters

Name Description Type Default

auto_detect Whether to auto‑detect detect the names of the keys and BOOL false
data types of the values automatically
columns A struct that specifies the key names and value types STRUCT (empty)
contained within the JSON file (e.g., {key1:
'INTEGER', key2: 'VARCHAR'}). If auto_detect
is enabled these will be inferred

18
DuckDB Documentation

Name Description Type Default

compression The compression type for the file. By default this will be VARCHAR 'auto'
detected automatically from the file extension (e.g.,
t.json.gz will use gzip, t.json will use none). Options
are 'none', 'gzip', 'zstd', and 'auto'.
convert_strings_to_ Whether strings representing integer values should be BOOL false
integers converted to a numerical type.
dateformat Specifies the date format to use when parsing dates. See VARCHAR 'iso'
Date Format
filename Whether or not an extra filename column should be BOOL false
included in the result.
format Can be one of ['auto', 'unstructured', VARCHAR 'array'
'newline_delimited', 'array']
hive_partitioning Whether or not to interpret the path as a Hive partitioned BOOL false
path.
ignore_errors Whether to ignore parse errors (only possible when BOOL false
format is 'newline_delimited')
maximum_depth Maximum nesting depth to which the automatic schema BIGINT -1
detection detects types. Set to ‑1 to fully detect nested
JSON types
maximum_object_size The maximum size of a JSON object (in bytes) UINTEGER 16777216
records Can be one of ['auto', 'true', 'false'] VARCHAR 'records'
sample_size Option to define number of sample objects for automatic UBIGINT 20480
JSON type detection. Set to ‑1 to scan the entire input file
timestampformat Specifies the date format to use when parsing timestamps. VARCHAR 'iso'
See Date Format
union_by_name Whether the schema's of multiple JSON files should be BOOL false
unified.

When using read_json_auto, every parameter that supports auto‑detection is enabled.

Examples of Format Settings

The JSON extension can attempt to determine the format of a JSON file when setting format to auto. Here are some example JSON files
and the corresponding format settings that should be used.

In each of the below cases, the format setting was not needed, as DuckDB was able to infer it correctly, but it is included for illustrative
purposes. A query of this shape would work in each case:

SELECT *
FROM filename.json;

Format: newline_delimited With format = 'newline_delimited' newline‑delimited JSON can be parsed. Each line is a
JSON.

{"key1":"value1", "key2": "value1"}

{"key1":"value2", "key2": "value2"}
{"key1":"value3", "key2": "value3"}

19
DuckDB Documentation

SELECT *
FROM read_json_auto('records.json', format = 'newline_delimited');

key1 key2

value1 value1
value2 value2
value3 value3

Format: array If the JSON file contains a JSON array of objects (pretty‑printed or not), array_of_objects may be used.

[
{"key1":"value1", "key2": "value1"},
{"key1":"value2", "key2": "value2"},
{"key1":"value3", "key2": "value3"}
]

SELECT *
FROM read_json_auto('array.json', format = 'array');

1 1 delectus aut autem false

1 2 quis ut nam facilis et officia qui false
1 3 fugiat veniam minus false
1 4 et porro tempora true
1 5 laboriosam mollitia et enim quasi adipisci quia provident illum false

The path can either be a relative path (relative to the current working directory) or an absolute path.

We can use read_json_auto to create a persistent table as well:

CREATE TABLE todos AS

SELECT *
FROM read_json_auto('todos.json');
DESCRIBE todos;

column_name column_type null key default extra

userId UBIGINT YES

id UBIGINT YES
title VARCHAR YES
completed BOOLEAN YES

If we specify the columns, we can bypass the automatic detection. Note that not all columns need to be specified:

SELECT *
FROM read_json_auto('todos.json',
columns = {userId: 'UBIGINT',
completed: 'BOOLEAN'});

Multiple files can be read at once by providing a glob or a list of files. Refer to the multiple files section for more information.

COPY Statement

The COPY statement can be used to load data from a JSON file into a table. For the COPY statement, we must first create a table with the
correct schema to load the data into. We then specify the JSON file to load from plus any configuration options separately.

CREATE TABLE todos (userId UBIGINT, id UBIGINT, title VARCHAR, completed BOOLEAN);
COPY todos FROM 'todos.json';
SELECT * FROM todos LIMIT 5;

22
DuckDB Documentation

userId id title completed

1 1 delectus aut autem false

1 2 quis ut nam facilis et officia qui false
1 3 fugiat veniam minus false
1 4 et porro tempora true
1 5 laboriosam mollitia et enim quasi adipisci quia provident illum false

For more details, see the page on the COPY statement.

Multiple Files

Reading Multiple Files

DuckDB can read multiple files of different types (CSV, Parquet, JSON files) at the same time using either the glob syntax, or by providing a
list of files to read. See the combining schemas page for tips on reading files with different schemas.

CSV

-- read all files with a name ending in ".csv" in the folder "dir"
SELECT * FROM 'dir/*.csv';
-- read all files with a name ending in ".csv", two directories deep
SELECT * FROM '*/*/*.csv';
-- read all files with a name ending in ".csv", at any depth in the folder "dir"
SELECT * FROM 'dir/**/*.csv';
-- read the CSV files 'flights1.csv' and 'flights2.csv'
SELECT * FROM read_csv(['flights1.csv', 'flights2.csv']);
-- read the CSV files 'flights1.csv' and 'flights2.csv', unifying schemas by name and outputting a
`filename` column
SELECT * FROM read_csv(['flights1.csv', 'flights2.csv'], union_by_name = true, filename = true);

Parquet

-- read all files that match the glob pattern

SELECT * FROM 'test/*.parquet';
-- read 3 Parquet files and treat them as a single table
SELECT * FROM read_parquet(['file1.parquet', 'file2.parquet', 'file3.parquet']);
-- Read all Parquet files from 2 specific folders
SELECT * FROM read_parquet(['folder1/*.parquet', 'folder2/*.parquet']);
-- read all Parquet files that match the glob pattern at any depth
SELECT * FROM read_parquet('dir/**/*.parquet');

Multi‑File Reads and Globs

DuckDB can also read a series of Parquet files and treat them as if they were a single table. Note that this only works if the Parquet files
have the same schema. You can specify which Parquet files you want to read using a list parameter, glob pattern matching syntax, or a
combination of both.

23
DuckDB Documentation

List Parameter The read_parquet function can accept a list of filenames as the input parameter.

-- read 3 Parquet files and treat them as a single table

SELECT * FROM read_parquet(['file1.parquet', 'file2.parquet', 'file3.parquet']);

Glob Syntax Any file name input to the read_parquet function can either be an exact filename, or use a glob syntax to read multiple files
that match a pattern.

Wildcard Description

* matches any number of any characters (including none)

** matches any number of subdirectories (including none)
? matches any single character
[abc] matches one character given in the bracket
[a-z] matches one character from the range given in the bracket

Note that the ? wildcard in globs is not supported for reads over S3 due to HTTP encoding issues.

Here is an example that reads all the files that end with .parquet located in the test folder:

-- read all files that match the glob pattern

SELECT * FROM read_parquet('test/*.parquet');

List of Globs The glob syntax and the list input parameter can be combined to scan files that meet one of multiple patterns.

-- Read all Parquet files from 2 specific folders

SELECT * FROM read_parquet(['folder1/*.parquet', 'folder2/*.parquet']);

DuckDB can read multiple CSV files at the same time using either the glob syntax, or by providing a list of files to read.

Filename

The filename argument can be used to add an extra filename column to the result that indicates which row came from which file. For
example:

SELECT * FROM read_csv(['flights1.csv', 'flights2.csv'], union_by_name = true, filename = true);

FlightDate OriginCityName DestCityName UniqueCarrier filename

1988‑01‑01 New York, NY Los Angeles, CA NULL flights1.csv

1988‑01‑02 New York, NY Los Angeles, CA NULL flights1.csv
1988‑01‑03 New York, NY Los Angeles, CA AA flights2.csv

Glob Function to Find Filenames

The glob pattern matching syntax can also be used to search for filenames using the glob table function. It accepts one parameter: the
path to search (which may include glob patterns).

-- Search the current directory for all files

SELECT * FROM glob('*');

24
DuckDB Documentation

file

duckdb.exe
test.csv
test.json
test.parquet
test2.csv
test2.parquet
todos.json

Combining Schemas

Examples

-- read a set of CSV files combining columns by position

SELECT * FROM read_csv('flights*.csv');
-- read a set of CSV files combining columns by name
SELECT * FROM read_csv('flights*.csv', union_by_name = true);

Combining Schemas

When reading from multiple files, we have to combine schemas from those files. That is because each file has its own schema that can
differ from the other files. DuckDB offers two ways of unifying schemas of multiple files: by column position and by column name.

By default, DuckDB reads the schema of the first file provided, and then unifies columns in subsequent files by column position. This works
correctly as long as all files have the same schema. If the schema of the files differs, you might want to use the union_by_name option
to allow DuckDB to construct the schema by reading all of the names instead.

Below is an example of how both methods work.

Union by Position

By default, DuckDB unifies the columns of these different files by position. This means that the first column in each file is combined
together, as well as the second column in each file, etc. For example, consider the following two files.

flights1.csv:

flights2.csv:

Reading the two files at the same time will produce the following result set:

FlightDate UniqueCarrier OriginCityName DestCityName

1988‑01‑01 AA New York, NY Los Angeles, CA

1988‑01‑02 AA New York, NY Los Angeles, CA

25
DuckDB Documentation

FlightDate UniqueCarrier OriginCityName DestCityName

1988‑01‑03 AA New York, NY Los Angeles, CA

This is equivalent to the SQL construct UNION ALL.

Union by Name

If you are processing multiple files that have different schemas, perhaps because columns have been added or renamed, it might be de‑
sirable to unify the columns of different files by name instead. This can be done by providing the union_by_name option. For example,
consider the following two files, where flights4.csv has an extra column (UniqueCarrier).

flights3.csv:

flights4.csv:

Reading these when unifying column names by position results in an error ‑ as the two files have a different number of columns. When
specifying the union_by_name option, the columns are correctly unified, and any missing values are set to NULL.

SELECT * FROM read_csv(['flights3.csv', 'flights4.csv'], union_by_name = true);

FlightDate OriginCityName DestCityName UniqueCarrier

1988‑01‑01 New York, NY Los Angeles, CA NULL

1988‑01‑02 New York, NY Los Angeles, CA NULL
1988‑01‑03 New York, NY Los Angeles, CA AA

This is equivalent to the SQL construct UNION ALL BY NAME.

Parquet Files

Reading and Writing Parquet Files

Examples

-- read a single Parquet file

SELECT * FROM 'test.parquet';
-- figure out which columns/types are in a Parquet file
DESCRIBE SELECT * FROM 'test.parquet';
-- create a table from a Parquet file
CREATE TABLE test AS SELECT * FROM 'test.parquet';
-- if the file does not end in ".parquet", use the read_parquet function
SELECT * FROM read_parquet('test.parq');
-- use list parameter to read 3 Parquet files and treat them as a single table
SELECT * FROM read_parquet(['file1.parquet', 'file2.parquet', 'file3.parquet']);
-- read all files that match the glob pattern

26
DuckDB Documentation

SELECT * FROM 'test/*.parquet';

-- read all files that match the glob pattern, and include a "filename" column
-- that specifies which file each row came from
SELECT * FROM read_parquet('test/*.parquet', filename = true);
-- use a list of globs to read all Parquet files from 2 specific folders
SELECT * FROM read_parquet(['folder1/*.parquet', 'folder2/*.parquet']);
-- read over https
SELECT * FROM read_parquet('https://some.url/some_file.parquet');
-- query the metadata of a Parquet file
SELECT * FROM parquet_metadata('test.parquet');
-- query the schema of a Parquet file
SELECT * FROM parquet_schema('test.parquet');

-- write the results of a query to a Parquet file using the default compression (Snappy)
COPY
(SELECT * FROM tbl)
TO 'result-snappy.parquet'
(FORMAT 'parquet');

-- write the results from a query to a Parquet file with specific compression and row group size
COPY
(FROM generate_series(100_000))
TO 'test.parquet'
(FORMAT 'parquet', COMPRESSION 'zstd', ROW_GROUP_SIZE 100_000);

-- export the table contents of the entire database as parquet

EXPORT DATABASE 'target_directory' (FORMAT PARQUET);

Parquet Files

Parquet files are compressed columnar files that are efficient to load and process. DuckDB provides support for both reading and writing
Parquet files in an efficient manner, as well as support for pushing filters and projections into the Parquet file scans.

Note. Parquet data sets differ based on the number of files, the size of individual files, the compression algorithm used row group
size, etc. These have a significant effect on performance. Please consult the Performance Guide for details.

read_parquet Function

Function Description Example

read_parquet( path(s), Read Parquet file(s) SELECT * FROM read_parquet('test.parquet');

*)
parquet_scan( path(s), Alias for read_parquet SELECT * FROM parquet_scan('test.parquet');
*)

If your file ends in .parquet, the function syntax is optional. The system will automatically infer that you are reading a Parquet file.

SELECT * FROM 'test.parquet';

Multiple files can be read at once by providing a glob or a list of files. Refer to the multiple files section for more information.

Parameters There are a number of options exposed that can be passed to the read_parquet function or the COPY statement.

27
DuckDB Documentation

Name Description Type Default

binary_as_string Parquet files generated by legacy writers do not correctly BOOL false
set the UTF8 flag for strings, causing string columns to be
loaded as BLOB instead. Set this to true to load binary
columns as strings.
encryption_config Configuration for Parquet encryption. STRUCT ‑
filename Whether or not an extra filename column should be BOOL false
included in the result.
file_row_number Whether or not to include the file_row_number BOOL false
column.
hive_partitioning Whether or not to interpret the path as a Hive partitioned BOOL false
path.
union_by_name Whether the columns of multiple schemas should be BOOL false
unified by name, rather than by position.

Partial Reading

DuckDB supports projection pushdown into the Parquet file itself. That is to say, when querying a Parquet file, only the columns required
for the query are read. This allows you to read only the part of the Parquet file that you are interested in. This will be done automatically
by DuckDB.

DuckDB also supports filter pushdown into the Parquet reader. When you apply a filter to a column that is scanned from a Parquet file,
the filter will be pushed down into the scan, and can even be used to skip parts of the file using the built‑in zonemaps. Note that this will
depend on whether or not your Parquet file contains zonemaps.

Filter and projection pushdown provide significant performance benefits. See our blog post on this for more information.

Inserts and Views

You can also insert the data into a table or create a table from the Parquet file directly. This will load the data from the Parquet file and
insert it into the database.

-- insert the data from the Parquet file in the table

INSERT INTO people SELECT * FROM read_parquet('test.parquet');
-- create a table directly from a Parquet file
CREATE TABLE people AS SELECT * FROM read_parquet('test.parquet');

If you wish to keep the data stored inside the Parquet file, but want to query the Parquet file directly, you can create a view over the read_
parquet function. You can then query the Parquet file as if it were a built‑in table.

-- create a view over the Parquet file

CREATE VIEW people AS SELECT * FROM read_parquet('test.parquet');
-- query the Parquet file
SELECT * FROM people;

Writing to Parquet Files

DuckDB also has support for writing to Parquet files using the COPY statement syntax. See the COPY Statement page for details, including
all possible parameters for the COPY statement.

-- write a query to a snappy compressed Parquet file

COPY

28
DuckDB Documentation

(SELECT * FROM tbl)

TO 'result-snappy.parquet'
(FORMAT 'parquet')

-- write "tbl" to a zstd compressed Parquet file

COPY tbl
TO 'result-zstd.parquet'
(FORMAT 'parquet', CODEC 'zstd')

-- write a CSV file to an uncompressed Parquet file

COPY
'test.csv'
TO 'result-uncompressed.parquet'
(FORMAT 'parquet', CODEC 'uncompressed')

-- write a query to a Parquet file with ZSTD compression (same as CODEC) and row_group_size
COPY
(FROM generate_series(100_000))
TO 'row-groups-zstd.parquet'
(FORMAT PARQUET, COMPRESSION ZSTD, ROW_GROUP_SIZE 100_000);

DuckDB's EXPORT command can be used to export an entire database to a series of Parquet files. See the Export statement documentation
for more details.

-- export the table contents of the entire database as parquet

EXPORT DATABASE 'target_directory' (FORMAT PARQUET);

Encryption

DuckDB supports reading and writing encrypted Parquet files.

Installing and Loading the Parquet Extension

The support for Parquet files is enabled via extension. The parquet extension is bundled with almost all clients. However, if your client
does not bundle the parquet extension, the extension must be installed and loaded separately.

INSTALL parquet;
LOAD parquet;

Querying Parquet Metadata

Parquet Metadata

The parquet_metadata function can be used to query the metadata contained within a Parquet file, which reveals various internal
details of the Parquet file such as the statistics of the different columns. This can be useful for figuring out what kind of skipping is possible
in Parquet files, or even to obtain a quick overview of what the different columns contain.

SELECT *
FROM parquet_metadata('test.parquet');

Below is a table of the columns returned by parquet_metadata.

Field Type

file_name VARCHAR
row_group_id BIGINT

29
DuckDB Documentation

Field Type

row_group_num_rows BIGINT
row_group_num_columns BIGINT
row_group_bytes BIGINT
column_id BIGINT
file_offset BIGINT
num_values BIGINT
path_in_schema VARCHAR
type VARCHAR
stats_min VARCHAR
stats_max VARCHAR
stats_null_count BIGINT
stats_distinct_count BIGINT
stats_min_value VARCHAR
stats_max_value VARCHAR
compression VARCHAR
encodings VARCHAR
index_page_offset BIGINT
dictionary_page_offset BIGINT
data_page_offset BIGINT
total_compressed_size BIGINT
total_uncompressed_size BIGINT
key_value_metadata MAP(BLOB, BLOB)

Parquet Schema

The parquet_schema function can be used to query the internal schema contained within a Parquet file. Note that this is the schema
as it is contained within the metadata of the Parquet file. If you want to figure out the column names and types contained within a Parquet
file it is easier to use DESCRIBE.

-- fetch the column names and column types

DESCRIBE SELECT * FROM 'test.parquet';
-- fetch the internal schema of a Parquet file
SELECT *
FROM parquet_schema('test.parquet');

Below is a table of the columns returned by parquet_schema.

Field Type

file_name VARCHAR
name VARCHAR
type VARCHAR
type_length VARCHAR
repetition_type VARCHAR

30
DuckDB Documentation

Using the PRAGMA add_parquet_key function, named encryption keys of 128, 192, or 256 bits can be added to a session. These keys
are stored in‑memory.

PRAGMA add_parquet_key('key128', '0123456789112345');

PRAGMA add_parquet_key('key192', '012345678911234501234567');
PRAGMA add_parquet_key('key256', '01234567891123450123456789112345');

Writing Encrypted Parquet Files After specifying the key (e.g., key256), files can be encrypted as follows:

COPY tbl TO 'tbl.parquet' (ENCRYPTION_CONFIG {footer_key: 'key256'});

Reading Encrpyted Parquet Files An encrypted Parquet file using a specific key (e.g., key256), can then be read as follows:

COPY tbl FROM 'tbl.parquet' (ENCRYPTION_CONFIG {footer_key: 'key256'});

Or:

SELECT *
FROM read_parquet('tbl.parquet', encryption_config = {footer_key: 'key256'});

Limitations

DuckDB's Parquet encryption currently has the following limitations.

1. It is not compatible with the encryption of, e.g., PyArrow, until the missing details are implemented.

2. DuckDB encrypts the footer and all columns using the footer_key. The Parquet specification allows encryption of individual
columns with different keys, e.g.:

COPY tbl TO 'tbl.parquet'

(ENCRYPTION_CONFIG {
footer_key: 'key256',
column_keys: {key256: ['col0', 'col1']}
});

However, this is unsupported at the moment and will cause an error to be thrown (for now):

Not implemented Error: Parquet encryption_config column_keys not yet implemented

Performance Implications

Note that encryption has some performance implications. Without encryption, reading/writing the lineitem table from TPC-H at SF1,
which is 6M rows and 15 columns, from/to a Parquet file takes 0.26 and 0.99 seconds, respectively. With encryption, this takes 0.64 and 2.21
seconds, both approximately 2.5× slower than the unencrypted version.

Parquet Tips

Below is a collection of tips to help when dealing with Parquet files.

Tips for Reading Parquet Files

SELECT *
FROM read_parquet('flights*.parquet', union_by_name = true);

32
DuckDB Documentation

Tips for Writing Parquet Files

Enabling PER_THREAD_OUTPUT If the final number of Parquet files is not important, writing one file per thread can significantly im‑
prove performance. Using a glob pattern upon read or a Hive partitioning structure are good ways to transparently handle multiple files.

COPY
(FROM generate_series(10_000_000))
TO 'test.parquet'
(FORMAT PARQUET, PER_THREAD_OUTPUT true);

Selecting a ROW_GROUP_SIZE The ROW_GROUP_SIZE parameter specifies the minimum number of rows in a Parquet row group, with
a minimum value equal to DuckDB's vector size (currently 2048, but adjustable when compiling DuckDB), and a default of 122,880. A Parquet
row group is a partition of rows, consisting of a column chunk for each column in the dataset.

Compression algorithms are only applied per row group, so the larger the row group size, the more opportunities to compress the data.
DuckDB can read Parquet row groups in parallel even within the same file and uses predicate pushdown to only scan the row groups whose
metadata ranges match the WHERE clause of the query. However there is some overhead associated with reading the metadata in each
group. A good approach would be to ensure that within each file, the total number of row groups is at least as large as the number of CPU
threads used to query that file. More row groups beyond the thread count would improve the speed of highly selective queries, but slow
down queries that must scan the whole file like aggregations.

-- write a query to a Parquet file with a different row_group_size

COPY
(FROM generate_series(100_000))
TO 'row-groups.parquet'
(FORMAT PARQUET, ROW_GROUP_SIZE 100_000);

See the Performance Guide on file formats for more tips.

Partitioning

Hive Partitioning

Examples

-- read data from a Hive partitioned data set

SELECT * FROM read_parquet('orders/*/*/*.parquet', hive_partitioning = true);
-- write a table to a Hive partitioned data set
COPY orders TO 'orders' (FORMAT PARQUET, PARTITION_BY (year, month));

Hive Partitioning

Hive partitioning is a partitioning strategy that is used to split a table into multiple files based on partition keys. The files are organized
into folders. Within each folder, the partition key has a value that is determined by the name of the folder.

Below is an example of a Hive partitioned file hierarchy. The files are partitioned on two keys (year and month).

orders
├── year=2021
│ ├── month=1
│ │ ├── file1.parquet
│ │ └── file2.parquet
│ └── month=2
│ └── file3.parquet
└── year=2022
├── month=11

33
DuckDB Documentation

│ ├── file4.parquet
│ └── file5.parquet
└── month=12
└── file6.parquet

Files stored in this hierarchy can be read using the hive_partitioning flag.

SELECT *
FROM read_parquet('orders/*/*/*.parquet', hive_partitioning = true);

When we specify the hive_partitioning flag, the values of the columns will be read from the directories.

Filter Pushdown Filters on the partition keys are automatically pushed down into the files. This way the system skips reading files that
are not necessary to answer a query. For example, consider the following query on the above dataset:

SELECT *
FROM read_parquet('orders/*/*/*.parquet', hive_partitioning = true)
WHERE year = 2022 AND month = 11;

When executing this query, only the following files will be read:

orders
└── year=2022
└── month=11
├── file4.parquet
└── file5.parquet

Autodetection By default the system tries to infer if the provided files are in a hive partitioned hierarchy. And if so, the hive_
partitioning flag is enabled automatically. The autodetection will look at the names of the folders and search for a 'key' =
'value' pattern. This behaviour can be overridden by setting the hive_partitioning flag manually.

Hive Types hive_types is a way to specify the logical types of the hive partitions in a struct:

SELECT *
FROM read_parquet(
'dir/**/*.parquet',
hive_partitioning = true,
hive_types = {'release': DATE, 'orders': BIGINT}
);

hive_types will be autodetected for the following types: DATE, TIMESTAMP and BIGINT. To switch off the autodetection, the flag
hive_types_autocast = 0 can be set.

Writing Partitioned Files See the Partitioned Writes section.

Partitioned Writes

Examples

-- write a table to a Hive partitioned data set of Parquet files

COPY orders TO 'orders' (FORMAT PARQUET, PARTITION_BY (year, month));
-- write a table to a Hive partitioned data set of CSV files, allowing overwrites
COPY orders TO 'orders' (FORMAT CSV, PARTITION_BY (year, month), OVERWRITE_OR_IGNORE 1);

34
DuckDB Documentation

Partitioned Writes

When the partition_by clause is specified for the COPY statement, the files are written in a Hive partitioned folder hierarchy. The target
is the name of the root directory (in the example above: orders). The files are written in‑order in the file hierarchy. Currently, one file is
written per thread to each directory.

orders
├── year=2021
│ ├── month=1
│ │ ├── data_1.parquet
│ │ └── data_2.parquet
│ └── month=2
│ └── data_1.parquet
└── year=2022
├── month=11
│ ├── data_1.parquet
│ └── data_2.parquet
└── month=12
└── data_1.parquet

The values of the partitions are automatically extracted from the data. Note that it can be very expensive to write many partitions as many
files will be created. The ideal partition count depends on how large your data set is.

Note. Bestpractice Writing data into many small partitions is expensive. It is generally recommended to have at least 100MB of
data per partition.

Overwriting By default the partitioned write will not allow overwriting existing directories. Use the OVERWRITE_OR_IGNORE option
to allow overwriting an existing directory.

Filename Pattern By default, files will be named data_0.parquet or data_0.csv. With the flag FILENAME_PATTERN a pattern
with {i} or {uuid} can be defined to create specific filenames:

• {i} will be replaced by an index

• {uuid} will be replaced by a 128 bits long UUID

-- write a table to a Hive partitioned data set of .parquet files, with an index in the filename
COPY orders TO 'orders' (FORMAT PARQUET, PARTITION_BY (year, month), OVERWRITE_OR_IGNORE, FILENAME_
PATTERN "orders_{i}");
-- write a table to a Hive partitioned data set of .parquet files, with unique filenames
COPY orders TO 'orders' (FORMAT PARQUET, PARTITION_BY (year, month), OVERWRITE_OR_IGNORE, FILENAME_
PATTERN "file_{uuid}");

Appender

The Appender can be used to load bulk data into a DuckDB database. It is currently available in the C, C++, Go, Java, and Rust APIs. The
Appender is tied to a connection, and will use the transaction context of that connection when appending. An Appender always appends
to a single table in the database file.

In the C++ API, the Appender works as follows:

DuckDB db;
Connection con(db);
// create the table
con.Query("CREATE TABLE people (id INTEGER, name VARCHAR)");
// initialize the appender
Appender appender(con, "people");

35
DuckDB Documentation

The AppendRow function is the easiest way of appending data. It uses recursive templates to allow you to put all the values of a single row
within one function call, as follows:

appender.AppendRow(1, "Mark");

Rows can also be individually constructed using the BeginRow, EndRow and Append methods. This is done internally by AppendRow,
and hence has the same performance characteristics.

appender.BeginRow();
appender.Append<int32_t>(2);
appender.Append<string>("Hannes");
appender.EndRow();

Any values added to the appender are cached prior to being inserted into the database system for performance reasons. That means that,
while appending, the rows might not be immediately visible in the system. The cache is automatically flushed when the appender goes
out of scope or when appender.Close() is called. The cache can also be manually flushed using the appender.Flush() method.
After either Flush or Close is called, all the data has been written to the database system.

Date, Time and Timestamps

While numbers and strings are rather self‑explanatory, dates, times and timestamps require some explanation. They can be directly ap‑
pended using the methods provided by duckdb::Date, duckdb::Time or duckdb::Timestamp. They can also be appended using
the internal duckdb::Value type, however, this adds some additional overheads and should be avoided if possible.

Below is a short example:

con.Query("CREATE TABLE dates (d DATE, t TIME, ts TIMESTAMP)");

Appender appender(con, "dates");

// construct the values using the Date/Time/Timestamp types

// (this is the most efficient approach)
appender.AppendRow(
Date::FromDate(1992, 1, 1),
Time::FromTime(1, 1, 1, 0),
Timestamp::FromDatetime(Date::FromDate(1992, 1, 1), Time::FromTime(1, 1, 1, 0))
);
// construct duckdb::Value objects
appender.AppendRow(
Value::DATE(1992, 1, 1),
Value::TIME(1, 1, 1, 0),
Value::TIMESTAMP(1992, 1, 1, 1, 1, 1, 0)
);

Handling Constraint Violations

If the appender encounters a PRIMARY KEY conflict or a UNIQUE constraint violation, it fails and returns the following error:

Constraint Error: PRIMARY KEY or UNIQUE constraint violated: duplicate key "..."

In this case, the entire append operation fails and no rows are inserted.

Appender Support in Other Clients

The appender is also available in the following client APIs:

• C
• Go
• JDBC (Java)
• Rust

36
DuckDB Documentation

INSERT Statements

INSERT statements are the standard way of loading data into a relational database. When using INSERT statements, the values are
supplied row‑by‑row. While simple, there is significant overhead involved in parsing and processing individual INSERT statements. This
makes lots of individual row‑by‑row insertions very inefficient for bulk insertion.

Note. Bestpractice As a rule‑of‑thumb, avoid using lots of individual row‑by‑row INSERT statements when inserting more than a
few rows (i.e., avoid using INSERT statements as part of a loop). When bulk inserting data, try to maximize the amount of data that
is inserted per statement.

If you must use INSERT statements to load data in a loop, avoid executing the statements in auto‑commit mode. After every commit,
the database is required to sync the changes made to disk to ensure no data is lost. In auto‑commit mode every single statement will be
wrapped in a separate transaction, meaning fsync will be called for every statement. This is typically unnecessary when bulk loading and
will significantly slow down your program.

Note. If you absolutely must use INSERT statements in a loop to load data, wrap them in calls to BEGIN TRANSACTION and
COMMIT.

Syntax

An example of using INSERT INTO to load data in a table is as follows:

CREATE TABLE people (id INTEGER, name VARCHAR);

INSERT INTO people VALUES (1, 'Mark'), (2, 'Hannes');

For a more detailed description together with syntax diagram can be found, see the page on the INSERT statement.

37
DuckDB Documentation

38
Client APIs

Client APIs Overview

There are various client APIs for DuckDB:

• C
• C++
• Go by marcboeker
• Java
• Julia
• Node.js
• Python
• R
• Rust
• WebAssembly/Wasm
• ADBC API
• ODBC API

Additionally, there is a standalone Command Line Interface (CLI) client.

There are also contributed third‑party DuckDB wrappers, which currently do not have an official documentation page:

• C# by Giorgi
• Common Lisp by ak‑coram
• Crystal by amauryt
• Ruby by suketa
• Zig by karlseguin

Overview

DuckDB implements a custom C API modelled somewhat following the SQLite C API. The API is contained in the duckdb.h header. Con‑
tinue to Startup & Shutdown to get started, or check out the Full API overview.

We also provide a SQLite API wrapper which means that if your applications is programmed against the SQLite C API, you can re‑link to
DuckDB and it should continue working. See the sqlite_api_wrapper folder in our source repository for more information.

Installation

The DuckDB C API can be installed as part of the libduckdb packages. Please see the installation page for details.

Startup & Shutdown

To use DuckDB, you must first initialize a duckdb_database handle using duckdb_open(). duckdb_open() takes as parameter the
database file to read and write from. The special value NULL (nullptr) can be used to create an in‑memory database. Note that for an
in‑memory database no data is persisted to disk (i.e., all data is lost when you exit the process).

39
DuckDB Documentation

With the duckdb_database handle, you can create one or many duckdb_connection using duckdb_connect(). While individual
connections are thread‑safe, they will be locked during querying. It is therefore recommended that each thread uses its own connection
to allow for the best parallel performance.

All duckdb_connections have to explicitly be disconnected with duckdb_disconnect() and the duckdb_database has to be
explicitly closed with duckdb_close() to avoid memory and file handle leaking.

Example

duckdb_database db;
duckdb_connection con;

if (duckdb_open(NULL, &db) == DuckDBError) {

// handle error
}
if (duckdb_connect(db, &con) == DuckDBError) {
// handle error
}

// run queries...

// cleanup
duckdb_disconnect(&con);
duckdb_close(&db);

API Reference

duckdb_state duckdb_open(const char path, duckdb_database out_database);

duckdb_state duckdb_open_ext(const char *path, duckdb_database *out_database, duckdb_config config, char
**out_error);
void duckdb_close(duckdb_database *database);
duckdb_state duckdb_connect(duckdb_database database, duckdb_connection *out_connection);
void duckdb_interrupt(duckdb_connection connection);
duckdb_query_progress_type duckdb_query_progress(duckdb_connection connection);
void duckdb_disconnect(duckdb_connection *connection);
const char *duckdb_library_version();

duckdb_open Creates a new database or opens an existing database file stored at the given path. If no path is given a new in‑memory
database is created instead. The instantiated database should be closed with 'duckdb_close'.

Syntax

duckdb_state duckdb_open(
const char *path,
duckdb_database *out_database
);

Parameters

• path

Path to the database file on disk, or nullptr or :memory: to open an in‑memory database.

• out_database

The result database object.

40
DuckDB Documentation

• returns

DuckDBSuccess on success or DuckDBError on failure.

duckdb_open_ext Extended version of duckdb_open. Creates a new database or opens an existing database file stored at the given
path. The instantiated database should be closed with 'duckdb_close'.

Syntax
duckdb_state duckdb_open_ext(
const char *path,
duckdb_database *out_database,
duckdb_config config,
char **out_error
);

Parameters

• path

Path to the database file on disk, or nullptr or :memory: to open an in‑memory database.

• out_database

The result database object.

• config

(Optional) configuration used to start up the database system.

• out_error

If set and the function returns DuckDBError, this will contain the reason why the start‑up failed. Note that the error must be freed using
duckdb_free.

• returns

DuckDBSuccess on success or DuckDBError on failure.

duckdb_close Closes the specified database and de‑allocates all memory allocated for that database. This should be called after you
are done with any database allocated through duckdb_open or duckdb_open_ext. Note that failing to call duckdb_close (in case
of e.g., a program crash) will not cause data corruption. Still, it is recommended to always correctly close a database object after you are
done with it.

Syntax
void duckdb_close(
duckdb_database *database
);

Parameters

• database

The database object to shut down.

duckdb_connect Opens a connection to a database. Connections are required to query the database, and store transactional state
associated with the connection. The instantiated connection should be closed using 'duckdb_disconnect'.

41
DuckDB Documentation

Syntax

duckdb_state duckdb_connect(
duckdb_database database,
duckdb_connection *out_connection
);

Parameters

• database

The database file to connect to.

• out_connection

The result connection object.

• returns

DuckDBSuccess on success or DuckDBError on failure.

duckdb_interrupt Interrupt running query

Syntax

void duckdb_interrupt(
duckdb_connection connection
);

Parameters

• connection

The connection to interrupt

duckdb_query_progress Get progress of the running query

Syntax

duckdb_query_progress_type duckdb_query_progress(
duckdb_connection connection
);

Parameters

• connection

The working connection

• returns

‑1 if no progress or a percentage of the progress

duckdb_disconnect Closes the specified connection and de‑allocates all memory allocated for that connection.

42
DuckDB Documentation

Syntax

void duckdb_disconnect(
duckdb_connection *connection
);

Parameters

• connection

The connection to close.

duckdb_library_version Returns the version of the linked DuckDB, with a version postfix for dev versions

Usually used for developing C extensions that must return this for a compatibility check.

Syntax

const char *duckdb_library_version(

);

Configuration

Configuration options can be provided to change different settings of the database system. Note that many of these settings can be changed
later on using PRAGMA statements as well. The configuration object should be created, filled with values and passed to duckdb_open_
ext.

Example

duckdb_database db;
duckdb_config config;

// create the configuration object

if (duckdb_create_config(&config) == DuckDBError) {
// handle error
}
// set some configuration options
duckdb_set_config(config, "access_mode", "READ_WRITE"); // or READ_ONLY
duckdb_set_config(config, "threads", "8");
duckdb_set_config(config, "max_memory", "8GB");
duckdb_set_config(config, "default_order", "DESC");

// open the database using the configuration

if (duckdb_open_ext(NULL, &db, config, NULL) == DuckDBError) {
// handle error
}
// cleanup the configuration object
duckdb_destroy_config(&config);

// run queries...

// cleanup
duckdb_close(&db);

43
DuckDB Documentation

API Reference

duckdb_state duckdb_create_config(duckdb_config *out_config);

size_t duckdb_config_count();
duckdb_state duckdb_get_config_flag(size_t index, const char **out_name, const char **out_description);
duckdb_state duckdb_set_config(duckdb_config config, const char *name, const char *option);
void duckdb_destroy_config(duckdb_config *config);

duckdb_create_config Initializes an empty configuration object that can be used to provide start‑up options for the DuckDB in‑
stance through duckdb_open_ext. The duckdb_config must be destroyed using 'duckdb_destroy_config'

This will always succeed unless there is a malloc failure.

Syntax
duckdb_state duckdb_create_config(
duckdb_config *out_config
);

Parameters

• out_config

The result configuration object.

• returns

DuckDBSuccess on success or DuckDBError on failure.

duckdb_config_count This returns the total amount of configuration options available for usage with duckdb_get_config_
flag.

This should not be called in a loop as it internally loops over all the options.

Syntax
size_t duckdb_config_count(

);

Parameters

• returns

The amount of config options available.

duckdb_get_config_flag Obtains a human‑readable name and description of a specific configuration option. This can be used to
e.g. display configuration options. This will succeed unless index is out of range (i.e., >= duckdb_config_count).

The result name or description MUST NOT be freed.

Syntax
duckdb_state duckdb_get_config_flag(
size_t index,
const char **out_name,
const char **out_description
);

44
DuckDB Documentation

Parameters

• index

The index of the configuration option (between 0 and duckdb_config_count)

• out_name

A name of the configuration flag.

• out_description

A description of the configuration flag.

• returns

DuckDBSuccess on success or DuckDBError on failure.

duckdb_set_config Sets the specified option for the specified configuration. The configuration option is indicated by name. To
obtain a list of config options, see duckdb_get_config_flag.

In the source code, configuration options are defined in config.cpp.

This can fail if either the name is invalid, or if the value provided for the option is invalid.

Syntax

duckdb_state duckdb_set_config(
duckdb_config config,
const char *name,
const char *option
);

Parameters

• duckdb_config

The configuration object to set the option on.

• name

The name of the configuration flag to set.

• option

The value to set the configuration flag to.

• returns

DuckDBSuccess on success or DuckDBError on failure.

duckdb_destroy_config Destroys the specified configuration object and de‑allocates all memory allocated for the object.

Syntax

void duckdb_destroy_config(
duckdb_config *config
);

45
DuckDB Documentation

Parameters

• config

The configuration object to destroy.

Query

The duckdb_query method allows SQL queries to be run in DuckDB from C. This method takes two parameters, a (null‑terminated) SQL
query string and a duckdb_result result pointer. The result pointer may be NULL if the application is not interested in the result set
or if the query produces no result. After the result is consumed, the duckdb_destroy_result method should be used to clean up the
result.

Elements can be extracted from the duckdb_result object using a variety of methods. The duckdb_column_count and duckdb_
row_count methods can be used to extract the number of columns and the number of rows, respectively. duckdb_column_name and
duckdb_column_type can be used to extract the names and types of individual columns.

Example

duckdb_state state;
duckdb_result result;

// create a table
state = duckdb_query(con, "CREATE TABLE integers (i INTEGER, j INTEGER);", NULL);
if (state == DuckDBError) {
// handle error
}
// insert three rows into the table
state = duckdb_query(con, "INSERT INTO integers VALUES (3, 4), (5, 6), (7, NULL);", NULL);
if (state == DuckDBError) {
// handle error
}
// query rows again
state = duckdb_query(con, "SELECT * FROM integers", &result);
if (state == DuckDBError) {
// handle error
}
// handle the result
// ...

// destroy the result after we are done with it

duckdb_destroy_result(&result);

Value Extraction

Values can be extracted using either the duckdb_column_data/duckdb_nullmask_data functions, or using the duckdb_value
convenience functions. The duckdb_column_data/duckdb_nullmask_data functions directly hand you a pointer to the result
arrays in columnar format, and can therefore be very fast. The duckdb_value functions perform bounds‑ and type‑checking, and will
automatically cast values to the desired type. This makes them more convenient and easier to use, at the expense of being slower.

See the Types page for more information.

Note. For optimal performance, use duckdb_column_data and duckdb_nullmask_data to extract data from the query
result. The duckdb_value functions perform internal type‑checking, bounds‑checking and casting which makes them slower.

46
DuckDB Documentation

duckdb_value Below is an example that prints the above result to CSV format using the duckdb_value_varchar function. Note
that the function is generic: we do not need to know about the types of the individual result columns.

// print the above result to CSV format using `duckdb_value_varchar`

idx_t row_count = duckdb_row_count(&result);
idx_t column_count = duckdb_column_count(&result);
for(idx_t row = 0; row < row_count; row++) {
for(idx_t col = 0; col < column_count; col++) {
if (col > 0) printf(",");
auto str_val = duckdb_value_varchar(&result, col, row);
printf("%s", str_val);
duckdb_free(str_val);
}
printf("\n");
}

duckdb_column_data Below is an example that prints the above result to CSV format using the duckdb_column_data function.
Note that the function is NOT generic: we do need to know exactly what the types of the result columns are.

int32_t i_data = (int32_t ) duckdb_column_data(&result, 0);

int32_t *j_data = (int32_t *) duckdb_column_data(&result, 1);
bool *i_mask = duckdb_nullmask_data(&result, 0);
bool *j_mask = duckdb_nullmask_data(&result, 1);
idx_t row_count = duckdb_row_count(&result);
for(idx_t row = 0; row < row_count; row++) {
if (i_mask[row]) {
printf("NULL");
} else {
printf("%d", i_data[row]);
}
printf(",");
if (j_mask[row]) {
printf("NULL");
} else {
printf("%d", j_data[row]);
}
printf("\n");
}

Note. Warning When using duckdb_column_data, be careful that the type matches exactly what you expect it to be. As the code
directly accesses an internal array, there is no type‑checking. Accessing a DUCKDB_TYPE_INTEGER column as if it was a DUCKDB_
TYPE_BIGINT column will provide unpredictable results!

API Reference

duckdb_state duckdb_query(duckdb_connection connection, const char query, duckdb_result out_result);

void duckdb_destroy_result(duckdb_result *result);
const char *duckdb_column_name(duckdb_result *result, idx_t col);
duckdb_type duckdb_column_type(duckdb_result *result, idx_t col);
duckdb_statement_type duckdb_result_statement_type(duckdb_result result);
duckdb_logical_type duckdb_column_logical_type(duckdb_result *result, idx_t col);
idx_t duckdb_column_count(duckdb_result *result);
idx_t duckdb_row_count(duckdb_result *result);
idx_t duckdb_rows_changed(duckdb_result *result);
void *duckdb_column_data(duckdb_result *result, idx_t col);
bool *duckdb_nullmask_data(duckdb_result *result, idx_t col);
const char *duckdb_result_error(duckdb_result *result);

47
DuckDB Documentation

duckdb_query Executes a SQL query within a connection and stores the full (materialized) result in the out_result pointer. If the query
fails to execute, DuckDBError is returned and the error message can be retrieved by calling duckdb_result_error.

Note that after running duckdb_query, duckdb_destroy_result must be called on the result object even if the query fails, other‑
wise the error stored within the result will not be freed correctly.

Syntax

duckdb_state duckdb_query(
duckdb_connection connection,
const char *query,
duckdb_result *out_result
);

Parameters

• connection

The connection to perform the query in.

• query

The SQL query to run.

• out_result

The query result.

• returns

DuckDBSuccess on success or DuckDBError on failure.

duckdb_destroy_result Closes the result and de‑allocates all memory allocated for that connection.

Syntax

void duckdb_destroy_result(
duckdb_result *result
);

Parameters

• result

The result to destroy.

duckdb_column_name Returns the column name of the specified column. The result should not need to be freed; the column names
will automatically be destroyed when the result is destroyed.

Returns NULL if the column is out of range.

Syntax

const char *duckdb_column_name(

duckdb_result *result,
idx_t col
);

48
DuckDB Documentation

Parameters

• result

The result object to fetch the column name from.

• col

The column index.

• returns

The column name of the specified column.

duckdb_column_type Returns the column type of the specified column.

Returns DUCKDB_TYPE_INVALID if the column is out of range.

Syntax
duckdb_type duckdb_column_type(
duckdb_result *result,
idx_t col
);

Parameters

• result

The result object to fetch the column type from.

• col

The column index.

• returns

The column type of the specified column.

duckdb_result_statement_type Returns the statement type of the statement that was executed

Syntax
duckdb_statement_type duckdb_result_statement_type(
duckdb_result result
);

Parameters

• result

The result object to fetch the statement type from.

• returns

duckdb_statement_type value or DUCKDB_STATEMENT_TYPE_INVALID

duckdb_column_logical_type Returns the logical column type of the specified column.

The return type of this call should be destroyed with duckdb_destroy_logical_type.

Returns NULL if the column is out of range.

49
DuckDB Documentation

Syntax
duckdb_logical_type duckdb_column_logical_type(
duckdb_result *result,
idx_t col
);

Parameters

• result

The result object to fetch the column type from.

Parameters

• result

The result object.

• returns

The number of rows changed.

duckdb_column_data DEPRECATED: Prefer using duckdb_result_get_chunk instead.

Returns the data of a specific column of a result in columnar format.

The function returns a dense array which contains the result data. The exact type stored in the array depends on the corresponding duckdb_
type (as provided by duckdb_column_type). For the exact type by which the data should be accessed, see the comments in the types
section or the DUCKDB_TYPE enum.

For example, for a column of type DUCKDB_TYPE_INTEGER, rows can be accessed in the following manner:

int32_t data = (int32_t ) duckdb_column_data(&result, 0);

printf("Data for row %d: %d\n", row, data[row]);

Syntax
void *duckdb_column_data(
duckdb_result *result,
idx_t col
);

Parameters

• result

The result object to fetch the column data from.

• col

The column index.

• returns

The column data of the specified column.

duckdb_nullmask_data DEPRECATED: Prefer using duckdb_result_get_chunk instead.

Returns the nullmask of a specific column of a result in columnar format. The nullmask indicates for every row whether or not the corre‑
sponding row is NULL. If a row is NULL, the values present in the array provided by duckdb_column_data are undefined.

int32_t data = (int32_t ) duckdb_column_data(&result, 0);

bool *nullmask = duckdb_nullmask_data(&result, 0);
if (nullmask[row]) {
printf("Data for row %d: NULL\n", row);
} else {
printf("Data for row %d: %d\n", row, data[row]);
}

51
DuckDB Documentation

Syntax

bool *duckdb_nullmask_data(
duckdb_result *result,
idx_t col
);

Parameters

• result

The result object to fetch the nullmask from.

• col

The column index.

• returns

The nullmask of the specified column.

duckdb_result_error Returns the error message contained within the result. The error is only set if duckdb_query returns
DuckDBError.

The result of this function must not be freed. It will be cleaned up when duckdb_destroy_result is called.

Syntax

const char *duckdb_result_error(

duckdb_result *result
);

Parameters

• result

The result object to fetch the error from.

• returns

The error of the result.

Data Chunks

Data chunks represent a horizontal slice of a table. They hold a number of vectors, that can each hold up to the VECTOR_SIZE rows. The
vector size can be obtained through the duckdb_vector_size function and is configurable, but is usually set to 2048.

Data chunks and vectors are what DuckDB uses natively to store and represent data. For this reason, the data chunk interface is the most
efficient way of interfacing with DuckDB. Be aware, however, that correctly interfacing with DuckDB using the data chunk API does require
knowledge of DuckDB's internal vector format.

The primary manner of interfacing with data chunks is by obtaining the internal vectors of the data chunk using the duckdb_data_
chunk_get_vector method, and subsequently using the duckdb_vector_get_data and duckdb_vector_get_validity
methods to read the internal data and the validity mask of the vector. For composite types (list and struct vectors), duckdb_list_
vector_get_child and duckdb_struct_vector_get_child should be used to read child vectors.

52
DuckDB Documentation

API Reference

duckdb_data_chunk duckdb_create_data_chunk(duckdb_logical_type *types, idx_t column_count);

void duckdb_destroy_data_chunk(duckdb_data_chunk *chunk);
void duckdb_data_chunk_reset(duckdb_data_chunk chunk);
idx_t duckdb_data_chunk_get_column_count(duckdb_data_chunk chunk);
duckdb_vector duckdb_data_chunk_get_vector(duckdb_data_chunk chunk, idx_t col_idx);
idx_t duckdb_data_chunk_get_size(duckdb_data_chunk chunk);
void duckdb_data_chunk_set_size(duckdb_data_chunk chunk, idx_t size);

Vector Interface

duckdb_logical_type duckdb_vector_get_column_type(duckdb_vector vector);

void *duckdb_vector_get_data(duckdb_vector vector);
uint64_t *duckdb_vector_get_validity(duckdb_vector vector);
void duckdb_vector_ensure_validity_writable(duckdb_vector vector);
void duckdb_vector_assign_string_element(duckdb_vector vector, idx_t index, const char *str);
void duckdb_vector_assign_string_element_len(duckdb_vector vector, idx_t index, const char *str, idx_t
str_len);
duckdb_vector duckdb_list_vector_get_child(duckdb_vector vector);
idx_t duckdb_list_vector_get_size(duckdb_vector vector);
duckdb_state duckdb_list_vector_set_size(duckdb_vector vector, idx_t size);
duckdb_state duckdb_list_vector_reserve(duckdb_vector vector, idx_t required_capacity);
duckdb_vector duckdb_struct_vector_get_child(duckdb_vector vector, idx_t index);
duckdb_vector duckdb_array_vector_get_child(duckdb_vector vector);

Validity Mask Functions

bool duckdb_validity_row_is_valid(uint64_t *validity, idx_t row);

void duckdb_validity_set_row_validity(uint64_t *validity, idx_t row, bool valid);
void duckdb_validity_set_row_invalid(uint64_t *validity, idx_t row);
void duckdb_validity_set_row_valid(uint64_t *validity, idx_t row);

duckdb_create_data_chunk Creates an empty DataChunk with the specified set of types.

Note that the result must be destroyed with duckdb_destroy_data_chunk.

Syntax

duckdb_data_chunk duckdb_create_data_chunk(
duckdb_logical_type *types,
idx_t column_count
);

Parameters

• types

An array of types of the data chunk.

• column_count

The number of columns.

• returns

The data chunk.

53
DuckDB Documentation

duckdb_destroy_data_chunk Destroys the data chunk and de‑allocates all memory allocated for that chunk.

Syntax
void duckdb_destroy_data_chunk(
duckdb_data_chunk *chunk
);

Parameters

• chunk

The data chunk to destroy.

duckdb_data_chunk_reset Resets a data chunk, clearing the validity masks and setting the cardinality of the data chunk to 0.

Syntax
void duckdb_data_chunk_reset(
duckdb_data_chunk chunk
);

Parameters

• chunk

The data chunk to reset.

duckdb_data_chunk_get_column_count Retrieves the number of columns in a data chunk.

Syntax
idx_t duckdb_data_chunk_get_column_count(
duckdb_data_chunk chunk
);

Parameters

• chunk

The data chunk to get the data from

• returns

The number of columns in the data chunk

duckdb_data_chunk_get_vector Retrieves the vector at the specified column index in the data chunk.

The pointer to the vector is valid for as long as the chunk is alive. It does NOT need to be destroyed.

The result must be destroyed with duckdb_destroy_logical_type.

Syntax

duckdb_logical_type duckdb_vector_get_column_type(
duckdb_vector vector
);

55
DuckDB Documentation

Parameters

• vector

The vector get the data from

• returns

The type of the vector

duckdb_vector_get_data Retrieves the data pointer of the vector.

The data pointer can be used to read or write values from the vector. How to read or write values depends on the type of the vector.

Syntax

void *duckdb_vector_get_data(
duckdb_vector vector
);

Parameters

• vector

The vector to get the data from

• returns

The data pointer

duckdb_vector_get_validity Retrieves the validity mask pointer of the specified vector.

If all values are valid, this function MIGHT return NULL!

The validity mask is a bitset that signifies null‑ness within the data chunk. It is a series of uint64_t values, where each uint64_t value contains
validity for 64 tuples. The bit is set to 1 if the value is valid (i.e., not NULL) or 0 if the value is invalid (i.e., NULL).

Validity of a specific value can be obtained like this:

idx_t entry_idx = row_idx / 64; idx_t idx_in_entry = row_idx % 64; bool is_valid = validity_mask[entry_idx] & (1 « idx_in_entry);

Alternatively, the (slower) duckdb_validity_row_is_valid function can be used.

Syntax

uint64_t *duckdb_vector_get_validity(
duckdb_vector vector
);

Parameters

• vector

The vector to get the data from

• returns

The pointer to the validity mask, or NULL if no validity mask is present

56
DuckDB Documentation

duckdb_vector_ensure_validity_writable Ensures the validity mask is writable by allocating it.

After this function is called, duckdb_vector_get_validity will ALWAYS return non‑NULL. This allows null values to be written to the
vector, regardless of whether a validity mask was present before.

Syntax

void duckdb_vector_ensure_validity_writable(
duckdb_vector vector
);

Parameters

• vector

The vector to alter

duckdb_vector_assign_string_element Assigns a string element in the vector at the specified location.

Syntax

void duckdb_vector_assign_string_element(
duckdb_vector vector,
idx_t index,
const char *str
);

Parameters

• vector

The vector to alter

• index

The row position in the vector to assign the string to

• str

The null‑terminated string

duckdb_vector_assign_string_element_len Assigns a string element in the vector at the specified location. You may also
use this function to assign BLOBs.

Syntax

void duckdb_vector_assign_string_element_len(
duckdb_vector vector,
idx_t index,
const char *str,
idx_t str_len
);

57
DuckDB Documentation

Parameters

• vector

The vector to alter

• index

The row position in the vector to assign the string to

• str

The string

• str_len

The length of the string (in bytes)

duckdb_list_vector_get_child Retrieves the child vector of a list vector.

The resulting vector is valid as long as the parent vector is valid.

Syntax

• size

The size of the child list.

• returns

The duckdb state. Returns DuckDBError if the vector is nullptr.

duckdb_list_vector_reserve Sets the total capacity of the underlying child‑vector of a list.

Syntax

duckdb_state duckdb_list_vector_reserve(
duckdb_vector vector,
idx_t required_capacity
);

Parameters

• vector

The list vector.

• required_capacity

the total capacity to reserve.

• return

The duckdb state. Returns DuckDBError if the vector is nullptr.

duckdb_struct_vector_get_child Retrieves the child vector of a struct vector.

The resulting vector is valid as long as the parent vector is valid.

Syntax

duckdb_vector duckdb_struct_vector_get_child(
duckdb_vector vector,
idx_t index
);

59
DuckDB Documentation

Parameters

• vector

The vector

• index

The child index

• returns

The child vector

duckdb_array_vector_get_child Retrieves the child vector of a array vector.

The resulting vector is valid as long as the parent vector is valid. The resulting vector has the size of the parent vector multiplied by the
array size.

Syntax

duckdb_vector duckdb_array_vector_get_child(
duckdb_vector vector
);

Parameters

• vector

The vector

• returns

The child vector

duckdb_validity_row_is_valid Returns whether or not a row is valid (i.e., not NULL) in the given validity mask.

Syntax

bool duckdb_validity_row_is_valid(
uint64_t *validity,
idx_t row
);

Parameters

• validity

The validity mask, as obtained through duckdb_vector_get_validity

• row

The row index

• returns

true if the row is valid, false otherwise

60
DuckDB Documentation

duckdb_validity_set_row_validity In a validity mask, sets a specific row to either valid or invalid.

Note that duckdb_vector_ensure_validity_writable should be called before calling duckdb_vector_get_validity, to

ensure that there is a validity mask to write to.

Syntax
void duckdb_validity_set_row_validity(
uint64_t *validity,
idx_t row,
bool valid
);

Parameters

• validity

The validity mask, as obtained through duckdb_vector_get_validity.

• row

The row index

• valid

Whether or not to set the row to valid, or invalid

duckdb_validity_set_row_invalid In a validity mask, sets a specific row to invalid.

Equivalent to duckdb_validity_set_row_validity with valid set to false.

Syntax
void duckdb_validity_set_row_invalid(
uint64_t *validity,
idx_t row
);

Parameters

• validity

The validity mask

• row

The row index

duckdb_validity_set_row_valid In a validity mask, sets a specific row to valid.

Equivalent to duckdb_validity_set_row_validity with valid set to true.

Syntax
void duckdb_validity_set_row_valid(
uint64_t *validity,
idx_t row
);

61
DuckDB Documentation

Parameters

• validity

The validity mask

• row

The row index

Values

The value class represents a single value of any type.

API Reference

void duckdb_destroy_value(duckdb_value *value);

Parameters

• value

The bigint value

• returns

The value. This must be destroyed with duckdb_destroy_value.

duckdb_create_struct_value Creates a struct value from a type and an array of values

Syntax
duckdb_value duckdb_create_struct_value(
duckdb_logical_type type,
duckdb_value *values
);

63
DuckDB Documentation

Parameters

• type

The type of the struct

• values

The values for the struct fields

• returns

The value. This must be destroyed with duckdb_destroy_value.

duckdb_create_list_value Creates a list value from a type and an array of values of length value_count

Syntax

duckdb_value duckdb_create_list_value(
duckdb_logical_type type,
duckdb_value *values,
idx_t value_count
);

Parameters

• type

The type of the list

• values

The values for the list

• value_count

The number of values in the list

• returns

The value. This must be destroyed with duckdb_destroy_value.

duckdb_create_array_value Creates a array value from a type and an array of values of length value_count

Syntax

duckdb_value duckdb_create_array_value(
duckdb_logical_type type,
duckdb_value *values,
idx_t value_count
);

Parameters

• type

The type of the array

• values

64
DuckDB Documentation

The values for the array

• value_count

The number of values in the array

• returns

The value. This must be destroyed with duckdb_destroy_value.

duckdb_get_varchar Obtains a string representation of the given value. The result must be destroyed with duckdb_free.

Syntax
char *duckdb_get_varchar(
duckdb_value value
);

Parameters

• value

The value

• returns

The string value. This must be destroyed with duckdb_free.

duckdb_get_int64 Obtains an int64 of the given value.

Syntax
int64_t duckdb_get_int64(
duckdb_value value
);

Parameters

• value

The value

• returns

The int64 value, or 0 if no conversion is possible

Types

DuckDB is a strongly typed database system. As such, every column has a single type specified. This type is constant over the entire column.
That is to say, a column that is labeled as an INTEGER column will only contain INTEGER values.

DuckDB also supports columns of composite types. For example, it is possible to define an array of integers (INT[]). It is also possible to
define types as arbitrary structs (ROW(i INTEGER, j VARCHAR)). For that reason, native DuckDB type objects are not mere enums,
but a class that can potentially be nested.

Types in the C API are modeled using an enum (duckdb_type) and a complex class (duckdb_logical_type). For most primitive
types, e.g., integers or varchars, the enum is sufficient. For more complex types, such as lists, structs or decimals, the logical type must be
used.

65
DuckDB Documentation

typedef enum DUCKDB_TYPE {

DUCKDB_TYPE_INVALID,
DUCKDB_TYPE_BOOLEAN,
DUCKDB_TYPE_TINYINT,
DUCKDB_TYPE_SMALLINT,
DUCKDB_TYPE_INTEGER,
DUCKDB_TYPE_BIGINT,
DUCKDB_TYPE_UTINYINT,
DUCKDB_TYPE_USMALLINT,
DUCKDB_TYPE_UINTEGER,
DUCKDB_TYPE_UBIGINT,
DUCKDB_TYPE_FLOAT,
DUCKDB_TYPE_DOUBLE,
DUCKDB_TYPE_TIMESTAMP,
DUCKDB_TYPE_DATE,
DUCKDB_TYPE_TIME,
DUCKDB_TYPE_INTERVAL,
DUCKDB_TYPE_HUGEINT,
DUCKDB_TYPE_VARCHAR,
DUCKDB_TYPE_BLOB,
DUCKDB_TYPE_DECIMAL,
DUCKDB_TYPE_TIMESTAMP_S,
DUCKDB_TYPE_TIMESTAMP_MS,
DUCKDB_TYPE_TIMESTAMP_NS,
DUCKDB_TYPE_ENUM,
DUCKDB_TYPE_LIST,
DUCKDB_TYPE_STRUCT,
DUCKDB_TYPE_MAP,
DUCKDB_TYPE_UUID,
DUCKDB_TYPE_UNION,
DUCKDB_TYPE_BIT,
} duckdb_type;

Functions

The enum type of a column in the result can be obtained using the duckdb_column_type function. The logical type of a column can be
obtained using the duckdb_column_logical_type function.

duckdb_value The duckdb_value functions will auto‑cast values as required. For example, it is no problem to use duckdb_
value_double on a column of type duckdb_value_int32. The value will be auto‑cast and returned as a double. Note that in certain
cases the cast may fail. For example, this can happen if we request a duckdb_value_int8 and the value does not fit within an int8
value. In this case, a default value will be returned (usually 0 or nullptr). The same default value will also be returned if the corresponding
value is NULL.

The duckdb_value_is_null function can be used to check if a specific value is NULL or not.

The exception to the auto‑cast rule is the duckdb_value_varchar_internal function. This function does not auto‑cast and only
works for VARCHAR columns. The reason this function exists is that the result does not need to be freed.
Note. duckdb_value_varchar and duckdb_value_blob require the result to be de‑allocated using duckdb_free.

duckdb_result_get_chunk The duckdb_result_get_chunk function can be used to read data chunks from a DuckDB result
set, and is the most efficient way of reading data from a DuckDB result using the C API. It is also the only way of reading data of certain types
from a DuckDB result. For example, the duckdb_value functions do not support structural reading of composite types (lists or structs)
or more complex types like enums and decimals.

For more information about data chunks, see the documentation on data chunks.

66
DuckDB Documentation

API Reference

duckdb_data_chunk duckdb_result_get_chunk(duckdb_result result, idx_t chunk_index);

bool duckdb_result_is_streaming(duckdb_result result);
idx_t duckdb_result_chunk_count(duckdb_result result);
duckdb_result_type duckdb_result_return_type(duckdb_result result);

Date/Time/Timestamp Helpers

duckdb_date_struct duckdb_from_date(duckdb_date date);

duckdb_date duckdb_to_date(duckdb_date_struct date);
bool duckdb_is_finite_date(duckdb_date date);
duckdb_time_struct duckdb_from_time(duckdb_time time);
duckdb_time_tz duckdb_create_time_tz(int64_t micros, int32_t offset);
duckdb_time_tz_struct duckdb_from_time_tz(duckdb_time_tz micros);
duckdb_time duckdb_to_time(duckdb_time_struct time);
duckdb_timestamp_struct duckdb_from_timestamp(duckdb_timestamp ts);
duckdb_timestamp duckdb_to_timestamp(duckdb_timestamp_struct ts);
bool duckdb_is_finite_timestamp(duckdb_timestamp ts);

Hugeint Helpers

double duckdb_hugeint_to_double(duckdb_hugeint val);

duckdb_hugeint duckdb_double_to_hugeint(double val);

Decimal Helpers

duckdb_decimal duckdb_double_to_decimal(double val, uint8_t width, uint8_t scale);

double duckdb_decimal_to_double(duckdb_decimal val);

Logical Type Interface

duckdb_logical_type duckdb_create_logical_type(duckdb_type type);

char *duckdb_logical_type_get_alias(duckdb_logical_type type);
duckdb_logical_type duckdb_create_list_type(duckdb_logical_type type);
duckdb_logical_type duckdb_create_array_type(duckdb_logical_type type, idx_t array_size);
duckdb_logical_type duckdb_create_map_type(duckdb_logical_type key_type, duckdb_logical_type value_
type);
duckdb_logical_type duckdb_create_union_type(duckdb_logical_type *member_types, const char **member_
names, idx_t member_count);
duckdb_logical_type duckdb_create_struct_type(duckdb_logical_type *member_types, const char **member_
names, idx_t member_count);
duckdb_logical_type duckdb_create_enum_type(const char **member_names, idx_t member_count);
duckdb_logical_type duckdb_create_decimal_type(uint8_t width, uint8_t scale);
duckdb_type duckdb_get_type_id(duckdb_logical_type type);
uint8_t duckdb_decimal_width(duckdb_logical_type type);
uint8_t duckdb_decimal_scale(duckdb_logical_type type);
duckdb_type duckdb_decimal_internal_type(duckdb_logical_type type);
duckdb_type duckdb_enum_internal_type(duckdb_logical_type type);
uint32_t duckdb_enum_dictionary_size(duckdb_logical_type type);
char *duckdb_enum_dictionary_value(duckdb_logical_type type, idx_t index);
duckdb_logical_type duckdb_list_type_child_type(duckdb_logical_type type);
duckdb_logical_type duckdb_array_type_child_type(duckdb_logical_type type);
idx_t duckdb_array_type_array_size(duckdb_logical_type type);
duckdb_logical_type duckdb_map_type_key_type(duckdb_logical_type type);
duckdb_logical_type duckdb_map_type_value_type(duckdb_logical_type type);
idx_t duckdb_struct_type_child_count(duckdb_logical_type type);

67
DuckDB Documentation

char *duckdb_struct_type_child_name(duckdb_logical_type type, idx_t index);

duckdb_logical_type duckdb_struct_type_child_type(duckdb_logical_type type, idx_t index);
idx_t duckdb_union_type_member_count(duckdb_logical_type type);
char *duckdb_union_type_member_name(duckdb_logical_type type, idx_t index);
duckdb_logical_type duckdb_union_type_member_type(duckdb_logical_type type, idx_t index);
void duckdb_destroy_logical_type(duckdb_logical_type *type);

duckdb_result_get_chunk Fetches a data chunk from the duckdb_result. This function should be called repeatedly until the result
is exhausted.

The result must be destroyed with duckdb_destroy_data_chunk.

This function supersedes all duckdb_value functions, as well as the duckdb_column_data and duckdb_nullmask_data func‑
tions. It results in significantly better performance, and should be preferred in newer code‑bases.

If this function is used, none of the other result functions can be used and vice versa (i.e., this function cannot be mixed with the legacy
result functions).

Use duckdb_result_chunk_count to figure out how many chunks there are in the result.

Syntax

duckdb_data_chunk duckdb_result_get_chunk(
duckdb_result result,
idx_t chunk_index
);

Parameters

• result

The result object to fetch the data chunk from.

• chunk_index

The chunk index to fetch from.

• returns

The resulting data chunk. Returns NULL if the chunk index is out of bounds.

duckdb_result_is_streaming Checks if the type of the internal result is StreamQueryResult.

Syntax

bool duckdb_result_is_streaming(
duckdb_result result
);

Parameters

• result

The result object to check.

• returns

Whether or not the result object is of the type StreamQueryResult

The duckdb_date_struct with the decomposed elements.

The timezone offset component of the time.

• returns

The duckdb_time_tz element.

duckdb_from_time_tz Decompose a TIME_TZ objects into micros and a timezone offset.

Use duckdb_from_time to further decompose the micros into hour, minute, second and microsecond.

Syntax
duckdb_time_tz_struct duckdb_from_time_tz(
duckdb_time_tz micros
);

Parameters

• micros

The time object, as obtained from a DUCKDB_TYPE_TIME_TZ column.

• out_micros

The microsecond component of the time.

• out_offset

The timezone offset component of the time.

duckdb_to_time Re‑compose a duckdb_time from hour, minute, second and microsecond (duckdb_time_struct).

Syntax
duckdb_time duckdb_to_time(
duckdb_time_struct time
);

Parameters

• time

The hour, minute, second and microsecond in a duckdb_time_struct.

• returns

The duckdb_time element.

The logical type.

duckdb_create_array_type Creates a array type from its child type. The resulting type should be destroyed with duckdb_
destroy_logical_type.

• returns

The logical type.

76
DuckDB Documentation

duckdb_create_enum_type Creates an ENUM type from the passed member name array. The resulting type should be destroyed
with duckdb_destroy_logical_type.

Syntax
duckdb_logical_type duckdb_create_enum_type(
const char **member_names,
idx_t member_count
);

Parameters

• enum_name

The name of the enum.

• member_names

The array of names that the enum should consist of.

• member_count

The number of elements that were specified in the array.

• returns

The logical type.

duckdb_create_decimal_type Creates a duckdb_logical_type of type decimal with the specified width and scale. The re‑
sulting type should be destroyed with duckdb_destroy_logical_type.

Syntax
duckdb_logical_type duckdb_create_decimal_type(
uint8_t width,
uint8_t scale
);

Parameters

• width

The width of the decimal type

Parameters

• type

The logical type object

• returns

The internal type of the decimal type

duckdb_enum_internal_type Retrieves the internal storage type of an enum type.

Syntax

duckdb_type duckdb_enum_internal_type(
duckdb_logical_type type
);

Parameters

• type

The logical type object

• returns

The internal type of the enum type

duckdb_enum_dictionary_size Retrieves the dictionary size of the enum type.

Syntax

uint32_t duckdb_enum_dictionary_size(
duckdb_logical_type type
);

Parameters

• type

The logical type object

• returns

The dictionary size of the enum type

duckdb_enum_dictionary_value Retrieves the dictionary value at the specified position from the enum.

The result must be freed with duckdb_free.

Syntax

char *duckdb_enum_dictionary_value(
duckdb_logical_type type,
idx_t index
);

79
DuckDB Documentation

Parameters

• type

The logical type object

• type

The logical type object

• index

The child index

• returns

The name of the struct type. Must be freed with duckdb_free.

duckdb_struct_type_child_type Retrieves the child type of the given struct type at the specified index.

The result must be freed with duckdb_destroy_logical_type.

Syntax

duckdb_logical_type duckdb_struct_type_child_type(
duckdb_logical_type type,
idx_t index
);

82
DuckDB Documentation

Parameters

• type

The logical type object

• index

The child index

• returns

The child type of the struct type. Must be destroyed with duckdb_destroy_logical_type.

duckdb_union_type_member_count Returns the number of members that the union type has.

Syntax
idx_t duckdb_union_type_member_count(
duckdb_logical_type type
);

Parameters

• type

The logical type (union) object

• returns

The number of members of a union type.

duckdb_union_type_member_name Retrieves the name of the union member.

The result must be freed with duckdb_free.

Syntax
char *duckdb_union_type_member_name(
duckdb_logical_type type,
idx_t index
);

Parameters

• type

The logical type object

• index

The child index

• returns

The name of the union member. Must be freed with duckdb_free.

duckdb_union_type_member_type Retrieves the child type of the given union member at the specified index.

The result must be freed with duckdb_destroy_logical_type.

83
DuckDB Documentation

Syntax
duckdb_logical_type duckdb_union_type_member_type(
duckdb_logical_type type,
idx_t index
);

Parameters

• type

The logical type object

• index

The child index

• returns

The child type of the union member. Must be destroyed with duckdb_destroy_logical_type.

duckdb_destroy_logical_type Destroys the logical type and de‑allocates all memory allocated for that type.

Syntax
void duckdb_destroy_logical_type(
duckdb_logical_type *type
);

Parameters

• type

The logical type to destroy.

Prepared Statements

A prepared statement is a parameterized query. The query is prepared with question marks (?) or dollar symbols ($1) indicating the
parameters of the query. Values can then be bound to these parameters, after which the prepared statement can be executed using those
parameters. A single query can be prepared once and executed many times.

Prepared statements are useful to:

• Easily supply parameters to functions while avoiding string concatenation/SQL injection attacks.
• Speeding up queries that will be executed many times with different parameters.

DuckDB supports prepared statements in the C API with the duckdb_prepare method. The duckdb_bind family of functions is used
to supply values for subsequent execution of the prepared statement using duckdb_execute_prepared. After we are done with the
prepared statement it can be cleaned up using the duckdb_destroy_prepare method.

Example

duckdb_prepared_statement stmt;
duckdb_result result;
if (duckdb_prepare(con, "INSERT INTO integers VALUES ($1, $2)", &stmt) == DuckDBError) {
// handle error
}

84
DuckDB Documentation

duckdb_bind_int32(stmt, 1, 42); // the parameter index starts counting at 1!

duckdb_bind_int32(stmt, 2, 43);
// NULL as second parameter means no result set is requested
duckdb_execute_prepared(stmt, NULL);
duckdb_destroy_prepare(&stmt);

// we can also query result sets using prepared statements

if (duckdb_prepare(con, "SELECT * FROM integers WHERE i = ?", &stmt) == DuckDBError) {
// handle error
}
duckdb_bind_int32(stmt, 1, 42);
duckdb_execute_prepared(stmt, &result);

// do something with result

// clean up
duckdb_destroy_result(&result);
duckdb_destroy_prepare(&stmt);

After calling duckdb_prepare, the prepared statement parameters can be inspected using duckdb_nparams and duckdb_param_
type. In case the prepare fails, the error can be obtained through duckdb_prepare_error.

It is not required that the duckdb_bind family of functions matches the prepared statement parameter type exactly. The values will be
auto‑cast to the required value as required. For example, calling duckdb_bind_int8 on a parameter type of DUCKDB_TYPE_INTEGER
will work as expected.

Note. Warning Do not use prepared statements to insert large amounts of data into DuckDB. Instead it is recommended to use the
Appender.

API Reference

duckdb_state duckdb_prepare(duckdb_connection connection, const char *query, duckdb_prepared_statement

*out_prepared_statement);
void duckdb_destroy_prepare(duckdb_prepared_statement *prepared_statement);
const char *duckdb_prepare_error(duckdb_prepared_statement prepared_statement);
idx_t duckdb_nparams(duckdb_prepared_statement prepared_statement);
const char *duckdb_parameter_name(duckdb_prepared_statement prepared_statement, idx_t index);
duckdb_type duckdb_param_type(duckdb_prepared_statement prepared_statement, idx_t param_idx);
duckdb_state duckdb_clear_bindings(duckdb_prepared_statement prepared_statement);
duckdb_statement_type duckdb_prepared_statement_type(duckdb_prepared_statement statement);

duckdb_prepare Create a prepared statement object from a query.

Note that after calling duckdb_prepare, the prepared statement should always be destroyed using duckdb_destroy_prepare,
even if the prepare fails.

If the prepare fails, duckdb_prepare_error can be called to obtain the reason why the prepare failed.

Syntax

duckdb_state duckdb_prepare(
duckdb_connection connection,
const char *query,
duckdb_prepared_statement *out_prepared_statement
);

85
DuckDB Documentation

Parameters

• connection

The connection object

• query

The SQL query to prepare

• out_prepared_statement

The resulting prepared statement object

• returns

DuckDBSuccess on success or DuckDBError on failure.

duckdb_destroy_prepare Closes the prepared statement and de‑allocates all memory allocated for the statement.

Syntax

void duckdb_destroy_prepare(
duckdb_prepared_statement *prepared_statement
);

Parameters

• prepared_statement

The prepared statement to destroy.

duckdb_prepare_error Returns the error message associated with the given prepared statement. If the prepared statement has no
error message, this returns nullptr instead.

The error message should not be freed. It will be de‑allocated when duckdb_destroy_prepare is called.

Syntax

const char *duckdb_prepare_error(

duckdb_prepared_statement prepared_statement
);

Parameters

• prepared_statement

The prepared statement to obtain the error from.

• returns

The error message, or nullptr if there is none.

duckdb_nparams Returns the number of parameters that can be provided to the given prepared statement.

Returns 0 if the query was not successfully prepared.

86
DuckDB Documentation

Syntax

idx_t duckdb_nparams(
duckdb_prepared_statement prepared_statement
);

Parameters

• prepared_statement

The prepared statement to obtain the number of parameters for.

duckdb_parameter_name Returns the name used to identify the parameter The returned string should be freed using duckdb_
free.

Returns NULL if the index is out of range for the provided prepared statement.

Syntax

const char *duckdb_parameter_name(

duckdb_prepared_statement prepared_statement,
idx_t index
);

Parameters

• prepared_statement

The prepared statement for which to get the parameter name from.

duckdb_param_type Returns the parameter type for the parameter at the given index.

Returns DUCKDB_TYPE_INVALID if the parameter index is out of range or the statement was not successfully prepared.

Syntax

duckdb_type duckdb_param_type(
duckdb_prepared_statement prepared_statement,
idx_t param_idx
);

Parameters

• prepared_statement

The prepared statement.

• param_idx

The parameter index.

• returns

The parameter type

duckdb_clear_bindings Clear the params bind to the prepared statement.

87
DuckDB Documentation

Syntax

duckdb_state duckdb_clear_bindings(
duckdb_prepared_statement prepared_statement
);

duckdb_prepared_statement_type Returns the statement type of the statement to be executed

Syntax

duckdb_statement_type duckdb_prepared_statement_type(
duckdb_prepared_statement statement
);

Parameters

• statement

The prepared statement.

• returns

duckdb_statement_type value or DUCKDB_STATEMENT_TYPE_INVALID

Appender

Appenders are the most efficient way of loading data into DuckDB from within the C interface, and are recommended for fast data loading.
The appender is much faster than using prepared statements or individual INSERT INTO statements.

Appends are made in row‑wise format. For every column, a duckdb_append_[type] call should be made, after which the row should
be finished by calling duckdb_appender_end_row. After all rows have been appended, duckdb_appender_destroy should be
used to finalize the appender and clean up the resulting memory.

Note that duckdb_appender_destroy should always be called on the resulting appender, even if the function returns DuckDBEr-
ror.

Example

duckdb_query(con, "CREATE TABLE people (id INTEGER, name VARCHAR)", NULL);

duckdb_appender appender;
if (duckdb_appender_create(con, NULL, "people", &appender) == DuckDBError) {
// handle error
}
// append the first row (1, Mark)
duckdb_append_int32(appender, 1);
duckdb_append_varchar(appender, "Mark");
duckdb_appender_end_row(appender);

// append the second row (2, Hannes)

duckdb_append_int32(appender, 2);
duckdb_append_varchar(appender, "Hannes");
duckdb_appender_end_row(appender);

// finish appending and flush all the rows to the table

duckdb_appender_destroy(&appender);

88
DuckDB Documentation

API Reference

duckdb_state duckdb_appender_create(duckdb_connection connection, const char schema, const char table,

duckdb_appender *out_appender);
idx_t duckdb_appender_column_count(duckdb_appender appender);
duckdb_logical_type duckdb_appender_column_type(duckdb_appender appender, idx_t col_idx);
const char *duckdb_appender_error(duckdb_appender appender);
duckdb_state duckdb_appender_flush(duckdb_appender appender);
duckdb_state duckdb_appender_close(duckdb_appender appender);
duckdb_state duckdb_appender_destroy(duckdb_appender *appender);
duckdb_state duckdb_appender_begin_row(duckdb_appender appender);
duckdb_state duckdb_appender_end_row(duckdb_appender appender);
duckdb_state duckdb_append_bool(duckdb_appender appender, bool value);
duckdb_state duckdb_append_int8(duckdb_appender appender, int8_t value);
duckdb_state duckdb_append_int16(duckdb_appender appender, int16_t value);
duckdb_state duckdb_append_int32(duckdb_appender appender, int32_t value);
duckdb_state duckdb_append_int64(duckdb_appender appender, int64_t value);
duckdb_state duckdb_append_hugeint(duckdb_appender appender, duckdb_hugeint value);
duckdb_state duckdb_append_uint8(duckdb_appender appender, uint8_t value);
duckdb_state duckdb_append_uint16(duckdb_appender appender, uint16_t value);
duckdb_state duckdb_append_uint32(duckdb_appender appender, uint32_t value);
duckdb_state duckdb_append_uint64(duckdb_appender appender, uint64_t value);
duckdb_state duckdb_append_uhugeint(duckdb_appender appender, duckdb_uhugeint value);
duckdb_state duckdb_append_float(duckdb_appender appender, float value);
duckdb_state duckdb_append_double(duckdb_appender appender, double value);
duckdb_state duckdb_append_date(duckdb_appender appender, duckdb_date value);
duckdb_state duckdb_append_time(duckdb_appender appender, duckdb_time value);
duckdb_state duckdb_append_timestamp(duckdb_appender appender, duckdb_timestamp value);
duckdb_state duckdb_append_interval(duckdb_appender appender, duckdb_interval value);
duckdb_state duckdb_append_varchar(duckdb_appender appender, const char *val);
duckdb_state duckdb_append_varchar_length(duckdb_appender appender, const char *val, idx_t length);
duckdb_state duckdb_append_blob(duckdb_appender appender, const void *data, idx_t length);
duckdb_state duckdb_append_null(duckdb_appender appender);
duckdb_state duckdb_append_data_chunk(duckdb_appender appender, duckdb_data_chunk chunk);

duckdb_appender_create Creates an appender object.

Note that the object must be destroyed with duckdb_appender_destroy.

Syntax
duckdb_state duckdb_appender_create(
duckdb_connection connection,
const char *schema,
const char *table,
duckdb_appender *out_appender
);

Parameters

• connection

The connection context to create the appender in.

• schema

The schema of the table to append to, or nullptr for the default schema.

• table

89
DuckDB Documentation

The table name to append to.

• out_appender

The resulting appender object.

• returns

DuckDBSuccess on success or DuckDBError on failure.

duckdb_appender_column_count Returns the number of columns in the table that belongs to the appender.

• appender The appender to get the column count from.

Syntax

idx_t duckdb_appender_column_count(
duckdb_appender appender
);

Parameters

• returns

The number of columns in the table.

duckdb_appender_column_type Returns the type of the column at the specified index.

Note: The resulting type should be destroyed with duckdb_destroy_logical_type.

• appender The appender to get the column type from.

• col_idx The index of the column to get the type of.

Syntax

duckdb_logical_type duckdb_appender_column_type(
duckdb_appender appender,
idx_t col_idx
);

Parameters

• returns

The duckdb_logical_type of the column.

duckdb_appender_error Returns the error message associated with the given appender. If the appender has no error message, this
returns nullptr instead.

The error message should not be freed. It will be de‑allocated when duckdb_appender_destroy is called.

duckdb_appender_begin_row A nop function, provided for backwards compatibility reasons. Does nothing. Only duckdb_
appender_end_row is required.

Syntax

duckdb_state duckdb_appender_begin_row(
duckdb_appender appender
);

duckdb_appender_end_row Finish the current row of appends. After end_row is called, the next row can be appended.

Syntax

duckdb_state duckdb_appender_end_row(
duckdb_appender appender
);

Parameters

• appender

The appender.

• returns

DuckDBSuccess on success or DuckDBError on failure.

duckdb_append_bool Append a bool value to the appender.

Syntax

duckdb_state duckdb_append_bool(
duckdb_appender appender,
bool value
);

duckdb_append_int8 Append an int8_t value to the appender.

92
DuckDB Documentation

Syntax
duckdb_state duckdb_append_int8(
duckdb_appender appender,
int8_t value
);

duckdb_append_int16 Append an int16_t value to the appender.

Syntax
duckdb_state duckdb_append_int16(
duckdb_appender appender,
int16_t value
);

duckdb_append_int32 Append an int32_t value to the appender.

Syntax
duckdb_state duckdb_append_int32(
duckdb_appender appender,
int32_t value
);

duckdb_append_int64 Append an int64_t value to the appender.

Syntax
duckdb_state duckdb_append_int64(
duckdb_appender appender,
int64_t value
);

duckdb_append_hugeint Append a duckdb_hugeint value to the appender.

Syntax
duckdb_state duckdb_append_hugeint(
duckdb_appender appender,
duckdb_hugeint value
);

duckdb_append_uint8 Append a uint8_t value to the appender.

Syntax
duckdb_state duckdb_append_uint8(
duckdb_appender appender,
uint8_t value
);

duckdb_append_uint16 Append a uint16_t value to the appender.

93
DuckDB Documentation

Syntax
duckdb_state duckdb_append_uint16(
duckdb_appender appender,
uint16_t value
);

duckdb_append_uint32 Append a uint32_t value to the appender.

Syntax
duckdb_state duckdb_append_uint32(
duckdb_appender appender,
uint32_t value
);

duckdb_append_uint64 Append a uint64_t value to the appender.

Syntax
duckdb_state duckdb_append_uint64(
duckdb_appender appender,
uint64_t value
);

duckdb_append_uhugeint Append a duckdb_uhugeint value to the appender.

Syntax
duckdb_state duckdb_append_uhugeint(
duckdb_appender appender,
duckdb_uhugeint value
);

duckdb_append_float Append a float value to the appender.

Syntax
duckdb_state duckdb_append_float(
duckdb_appender appender,
float value
);

duckdb_append_double Append a double value to the appender.

Syntax
duckdb_state duckdb_append_double(
duckdb_appender appender,
double value
);

duckdb_append_date Append a duckdb_date value to the appender.

94
DuckDB Documentation

duckdb_append_blob Append a blob value to the appender.

Syntax
duckdb_state duckdb_append_blob(
duckdb_appender appender,
const void *data,
idx_t length
);

duckdb_append_null Append a NULL value to the appender (of any type).

Syntax
duckdb_state duckdb_append_null(
duckdb_appender appender
);

duckdb_append_data_chunk Appends a pre‑filled data chunk to the specified appender.

The types of the data chunk must exactly match the types of the table, no casting is performed. If the types do not match or the appender
is in an invalid state, DuckDBError is returned. If the append is successful, DuckDBSuccess is returned.

Syntax
duckdb_state duckdb_append_data_chunk(
duckdb_appender appender,
duckdb_data_chunk chunk
);

Parameters

• appender

The appender to append to.

• chunk

The data chunk to append.

• returns

The return state.

Table Functions

The table function API can be used to define a table function that can then be called from within DuckDB in the FROM clause of a query.

API Reference

duckdb_table_function duckdb_create_table_function();
void duckdb_destroy_table_function(duckdb_table_function *table_function);
void duckdb_table_function_set_name(duckdb_table_function table_function, const char *name);
void duckdb_table_function_add_parameter(duckdb_table_function table_function, duckdb_logical_type
type);

96
DuckDB Documentation

void duckdb_table_function_add_named_parameter(duckdb_table_function table_function, const char *name,

duckdb_logical_type type);
void duckdb_table_function_set_extra_info(duckdb_table_function table_function, void *extra_info,
duckdb_delete_callback_t destroy);
void duckdb_table_function_set_bind(duckdb_table_function table_function, duckdb_table_function_bind_t
bind);
void duckdb_table_function_set_init(duckdb_table_function table_function, duckdb_table_function_init_t
init);
void duckdb_table_function_set_local_init(duckdb_table_function table_function, duckdb_table_function_
init_t init);
void duckdb_table_function_set_function(duckdb_table_function table_function, duckdb_table_function_t
function);
void duckdb_table_function_supports_projection_pushdown(duckdb_table_function table_function, bool
pushdown);
duckdb_state duckdb_register_table_function(duckdb_connection con, duckdb_table_function function);

Table Function Bind

void *duckdb_bind_get_extra_info(duckdb_bind_info info);

void duckdb_bind_add_result_column(duckdb_bind_info info, const char *name, duckdb_logical_type type);
idx_t duckdb_bind_get_parameter_count(duckdb_bind_info info);
duckdb_value duckdb_bind_get_parameter(duckdb_bind_info info, idx_t index);
duckdb_value duckdb_bind_get_named_parameter(duckdb_bind_info info, const char *name);
void duckdb_bind_set_bind_data(duckdb_bind_info info, void *bind_data, duckdb_delete_callback_t
destroy);
void duckdb_bind_set_cardinality(duckdb_bind_info info, idx_t cardinality, bool is_exact);
void duckdb_bind_set_error(duckdb_bind_info info, const char *error);

Table Function Init

void *duckdb_init_get_extra_info(duckdb_init_info info);

void *duckdb_init_get_bind_data(duckdb_init_info info);
void duckdb_init_set_init_data(duckdb_init_info info, void *init_data, duckdb_delete_callback_t
destroy);
idx_t duckdb_init_get_column_count(duckdb_init_info info);
idx_t duckdb_init_get_column_index(duckdb_init_info info, idx_t column_index);
void duckdb_init_set_max_threads(duckdb_init_info info, idx_t max_threads);
void duckdb_init_set_error(duckdb_init_info info, const char *error);

Table Function

void *duckdb_function_get_extra_info(duckdb_function_info info);

void *duckdb_function_get_bind_data(duckdb_function_info info);
void *duckdb_function_get_init_data(duckdb_function_info info);
void *duckdb_function_get_local_init_data(duckdb_function_info info);
void duckdb_function_set_error(duckdb_function_info info, const char *error);

duckdb_create_table_function Creates a new empty table function.

The return value should be destroyed with duckdb_destroy_table_function.

Syntax

duckdb_table_function duckdb_create_table_function(

);

The name of the table function

duckdb_table_function_add_parameter Adds a parameter to the table function.

Syntax
void duckdb_table_function_add_parameter(
duckdb_table_function table_function,
duckdb_logical_type type
);

Parameters

• table_function

The table function

• type

The type of the parameter to add.

98
DuckDB Documentation

duckdb_table_function_add_named_parameter Adds a named parameter to the table function.

Syntax

void duckdb_table_function_add_named_parameter(
duckdb_table_function table_function,
const char *name,
duckdb_logical_type type
);

Parameters

• table_function

The table function

• name

The name of the parameter

• type

The type of the parameter to add.

duckdb_table_function_set_extra_info Assigns extra information to the table function that can be fetched during binding,
etc.

Syntax

void duckdb_table_function_set_extra_info(
duckdb_table_function table_function,
void *extra_info,
duckdb_delete_callback_t destroy
);

Parameters

• table_function

The table function

• extra_info

The extra information

• destroy

• init

The init function

duckdb_table_function_set_function Sets the main function of the table function.

Syntax

void duckdb_table_function_set_function(
duckdb_table_function table_function,
duckdb_table_function_t function
);

100
DuckDB Documentation

Parameters

• table_function

The table function

• function

The function

duckdb_table_function_supports_projection_pushdown Sets whether or not the given table function supports projec‑
tion pushdown.

If this is set to true, the system will provide a list of all required columns in the init stage through the duckdb_init_get_column_
count and duckdb_init_get_column_index functions. If this is set to false (the default), the system will expect all columns to be
projected.

Syntax
void duckdb_table_function_supports_projection_pushdown(
duckdb_table_function table_function,
bool pushdown
);

Parameters

• table_function

The table function

• pushdown

True if the table function supports projection pushdown, false otherwise.

duckdb_register_table_function Register the table function object within the given connection.

The function requires at least a name, a bind function, an init function and a main function.

If the function is incomplete or a function with this name already exists DuckDBError is returned.

Syntax
duckdb_state duckdb_register_table_function(
duckdb_connection con,
duckdb_table_function function
);

Parameters

• con

The connection to register it in.

• function

The function pointer

• returns

Whether or not the registration was successful.

101
DuckDB Documentation

duckdb_bind_get_extra_info Retrieves the extra info of the function as set in duckdb_table_function_set_extra_

Parameters

• info

The info object

• index

The index of the parameter to get

• returns

The value of the parameter. Must be destroyed with duckdb_destroy_value.

duckdb_bind_get_named_parameter Retrieves a named parameter with the given name.

The result must be destroyed with duckdb_destroy_value.

Syntax
duckdb_value duckdb_bind_get_named_parameter(
duckdb_bind_info info,
const char *name
);

Parameters

• info

The info object

• name

The name of the parameter

• returns

The value of the parameter. Must be destroyed with duckdb_destroy_value.

duckdb_bind_set_bind_data Sets the user‑provided bind data in the bind object. This object can be retrieved again during exe‑
cution.

Syntax
void duckdb_bind_set_bind_data(
duckdb_bind_info info,
void *bind_data,
duckdb_delete_callback_t destroy
);

103
DuckDB Documentation

Parameters

• info

The info object

• extra_data

The bind data object.

• destroy

The callback that will be called to destroy the bind data (if any)

duckdb_bind_set_cardinality Sets the cardinality estimate for the table function, used for optimization.

Syntax

void duckdb_bind_set_cardinality(
duckdb_bind_info info,
idx_t cardinality,
bool is_exact
);

Parameters

• info

The bind data object.

• is_exact

Whether or not the cardinality estimate is exact, or an approximation

duckdb_bind_set_error Report that an error has occurred while calling bind.

Syntax

void duckdb_bind_set_error(
duckdb_bind_info info,
const char *error
);

Parameters

• info

The info object

• error

The error message

duckdb_init_get_extra_info Retrieves the extra info of the function as set in duckdb_table_function_set_extra_

info.

104
DuckDB Documentation

Syntax
void *duckdb_init_get_extra_info(
duckdb_init_info info
);

Parameters

• info

The info object

• returns

The extra info

duckdb_init_get_bind_data Gets the bind data set by duckdb_bind_set_bind_data during the bind.

Note that the bind data should be considered as read‑only. For tracking state, use the init data instead.

Syntax
void *duckdb_init_get_bind_data(
duckdb_init_info info
);

Parameters

• info

The info object

• returns

The bind data object

duckdb_init_set_init_data Sets the user‑provided init data in the init object. This object can be retrieved again during execu‑
tion.

Syntax
void duckdb_init_set_init_data(
duckdb_init_info info,
void *init_data,
duckdb_delete_callback_t destroy
);

Parameters

• info

The info object

• extra_data

The init data object.

• destroy

The callback that will be called to destroy the init data (if any)

105
DuckDB Documentation

duckdb_init_get_column_count Returns the number of projected columns.

This function must be used if projection pushdown is enabled to figure out which columns to emit.

Syntax

idx_t duckdb_init_get_column_count(
duckdb_init_info info
);

Parameters

• info

The info object

• returns

The number of projected columns.

duckdb_init_get_column_index Returns the column index of the projected column at the specified position.

This function must be used if projection pushdown is enabled to figure out which columns to emit.

Syntax

idx_t duckdb_init_get_column_index(
duckdb_init_info info,
idx_t column_index
);

The info object

• error

The error message

duckdb_function_get_extra_info Retrieves the extra info of the function as set in duckdb_table_function_set_

extra_info.

Syntax

void *duckdb_function_get_extra_info(
duckdb_function_info info
);

Parameters

• info

The info object

• returns

The extra info

duckdb_function_get_bind_data Gets the bind data set by duckdb_bind_set_bind_data during the bind.

Note that the bind data should be considered as read‑only. For tracking state, use the init data instead.

duckdb_function_set_error Report that an error has occurred while executing the function.

Syntax

void duckdb_function_set_error(
duckdb_function_info info,
const char *error
);

108
DuckDB Documentation

Parameters

• info

The info object

• error

The error message

Replacement Scans

The replacement scan API can be used to register a callback that is called when a table is read that does not exist in the catalog. For example,
when a query such as SELECT * FROM my_table is executed and my_table does not exist, the replacement scan callback will be
called with my_table as parameter. The replacement scan can then insert a table function with a specific parameter to replace the read
of the table.

API Reference

void duckdb_add_replacement_scan(duckdb_database db, duckdb_replacement_callback_t replacement, void

*extra_data, duckdb_delete_callback_t delete_callback);
void duckdb_replacement_scan_set_function_name(duckdb_replacement_scan_info info, const char *function_
name);
void duckdb_replacement_scan_add_parameter(duckdb_replacement_scan_info info, duckdb_value parameter);
void duckdb_replacement_scan_set_error(duckdb_replacement_scan_info info, const char *error);

duckdb_add_replacement_scan Add a replacement scan definition to the specified database.

Syntax
void duckdb_add_replacement_scan(
duckdb_database db,
duckdb_replacement_callback_t replacement,
void *extra_data,
duckdb_delete_callback_t delete_callback
);

Parameters

• db

The database object to add the replacement scan to

• replacement

The replacement scan callback

• extra_data

Extra data that is passed back into the specified callback

• delete_callback

The delete callback to call on the extra data, if any

duckdb_replacement_scan_set_function_name Sets the replacement function name. If this function is called in the replace‑
ment callback, the replacement scan is performed. If it is not called, the replacement callback is not performed.

109
DuckDB Documentation

Syntax

void duckdb_replacement_scan_set_function_name(
duckdb_replacement_scan_info info,
const char *function_name
);

Parameters

• info

The info object

• function_name

The function name to substitute.

duckdb_replacement_scan_add_parameter Adds a parameter to the replacement scan function.

Syntax

void duckdb_replacement_scan_add_parameter(
duckdb_replacement_scan_info info,
duckdb_value parameter
);

Parameters

• info

The info object

• parameter

The parameter to add.

duckdb_replacement_scan_set_error Report that an error has occurred while executing the replacement scan.

Syntax

void duckdb_replacement_scan_set_error(
duckdb_replacement_scan_info info,
const char *error
);

Parameters

• info

The info object

• error

The error message

110
DuckDB Documentation

Complete API

API Reference

Open/Connect
duckdb_state duckdb_open(const char *path, duckdb_database *out_database);
duckdb_state duckdb_open_ext(const char *path, duckdb_database *out_database, duckdb_config config, char
**out_error);
void duckdb_close(duckdb_database *database);
duckdb_state duckdb_connect(duckdb_database database, duckdb_connection *out_connection);
void duckdb_interrupt(duckdb_connection connection);
duckdb_query_progress_type duckdb_query_progress(duckdb_connection connection);
void duckdb_disconnect(duckdb_connection *connection);
const char *duckdb_library_version();

Configuration
duckdb_state duckdb_create_config(duckdb_config *out_config);
size_t duckdb_config_count();
duckdb_state duckdb_get_config_flag(size_t index, const char **out_name, const char **out_description);
duckdb_state duckdb_set_config(duckdb_config config, const char *name, const char *option);
void duckdb_destroy_config(duckdb_config *config);

Query Execution
duckdb_state duckdb_query(duckdb_connection connection, const char *query, duckdb_result *out_result);
void duckdb_destroy_result(duckdb_result *result);
const char *duckdb_column_name(duckdb_result *result, idx_t col);
duckdb_type duckdb_column_type(duckdb_result *result, idx_t col);
duckdb_statement_type duckdb_result_statement_type(duckdb_result result);
duckdb_logical_type duckdb_column_logical_type(duckdb_result *result, idx_t col);
idx_t duckdb_column_count(duckdb_result *result);
idx_t duckdb_row_count(duckdb_result *result);
idx_t duckdb_rows_changed(duckdb_result *result);
void *duckdb_column_data(duckdb_result *result, idx_t col);
bool *duckdb_nullmask_data(duckdb_result *result, idx_t col);
const char *duckdb_result_error(duckdb_result *result);

Result Functions
duckdb_data_chunk duckdb_result_get_chunk(duckdb_result result, idx_t chunk_index);
bool duckdb_result_is_streaming(duckdb_result result);
idx_t duckdb_result_chunk_count(duckdb_result result);
duckdb_result_type duckdb_result_return_type(duckdb_result result);

Safe fetch functions

bool duckdb_value_boolean(duckdb_result *result, idx_t col, idx_t row);
int8_t duckdb_value_int8(duckdb_result *result, idx_t col, idx_t row);
int16_t duckdb_value_int16(duckdb_result *result, idx_t col, idx_t row);
int32_t duckdb_value_int32(duckdb_result *result, idx_t col, idx_t row);
int64_t duckdb_value_int64(duckdb_result *result, idx_t col, idx_t row);
duckdb_hugeint duckdb_value_hugeint(duckdb_result *result, idx_t col, idx_t row);
duckdb_uhugeint duckdb_value_uhugeint(duckdb_result *result, idx_t col, idx_t row);
duckdb_decimal duckdb_value_decimal(duckdb_result *result, idx_t col, idx_t row);
uint8_t duckdb_value_uint8(duckdb_result *result, idx_t col, idx_t row);
uint16_t duckdb_value_uint16(duckdb_result *result, idx_t col, idx_t row);

111
DuckDB Documentation

uint32_t duckdb_value_uint32(duckdb_result *result, idx_t col, idx_t row);

uint64_t duckdb_value_uint64(duckdb_result *result, idx_t col, idx_t row);
float duckdb_value_float(duckdb_result *result, idx_t col, idx_t row);
double duckdb_value_double(duckdb_result *result, idx_t col, idx_t row);
duckdb_date duckdb_value_date(duckdb_result *result, idx_t col, idx_t row);
duckdb_time duckdb_value_time(duckdb_result *result, idx_t col, idx_t row);
duckdb_timestamp duckdb_value_timestamp(duckdb_result *result, idx_t col, idx_t row);
duckdb_interval duckdb_value_interval(duckdb_result *result, idx_t col, idx_t row);
char *duckdb_value_varchar(duckdb_result *result, idx_t col, idx_t row);
duckdb_string duckdb_value_string(duckdb_result *result, idx_t col, idx_t row);
char *duckdb_value_varchar_internal(duckdb_result *result, idx_t col, idx_t row);
duckdb_string duckdb_value_string_internal(duckdb_result *result, idx_t col, idx_t row);
duckdb_blob duckdb_value_blob(duckdb_result *result, idx_t col, idx_t row);
bool duckdb_value_is_null(duckdb_result *result, idx_t col, idx_t row);

Helpers
void *duckdb_malloc(size_t size);
void duckdb_free(void *ptr);
idx_t duckdb_vector_size();
bool duckdb_string_is_inlined(duckdb_string_t string);

Date/Time/Timestamp Helpers
duckdb_date_struct duckdb_from_date(duckdb_date date);
duckdb_date duckdb_to_date(duckdb_date_struct date);
bool duckdb_is_finite_date(duckdb_date date);
duckdb_time_struct duckdb_from_time(duckdb_time time);
duckdb_time_tz duckdb_create_time_tz(int64_t micros, int32_t offset);
duckdb_time_tz_struct duckdb_from_time_tz(duckdb_time_tz micros);
duckdb_time duckdb_to_time(duckdb_time_struct time);
duckdb_timestamp_struct duckdb_from_timestamp(duckdb_timestamp ts);
duckdb_timestamp duckdb_to_timestamp(duckdb_timestamp_struct ts);
bool duckdb_is_finite_timestamp(duckdb_timestamp ts);

Hugeint Helpers
double duckdb_hugeint_to_double(duckdb_hugeint val);
duckdb_hugeint duckdb_double_to_hugeint(double val);

Unsigned Hugeint Helpers

double duckdb_uhugeint_to_double(duckdb_uhugeint val);
duckdb_uhugeint duckdb_double_to_uhugeint(double val);

Decimal Helpers
duckdb_decimal duckdb_double_to_decimal(double val, uint8_t width, uint8_t scale);
double duckdb_decimal_to_double(duckdb_decimal val);

Prepared Statements
duckdb_state duckdb_prepare(duckdb_connection connection, const char *query, duckdb_prepared_statement
*out_prepared_statement);
void duckdb_destroy_prepare(duckdb_prepared_statement *prepared_statement);
const char *duckdb_prepare_error(duckdb_prepared_statement prepared_statement);
idx_t duckdb_nparams(duckdb_prepared_statement prepared_statement);

112
DuckDB Documentation

const char *duckdb_parameter_name(duckdb_prepared_statement prepared_statement, idx_t index);

duckdb_type duckdb_param_type(duckdb_prepared_statement prepared_statement, idx_t param_idx);
duckdb_state duckdb_clear_bindings(duckdb_prepared_statement prepared_statement);
duckdb_statement_type duckdb_prepared_statement_type(duckdb_prepared_statement statement);

Bind Values to Prepared Statements

duckdb_state duckdb_bind_value(duckdb_prepared_statement prepared_statement, idx_t param_idx, duckdb_

value val);
duckdb_state duckdb_bind_parameter_index(duckdb_prepared_statement prepared_statement, idx_t *param_idx_
out, const char *name);
duckdb_state duckdb_bind_boolean(duckdb_prepared_statement prepared_statement, idx_t param_idx, bool
val);
duckdb_state duckdb_bind_int8(duckdb_prepared_statement prepared_statement, idx_t param_idx, int8_t
val);
duckdb_state duckdb_bind_int16(duckdb_prepared_statement prepared_statement, idx_t param_idx, int16_t
val);
duckdb_state duckdb_bind_int32(duckdb_prepared_statement prepared_statement, idx_t param_idx, int32_t
val);
duckdb_state duckdb_bind_int64(duckdb_prepared_statement prepared_statement, idx_t param_idx, int64_t
val);
duckdb_state duckdb_bind_hugeint(duckdb_prepared_statement prepared_statement, idx_t param_idx, duckdb_
hugeint val);
duckdb_state duckdb_bind_uhugeint(duckdb_prepared_statement prepared_statement, idx_t param_idx, duckdb_
uhugeint val);
duckdb_state duckdb_bind_decimal(duckdb_prepared_statement prepared_statement, idx_t param_idx, duckdb_
decimal val);
duckdb_state duckdb_bind_uint8(duckdb_prepared_statement prepared_statement, idx_t param_idx, uint8_t
val);
duckdb_state duckdb_bind_uint16(duckdb_prepared_statement prepared_statement, idx_t param_idx, uint16_t
val);
duckdb_state duckdb_bind_uint32(duckdb_prepared_statement prepared_statement, idx_t param_idx, uint32_t
val);
duckdb_state duckdb_bind_uint64(duckdb_prepared_statement prepared_statement, idx_t param_idx, uint64_t
val);
duckdb_state duckdb_bind_float(duckdb_prepared_statement prepared_statement, idx_t param_idx, float
val);
duckdb_state duckdb_bind_double(duckdb_prepared_statement prepared_statement, idx_t param_idx, double
val);
duckdb_state duckdb_bind_date(duckdb_prepared_statement prepared_statement, idx_t param_idx, duckdb_date
val);
duckdb_state duckdb_bind_time(duckdb_prepared_statement prepared_statement, idx_t param_idx, duckdb_time
val);
duckdb_state duckdb_bind_timestamp(duckdb_prepared_statement prepared_statement, idx_t param_idx,
duckdb_timestamp val);
duckdb_state duckdb_bind_interval(duckdb_prepared_statement prepared_statement, idx_t param_idx, duckdb_
interval val);
duckdb_state duckdb_bind_varchar(duckdb_prepared_statement prepared_statement, idx_t param_idx, const
char *val);
duckdb_state duckdb_bind_varchar_length(duckdb_prepared_statement prepared_statement, idx_t param_idx,
const char *val, idx_t length);
duckdb_state duckdb_bind_blob(duckdb_prepared_statement prepared_statement, idx_t param_idx, const void
*data, idx_t length);
duckdb_state duckdb_bind_null(duckdb_prepared_statement prepared_statement, idx_t param_idx);

Execute Prepared Statements

113
DuckDB Documentation

duckdb_state duckdb_execute_prepared(duckdb_prepared_statement prepared_statement, duckdb_result *out_

result);
duckdb_state duckdb_execute_prepared_streaming(duckdb_prepared_statement prepared_statement, duckdb_
result *out_result);

Extract Statements
idx_t duckdb_extract_statements(duckdb_connection connection, const char *query, duckdb_extracted_
statements *out_extracted_statements);
duckdb_state duckdb_prepare_extracted_statement(duckdb_connection connection, duckdb_extracted_
statements extracted_statements, idx_t index, duckdb_prepared_statement *out_prepared_statement);
const char *duckdb_extract_statements_error(duckdb_extracted_statements extracted_statements);
void duckdb_destroy_extracted(duckdb_extracted_statements *extracted_statements);

Pending Result Interface

duckdb_state duckdb_pending_prepared(duckdb_prepared_statement prepared_statement, duckdb_pending_result
*out_result);
duckdb_state duckdb_pending_prepared_streaming(duckdb_prepared_statement prepared_statement, duckdb_
pending_result *out_result);
void duckdb_destroy_pending(duckdb_pending_result *pending_result);
const char *duckdb_pending_error(duckdb_pending_result pending_result);
duckdb_pending_state duckdb_pending_execute_task(duckdb_pending_result pending_result);
duckdb_pending_state duckdb_pending_execute_check_state(duckdb_pending_result pending_result);
duckdb_state duckdb_execute_pending(duckdb_pending_result pending_result, duckdb_result *out_result);
bool duckdb_pending_execution_is_finished(duckdb_pending_state pending_state);

Value Interface
void duckdb_destroy_value(duckdb_value *value);
duckdb_value duckdb_create_varchar(const char *text);
duckdb_value duckdb_create_varchar_length(const char *text, idx_t length);
duckdb_value duckdb_create_int64(int64_t val);
duckdb_value duckdb_create_struct_value(duckdb_logical_type type, duckdb_value *values);
duckdb_value duckdb_create_list_value(duckdb_logical_type type, duckdb_value *values, idx_t value_
count);
duckdb_value duckdb_create_array_value(duckdb_logical_type type, duckdb_value *values, idx_t value_
count);
char *duckdb_get_varchar(duckdb_value value);
int64_t duckdb_get_int64(duckdb_value value);

Logical Type Interface

duckdb_logical_type duckdb_create_logical_type(duckdb_type type);
char *duckdb_logical_type_get_alias(duckdb_logical_type type);
duckdb_logical_type duckdb_create_list_type(duckdb_logical_type type);
duckdb_logical_type duckdb_create_array_type(duckdb_logical_type type, idx_t array_size);
duckdb_logical_type duckdb_create_map_type(duckdb_logical_type key_type, duckdb_logical_type value_
type);
duckdb_logical_type duckdb_create_union_type(duckdb_logical_type *member_types, const char **member_
names, idx_t member_count);
duckdb_logical_type duckdb_create_struct_type(duckdb_logical_type *member_types, const char **member_
names, idx_t member_count);
duckdb_logical_type duckdb_create_enum_type(const char **member_names, idx_t member_count);
duckdb_logical_type duckdb_create_decimal_type(uint8_t width, uint8_t scale);
duckdb_type duckdb_get_type_id(duckdb_logical_type type);
uint8_t duckdb_decimal_width(duckdb_logical_type type);
uint8_t duckdb_decimal_scale(duckdb_logical_type type);

114
DuckDB Documentation

duckdb_type duckdb_decimal_internal_type(duckdb_logical_type type);

duckdb_type duckdb_enum_internal_type(duckdb_logical_type type);
uint32_t duckdb_enum_dictionary_size(duckdb_logical_type type);
char *duckdb_enum_dictionary_value(duckdb_logical_type type, idx_t index);
duckdb_logical_type duckdb_list_type_child_type(duckdb_logical_type type);
duckdb_logical_type duckdb_array_type_child_type(duckdb_logical_type type);
idx_t duckdb_array_type_array_size(duckdb_logical_type type);
duckdb_logical_type duckdb_map_type_key_type(duckdb_logical_type type);
duckdb_logical_type duckdb_map_type_value_type(duckdb_logical_type type);
idx_t duckdb_struct_type_child_count(duckdb_logical_type type);
char *duckdb_struct_type_child_name(duckdb_logical_type type, idx_t index);
duckdb_logical_type duckdb_struct_type_child_type(duckdb_logical_type type, idx_t index);
idx_t duckdb_union_type_member_count(duckdb_logical_type type);
char *duckdb_union_type_member_name(duckdb_logical_type type, idx_t index);
duckdb_logical_type duckdb_union_type_member_type(duckdb_logical_type type, idx_t index);
void duckdb_destroy_logical_type(duckdb_logical_type *type);

Data Chunk Interface

duckdb_data_chunk duckdb_create_data_chunk(duckdb_logical_type *types, idx_t column_count);

Vector Interface

duckdb_logical_type duckdb_vector_get_column_type(duckdb_vector vector);

Validity Mask Functions

bool duckdb_validity_row_is_valid(uint64_t *validity, idx_t row);

Table Functions

115
DuckDB Documentation

void duckdb_table_function_add_named_parameter(duckdb_table_function table_function, const char *name,

Table Function Bind

void *duckdb_bind_get_extra_info(duckdb_bind_info info);

Table Function Init

void *duckdb_init_get_extra_info(duckdb_init_info info);

Table Function

void *duckdb_function_get_extra_info(duckdb_function_info info);

Replacement Scans

void duckdb_add_replacement_scan(duckdb_database db, duckdb_replacement_callback_t replacement, void

116
DuckDB Documentation

Appender

duckdb_state duckdb_appender_create(duckdb_connection connection, const char schema, const char table,

Arrow Interface

duckdb_state duckdb_query_arrow(duckdb_connection connection, const char query, duckdb_arrow out_

result);
duckdb_state duckdb_query_arrow_schema(duckdb_arrow result, duckdb_arrow_schema *out_schema);
duckdb_state duckdb_prepared_arrow_schema(duckdb_prepared_statement prepared, duckdb_arrow_schema *out_
schema);
void duckdb_result_arrow_array(duckdb_result result, duckdb_data_chunk chunk, duckdb_arrow_array *out_
array);
duckdb_state duckdb_query_arrow_array(duckdb_arrow result, duckdb_arrow_array *out_array);
idx_t duckdb_arrow_column_count(duckdb_arrow result);
idx_t duckdb_arrow_row_count(duckdb_arrow result);
idx_t duckdb_arrow_rows_changed(duckdb_arrow result);
const char *duckdb_query_arrow_error(duckdb_arrow result);
void duckdb_destroy_arrow(duckdb_arrow *result);
void duckdb_destroy_arrow_stream(duckdb_arrow_stream *stream_p);
duckdb_state duckdb_execute_prepared_arrow(duckdb_prepared_statement prepared_statement, duckdb_arrow
*out_result);
duckdb_state duckdb_arrow_scan(duckdb_connection connection, const char *table_name, duckdb_arrow_stream
arrow);
duckdb_state duckdb_arrow_array_scan(duckdb_connection connection, const char *table_name, duckdb_arrow_
schema arrow_schema, duckdb_arrow_array arrow_array, duckdb_arrow_stream *out_stream);

117
DuckDB Documentation

Threading Information
void duckdb_execute_tasks(duckdb_database database, idx_t max_tasks);
duckdb_task_state duckdb_create_task_state(duckdb_database database);
void duckdb_execute_tasks_state(duckdb_task_state state);
idx_t duckdb_execute_n_tasks_state(duckdb_task_state state, idx_t max_tasks);
void duckdb_finish_execution(duckdb_task_state state);
bool duckdb_task_state_is_finished(duckdb_task_state state);
void duckdb_destroy_task_state(duckdb_task_state state);
bool duckdb_execution_is_finished(duckdb_connection con);

Streaming Result Interface

duckdb_data_chunk duckdb_stream_fetch_chunk(duckdb_result result);

Syntax
duckdb_state duckdb_open(
const char *path,
duckdb_database *out_database
);

Parameters

• path

Path to the database file on disk, or nullptr or :memory: to open an in‑memory database.

• out_database

The result database object.

• returns

DuckDBSuccess on success or DuckDBError on failure.

duckdb_open_ext Extended version of duckdb_open. Creates a new database or opens an existing database file stored at the given
path. The instantiated database should be closed with 'duckdb_close'.

Syntax
duckdb_state duckdb_open_ext(
const char *path,
duckdb_database *out_database,
duckdb_config config,
char **out_error
);

Parameters

• path

Path to the database file on disk, or nullptr or :memory: to open an in‑memory database.

• out_database

118
DuckDB Documentation

The result database object.

• config

(Optional) configuration used to start up the database system.

• out_error

If set and the function returns DuckDBError, this will contain the reason why the start‑up failed. Note that the error must be freed using
duckdb_free.

• returns

DuckDBSuccess on success or DuckDBError on failure.

Syntax

void duckdb_close(
duckdb_database *database
);

Parameters

• database

The database object to shut down.

Syntax

duckdb_state duckdb_connect(
duckdb_database database,
duckdb_connection *out_connection
);

Parameters

• database

The database file to connect to.

• out_connection

The result connection object.

• returns

The connection to close.

duckdb_library_version Returns the version of the linked DuckDB, with a version postfix for dev versions

Usually used for developing C extensions that must return this for a compatibility check.

Syntax

const char *duckdb_library_version(

);

120
DuckDB Documentation

This will always succeed unless there is a malloc failure.

Syntax

duckdb_state duckdb_create_config(
duckdb_config *out_config
);

Parameters

• out_config

The result configuration object.

• returns

DuckDBSuccess on success or DuckDBError on failure.

duckdb_config_count This returns the total amount of configuration options available for usage with duckdb_get_config_
flag.

This should not be called in a loop as it internally loops over all the options.

Syntax

size_t duckdb_config_count(

);

Parameters

• returns

The amount of config options available.

The result name or description MUST NOT be freed.

Syntax

duckdb_state duckdb_get_config_flag(
size_t index,
const char **out_name,
const char **out_description
);

121
DuckDB Documentation

Parameters

• index

The index of the configuration option (between 0 and duckdb_config_count)

• out_name

A name of the configuration flag.

• out_description

A description of the configuration flag.

• returns

DuckDBSuccess on success or DuckDBError on failure.

duckdb_set_config Sets the specified option for the specified configuration. The configuration option is indicated by name. To
obtain a list of config options, see duckdb_get_config_flag.

In the source code, configuration options are defined in config.cpp.

This can fail if either the name is invalid, or if the value provided for the option is invalid.

Syntax

duckdb_state duckdb_set_config(
duckdb_config config,
const char *name,
const char *option
);

Parameters

• duckdb_config

The configuration object to set the option on.

• name

The name of the configuration flag to set.

• option

The value to set the configuration flag to.

• returns

DuckDBSuccess on success or DuckDBError on failure.

duckdb_destroy_config Destroys the specified configuration object and de‑allocates all memory allocated for the object.

Syntax

void duckdb_destroy_config(
duckdb_config *config
);

122
DuckDB Documentation

Parameters

• config

The configuration object to destroy.

Note that after running duckdb_query, duckdb_destroy_result must be called on the result object even if the query fails, other‑
wise the error stored within the result will not be freed correctly.

Syntax

duckdb_state duckdb_query(
duckdb_connection connection,
const char *query,
duckdb_result *out_result
);

Parameters

• connection

The connection to perform the query in.

• query

The SQL query to run.

• out_result

The query result.

• returns

DuckDBSuccess on success or DuckDBError on failure.

duckdb_destroy_result Closes the result and de‑allocates all memory allocated for that connection.

Syntax

void duckdb_destroy_result(
duckdb_result *result
);

Parameters

• result

The result to destroy.

duckdb_column_name Returns the column name of the specified column. The result should not need to be freed; the column names
will automatically be destroyed when the result is destroyed.

Returns NULL if the column is out of range.

123
DuckDB Documentation

Syntax
const char *duckdb_column_name(
duckdb_result *result,
idx_t col
);

Parameters

• result

The result object to fetch the column name from.

• col

The column index.

• returns

The column name of the specified column.

duckdb_column_type Returns the column type of the specified column.

Returns DUCKDB_TYPE_INVALID if the column is out of range.

Syntax
duckdb_type duckdb_column_type(
duckdb_result *result,
idx_t col
);

Parameters

• result

The result object to fetch the column type from.

• col

The column index.

• returns

The column type of the specified column.

duckdb_result_statement_type Returns the statement type of the statement that was executed

Syntax
duckdb_statement_type duckdb_result_statement_type(
duckdb_result result
);

Parameters

• result

The result object to fetch the statement type from.

• returns

duckdb_statement_type value or DUCKDB_STATEMENT_TYPE_INVALID

124
DuckDB Documentation

duckdb_column_logical_type Returns the logical column type of the specified column.

The return type of this call should be destroyed with duckdb_destroy_logical_type.

Returns NULL if the column is out of range.

Syntax
duckdb_logical_type duckdb_column_logical_type(
duckdb_result *result,
idx_t col
);

Parameters

• result

The result object to fetch the column type from.

idx_t duckdb_rows_changed(
duckdb_result *result
);

Parameters

• result

The result object.

• returns

The number of rows changed.

duckdb_column_data DEPRECATED: Prefer using duckdb_result_get_chunk instead.

Returns the data of a specific column of a result in columnar format.

For example, for a column of type DUCKDB_TYPE_INTEGER, rows can be accessed in the following manner:

int32_t data = (int32_t ) duckdb_column_data(&result, 0);

printf("Data for row %d: %d\n", row, data[row]);

Syntax

void *duckdb_column_data(
duckdb_result *result,
idx_t col
);

Parameters

• result

The result object to fetch the column data from.

• col

The column index.

• returns

The column data of the specified column.

126
DuckDB Documentation

duckdb_nullmask_data DEPRECATED: Prefer using duckdb_result_get_chunk instead.

int32_t data = (int32_t ) duckdb_column_data(&result, 0);

bool *nullmask = duckdb_nullmask_data(&result, 0);
if (nullmask[row]) {
printf("Data for row %d: NULL\n", row);
} else {
printf("Data for row %d: %d\n", row, data[row]);
}

Syntax

bool *duckdb_nullmask_data(
duckdb_result *result,
idx_t col
);

Parameters

• result

The result object to fetch the nullmask from.

• col

The column index.

• returns

The nullmask of the specified column.

duckdb_result_error Returns the error message contained within the result. The error is only set if duckdb_query returns
DuckDBError.

The result of this function must not be freed. It will be cleaned up when duckdb_destroy_result is called.

Syntax

const char *duckdb_result_error(

duckdb_result *result
);

Parameters

• result

The result object to fetch the error from.

• returns

The error of the result.

127
DuckDB Documentation

duckdb_result_get_chunk Fetches a data chunk from the duckdb_result. This function should be called repeatedly until the result
is exhausted.

The result must be destroyed with duckdb_destroy_data_chunk.

If this function is used, none of the other result functions can be used and vice versa (i.e., this function cannot be mixed with the legacy
result functions).

Use duckdb_result_chunk_count to figure out how many chunks there are in the result.

Syntax
duckdb_data_chunk duckdb_result_get_chunk(
duckdb_result result,
idx_t chunk_index
);

Parameters

• result

The result object to fetch the data chunk from.

• chunk_index

The chunk index to fetch from.

• returns

The resulting data chunk. Returns NULL if the chunk index is out of bounds.

duckdb_result_is_streaming Checks if the type of the internal result is StreamQueryResult.

Syntax
bool duckdb_result_is_streaming(
duckdb_result result
);

Parameters

• result

The result object to check.

• returns

Whether or not the result object is of the type StreamQueryResult

duckdb_result_chunk_count Returns the number of data chunks present in the result.

Syntax
idx_t duckdb_result_chunk_count(
duckdb_result result
);

128
DuckDB Documentation

Parameters

• result

The result object

• returns

Number of data chunks present in the result.

duckdb_result_return_type Returns the return_type of the given result, or DUCKDB_RETURN_TYPE_INVALID on error

Syntax

duckdb_result_type duckdb_result_return_type(
duckdb_result result
);

Parameters

• result

The result object

• returns

The return_type

duckdb_value_boolean

Syntax

bool duckdb_value_boolean(
duckdb_result *result,
idx_t col,
idx_t row
);

Parameters

• returns

The boolean value at the specified location, or false if the value cannot be converted.

duckdb_value_int8

Syntax

int8_t duckdb_value_int8(
duckdb_result *result,
idx_t col,
idx_t row
);

129
DuckDB Documentation

Parameters

• returns

The int8_t value at the specified location, or 0 if the value cannot be converted.

duckdb_value_int16

Syntax
int16_t duckdb_value_int16(
duckdb_result *result,
idx_t col,
idx_t row
);

Parameters

• returns

The int16_t value at the specified location, or 0 if the value cannot be converted.

duckdb_value_int32

Syntax
int32_t duckdb_value_int32(
duckdb_result *result,
idx_t col,
idx_t row
);

Parameters

• returns

The int32_t value at the specified location, or 0 if the value cannot be converted.

duckdb_value_int64

Syntax
int64_t duckdb_value_int64(
duckdb_result *result,
idx_t col,
idx_t row
);

Parameters

• returns

The int64_t value at the specified location, or 0 if the value cannot be converted.

duckdb_value_hugeint

130
DuckDB Documentation

Syntax

duckdb_hugeint duckdb_value_hugeint(
duckdb_result *result,
idx_t col,
idx_t row
);

Parameters

• returns

The duckdb_hugeint value at the specified location, or 0 if the value cannot be converted.

duckdb_value_uhugeint

Syntax

duckdb_uhugeint duckdb_value_uhugeint(
duckdb_result *result,
idx_t col,
idx_t row
);

Parameters

• returns

The duckdb_uhugeint value at the specified location, or 0 if the value cannot be converted.

duckdb_value_decimal

Syntax

duckdb_decimal duckdb_value_decimal(
duckdb_result *result,
idx_t col,
idx_t row
);

Parameters

• returns

The duckdb_decimal value at the specified location, or 0 if the value cannot be converted.

duckdb_value_uint8

Syntax

uint8_t duckdb_value_uint8(
duckdb_result *result,
idx_t col,
idx_t row
);

131
DuckDB Documentation

Parameters

• returns

The uint8_t value at the specified location, or 0 if the value cannot be converted.

duckdb_value_uint16

Syntax
uint16_t duckdb_value_uint16(
duckdb_result *result,
idx_t col,
idx_t row
);

Parameters

• returns

The uint16_t value at the specified location, or 0 if the value cannot be converted.

duckdb_value_uint32

Syntax
uint32_t duckdb_value_uint32(
duckdb_result *result,
idx_t col,
idx_t row
);

Parameters

• returns

The uint32_t value at the specified location, or 0 if the value cannot be converted.

duckdb_value_uint64

Syntax
uint64_t duckdb_value_uint64(
duckdb_result *result,
idx_t col,
idx_t row
);

Parameters

• returns

The uint64_t value at the specified location, or 0 if the value cannot be converted.

duckdb_value_float

132
DuckDB Documentation

Syntax

float duckdb_value_float(
duckdb_result *result,
idx_t col,
idx_t row
);

Parameters

• returns

The float value at the specified location, or 0 if the value cannot be converted.

duckdb_value_double

Syntax

double duckdb_value_double(
duckdb_result *result,
idx_t col,
idx_t row
);

Parameters

• returns

The double value at the specified location, or 0 if the value cannot be converted.

duckdb_value_date

Syntax

duckdb_date duckdb_value_date(
duckdb_result *result,
idx_t col,
idx_t row
);

Parameters

• returns

The duckdb_date value at the specified location, or 0 if the value cannot be converted.

duckdb_value_time

Syntax

duckdb_time duckdb_value_time(
duckdb_result *result,
idx_t col,
idx_t row
);

133
DuckDB Documentation

Parameters

• returns

The duckdb_time value at the specified location, or 0 if the value cannot be converted.

duckdb_value_timestamp

Syntax
duckdb_timestamp duckdb_value_timestamp(
duckdb_result *result,
idx_t col,
idx_t row
);

Parameters

• returns

The duckdb_timestamp value at the specified location, or 0 if the value cannot be converted.

duckdb_value_interval

Syntax
duckdb_interval duckdb_value_interval(
duckdb_result *result,
idx_t col,
idx_t row
);

Parameters

• returns

The duckdb_interval value at the specified location, or 0 if the value cannot be converted.

duckdb_value_varchar

Syntax
char *duckdb_value_varchar(
duckdb_result *result,
idx_t col,
idx_t row
);

Parameters

• DEPRECATED

use duckdb_value_string instead. This function does not work correctly if the string contains null bytes.

Syntax

duckdb_blob duckdb_value_blob(
duckdb_result *result,
idx_t col,
idx_t row
);

Parameters

• returns

The duckdb_blob value at the specified location. Returns a blob with blob.data set to nullptr if the value cannot be converted. The resulting
field ”blob.data” must be freed with duckdb_free.

duckdb_value_is_null

Syntax

bool duckdb_value_is_null(
duckdb_result *result,
idx_t col,
idx_t row
);

Parameters

• returns

Returns true if the value at the specified index is NULL, and false otherwise.

duckdb_malloc Allocate size bytes of memory using the duckdb internal malloc function. Any memory allocated in this manner
should be freed using duckdb_free.

Syntax

void *duckdb_malloc(
size_t size
);

136
DuckDB Documentation

Parameters

• size

The number of bytes to allocate.

• returns

A pointer to the allocated memory region.

duckdb_free Free a value returned from duckdb_malloc, duckdb_value_varchar, duckdb_value_blob, or duckdb_

value_string.

Syntax
void duckdb_free(
void *ptr
);

Parameters

The time object, as obtained from a DUCKDB_TYPE_TIME column.

• returns

The duckdb_time_struct with the decomposed elements.

duckdb_create_time_tz Create a duckdb_time_tz object from micros and a timezone offset.

Syntax

duckdb_time_tz duckdb_create_time_tz(
int64_t micros,
int32_t offset
);

Parameters

• micros

The microsecond component of the time.

• offset

The timezone offset component of the time.

• returns

The duckdb_time_tz element.

duckdb_from_time_tz Decompose a TIME_TZ objects into micros and a timezone offset.

Use duckdb_from_time to further decompose the micros into hour, minute, second and microsecond.

Syntax

duckdb_time_tz_struct duckdb_from_time_tz(
duckdb_time_tz micros
);

Parameters

• micros

The time object, as obtained from a DUCKDB_TYPE_TIME_TZ column.

• out_micros

The microsecond component of the time.

• out_offset

The timezone offset component of the time.

duckdb_to_time Re‑compose a duckdb_time from hour, minute, second and microsecond (duckdb_time_struct).

duckdb_is_finite_timestamp Test a duckdb_timestamp to see if it is a finite value.

Syntax
bool duckdb_is_finite_timestamp(
duckdb_timestamp ts
);

140
DuckDB Documentation

Parameters

• ts

The timestamp object, as obtained from a DUCKDB_TYPE_TIMESTAMP column.

• returns

duckdb_prepare Create a prepared statement object from a query.

Note that after calling duckdb_prepare, the prepared statement should always be destroyed using duckdb_destroy_prepare,
even if the prepare fails.

If the prepare fails, duckdb_prepare_error can be called to obtain the reason why the prepare failed.

Syntax
duckdb_state duckdb_prepare(
duckdb_connection connection,
const char *query,
duckdb_prepared_statement *out_prepared_statement
);

Parameters

• connection

The connection object

• query

The SQL query to prepare

• out_prepared_statement

The resulting prepared statement object

• returns

DuckDBSuccess on success or DuckDBError on failure.

duckdb_destroy_prepare Closes the prepared statement and de‑allocates all memory allocated for the statement.

Syntax
void duckdb_destroy_prepare(
duckdb_prepared_statement *prepared_statement
);

Parameters

• prepared_statement

The prepared statement to destroy.

duckdb_prepare_error Returns the error message associated with the given prepared statement. If the prepared statement has no
error message, this returns nullptr instead.

The error message should not be freed. It will be de‑allocated when duckdb_destroy_prepare is called.

143
DuckDB Documentation

Syntax

const char *duckdb_prepare_error(

duckdb_prepared_statement prepared_statement
);

Parameters

• prepared_statement

The prepared statement to obtain the error from.

• returns

The error message, or nullptr if there is none.

duckdb_nparams Returns the number of parameters that can be provided to the given prepared statement.

Returns 0 if the query was not successfully prepared.

Syntax

idx_t duckdb_nparams(
duckdb_prepared_statement prepared_statement
);

Parameters

• prepared_statement

The prepared statement to obtain the number of parameters for.

duckdb_parameter_name Returns the name used to identify the parameter The returned string should be freed using duckdb_
free.

Returns NULL if the index is out of range for the provided prepared statement.

Syntax

const char *duckdb_parameter_name(

duckdb_prepared_statement prepared_statement,
idx_t index
);

Parameters

• prepared_statement

The prepared statement for which to get the parameter name from.

duckdb_param_type Returns the parameter type for the parameter at the given index.

Returns DUCKDB_TYPE_INVALID if the parameter index is out of range or the statement was not successfully prepared.

144
DuckDB Documentation

Syntax

duckdb_type duckdb_param_type(
duckdb_prepared_statement prepared_statement,
idx_t param_idx
);

Parameters

• prepared_statement

The prepared statement.

• param_idx

The parameter index.

• returns

The parameter type

duckdb_clear_bindings Clear the params bind to the prepared statement.

Syntax

duckdb_state duckdb_clear_bindings(
duckdb_prepared_statement prepared_statement
);

duckdb_prepared_statement_type Returns the statement type of the statement to be executed

Syntax

duckdb_statement_type duckdb_prepared_statement_type(
duckdb_prepared_statement statement
);

Parameters

• statement

The prepared statement.

• returns

duckdb_statement_type value or DUCKDB_STATEMENT_TYPE_INVALID

duckdb_bind_value Binds a value to the prepared statement at the specified index.

Syntax

duckdb_state duckdb_bind_value(
duckdb_prepared_statement prepared_statement,
idx_t param_idx,
duckdb_value val
);

145
DuckDB Documentation

duckdb_bind_parameter_index Retrieve the index of the parameter for the prepared statement, identified by name

Syntax

duckdb_state duckdb_bind_parameter_index(
duckdb_prepared_statement prepared_statement,
idx_t *param_idx_out,
const char *name
);

duckdb_bind_boolean Binds a bool value to the prepared statement at the specified index.

Syntax

duckdb_state duckdb_bind_boolean(
duckdb_prepared_statement prepared_statement,
idx_t param_idx,
bool val
);

duckdb_bind_int8 Binds an int8_t value to the prepared statement at the specified index.

Syntax

duckdb_state duckdb_bind_int8(
duckdb_prepared_statement prepared_statement,
idx_t param_idx,
int8_t val
);

duckdb_bind_int16 Binds an int16_t value to the prepared statement at the specified index.

Syntax

duckdb_state duckdb_bind_int16(
duckdb_prepared_statement prepared_statement,
idx_t param_idx,
int16_t val
);

duckdb_bind_int32 Binds an int32_t value to the prepared statement at the specified index.

Syntax

duckdb_state duckdb_bind_int32(
duckdb_prepared_statement prepared_statement,
idx_t param_idx,
int32_t val
);

duckdb_bind_int64 Binds an int64_t value to the prepared statement at the specified index.

146
DuckDB Documentation

Syntax

duckdb_state duckdb_bind_int64(
duckdb_prepared_statement prepared_statement,
idx_t param_idx,
int64_t val
);

duckdb_bind_hugeint Binds a duckdb_hugeint value to the prepared statement at the specified index.

Syntax

duckdb_state duckdb_bind_hugeint(
duckdb_prepared_statement prepared_statement,
idx_t param_idx,
duckdb_hugeint val
);

duckdb_bind_uhugeint Binds an duckdb_uhugeint value to the prepared statement at the specified index.

Syntax

duckdb_state duckdb_bind_uhugeint(
duckdb_prepared_statement prepared_statement,
idx_t param_idx,
duckdb_uhugeint val
);

duckdb_bind_decimal Binds a duckdb_decimal value to the prepared statement at the specified index.

Syntax

duckdb_state duckdb_bind_decimal(
duckdb_prepared_statement prepared_statement,
idx_t param_idx,
duckdb_decimal val
);

duckdb_bind_uint8 Binds an uint8_t value to the prepared statement at the specified index.

Syntax

duckdb_state duckdb_bind_uint8(
duckdb_prepared_statement prepared_statement,
idx_t param_idx,
uint8_t val
);

duckdb_bind_uint16 Binds an uint16_t value to the prepared statement at the specified index.

147
DuckDB Documentation

Syntax

duckdb_state duckdb_bind_uint16(
duckdb_prepared_statement prepared_statement,
idx_t param_idx,
uint16_t val
);

duckdb_bind_uint32 Binds an uint32_t value to the prepared statement at the specified index.

Syntax

duckdb_state duckdb_bind_uint32(
duckdb_prepared_statement prepared_statement,
idx_t param_idx,
uint32_t val
);

duckdb_bind_uint64 Binds an uint64_t value to the prepared statement at the specified index.

Syntax

duckdb_state duckdb_bind_uint64(
duckdb_prepared_statement prepared_statement,
idx_t param_idx,
uint64_t val
);

duckdb_bind_float Binds a float value to the prepared statement at the specified index.

Syntax

duckdb_state duckdb_bind_float(
duckdb_prepared_statement prepared_statement,
idx_t param_idx,
float val
);

duckdb_bind_double Binds a double value to the prepared statement at the specified index.

Syntax

duckdb_state duckdb_bind_double(
duckdb_prepared_statement prepared_statement,
idx_t param_idx,
double val
);

duckdb_bind_date Binds a duckdb_date value to the prepared statement at the specified index.

148
DuckDB Documentation

Syntax

duckdb_state duckdb_bind_date(
duckdb_prepared_statement prepared_statement,
idx_t param_idx,
duckdb_date val
);

duckdb_bind_time Binds a duckdb_time value to the prepared statement at the specified index.

Syntax

duckdb_state duckdb_bind_time(
duckdb_prepared_statement prepared_statement,
idx_t param_idx,
duckdb_time val
);

duckdb_bind_timestamp Binds a duckdb_timestamp value to the prepared statement at the specified index.

Syntax

duckdb_state duckdb_bind_timestamp(
duckdb_prepared_statement prepared_statement,
idx_t param_idx,
duckdb_timestamp val
);

duckdb_bind_interval Binds a duckdb_interval value to the prepared statement at the specified index.

Syntax

duckdb_state duckdb_bind_interval(
duckdb_prepared_statement prepared_statement,
idx_t param_idx,
duckdb_interval val
);

duckdb_bind_varchar Binds a null‑terminated varchar value to the prepared statement at the specified index.

Syntax

duckdb_state duckdb_bind_varchar(
duckdb_prepared_statement prepared_statement,
idx_t param_idx,
const char *val
);

duckdb_bind_varchar_length Binds a varchar value to the prepared statement at the specified index.

149
DuckDB Documentation

Syntax
duckdb_state duckdb_bind_varchar_length(
duckdb_prepared_statement prepared_statement,
idx_t param_idx,
const char *val,
idx_t length
);

duckdb_bind_blob Binds a blob value to the prepared statement at the specified index.

Syntax
duckdb_state duckdb_bind_blob(
duckdb_prepared_statement prepared_statement,
idx_t param_idx,
const void *data,
idx_t length
);

duckdb_bind_null Binds a NULL value to the prepared statement at the specified index.

Syntax
duckdb_state duckdb_bind_null(
duckdb_prepared_statement prepared_statement,
idx_t param_idx
);

duckdb_execute_prepared Executes the prepared statement with the given bound parameters, and returns a materialized query
result.

This method can be called multiple times for each prepared statement, and the parameters can be modified between calls to this func‑
tion.

Note that the result must be freed with duckdb_destroy_result.

Syntax
duckdb_state duckdb_execute_prepared(
duckdb_prepared_statement prepared_statement,
duckdb_result *out_result
);

Parameters

• prepared_statement

The prepared statement to execute.

• out_result

The query result.

• returns

DuckDBSuccess on success or DuckDBError on failure.

150
DuckDB Documentation

duckdb_execute_prepared_streaming Executes the prepared statement with the given bound parameters, and returns an
optionally‑streaming query result. To determine if the resulting query was in fact streamed, use duckdb_result_is_streaming

This method can be called multiple times for each prepared statement, and the parameters can be modified between calls to this func‑
tion.

Note that the result must be freed with duckdb_destroy_result.

Syntax

duckdb_state duckdb_execute_prepared_streaming(
duckdb_prepared_statement prepared_statement,
duckdb_result *out_result
);

Parameters

• prepared_statement

The prepared statement to execute.

• out_result

The query result.

• returns

DuckDBSuccess on success or DuckDBError on failure.

duckdb_extract_statements Extract all statements from a query. Note that after calling duckdb_extract_statements, the
extracted statements should always be destroyed using duckdb_destroy_extracted, even if no statements were extracted.

If the extract fails, duckdb_extract_statements_error can be called to obtain the reason why the extract failed.

Syntax

idx_t duckdb_extract_statements(
duckdb_connection connection,
const char *query,
duckdb_extracted_statements *out_extracted_statements
);

Parameters

• connection

The connection object

• query

The SQL query to extract

• out_extracted_statements

The resulting extracted statements object

• returns

The number of extracted statements or 0 on failure.

151
DuckDB Documentation

duckdb_prepare_extracted_statement Prepare an extracted statement. Note that after calling duckdb_prepare_

extracted_statement, the prepared statement should always be destroyed using duckdb_destroy_prepare, even if the
prepare fails.

If the prepare fails, duckdb_prepare_error can be called to obtain the reason why the prepare failed.

Syntax

duckdb_state duckdb_prepare_extracted_statement(
duckdb_connection connection,
duckdb_extracted_statements extracted_statements,
idx_t index,
duckdb_prepared_statement *out_prepared_statement
);

Parameters

• connection

The connection object

• extracted_statements

The extracted statements object

• index

The index of the extracted statement to prepare

• out_prepared_statement

The resulting prepared statement object

• returns

DuckDBSuccess on success or DuckDBError on failure.

duckdb_extract_statements_error Returns the error message contained within the extracted statements. The result of this
function must not be freed. It will be cleaned up when duckdb_destroy_extracted is called.

Syntax

const char *duckdb_extract_statements_error(

duckdb_extracted_statements extracted_statements
);

Parameters

• result

The extracted statements to fetch the error from.

• returns

The error of the extracted statements.

duckdb_destroy_extracted De‑allocates all memory allocated for the extracted statements.

152
DuckDB Documentation

Syntax

void duckdb_destroy_extracted(
duckdb_extracted_statements *extracted_statements
);

Parameters

• extracted_statements

The extracted statements to destroy.

duckdb_pending_prepared Executes the prepared statement with the given bound parameters, and returns a pending result. The
pending result represents an intermediate structure for a query that is not yet fully executed. The pending result can be used to incremen‑
tally execute a query, returning control to the client between tasks.

Note that after calling duckdb_pending_prepared, the pending result should always be destroyed using duckdb_destroy_
pending, even if this function returns DuckDBError.

Syntax

duckdb_state duckdb_pending_prepared(
duckdb_prepared_statement prepared_statement,
duckdb_pending_result *out_result
);

Parameters

• prepared_statement

The prepared statement to execute.

• out_result

The pending query result.

• returns

DuckDBSuccess on success or DuckDBError on failure.

duckdb_pending_prepared_streaming Executes the prepared statement with the given bound parameters, and returns a pend‑
ing result. This pending result will create a streaming duckdb_result when executed. The pending result represents an intermediate struc‑
ture for a query that is not yet fully executed.

Note that after calling duckdb_pending_prepared_streaming, the pending result should always be destroyed using duckdb_
destroy_pending, even if this function returns DuckDBError.

Syntax

duckdb_state duckdb_pending_prepared_streaming(
duckdb_prepared_statement prepared_statement,
duckdb_pending_result *out_result
);

153
DuckDB Documentation

Parameters

• prepared_statement

The prepared statement to execute.

• out_result

The pending query result.

• returns

DuckDBSuccess on success or DuckDBError on failure.

duckdb_destroy_pending Closes the pending result and de‑allocates all memory allocated for the result.

Syntax

void duckdb_destroy_pending(
duckdb_pending_result *pending_result
);

Parameters

• pending_result

The pending result to destroy.

duckdb_pending_error Returns the error message contained within the pending result.

The result of this function must not be freed. It will be cleaned up when duckdb_destroy_pending is called.

Syntax

const char *duckdb_pending_error(

duckdb_pending_result pending_result
);

Parameters

• result

The pending result to fetch the error from.

• returns

The error of the pending result.

duckdb_pending_execute_task Executes a single task within the query, returning whether or not the query is ready.

If this returns DUCKDB_PENDING_RESULT_READY, the duckdb_execute_pending function can be called to obtain the result. If this returns
DUCKDB_PENDING_RESULT_NOT_READY, the duckdb_pending_execute_task function should be called again. If this returns DUCKDB_
PENDING_ERROR, an error occurred during execution.

The error message can be obtained by calling duckdb_pending_error on the pending_result.

154
DuckDB Documentation

Syntax

duckdb_pending_state duckdb_pending_execute_task(
duckdb_pending_result pending_result
);

Parameters

• pending_result

The pending result to execute a task within.

• returns

The state of the pending result after the execution.

duckdb_pending_execute_check_state If this returns DUCKDB_PENDING_RESULT_READY, the duckdb_execute_pending func‑

tion can be called to obtain the result. If this returns DUCKDB_PENDING_RESULT_NOT_READY, the duckdb_pending_execute_check_state
function should be called again. If this returns DUCKDB_PENDING_ERROR, an error occurred during execution.

The error message can be obtained by calling duckdb_pending_error on the pending_result.

• value

The bigint value

• returns

The value. This must be destroyed with duckdb_destroy_value.

duckdb_create_struct_value Creates a struct value from a type and an array of values

Syntax
duckdb_value duckdb_create_struct_value(
duckdb_logical_type type,
duckdb_value *values
);

157
DuckDB Documentation

Parameters

• type

The type of the struct

• values

The values for the struct fields

• returns

The value. This must be destroyed with duckdb_destroy_value.

duckdb_create_list_value Creates a list value from a type and an array of values of length value_count

Syntax

duckdb_value duckdb_create_list_value(
duckdb_logical_type type,
duckdb_value *values,
idx_t value_count
);

Parameters

• type

The type of the list

• values

The values for the list

• value_count

The number of values in the list

• returns

The value. This must be destroyed with duckdb_destroy_value.

duckdb_create_array_value Creates a array value from a type and an array of values of length value_count

Syntax

duckdb_value duckdb_create_array_value(
duckdb_logical_type type,
duckdb_value *values,
idx_t value_count
);

Parameters

• type

The type of the array

• values

158
DuckDB Documentation

The values for the array

• value_count

The number of values in the array

• returns

The value. This must be destroyed with duckdb_destroy_value.

duckdb_get_varchar Obtains a string representation of the given value. The result must be destroyed with duckdb_free.

Syntax

char *duckdb_get_varchar(
duckdb_value value
);

Parameters

• value

The value

• returns

The string value. This must be destroyed with duckdb_free.

duckdb_get_int64 Obtains an int64 of the given value.

Syntax

int64_t duckdb_get_int64(
duckdb_value value
);

Parameters

• value

The value

• returns

The int64 value, or 0 if no conversion is possible

duckdb_create_logical_type Creates a duckdb_logical_type from a standard primitive type. The resulting type should be
destroyed with duckdb_destroy_logical_type.

This should not be used with DUCKDB_TYPE_DECIMAL.

Syntax

duckdb_create_array_type Creates a array type from its child type. The resulting type should be destroyed with duckdb_
destroy_logical_type.

Syntax
duckdb_logical_type duckdb_create_array_type(
duckdb_logical_type type,
idx_t array_size
);

• returns

The logical type.

161
DuckDB Documentation

duckdb_create_struct_type Creates a STRUCT type from the passed member name and type arrays. The resulting type should
be destroyed with duckdb_destroy_logical_type.

Syntax
duckdb_logical_type duckdb_create_struct_type(
duckdb_logical_type *member_types,
const char **member_names,
idx_t member_count
);

Parameters

• member_types

The array of types that the struct should consist of.

• member_names

The array of names that the struct should consist of.

• member_count

The number of members that were specified for both arrays.

• returns

The logical type.

duckdb_create_enum_type Creates an ENUM type from the passed member name array. The resulting type should be destroyed
with duckdb_destroy_logical_type.

Syntax
duckdb_logical_type duckdb_create_enum_type(
const char **member_names,
idx_t member_count
);

Parameters

• enum_name

The name of the enum.

• member_names

The array of names that the enum should consist of.

• member_count

The number of elements that were specified in the array.

• returns

The logical type.

duckdb_create_decimal_type Creates a duckdb_logical_type of type decimal with the specified width and scale. The re‑
sulting type should be destroyed with duckdb_destroy_logical_type.

162
DuckDB Documentation

Syntax
duckdb_logical_type duckdb_create_decimal_type(
uint8_t width,
uint8_t scale
);

Parameters

• width

The width of the decimal type

164
DuckDB Documentation

Parameters

• type

The logical type object

• returns

The dictionary size of the enum type

duckdb_enum_dictionary_value Retrieves the dictionary value at the specified position from the enum.

The result must be freed with duckdb_free.

Syntax

char *duckdb_enum_dictionary_value(
duckdb_logical_type type,
idx_t index
);

Parameters

• type

The logical type object

• index

The index in the dictionary

• returns

The string value of the enum type. Must be freed with duckdb_free.

duckdb_list_type_child_type Retrieves the child type of the given list type.

The result must be freed with duckdb_destroy_logical_type.

Syntax

duckdb_logical_type duckdb_list_type_child_type(
duckdb_logical_type type
);

Parameters

• type

The logical type object

• returns

The child type of the list type. Must be destroyed with duckdb_destroy_logical_type.

duckdb_array_type_child_type Retrieves the child type of the given array type.

The result must be freed with duckdb_destroy_logical_type.

165
DuckDB Documentation

Syntax

duckdb_logical_type duckdb_array_type_child_type(
duckdb_logical_type type
);

Parameters

• type

The logical type object

• returns

The child type of the array type. Must be destroyed with duckdb_destroy_logical_type.

duckdb_array_type_array_size Retrieves the array size of the given array type.

Syntax

idx_t duckdb_array_type_array_size(
duckdb_logical_type type
);

Parameters

• type

The logical type object

• returns

The fixed number of elements the values of this array type can store.

duckdb_map_type_key_type Retrieves the key type of the given map type.

The result must be freed with duckdb_destroy_logical_type.

Syntax

duckdb_logical_type duckdb_map_type_key_type(
duckdb_logical_type type
);

Parameters

• type

The logical type object

• returns

The key type of the map type. Must be destroyed with duckdb_destroy_logical_type.

duckdb_map_type_value_type Retrieves the value type of the given map type.

The result must be freed with duckdb_destroy_logical_type.

166
DuckDB Documentation

Syntax

duckdb_logical_type duckdb_map_type_value_type(
duckdb_logical_type type
);

Parameters

• type

The logical type object

• returns

The value type of the map type. Must be destroyed with duckdb_destroy_logical_type.

duckdb_struct_type_child_count Returns the number of children of a struct type.

Syntax

idx_t duckdb_struct_type_child_count(
duckdb_logical_type type
);

Parameters

• type

The logical type object

• returns

The number of children of a struct type.

duckdb_struct_type_child_name Retrieves the name of the struct child.

The result must be freed with duckdb_free.

Syntax

char *duckdb_struct_type_child_name(
duckdb_logical_type type,
idx_t index
);

Parameters

• type

The logical type object

• index

The child index

• returns

The name of the struct type. Must be freed with duckdb_free.

167
DuckDB Documentation

duckdb_struct_type_child_type Retrieves the child type of the given struct type at the specified index.

The result must be freed with duckdb_destroy_logical_type.

Syntax

duckdb_logical_type duckdb_struct_type_child_type(
duckdb_logical_type type,
idx_t index
);

Parameters

• type

The logical type object

• index

The child index

• returns

The child type of the struct type. Must be destroyed with duckdb_destroy_logical_type.

duckdb_union_type_member_count Returns the number of members that the union type has.

Syntax

idx_t duckdb_union_type_member_count(
duckdb_logical_type type
);

Parameters

• type

The logical type (union) object

• returns

The number of members of a union type.

duckdb_union_type_member_name Retrieves the name of the union member.

The result must be freed with duckdb_free.

Syntax

char *duckdb_union_type_member_name(
duckdb_logical_type type,
idx_t index
);

168
DuckDB Documentation

Parameters

• type

The logical type object

• index

The child index

• returns

The name of the union member. Must be freed with duckdb_free.

duckdb_union_type_member_type Retrieves the child type of the given union member at the specified index.

The result must be freed with duckdb_destroy_logical_type.

Syntax

duckdb_logical_type duckdb_union_type_member_type(
duckdb_logical_type type,
idx_t index
);

Parameters

• type

The logical type object

• index

The child index

• returns

The child type of the union member. Must be destroyed with duckdb_destroy_logical_type.

duckdb_destroy_logical_type Destroys the logical type and de‑allocates all memory allocated for that type.

Syntax

void duckdb_destroy_logical_type(
duckdb_logical_type *type
);

Parameters

• type

The logical type to destroy.

duckdb_create_data_chunk Creates an empty DataChunk with the specified set of types.

Note that the result must be destroyed with duckdb_destroy_data_chunk.

169
DuckDB Documentation

Syntax
duckdb_data_chunk duckdb_create_data_chunk(
duckdb_logical_type *types,
idx_t column_count
);

Parameters

• types

An array of types of the data chunk.

• column_count

The number of columns.

• returns

The data chunk.

duckdb_destroy_data_chunk Destroys the data chunk and de‑allocates all memory allocated for that chunk.

Syntax
void duckdb_destroy_data_chunk(
duckdb_data_chunk *chunk
);

Parameters

• chunk

The data chunk to destroy.

duckdb_data_chunk_reset Resets a data chunk, clearing the validity masks and setting the cardinality of the data chunk to 0.

Syntax
void duckdb_data_chunk_reset(
duckdb_data_chunk chunk
);

Parameters

• chunk

The data chunk to reset.

duckdb_data_chunk_get_column_count Retrieves the number of columns in a data chunk.

Syntax
idx_t duckdb_data_chunk_get_column_count(
duckdb_data_chunk chunk
);

170
DuckDB Documentation

Parameters

• chunk

The data chunk to get the data from

• returns

The number of columns in the data chunk

duckdb_data_chunk_get_vector Retrieves the vector at the specified column index in the data chunk.

The pointer to the vector is valid for as long as the chunk is alive. It does NOT need to be destroyed.

Syntax

duckdb_vector duckdb_data_chunk_get_vector(
duckdb_data_chunk chunk,
idx_t col_idx
);

Parameters

• chunk

The data chunk to get the data from

• returns

The vector

duckdb_data_chunk_get_size Retrieves the current number of tuples in a data chunk.

Syntax

idx_t duckdb_data_chunk_get_size(
duckdb_data_chunk chunk
);

Parameters

• chunk

The data chunk to get the data from

• returns

The number of tuples in the data chunk

duckdb_data_chunk_set_size Sets the current number of tuples in a data chunk.

Syntax

If all values are valid, this function MIGHT return NULL!

Validity of a specific value can be obtained like this:

idx_t entry_idx = row_idx / 64; idx_t idx_in_entry = row_idx % 64; bool is_valid = validity_mask[entry_idx] & (1 « idx_in_entry);

Alternatively, the (slower) duckdb_validity_row_is_valid function can be used.

172
DuckDB Documentation

Syntax
uint64_t *duckdb_vector_get_validity(
duckdb_vector vector
);

Parameters

• vector

The vector to get the data from

• returns

The pointer to the validity mask, or NULL if no validity mask is present

duckdb_vector_ensure_validity_writable Ensures the validity mask is writable by allocating it.

After this function is called, duckdb_vector_get_validity will ALWAYS return non‑NULL. This allows null values to be written to the
vector, regardless of whether a validity mask was present before.

Syntax
void duckdb_vector_ensure_validity_writable(
duckdb_vector vector
);

Parameters

• vector

The vector to alter

duckdb_vector_assign_string_element Assigns a string element in the vector at the specified location.

Syntax
void duckdb_vector_assign_string_element(
duckdb_vector vector,
idx_t index,
const char *str
);

Parameters

• vector

The vector to alter

• index

The row position in the vector to assign the string to

• str

The null‑terminated string

duckdb_vector_assign_string_element_len Assigns a string element in the vector at the specified location. You may also
use this function to assign BLOBs.

173
DuckDB Documentation

Syntax

void duckdb_vector_assign_string_element_len(
duckdb_vector vector,
idx_t index,
const char *str,
idx_t str_len
);

Parameters

• vector

The vector to alter

• index

The row position in the vector to assign the string to

• str

The string

• str_len

The length of the string (in bytes)

duckdb_list_vector_get_child Retrieves the child vector of a list vector.

• returns

The duckdb state. Returns DuckDBError if the vector is nullptr.

duckdb_list_vector_reserve Sets the total capacity of the underlying child‑vector of a list.

Syntax
duckdb_state duckdb_list_vector_reserve(
duckdb_vector vector,
idx_t required_capacity
);

Parameters

• vector

The list vector.

• required_capacity

the total capacity to reserve.

• return

The duckdb state. Returns DuckDBError if the vector is nullptr.

duckdb_struct_vector_get_child Retrieves the child vector of a struct vector.

The resulting vector is valid as long as the parent vector is valid.

175
DuckDB Documentation

Syntax

duckdb_vector duckdb_struct_vector_get_child(
duckdb_vector vector,
idx_t index
);

Parameters

• vector

The vector

• index

The child index

• returns

The child vector

duckdb_array_vector_get_child Retrieves the child vector of a array vector.

The resulting vector is valid as long as the parent vector is valid. The resulting vector has the size of the parent vector multiplied by the
array size.

Syntax

duckdb_vector duckdb_array_vector_get_child(
duckdb_vector vector
);

Parameters

• vector

The vector

• returns

The child vector

duckdb_validity_row_is_valid Returns whether or not a row is valid (i.e., not NULL) in the given validity mask.

Syntax

bool duckdb_validity_row_is_valid(
uint64_t *validity,
idx_t row
);

Parameters

• validity

The validity mask, as obtained through duckdb_vector_get_validity

• row

176
DuckDB Documentation

The row index

• returns

true if the row is valid, false otherwise

duckdb_validity_set_row_validity In a validity mask, sets a specific row to either valid or invalid.

Note that duckdb_vector_ensure_validity_writable should be called before calling duckdb_vector_get_validity, to

ensure that there is a validity mask to write to.

Syntax

void duckdb_validity_set_row_validity(
uint64_t *validity,
idx_t row,
bool valid
);

Parameters

• validity

The validity mask, as obtained through duckdb_vector_get_validity.

• row

The row index

• valid

Whether or not to set the row to valid, or invalid

duckdb_validity_set_row_invalid In a validity mask, sets a specific row to invalid.

Equivalent to duckdb_validity_set_row_validity with valid set to false.

Syntax

void duckdb_validity_set_row_invalid(
uint64_t *validity,
idx_t row
);

Parameters

• validity

The validity mask

• row

The row index

duckdb_validity_set_row_valid In a validity mask, sets a specific row to valid.

Equivalent to duckdb_validity_set_row_validity with valid set to true.

177
DuckDB Documentation

Syntax

void duckdb_validity_set_row_valid(
uint64_t *validity,
idx_t row
);

Parameters

• validity

The validity mask

• row

The row index

duckdb_create_table_function Creates a new empty table function.

The return value should be destroyed with duckdb_destroy_table_function.

Syntax

• name

The name of the table function

duckdb_table_function_add_parameter Adds a parameter to the table function.

Syntax

void duckdb_table_function_add_parameter(
duckdb_table_function table_function,
duckdb_logical_type type
);

Parameters

• table_function

The table function

• type

The type of the parameter to add.

duckdb_table_function_add_named_parameter Adds a named parameter to the table function.

Syntax

void duckdb_table_function_add_named_parameter(
duckdb_table_function table_function,
const char *name,
duckdb_logical_type type
);

Parameters

• table_function

The table function

• name

The name of the parameter

• type

The type of the parameter to add.

duckdb_table_function_set_extra_info Assigns extra information to the table function that can be fetched during binding,
etc.

179
DuckDB Documentation

Syntax
void duckdb_table_function_set_extra_info(
duckdb_table_function table_function,
void *extra_info,
duckdb_delete_callback_t destroy
);

Parameters

• table_function

The table function

• extra_info

The extra information

• destroy

The callback that will be called to destroy the bind data (if any)

duckdb_table_function_set_bind Sets the bind function of the table function.

Syntax
void duckdb_table_function_set_bind(
duckdb_table_function table_function,
duckdb_table_function_bind_t bind
);

Parameters

• table_function

The table function

• bind

The bind function

duckdb_table_function_set_init Sets the init function of the table function.

Syntax
void duckdb_table_function_set_init(
duckdb_table_function table_function,
duckdb_table_function_init_t init
);

Parameters

• table_function

The table function

• init

The init function

180
DuckDB Documentation

duckdb_table_function_set_local_init Sets the thread‑local init function of the table function.

Syntax

void duckdb_table_function_set_local_init(
duckdb_table_function table_function,
duckdb_table_function_init_t init
);

Parameters

• table_function

The table function

• init

The init function

duckdb_table_function_set_function Sets the main function of the table function.

Syntax

void duckdb_table_function_set_function(
duckdb_table_function table_function,
duckdb_table_function_t function
);

Parameters

• table_function

The table function

• function

The function

duckdb_table_function_supports_projection_pushdown Sets whether or not the given table function supports projec‑
tion pushdown.

Syntax

void duckdb_table_function_supports_projection_pushdown(
duckdb_table_function table_function,
bool pushdown
);

181
DuckDB Documentation

Parameters

• table_function

The table function

• pushdown

True if the table function supports projection pushdown, false otherwise.

duckdb_register_table_function Register the table function object within the given connection.

The function requires at least a name, a bind function, an init function and a main function.

If the function is incomplete or a function with this name already exists DuckDBError is returned.

Syntax

duckdb_state duckdb_register_table_function(
duckdb_connection con,
duckdb_table_function function
);

Parameters

• con

The connection to register it in.

• function

The function pointer

• returns

Whether or not the registration was successful.

duckdb_bind_get_extra_info Retrieves the extra info of the function as set in duckdb_table_function_set_extra_

info.

Syntax

void *duckdb_bind_get_extra_info(
duckdb_bind_info info
);

Parameters

• info

The info object

• returns

The extra info

duckdb_bind_add_result_column Adds a result column to the output of the table function.

182
DuckDB Documentation

Syntax

void duckdb_bind_add_result_column(
duckdb_bind_info info,
const char *name,
duckdb_logical_type type
);

Parameters

• info

The info object

• name

The name of the column

• type

The logical type of the column

duckdb_bind_get_parameter_count Retrieves the number of regular (non‑named) parameters to the function.

Syntax

idx_t duckdb_bind_get_parameter_count(
duckdb_bind_info info
);

Parameters

• info

The info object

• returns

The number of parameters

duckdb_bind_get_parameter Retrieves the parameter at the given index.

The result must be destroyed with duckdb_destroy_value.

Syntax

duckdb_value duckdb_bind_get_parameter(
duckdb_bind_info info,
idx_t index
);

Parameters

• info

The info object

• index

183
DuckDB Documentation

The index of the parameter to get

• returns

The value of the parameter. Must be destroyed with duckdb_destroy_value.

duckdb_bind_get_named_parameter Retrieves a named parameter with the given name.

The result must be destroyed with duckdb_destroy_value.

Syntax

duckdb_value duckdb_bind_get_named_parameter(
duckdb_bind_info info,
const char *name
);

Parameters

• info

The info object

• name

The name of the parameter

• returns

The value of the parameter. Must be destroyed with duckdb_destroy_value.

duckdb_bind_set_bind_data Sets the user‑provided bind data in the bind object. This object can be retrieved again during exe‑
cution.

Syntax

void duckdb_bind_set_bind_data(
duckdb_bind_info info,
void *bind_data,
duckdb_delete_callback_t destroy
);

Parameters

• info

The info object

• extra_data

The bind data object.

• destroy

The callback that will be called to destroy the bind data (if any)

duckdb_bind_set_cardinality Sets the cardinality estimate for the table function, used for optimization.

184
DuckDB Documentation

Syntax
void duckdb_bind_set_cardinality(
duckdb_bind_info info,
idx_t cardinality,
bool is_exact
);

Parameters

• info

The bind data object.

• is_exact

Whether or not the cardinality estimate is exact, or an approximation

duckdb_bind_set_error Report that an error has occurred while calling bind.

Syntax
void duckdb_bind_set_error(
duckdb_bind_info info,
const char *error
);

Parameters

• info

The info object

• error

The error message

duckdb_init_get_extra_info Retrieves the extra info of the function as set in duckdb_table_function_set_extra_

info.

Syntax
void *duckdb_init_get_extra_info(
duckdb_init_info info
);

Parameters

• info

The info object

• returns

The extra info

duckdb_init_get_bind_data Gets the bind data set by duckdb_bind_set_bind_data during the bind.

Note that the bind data should be considered as read‑only. For tracking state, use the init data instead.

185
DuckDB Documentation

Syntax
void *duckdb_init_get_bind_data(
duckdb_init_info info
);

Parameters

• info

The info object

• returns

The bind data object

duckdb_init_set_init_data Sets the user‑provided init data in the init object. This object can be retrieved again during execu‑
tion.

Syntax
void duckdb_init_set_init_data(
duckdb_init_info info,
void *init_data,
duckdb_delete_callback_t destroy
);

Parameters

• info

The info object

• extra_data

The init data object.

• destroy

The callback that will be called to destroy the init data (if any)

duckdb_init_get_column_count Returns the number of projected columns.

This function must be used if projection pushdown is enabled to figure out which columns to emit.

Syntax
idx_t duckdb_init_get_column_count(
duckdb_init_info info
);

Parameters

• info

The info object

• returns

The number of projected columns.

186
DuckDB Documentation

duckdb_init_get_column_index Returns the column index of the projected column at the specified position.

This function must be used if projection pushdown is enabled to figure out which columns to emit.

Syntax

idx_t duckdb_init_get_column_index(
duckdb_init_info info,
idx_t column_index
);

Parameters

• info

The info object

• column_index

• returns

The extra info

duckdb_function_get_bind_data Gets the bind data set by duckdb_bind_set_bind_data during the bind.

Note that the bind data should be considered as read‑only. For tracking state, use the init data instead.

Syntax

void *duckdb_function_get_bind_data(
duckdb_function_info info
);

Parameters

• info

The info object

• returns

The bind data object

duckdb_function_get_init_data Gets the init data set by duckdb_init_set_init_data during the init.

Syntax

duckdb_add_replacement_scan Add a replacement scan definition to the specified database.

Syntax
void duckdb_add_replacement_scan(
duckdb_database db,
duckdb_replacement_callback_t replacement,
void *extra_data,
duckdb_delete_callback_t delete_callback
);

189
DuckDB Documentation

Parameters

• db

The database object to add the replacement scan to

• replacement

The replacement scan callback

• extra_data

Extra data that is passed back into the specified callback

• delete_callback

duckdb_appender_create Creates an appender object.

Note that the object must be destroyed with duckdb_appender_destroy.

Syntax
duckdb_state duckdb_appender_create(
duckdb_connection connection,
const char *schema,
const char *table,
duckdb_appender *out_appender
);

Parameters

• connection

The connection context to create the appender in.

• schema

The schema of the table to append to, or nullptr for the default schema.

• table

The table name to append to.

• out_appender

The resulting appender object.

• returns

DuckDBSuccess on success or DuckDBError on failure.

duckdb_appender_column_count Returns the number of columns in the table that belongs to the appender.

• appender The appender to get the column count from.

Syntax
idx_t duckdb_appender_column_count(
duckdb_appender appender
);

191
DuckDB Documentation

Parameters

• returns

The number of columns in the table.

duckdb_appender_column_type Returns the type of the column at the specified index.

Note: The resulting type should be destroyed with duckdb_destroy_logical_type.

• appender The appender to get the column type from.

• col_idx The index of the column to get the type of.

Syntax
duckdb_logical_type duckdb_appender_column_type(
duckdb_appender appender,
idx_t col_idx
);

Parameters

• returns

The duckdb_logical_type of the column.

duckdb_appender_error Returns the error message associated with the given appender. If the appender has no error message, this
returns nullptr instead.

The error message should not be freed. It will be de‑allocated when duckdb_appender_destroy is called.

Syntax
const char *duckdb_appender_error(
duckdb_appender appender
);

Parameters

• appender

The appender to get the error from.

• returns

The error message, or nullptr if there is none.

duckdb_appender_flush Flush the appender to the table, forcing the cache of the appender to be cleared and the data to be ap‑
pended to the base table.

This should generally not be used unless you know what you are doing. Instead, call duckdb_appender_destroy when you are done
with the appender.

Syntax
duckdb_state duckdb_appender_flush(
duckdb_appender appender
);

192
DuckDB Documentation

Parameters

• appender

The appender to flush.

• returns

DuckDBSuccess on success or DuckDBError on failure.

duckdb_appender_close Close the appender, flushing all intermediate state in the appender to the table and closing it for further
appends.

This is generally not necessary. Call duckdb_appender_destroy instead.

Syntax
duckdb_state duckdb_appender_close(
duckdb_appender appender
);

Parameters

• appender

The appender to flush and close.

• returns

DuckDBSuccess on success or DuckDBError on failure.

duckdb_appender_destroy Close the appender and destroy it. Flushing all intermediate state in the appender to the table, and
de‑allocating all memory associated with the appender.

Syntax
duckdb_state duckdb_appender_destroy(
duckdb_appender *appender
);

Parameters

• appender

The appender to flush, close and destroy.

• returns

DuckDBSuccess on success or DuckDBError on failure.

duckdb_appender_begin_row A nop function, provided for backwards compatibility reasons. Does nothing. Only duckdb_
appender_end_row is required.

Syntax
duckdb_state duckdb_appender_begin_row(
duckdb_appender appender
);

193
DuckDB Documentation

duckdb_appender_end_row Finish the current row of appends. After end_row is called, the next row can be appended.

Syntax

duckdb_state duckdb_appender_end_row(
duckdb_appender appender
);

Parameters

• appender

The appender.

• returns

DuckDBSuccess on success or DuckDBError on failure.

duckdb_append_bool Append a bool value to the appender.

Syntax

duckdb_state duckdb_append_bool(
duckdb_appender appender,
bool value
);

duckdb_append_int8 Append an int8_t value to the appender.

Syntax

duckdb_state duckdb_append_int8(
duckdb_appender appender,
int8_t value
);

duckdb_append_int16 Append an int16_t value to the appender.

Syntax

duckdb_state duckdb_append_int16(
duckdb_appender appender,
int16_t value
);

duckdb_append_int32 Append an int32_t value to the appender.

Syntax

duckdb_state duckdb_append_int32(
duckdb_appender appender,
int32_t value
);

194
DuckDB Documentation

duckdb_append_int64 Append an int64_t value to the appender.

Syntax
duckdb_state duckdb_append_int64(
duckdb_appender appender,
int64_t value
);

duckdb_append_hugeint Append a duckdb_hugeint value to the appender.

Syntax
duckdb_state duckdb_append_hugeint(
duckdb_appender appender,
duckdb_hugeint value
);

duckdb_append_uint8 Append a uint8_t value to the appender.

Syntax
duckdb_state duckdb_append_uint8(
duckdb_appender appender,
uint8_t value
);

duckdb_append_uint16 Append a uint16_t value to the appender.

Syntax
duckdb_state duckdb_append_uint16(
duckdb_appender appender,
uint16_t value
);

duckdb_append_uint32 Append a uint32_t value to the appender.

Syntax
duckdb_state duckdb_append_uint32(
duckdb_appender appender,
uint32_t value
);

duckdb_append_uint64 Append a uint64_t value to the appender.

Syntax
duckdb_state duckdb_append_uint64(
duckdb_appender appender,
uint64_t value
);

195
DuckDB Documentation

duckdb_append_uhugeint Append a duckdb_uhugeint value to the appender.

Syntax
duckdb_state duckdb_append_uhugeint(
duckdb_appender appender,
duckdb_uhugeint value
);

duckdb_append_float Append a float value to the appender.

Syntax
duckdb_state duckdb_append_float(
duckdb_appender appender,
float value
);

duckdb_append_double Append a double value to the appender.

Syntax
duckdb_state duckdb_append_double(
duckdb_appender appender,
double value
);

duckdb_append_date Append a duckdb_date value to the appender.

Syntax
duckdb_state duckdb_append_date(
duckdb_appender appender,
duckdb_date value
);

Syntax

duckdb_state duckdb_append_null(
duckdb_appender appender
);

duckdb_append_data_chunk Appends a pre‑filled data chunk to the specified appender.

197
DuckDB Documentation

Syntax

duckdb_state duckdb_append_data_chunk(
duckdb_appender appender,
duckdb_data_chunk chunk
);

Parameters

• appender

The appender to append to.

• chunk

The data chunk to append.

• returns

The return state.

duckdb_query_arrow Executes a SQL query within a connection and stores the full (materialized) result in an arrow structure. If the
query fails to execute, DuckDBError is returned and the error message can be retrieved by calling duckdb_query_arrow_error.

Note that after running duckdb_query_arrow, duckdb_destroy_arrow must be called on the result object even if the query fails,
otherwise the error stored within the result will not be freed correctly.

Syntax

duckdb_state duckdb_query_arrow(
duckdb_connection connection,
const char *query,
duckdb_arrow *out_result
);

Parameters

• connection

The connection to perform the query in.

• query

The SQL query to run.

• out_result

The query result.

• returns

DuckDBSuccess on success or DuckDBError on failure.

duckdb_query_arrow_schema Fetch the internal arrow schema from the arrow result. Remember to call release on the respective
ArrowSchema object.

198
DuckDB Documentation

Syntax

duckdb_state duckdb_query_arrow_schema(
duckdb_arrow result,
duckdb_arrow_schema *out_schema
);

Parameters

• result

The result to fetch the schema from.

• out_schema

The output schema.

• returns

DuckDBSuccess on success or DuckDBError on failure.

duckdb_prepared_arrow_schema Fetch the internal arrow schema from the prepared statement. Remember to call release on the
respective ArrowSchema object.

Syntax

duckdb_state duckdb_prepared_arrow_schema(
duckdb_prepared_statement prepared,
duckdb_arrow_schema *out_schema
);

Parameters

• result

The prepared statement to fetch the schema from.

• out_schema

The output schema.

• returns

DuckDBSuccess on success or DuckDBError on failure.

duckdb_result_arrow_array Convert a data chunk into an arrow struct array. Remember to call release on the respective ArrowAr‑
ray object.

Syntax

void duckdb_result_arrow_array(
duckdb_result result,
duckdb_data_chunk chunk,
duckdb_arrow_array *out_array
);

199
DuckDB Documentation

Parameters

• result

The result object the data chunk have been fetched from.

• chunk

The data chunk to convert.

• out_array

The output array.

Syntax

idx_t duckdb_arrow_rows_changed(
duckdb_arrow result
);

Parameters

• result

The result object.

• returns

The number of rows changed.

duckdb_query_arrow_error Returns the error message contained within the result. The error is only set if duckdb_query_
arrow returns DuckDBError.

The error message should not be freed. It will be de‑allocated when duckdb_destroy_arrow is called.

Syntax

const char *duckdb_query_arrow_error(

duckdb_arrow result
);

Parameters

• result

The result object to fetch the error from.

• returns

The error of the result.

duckdb_destroy_arrow Closes the result and de‑allocates all memory allocated for the arrow result.

201
DuckDB Documentation

Syntax

void duckdb_destroy_arrow(
duckdb_arrow *result
);

Parameters

• result

The result to destroy.

duckdb_destroy_arrow_stream Releases the arrow array stream and de‑allocates its memory.

Syntax

void duckdb_destroy_arrow_stream(
duckdb_arrow_stream *stream_p
);

Parameters

• stream

The arrow array stream to destroy.

duckdb_execute_prepared_arrow Executes the prepared statement with the given bound parameters, and returns an arrow
query result. Note that after running duckdb_execute_prepared_arrow, duckdb_destroy_arrow must be called on the result
object.

Syntax

duckdb_state duckdb_execute_prepared_arrow(
duckdb_prepared_statement prepared_statement,
duckdb_arrow *out_result
);

Parameters

• prepared_statement

The prepared statement to execute.

• out_result

The query result.

• returns

DuckDBSuccess on success or DuckDBError on failure.

duckdb_arrow_scan Scans the Arrow stream and creates a view with the given name.

202
DuckDB Documentation

Syntax
duckdb_state duckdb_arrow_scan(
duckdb_connection connection,
const char *table_name,
duckdb_arrow_stream arrow
);

Parameters

• connection

The connection on which to execute the scan.

• table_name

Name of the temporary view to create.

• arrow

Arrow stream wrapper.

• returns

DuckDBSuccess on success or DuckDBError on failure.

duckdb_arrow_array_scan Scans the Arrow array and creates a view with the given name. Note that after running duckdb_
arrow_array_scan, duckdb_destroy_arrow_stream must be called on the out stream.

Syntax
duckdb_state duckdb_arrow_array_scan(
duckdb_connection connection,
const char *table_name,
duckdb_arrow_schema arrow_schema,
duckdb_arrow_array arrow_array,
duckdb_arrow_stream *out_stream
);

Parameters

• connection

The connection on which to execute the scan.

• table_name

Name of the temporary view to create.

• arrow_schema

Arrow schema wrapper.

• arrow_array

Arrow array wrapper.

• out_stream

Output array stream that wraps around the passed schema, for releasing/deleting once done.

• returns

DuckDBSuccess on success or DuckDBError on failure.

203
DuckDB Documentation

duckdb_execute_tasks Execute DuckDB tasks on this thread.

Will return after max_tasks have been executed, or if there are no more tasks present.

Syntax
void duckdb_execute_tasks(
duckdb_database database,
idx_t max_tasks
);

Parameters

• database

The database object to execute tasks for

• max_tasks

The maximum amount of tasks to execute

duckdb_create_task_state Creates a task state that can be used with duckdb_execute_tasks_state to execute tasks until
duckdb_finish_execution is called on the state.

duckdb_destroy_state must be called on the result.

Syntax
duckdb_task_state duckdb_create_task_state(
duckdb_database database
);

Parameters

• database

The database object to create the task state for

• returns

The task state that can be used with duckdb_execute_tasks_state.

duckdb_execute_tasks_state Execute DuckDB tasks on this thread.

The thread will keep on executing tasks forever, until duckdb_finish_execution is called on the state. Multiple threads can share the same
duckdb_task_state.

Syntax
void duckdb_execute_tasks_state(
duckdb_task_state state
);

Parameters

• state

The task state of the executor

204
DuckDB Documentation

duckdb_execute_n_tasks_state Execute DuckDB tasks on this thread.

The thread will keep on executing tasks until either duckdb_finish_execution is called on the state, max_tasks tasks have been executed or
there are no more tasks to be executed.

Multiple threads can share the same duckdb_task_state.

Syntax
idx_t duckdb_execute_n_tasks_state(
duckdb_task_state state,
idx_t max_tasks
);

Parameters

• state

The task state of the executor

• max_tasks

The maximum amount of tasks to execute

• returns

The amount of tasks that have actually been executed

duckdb_finish_execution Finish execution on a specific task.

Syntax
void duckdb_finish_execution(
duckdb_task_state state
);

Parameters

• state

The task state to finish execution

duckdb_task_state_is_finished Check if the provided duckdb_task_state has finished execution

Syntax
bool duckdb_task_state_is_finished(
duckdb_task_state state
);

Parameters

• state

The task state to inspect

• returns

Whether or not duckdb_finish_execution has been called on the task state

205
DuckDB Documentation

duckdb_destroy_task_state Destroys the task state returned from duckdb_create_task_state.

206
DuckDB Documentation

C++ API

Installation

The DuckDB C++ API can be installed as part of the libduckdb packages. Please see the installation page for details.

Basic API Usage

DuckDB implements a custom C++ API. This is built around the abstractions of a database instance (DuckDB class), multiple Connections
to the database instance and QueryResult instances as the result of queries. The header file for the C++ API is duckdb.hpp.

Note. The standard source distribution of libduckdb contains an ”amalgamation” of the DuckDB sources, which combine all
sources into two files duckdb.hpp and duckdb.cpp. The duckdb.hpp header is much larger in this case. Regardless of whether
you are using the amalgamation or not, just include duckdb.hpp.

Startup & Shutdown To use DuckDB, you must first initialize a DuckDB instance using its constructor. DuckDB() takes as parameter
the database file to read and write from. The special value nullptr can be used to create an in‑memory database. Note that for an
in‑memory database no data is persisted to disk (i.e., all data is lost when you exit the process). The second parameter to the DuckDB
constructor is an optional DBConfig object. In DBConfig, you can set various database parameters, for example the read/write mode
or memory limits. The DuckDB constructor may throw exceptions, for example if the database file is not usable.

With the DuckDB instance, you can create one or many Connection instances using the Connection() constructor. While connections
should be thread‑safe, they will be locked during querying. It is therefore recommended that each thread uses its own connection if you
are in a multithreaded environment.

DuckDB db(nullptr);
Connection con(db);

Querying Connections expose the Query() method to send a SQL query string to DuckDB from C++. Query() fully materializes the
query result as a MaterializedQueryResult in memory before returning at which point the query result can be consumed. There is
also a streaming API for queries, see further below.

// create a table
con.Query("CREATE TABLE integers (i INTEGER, j INTEGER)");

// insert three rows into the table

con.Query("INSERT INTO integers VALUES (3, 4), (5, 6), (7, NULL)");

MaterializedQueryResult result = con.Query("SELECT * FROM integers");

if (!result->success) {
cerr << result->error;
}

The MaterializedQueryResult instance contains firstly two fields that indicate whether the query was successful. Query will not
throw exceptions under normal circumstances. Instead, invalid queries or other issues will lead to the success boolean field in the query
result instance to be set to false. In this case an error message may be available in error as a string. If successful, other fields are set:
the type of statement that was just executed (e.g., StatementType::INSERT_STATEMENT) is contained in statement_type. The
high‑level (”Logical type”/”SQL type”) types of the result set columns are in types. The names of the result columns are in the names
string vector. In case multiple result sets are returned, for example because the result set contained multiple statements, the result set can
be chained using the next field.

DuckDB also supports prepared statements in the C++ API with the Prepare() method. This returns an instance of PreparedState-
ment. This instance can be used to execute the prepared statement with parameters. Below is an example:

std::unique_ptr<PreparedStatement> prepare = con.Prepare("SELECT count(*) FROM a WHERE i = $1");

std::unique_ptr<QueryResult> result = prepare->Execute(12);

207
DuckDB Documentation

Note. Warning Do not use prepared statements to insert large amounts of data into DuckDB. See the data import documentation
for better options.

UDF API The UDF API allows the definition of user‑defined functions. It is exposed in duckdb:Connection through the methods:
CreateScalarFunction(), CreateVectorizedFunction(), and variants. These methods created UDFs into the temporary
schema (TEMP_SCHEMA) of the owner connection that is the only one allowed to use and change them.

CreateScalarFunction The user can code an ordinary scalar function and invoke the CreateScalarFunction() to register and af‑
terward use the UDF in a SELECT statement, for instance:

bool bigger_than_four(int value) {

return value > 4;
}

connection.CreateScalarFunction<bool, int>("bigger_than_four", &bigger_than_four);

connection.Query("SELECT bigger_than_four(i) FROM (VALUES(3), (5)) tbl(i)")->Print();

The CreateScalarFunction() methods automatically creates vectorized scalar UDFs so they are as efficient as built‑in functions, we
have two variants of this method interface as follows:

template<typename TR, typename... Args>

void CreateScalarFunction(string name, TR (*udf_func)(Args…))

• template parameters:

– TR is the return type of the UDF function;

– Args are the arguments up to 3 for the UDF function (this method only supports until ternary functions);

• name: is the name to register the UDF function;

• udf_func: is a pointer to the UDF function.

This method automatically discovers from the template typenames the corresponding LogicalTypes:

• bool → LogicalType::BOOLEAN
• int8_t → LogicalType::TINYINT
• int16_t → LogicalType::SMALLINT
• int32_t → LogicalType::INTEGER
• int64_t →LogicalType::BIGINT
• float → LogicalType::FLOAT
• double → LogicalType::DOUBLE
• string_t → LogicalType::VARCHAR

In DuckDB some primitive types, e.g., int32_t, are mapped to the same LogicalType: INTEGER, TIME and DATE, then for disam‑
biguation the users can use the following overloaded method.

template<typename TR, typename... Args>

void CreateScalarFunction(string name, vector<LogicalType> args, LogicalType ret_type, TR (*udf_
func)(Args…))

An example of use would be:

int32_t udf_date(int32_t a) {
return a;
}

208
DuckDB Documentation

con.Query("CREATE TABLE dates (d DATE)");

con.Query("INSERT INTO dates VALUES ('1992-01-01')");

con.CreateScalarFunction<int32_t, int32_t>("udf_date", {LogicalType::DATE}, LogicalType::DATE, &udf_

date);

con.Query("SELECT udf_date(d) FROM dates")->Print();

• template parameters:

– TR is the return type of the UDF function;

– Args are the arguments up to 3 for the UDF function (this method only supports until ternary functions);

• name: is the name to register the UDF function;

• args: are the LogicalType arguments that the function uses, which should match with the template Args types;
• ret_type: is the LogicalType of return of the function, which should match with the template TR type;
• udf_func: is a pointer to the UDF function.

This function checks the template types against the LogicalTypes passed as arguments and they must match as follow:

• LogicalTypeId::BOOLEAN → bool
• LogicalTypeId::TINYINT → int8_t
• LogicalTypeId::SMALLINT → int16_t
• LogicalTypeId::DATE, LogicalTypeId::TIME, LogicalTypeId::INTEGER → int32_t
• LogicalTypeId::BIGINT, LogicalTypeId::TIMESTAMP → int64_t
• LogicalTypeId::FLOAT, LogicalTypeId::DOUBLE, LogicalTypeId::DECIMAL → double
• LogicalTypeId::VARCHAR, LogicalTypeId::CHAR, LogicalTypeId::BLOB → string_t
• LogicalTypeId::VARBINARY → blob_t

CreateVectorizedFunction The CreateVectorizedFunction() methods register a vectorized UDF such as:

/*
* This vectorized function copies the input values to the result vector
*/
template<typename TYPE>
static void udf_vectorized(DataChunk &args, ExpressionState &state, Vector &result) {
// set the result vector type
result.vector_type = VectorType::FLAT_VECTOR;
// get a raw array from the result
auto result_data = FlatVector::GetData<TYPE>(result);

// get the solely input vector

auto &input = args.data[0];
// now get an orrified vector
VectorData vdata;
input.Orrify(args.size(), vdata);

// get a raw array from the orrified input

auto input_data = (TYPE *)vdata.data;

// handling the data

for (idx_t i = 0; i < args.size(); i++) {
auto idx = vdata.sel->get_index(i);
if ((*vdata.nullmask)[idx]) {
continue;
}
result_data[i] = input_data[idx];
}
}

209
DuckDB Documentation

con.Query("CREATE TABLE integers (i INTEGER)");

con.Query("INSERT INTO integers VALUES (1), (2), (3), (999)");

con.CreateVectorizedFunction<int, int>("udf_vectorized_int", &&udf_vectorized<int>);

con.Query("SELECT udf_vectorized_int(i) FROM integers")->Print();

The Vectorized UDF is a pointer of the type scalar_function_t:

typedef std::function<void(DataChunk &args, ExpressionState &expr, Vector &result)> scalar_function_t;

• args is a DataChunk that holds a set of input vectors for the UDF that all have the same length;
• expr is an ExpressionState that provides information to the query's expression state;
• result: is a Vector to store the result values.

There are different vector types to handle in a Vectorized UDF:

• ConstantVector;
• DictionaryVector;
• FlatVector;
• ListVector;
• StringVector;
• StructVector;
• SequenceVector.

The general API of the CreateVectorizedFunction() method is as follows:

template<typename TR, typename... Args>

void CreateVectorizedFunction(string name, scalar_function_t udf_func, LogicalType varargs =
LogicalType::INVALID)

• template parameters:

– TR is the return type of the UDF function;

– Args are the arguments up to 3 for the UDF function.

• name is the name to register the UDF function;

• udf_func is a vectorized UDF function;
• varargs The type of varargs to support, or LogicalTypeId::INVALID (default value) if the function does not accept variable length
arguments.

This method automatically discovers from the template typenames the corresponding LogicalTypes:

• bool → LogicalType::BOOLEAN;
• int8_t → LogicalType::TINYINT;
• int16_t → LogicalType::SMALLINT
• int32_t → LogicalType::INTEGER
• int64_t → LogicalType::BIGINT
• float → LogicalType::FLOAT
• double → LogicalType::DOUBLE
• string_t → LogicalType::VARCHAR

template<typename TR, typename... Args>

void CreateVectorizedFunction(string name, vector<LogicalType> args, LogicalType ret_type, scalar_
function_t udf_func, LogicalType varargs = LogicalType::INVALID)

210
DuckDB Documentation

CLI

CLI API

Installation

The DuckDB CLI (Command Line Interface) is a single, dependency‑free executable. It is precompiled for Windows, Mac, and Linux for both
the stable version and for nightly builds produced by GitHub Actions. Please see the installation page under the CLI tab for download
links.

The DuckDB CLI is based on the SQLite command line shell, so CLI‑client‑specific functionality is similar to what is described in the SQLite
documentation (although DuckDB's SQL syntax follows PostgreSQL conventions).

Note. DuckDB has a tldr page that summarizes the most common uses of the CLI client. If you have tldr installed, you can display
it by running tldr duckdb.

Getting Started

Once the CLI executable has been downloaded, unzip it and save it to any directory. Navigate to that directory in a terminal and enter the
command duckdb to run the executable. If in a PowerShell or POSIX shell environment, use the command ./duckdb instead.

Usage

The typical usage of the duckdb command is the following:

$ duckdb [OPTIONS] [FILENAME]

Options The [OPTIONS] part encodes arguments for the CLI client. Common options include:

• -csv: sets the output mode to CSV

• -json: sets the output mode to JSON
• -readonly: open the database in read‑only mode (see concurrency in DuckDB)

For a full list of options, see the command line arguments page.

In‑Memory vs. Persistent Database When no [FILENAME] argument is provided, the DuckDB CLI will open a temporary in‑memory
database. You will see DuckDB's version number, the information on the connection and a prompt starting with a D.

$ duckdb

v0.10.0 20b1486d11
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
D

To open or create a persistent database, simply include a path as a command line argument like duckdb path/to/my_
database.duckdb or duckdb my_database.db.

Running SQL Statements in the CLI Once the CLI has been opened, enter a SQL statement followed by a semicolon, then hit enter and
it will be executed. Results will be displayed in a table in the terminal. If a semicolon is omitted, hitting enter will allow for multi‑line SQL
statements to be entered.

SELECT 'quack' AS my_column;

211
DuckDB Documentation

┌───────────┐
│ my_column │
│ varchar │
├───────────┤
│ quack │
└───────────┘

The CLI supports all of DuckDB's rich SQL syntax including SELECT, CREATE, and ALTER statements.

Editor Features The CLI supports autocompletion, and has sophisticated editor features and syntax highlighting on certain platforms.

Exiting the CLI To exit the CLI, press Ctrl‑D if your platform supports it. Otherwise press Ctrl‑C or use the .exit command. If used
a persistent database, DuckDB will automatically checkpoint (save the latest edits to disk) and close. This will remove the .wal file (the
Write‑Ahead‑Log) and consolidate all of your data into the single‑file database.

Dot Commands In addition to SQL syntax, special dot commands may be entered into the CLI client. To use one of these commands,
begin the line with a period (.) immediately followed by the name of the command you wish to execute. Additional arguments to the
command are entered, space separated, after the command. If an argument must contain a space, either single or double quotes may
be used to wrap that parameter. Dot commands must be entered on a single line, and no whitespace may occur before the period. No
semicolon is required at the end of the line.

Frequently‑used configurations can be stored in the file ~/.duckdbrc, which will be loaded when starting the CLI client. See the Config‑
uring the CLI section below for further information on these options.

Below, we summarize a few important dot commands. To see all available commands, see the dot commands page or use the .help
command.

Opening Database Files In addition to connecting to a database when opening the CLI, a new database connection can be made by using
the .open command. If no additional parameters are supplied, a new in‑memory database connection is created. This database will not
be persisted when the CLI connection is closed.

.open

The .open command optionally accepts several options, but the final parameter can be used to indicate a path to a persistent database
(or where one should be created). The special string :memory: can also be used to open a temporary in‑memory database.

.open persistent.duckdb

One important option accepted by .open is the --readonly flag. This disallows any editing of the database. To open in read only mode,
the database must already exist. This also means that a new in‑memory database can't be opened in read only mode since in‑memory
databases are created upon connection.

.open --readonly preexisting.duckdb

Output Formats The .mode dot command may be used to change the appearance of the tables returned in the terminal output. These
include the default duckbox mode, csv and json mode for ingestion by other tools, markdown and latex for documents, and insert
mode for generating SQL statements.

Writing Results to a File By default, the DuckDB CLI sends results to the terminal's standard output. However, this can be modified using
either the .output or .once commands. For details, see the documentation for the output dot command.

212
DuckDB Documentation

Reading SQL from a File The DuckDB CLI can read both SQL commands and dot commands from an external file instead of the terminal
using the .read command. This allows for a number of commands to be run in sequence and allows command sequences to be saved
and reused.

The .read command requires only one argument: the path to the file containing the SQL and/or commands to execute. After running the
commands in the file, control will revert back to the terminal. Output from the execution of that file is governed by the same .output and
.once commands that have been discussed previously. This allows the output to be displayed back to the terminal, as in the first example
below, or out to another file, as in the second example.

In this example, the file select_example.sql is located in the same directory as duckdb.exe and contains the following SQL state‑
ment:

SELECT *
FROM generate_series(5);

To execute it from the CLI, the .read command is used.

.read select_example.sql

The output below is returned to the terminal by default. The formatting of the table can be adjusted using the .output or .once com‑
mands.

| generate_series |
|-----------------|
| 0 |
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |

Multiple commands, including both SQL and dot commands, can also be run in a single .read command. In this example, the file write_
markdown_to_file.sql is located in the same directory as duckdb.exe and contains the following commands:

.mode markdown
.output series.md
SELECT *
FROM generate_series(5);

To execute it from the CLI, the .read command is used as before.

.read write_markdown_to_file.sql

In this case, no output is returned to the terminal. Instead, the file series.md is created (or replaced if it already existed) with the
markdown‑formatted results shown here:

| generate_series |
|-----------------|
| 0 |
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |

Configuring the CLI

Several dot commands can be used to configure the CLI. On startup, the CLI reads and executes all commands in the file ~/.duckdbrc,
including dot commands and SQL statements. This allows you to store the configuration state of the CLI. You may also point to a different
initialization file using the -init.

213
DuckDB Documentation

Setting a Custom Prompt As an example, a file in the same directory as the DuckDB CLI named prompt.sql will change the DuckDB
prompt to be a duck head and run a SQL statement. Note that the duck head is built with Unicode characters and does not work in all
terminal environments (e.g., in Windows, unless running with WSL and using the Windows Terminal).

.prompt ' '

To invoke that file on initialization, use this command:

$ duckdb -init prompt.sql

This outputs:

-- Loading resources from prompt.sql

v version git hash
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.

Non‑Interactive Usage

To read/process a file and exit immediately, pipe the file contents in to duckdb:

$ duckdb < select_example.sql

To execute a command with SQL text passed in directly from the command line, call duckdb with two arguments: the database location
(or :memory:), and a string with the SQL statement to execute.

$ duckdb :memory: "SELECT 42 AS the_answer"

Loading Extensions

To load extensions, use DuckDB's SQL INSTALL and LOAD commands as you would other SQL statements.

INSTALL fts;
LOAD fts;

For details, see the Extension docs.

Reading from stdin and Writing to stdout

When in a Unix environment, it can be useful to pipe data between multiple commands. DuckDB is able to read data from stdin as well
as write to stdout using the file location of stdin (/dev/stdin) and stdout (/dev/stdout) within SQL commands, as pipes act very
similarly to file handles.

This command will create an example CSV:

COPY (SELECT 42 AS woot UNION ALL SELECT 43 AS woot) TO 'test.csv' (HEADER);

First, read a file and pipe it to the duckdb CLI executable. As arguments to the DuckDB CLI, pass in the location of the database to open,
in this case, an in‑memory database, and a SQL command that utilizes /dev/stdin as a file location.

$ cat test.csv | duckdb :memory: "SELECT * FROM read_csv('/dev/stdin')"

┌───────┐
│ woot │
│ int32 │
├───────┤
│ 42 │
│ 43 │
└───────┘

214
DuckDB Documentation

To write back to stdout, the copy command can be used with the /dev/stdout file location.

$ cat test.csv | duckdb :memory: "COPY (SELECT * FROM read_csv('/dev/stdin')) TO '/dev/stdout' WITH
(FORMAT 'csv', HEADER)"

woot
42
43

Reading Environment Variables

The getenv function can read environment variables.

Examples To retrieve the home directory's path from the HOME environment variable, use:

SELECT getenv('HOME') AS home;

┌──────────────────┐
│ home │
│ varchar │
├──────────────────┤
│ /Users/user_name │
└──────────────────┘

The output of the getenv function can be used to set configuration options. For example, to set the NULL order based on the environment
variable DEFAULT_NULL_ORDER, use:

SET default_null_order = getenv('DEFAULT_NULL_ORDER');

Restrictions for Reading Environment Variables The getenv function can only be run when the enable_external_access is set
to true (the default setting). It is only available in the CLI client and is not supported in other DuckDB clients.

Prepared Statements

The DuckDB CLI supports executing prepared statements in addition to regular SELECT statements. To create and execute a prepared
statement in the CLI client, use the PREPARE clause and the EXECUTE statement.

Command Line Arguments

The table below summarizes DuckDB's command line options. To list all command line options, use the command duckdb -help. Fot
a list of dot commands available in the CLI shell, see the Dot Commands page.

Argument Description

-append Append the database to the end of the file

-ascii Set output mode to ascii
-bail Stop after hitting an error
-batch Force batch I/O
-box Set output mode to box
-column Set output mode to column
-cmd COMMAND Run COMMAND before reading stdin
-c COMMAND Run COMMAND and exit

215
DuckDB Documentation

Argument Description

-csv Set output mode to csv

-echo Print commands before execution
-init FILENAME Run the script in FILENAME upon startup (instead of ~./duckdbrc)
-header Turn headers on
-help Show this message
-html Set output mode to HTML
-interactive Force interactive I/O
-json Set output mode to json
-line Set output mode to line
-list Set output mode to list
-markdown Set output mode to markdown
-newline SEP Set output row separator. Default: \n
-nofollow Refuse to open symbolic links to database files
-noheader Turn headers off
-no-stdin Exit after processing options instead of reading stdin
-nullvalue TEXT Set text string for NULL values. Default: empty string
-quote Set output mode to quote
-readonly Open the database read‑only
-s COMMAND Run COMMAND and exit
-separator SEP Set output column separator to SEP. Default: |
-stats Print memory stats before each finalize
-table Set output mode to table
-unsigned Allow loading of unsigned extensions
-version Show DuckDB version

Dot Commands

Dot commands are available in the DuckDB CLI client. To use one of these commands, begin the line with a period (.) immediately followed
by the name of the command you wish to execute. Additional arguments to the command are entered, space separated, after the command.
If an argument must contain a space, either single or double quotes may be used to wrap that parameter. Dot commands must be entered
on a single line, and no whitespace may occur before the period. No semicolon is required at the end of the line. To see available commands,
use the .help command.

Dot Commands

Command Description

.bail on|off Stop after hitting an error. Default: off

.binary on|off Turn binary output on or off. Default: off
.cd DIRECTORY Change the working directory to DIRECTORY
.changes on|off Show number of rows changed by SQL

216
DuckDB Documentation

Command Description

.check GLOB Fail if output since .testcase does not match

.columns Column‑wise rendering of query results
.constant ?COLOR? Sets the syntax highlighting color used for constant values
.constantcode ?CODE? Sets the syntax highlighting terminal code used for constant values
.databases List names and files of attached databases
.echo on|off Turn command echo on or off
.excel Display the output of next command in spreadsheet
.exit ?CODE? Exit this program with return‑code CODE
.explain ?on|off|auto? Change the EXPLAIN formatting mode. Default: auto
.fullschema ?--indent? Show schema and the content of sqlite_stat tables
.headers on|off Turn display of headers on or off
.help ?-all? ?PATTERN? Show help text for PATTERN
.highlight [on|off] Toggle syntax highlighting in the shell on/off
.import FILE TABLE Import data from FILE into TABLE
.indexes ?TABLE? Show names of indexes
.keyword ?COLOR? Sets the syntax highlighting color used for keywords
.keywordcode ?CODE? Sets the syntax highlighting terminal code used for keywords
.lint OPTIONS Report potential schema issues.
.log FILE|off Turn logging on or off. FILE can be stderr/stdout
.maxrows COUNT Sets the maximum number of rows for display. Only for duckbox mode
.maxwidth COUNT Sets the maximum width in characters. 0 defaults to terminal width. Only for duckbox
mode
.mode MODE ?TABLE? Set output mode
.nullvalue STRING Use STRING in place of NULL values
.once ?OPTIONS? ?FILE? Output for the next SQL command only to FILE
.open ?OPTIONS? ?FILE? Close existing database and reopen FILE
.output ?FILE? Send output to FILE or stdout if FILE is omitted
.parameter CMD ... Manage SQL parameter bindings
.print STRING... Print literal STRING
.prompt MAIN CONTINUE Replace the standard prompts
.quit Exit this program
.read FILE Read input from FILE
.rows Row‑wise rendering of query results (default)
.schema ?PATTERN? Show the CREATE statements matching PATTERN
.separator COL ?ROW? Change the column and row separators
.sha3sum ... Compute a SHA3 hash of database content
.shell CMD ARGS... Run CMD ARGS... in a system shell
.show Show the current values for various settings
.system CMD ARGS... Run CMD ARGS... in a system shell
.tables ?TABLE? List names of tables matching LIKE pattern TABLE

217
DuckDB Documentation

Command Description

.testcase NAME Begin redirecting output to NAME

.timer on|off Turn SQL timer on or off
.width NUM1 NUM2 ... Set minimum column widths for columnar output

Using the .help Commmand

The .help text may be filtered by passing in a text string as the second argument.

.help m

.maxrows COUNT Sets the maximum number of rows for display (default: 40). Only for duckbox mode.
.maxwidth COUNT Sets the maximum width in characters. 0 defaults to terminal width. Only for duckbox
mode.
.mode MODE ?TABLE? Set output mode

.output: Writing Results to a File By default, the DuckDB CLI sends results to the terminal's standard output. However, this can be
modified using either the .output or .once commands. Pass in the desired output file location as a parameter. The .once command will
only output the next set of results and then revert to standard out, but .output will redirect all subsequent output to that file location.
Note that each result will overwrite the entire file at that destination. To revert back to standard output, enter .output with no file
parameter.

In this example, the output format is changed to markdown, the destination is identified as a Markdown file, and then DuckDB will write
the output of the SQL statement to that file. Output is then reverted to standard output using .output with no parameter.

.mode markdown
.output my_results.md
SELECT 'taking flight' AS output_column;
.output
SELECT 'back to the terminal' AS displayed_column;

The file my_results.md will then contain:

| output_column |
|---------------|
| taking flight |

The terminal will then display:

| displayed_column |
|----------------------|
| back to the terminal |

A common output format is CSV, or comma separated values. DuckDB supports SQL syntax to export data as CSV or Parquet, but the CLI‑
specific commands may be used to write a CSV instead if desired.

.mode csv
.once my_output_file.csv
SELECT 1 AS col_1, 2 AS col_2
UNION ALL
SELECT 10 AS col1, 20 AS col_2;

The file my_output_file.csv will then contain:

col_1,col_2
1,2
10,20

218
DuckDB Documentation

By passing special options (flags) to the .once command, query results can also be sent to a temporary file and automatically opened in
the user's default program. Use either the -e flag for a text file (opened in the default text editor), or the -x flag for a CSV file (opened in
the default spreadsheet editor). This is useful for more detailed inspection of query results, especially if there is a relatively large result set.
The .excel command is equivalent to .once -x.

.once -e
SELECT 'quack' AS hello;

The results then open in the default text file editor of the system, for example:

Querying the Database Schema

All DuckDB clients support querying the database schema with SQL, but the CLI has additional dot commands that can make it easier to
understand the contents of a database. The .tables command will return a list of tables in the database. It has an optional argument
that will filter the results according to a LIKE pattern.

CREATE TABLE swimmers AS SELECT 'duck' AS animal;

CREATE TABLE fliers AS SELECT 'duck' AS animal;
CREATE TABLE walkers AS SELECT 'duck' AS animal;
.tables

fliers swimmers walkers

For example, to filter to only tables that contain an ”l”, use the LIKE pattern %l%.

.tables %l%

fliers walkers

The .schema command will show all of the SQL statements used to define the schema of the database.

.schema

CREATE TABLE fliers (animal VARCHAR);

CREATE TABLE swimmers (animal VARCHAR);
CREATE TABLE walkers (animal VARCHAR);

Configuring the Syntax Highlighter

By default the shell includes support for syntax highlighting. The CLI's syntax highlighter can be configured using the following com‑
mands.

To turn off the highlighter:

.highlight on

To turn on the highlighter:

.highlight off

To configure the color used to highlight constants:

.constantcode [terminal_code]

To configure the color used to highlight keywords:

.keywordcode [terminal_code]

219
DuckDB Documentation

Importing Data from CSV

Note. Deprecated This feature is only included for compatibility reasons and may be removed in the future. Use the read_csv
function or the COPY statement to load CSV files.

DuckDB supports SQL syntax to directly query or import CSV files, but the CLI‑specific commands may be used to import a CSV instead if
desired. The .import command takes two arguments and also supports several options. The first argument is the path to the CSV file,
and the second is the name of the DuckDB table to create. Since DuckDB requires stricter typing than SQLite (upon which the DuckDB CLI
is based), the destination table must be created before using the .import command. To automatically detect the schema and create a
table from a CSV, see the read_csv examples in the import docs.

In this example, a CSV file is generated by changing to CSV mode and setting an output file location:

.mode csv
.output import_example.csv
SELECT 1 AS col_1, 2 AS col_2 UNION ALL SELECT 10 AS col1, 20 AS col_2;

Now that the CSV has been written, a table can be created with the desired schema and the CSV can be imported. The output is reset to the
terminal to avoid continuing to edit the output file specified above. The --skip N option is used to ignore the first row of data since it is
a header row and the table has already been created with the correct column names.

.mode csv
.output
CREATE TABLE test_table (col_1 INT, col_2 INT);
.import import_example.csv test_table --skip 1

Note that the .import command utilizes the current .mode and .separator settings when identifying the structure of the data to
import. The --csv option can be used to override that behavior.

.import import_example.csv test_table --skip 1 --csv

Output Formats

The .mode dot command may be used to change the appearance of the tables returned in the terminal output. In addition to customizing
the appearance, these modes have additional benefits. This can be useful for presenting DuckDB output elsewhere by redirecting the
terminal output to a file. Using the insert mode will build a series of SQL statements that can be used to insert the data at a later point.
The markdown mode is particularly useful for building documentation and the latex mode is useful for writing academic papers.

Mode Description

ascii Columns/rows delimited by 0x1F and 0x1E

box Tables using unicode box‑drawing characters
csv Comma‑separated values
column Output in columns. (See .width)
duckbox Tables with extensive features
html HTML <table> code
insert SQL insert statements for TABLE
json Results in a JSON array
jsonlines Results in a NDJSON
latex LaTeX tabular environment code
line One value per line
list Values delimited by ”|”
markdown Markdown table format

220
DuckDB Documentation

Mode Description

quote Escape answers as for SQL

table ASCII‑art table
tabs Tab‑separated values
tcl TCL list elements
trash No output

.mode markdown
SELECT 'quacking intensifies' AS incoming_ducks;

| incoming_ducks |
|----------------------|
| quacking intensifies |

The output appearance can also be adjusted with the .separator command. If using an export mode that relies on a separator (csv or
tabs for example), the separator will be reset when the mode is changed. For example, .mode csv will set the separator to a comma (,).
Using .separator "|" will then convert the output to be pipe‑separated.

.mode csv
SELECT 1 AS col_1, 2 AS col_2
UNION ALL
SELECT 10 AS col1, 20 AS col_2;

col_1,col_2
1,2
10,20

.separator "|"
SELECT 1 AS col_1, 2 AS col_2
UNION ALL
SELECT 10 AS col1, 20 AS col_2;

col_1|col_2
1|2
10|20

Editing
Note. The linenoise‑based CLI editor is currently only available for macOS and Linux.

Shift+Tab When autocompleting, cycle to previous entry
ESC+ESC When autocompleting, revert autocompletion

Miscellaneous

Key Action

Enter Execute query. If query is not complete, insert a newline at the end of the buffer
Ctrl+J Execute query. If query is not complete, insert a newline at the end of the buffer
Ctrl+C Cancel editing of current query
Ctrl+G Cancel editing of current query
Ctrl+L Clear screen
Ctrl+O Cancel editing of current query
Ctrl+X Insert a newline after the cursor
Ctrl+Z Suspend CLI and return to shell, use fg to re‑open

Using Read‑Line

If you prefer, you can use rlwrap to use read‑line directly with the shell. Then, use Shift+Enter to insert a newline and Enter to execute
the query:

rlwrap --substitute-prompt="D " duckdb -batch

Autocomplete

The shell offers context‑aware autocomplete of SQL queries through the autocomplete extension. autocomplete is triggered by pressing
Tab.

Multiple autocomplete suggestions can be present. You can cycle forwards through the suggestions by repeatedly pressing Tab, or
Shift+Tab to cycle backwards. autocompletion can be reverted by pressing ESC twice.

The shell autocompletes four different groups:

• Keywords
• Table names and table functions
• Column names and scalar functions

223
DuckDB Documentation

• File names

The shell looks at the position in the SQL statement to determine which of these autocompletions to trigger. For example:

SELECT s -> student_id

SELECT student_id F -> FROM

SELECT student_id FROM g -> grades

SELECT student_id FROM 'd -> data/

SELECT student_id FROM 'data/ -> data/grades.csv

Syntax Highlighting
Note. Syntax highlighting in the CLI is currently only available for macOS and Linux.

SQL queries that are written in the shell are automatically highlighted using syntax highlighting.

There are several components of a query that are highlighted in different colors. The colors can be configured using dot commands. Syntax
highlighting can also be disabled entirely using the .highlight off command.

Below is a list of components that can be configured.

Type Command Default Color

Keywords .keyword green

Constants ad literals .constant yellow
Comments .comment brightblack
Errors .error red

224
DuckDB Documentation

Type Command Default Color

Continuation .cont brightblack

Continuation (Selected) .cont_sel green

The components can be configured using either a supported color name (e.g., .keyword red), or by directly providing a terminal code to
use for rendering (e.g., .keywordcode \033[31m). Below is a list of supported color names and their corresponding terminal codes.

Color Terminal Code

red \033[31m
green \033[32m
yellow \033[33m
blue \033[34m
magenta \033[35m
cyan \033[36m
white \033[37m
brightblack \033[90m
brightred \033[91m
brightgreen \033[92m
brightyellow \033[93m
brightblue \033[94m
brightmagenta \033[95m
brightcyan \033[96m
brightwhite \033[97m

For example, here is an alternative set of syntax highlighting colors:

.keyword brightred
.constant brightwhite
.comment cyan
.error yellow
.cont blue
.cont_sel brightblue

If you wish to start up the CLI with a different set of colors every time, you can place these commands in the ~/.duckdbrc file that is
loaded on start‑up of the CLI.

Error Highlighting

The shell has support for highlighting certain errors. In particular, mismatched brackets and unclosed quotes are highlighted in red (or
another color if specified). This highlighting is automatically disabled for large queries. In addition, it can be disabled manually using the
.render_errors off command.

The DuckDB Go driver, go-duckdb, allows using DuckDB via the database/sql interface. For examples on how to use this interface,
see the official documentation and tutorial.

225
DuckDB Documentation

Note. The Go client is provided as a third‑party library.

Installation

To install the go-duckdb client, run:

go get github.com/marcboeker/go-duckdb

Importing

To import the DuckDB Go package, add the following entries to your imports:

import (
"database/sql"
_ "github.com/marcboeker/go-duckdb"
)

Appender

The DuckDB Go client supports the DuckDB Appender API for bulk inserts. You can obtain a new Appender by supplying a DuckDB connec‑
tion to NewAppenderFromConn(). For example:

connector, err := duckdb.NewConnector("test.db", nil)

if err != nil {
...
}
conn, err := connector.Connect(context.Background())
if err != nil {
...
}
defer conn.Close()

// Retrieve appender from connection (note that you have to create the table 'test' beforehand).
appender, err := NewAppenderFromConn(conn, "", "test")
if err != nil {
...
}
defer appender.Close()

err = appender.AppendRow(...)
if err != nil {
...
}

// Optional, if you want to access the appended rows immediately.

err = appender.Flush()
if err != nil {
...
}

Examples

Simple Example An example for using the Go API is as follows:

226
DuckDB Documentation

package main

import (
"database/sql"
"errors"
"fmt"
"log"

_ "github.com/marcboeker/go-duckdb"
)

func main() {
db, err := sql.Open("duckdb", "")
if err != nil {
log.Fatal(err)
}
defer db.Close()

_, err = db.Exec(` CREATE TABLE people (id INTEGER, name VARCHAR)`)

if err != nil {
log.Fatal(err)
}
_, err = db.Exec(` INSERT INTO people VALUES (42, 'John')`)
if err != nil {
log.Fatal(err)
}

var (
id int
name string
)
row := db.QueryRow(` SELECT id, name FROM people`)
err = row.Scan(&id, &name)
if errors.Is(err, sql.ErrNoRows) {
log.Println("no rows")
} else if err != nil {
log.Fatal(err)
}

fmt.Printf("id: %d, name: %s\n", id, name)

}

More Examples For more examples, see the examples in the duckdb-go repository.

Java JDBC API

Installation

The DuckDB Java JDBC API can be installed from Maven Central. Please see the installation page for details.

Basic API Usage

DuckDB's JDBC API implements the main parts of the standard Java Database Connectivity (JDBC) API, version 4.1. Describing JDBC is
beyond the scope of this page, see the official documentation for details. Below we focus on the DuckDB‑specific parts.

227
DuckDB Documentation

Refer to the externally hosted API Reference for more information about our extensions to the JDBC specification, or the below Arrow
Methods.

Startup & Shutdown In JDBC, database connections are created through the standard java.sql.DriverManager class. The driver
should auto‑register in the DriverManager, if that does not work for some reason, you can enforce registration like so:

Class.forName("org.duckdb.DuckDBDriver");

To create a DuckDB connection, call DriverManager with the jdbc:duckdb: JDBC URL prefix, like so:

import java.sql.Connection;
import java.sql.DriverManager;

Connection conn = DriverManager.getConnection("jdbc:duckdb:");

To use DuckDB‑specific features such as the Appender, cast the object to a DuckDBConnection:

import org.duckdb.DuckDBConnection;

DuckDBConnection conn = (DuckDBConnection) DriverManager.getConnection("jdbc:duckdb:");

When using the jdbc:duckdb: URL alone, an in‑memory database is created. Note that for an in‑memory database no data is persisted
to disk (i.e., all data is lost when you exit the Java program). If you would like to access or create a persistent database, append its file name
after the path. For example, if your database is stored in /tmp/my_database, use the JDBC URL jdbc:duckdb:/tmp/my_database
to create a connection to it.

It is possible to open a DuckDB database file in read‑only mode. This is for example useful if multiple Java processes want to read the same
database file at the same time. To open an existing database file in read‑only mode, set the connection property duckdb.read_only
like so:

Properties ro_prop = new Properties();

ro_prop.setProperty("duckdb.read_only", "true");
Connection conn_ro = DriverManager.getConnection("jdbc:duckdb:/tmp/my_database", ro_prop);

Additional connections can be created using the DriverManager. A more efficient mechanism is to call the DuckDBConnec-
tion#duplicate() method like so:

Connection conn2 = ((DuckDBConnection) conn).duplicate();

Multiple connections are allowed, but mixing read‑write and read‑only connections is unsupported.

Configuring Connections Configuration options can be provided to change different settings of the database system. Note that many
of these settings can be changed later on using PRAGMA statements as well.

Properties connectionProperties = new Properties();

connectionProperties.setProperty("temp_directory", "/path/to/temp/dir/");
Connection conn = DriverManager.getConnection("jdbc:duckdb:/tmp/my_database", connectionProperties);

Querying DuckDB supports the standard JDBC methods to send queries and retrieve result sets. First a Statement object has to be
created from the Connection, this object can then be used to send queries using execute and executeQuery. execute() is meant
for queries where no results are expected like CREATE TABLE or UPDATE etc. and executeQuery() is meant to be used for queries
that produce results (e.g., SELECT). Below two examples. See also the JDBC Statement and ResultSet documentations.

// create a table
Statement stmt = conn.createStatement();
stmt.execute("CREATE TABLE items (item VARCHAR, value DECIMAL(10, 2), count INTEGER)");
// insert two items into the table
stmt.execute("INSERT INTO items VALUES ('jeans', 20.0, 1), ('hammer', 42.2, 2)");
stmt.close();

228
DuckDB Documentation

try (ResultSet rs = stmt.executeQuery("SELECT * FROM items")) {

while (rs.next()) {
System.out.println(rs.getString(1));
System.out.println(rs.getInt(3));
}
}
// jeans
// 1
// hammer
// 2

DuckDB also supports prepared statements as per the JDBC API:

try (PreparedStatement p_stmt = conn.prepareStatement("INSERT INTO items VALUES (?, ?, ?);")) {

p_stmt.setString(1, "chainsaw");
p_stmt.setDouble(2, 500.0);
p_stmt.setInt(3, 42);
p_stmt.execute();
// more calls to execute() possible
}

Note. Warning Do not use prepared statements to insert large amounts of data into DuckDB. See the data import documentation
for better options.

Arrow Methods Refer to the API Reference for type signatures

Arrow Export The following demonstrates exporting an arrow stream and consuming it using the java arrow bindings

import org.apache.arrow.memory.RootAllocator;
import org.apache.arrow.vector.ipc.ArrowReader;
import org.duckdb.DuckDBResultSet;

try (var conn = DriverManager.getConnection("jdbc:duckdb:");

Arrow Import The following demonstrates consuming an arrow stream from the java arrow bindings

import org.apache.arrow.memory.RootAllocator;
import org.apache.arrow.vector.ipc.ArrowReader;
import org.duckdb.DuckDBConnection;

// Arrow stuff
try (var allocator = new RootAllocator();
ArrowStreamReader reader = null; // should not be null of course
var arrow_array_stream = ArrowArrayStream.allocateNew(allocator)) {
Data.exportArrayStream(allocator, reader, arrow_array_stream);

// DuckDB stuff

229
DuckDB Documentation

try (var conn = (DuckDBConnection) DriverManager.getConnection("jdbc:duckdb:")) {

conn.registerArrowStream("asdf", arrow_array_stream);

// run a query
try (var stmt = conn.createStatement();
var rs = (DuckDBResultSet) stmt.executeQuery("SELECT count(*) FROM asdf")) {
while (rs.next()) {
System.out.println(rs.getInt(1));
}
}
}
}

Streaming Results Result streaming is opt‑in in the JDBC driver ‑ by setting the jdbc_stream_results config to true before run‑
ning a query. The easiest way do that is to pass it in the Properties object.

Properties props = new Properties();

props.setProperty(DuckDBDriver.JDBC_STREAM_RESULTS, String.valueOf(true));

Connection conn = DriverManager.getConnection("jdbc:duckdb:", props);

Appender The Appender is available in the DuckDB JDBC driver via the org.duckdb.DuckDBAppender class. The constructor of the
class requires the schema name and the table name it is applied to. The Appender is flushed when the close() method is called.

Example:

import org.duckdb.DuckDBConnection;

DuckDBConnection conn = (DuckDBConnection) DriverManager.getConnection("jdbc:duckdb:");

Statement stmt = conn.createStatement();
stmt.execute("CREATE TABLE tbl (x BIGINT, y FLOAT, s VARCHAR)");

// using try-with-resources to automatically close the appender at the end of the scope
try (var appender = conn.createAppender(DuckDBConnection.DEFAULT_SCHEMA, "tbl")) {
appender.beginRow();
appender.append(10);
appender.append(3.2);
appender.append("hello");
appender.endRow();
appender.beginRow();
appender.append(20);
appender.append(-8.1);
appender.append("world");
appender.endRow();
}
stmt.close();

Batch Writer The DuckDB JDBC driver offers batch write functionality. The batch writer supports prepared statements to mitigate the
overhead of query parsing.

Note. The preferred method for bulk inserts is to use the Appender due to its higher performance. However, when using the Ap‑
pender is not possbile, the batch writer is available as alternative.

Batch Writer with Prepared Statements

import org.duckdb.DuckDBConnection;

230
DuckDB Documentation

DuckDBConnection conn = (DuckDBConnection) DriverManager.getConnection("jdbc:duckdb:");

PreparedStatement stmt = conn.prepareStatement("INSERT INTO test (x, y, z) VALUES (?, ?, ?);");

stmt.setObject(1, 1);
stmt.setObject(2, 2);
stmt.setObject(3, 3);
stmt.addBatch();

stmt.setObject(1, 4);
stmt.setObject(2, 5);
stmt.setObject(3, 6);
stmt.addBatch();

stmt.executeBatch();
stmt.close();

Batch Writer with Vanilla Statements The batch writer also supports vanilla SQL statements:

import org.duckdb.DuckDBConnection;

DuckDBConnection conn = (DuckDBConnection) DriverManager.getConnection("jdbc:duckdb:");

Statement stmt = conn.createStatement();

stmt.execute("CREATE TABLE test (x INT, y INT, z INT)");

stmt.addBatch("INSERT INTO test (x, y, z) VALUES (1, 2, 3);");

stmt.addBatch("INSERT INTO test (x, y, z) VALUES (4, 5, 6);");

stmt.executeBatch();
stmt.close();

Julia Package

The DuckDB Julia package provides a high‑performance front‑end for DuckDB. Much like SQLite, DuckDB runs in‑process within the Julia
client, and provides a DBInterface front‑end.

The package also supports multi‑threaded execution. It uses Julia threads/tasks for this purpose. If you wish to run queries in parallel, you
must launch Julia with multi‑threading support (by e.g., setting the JULIA_NUM_THREADS environment variable).

Installation

Install DuckDB as follows:

using Pkg
Pkg.add("DuckDB")

Alternatively, enter the package manager using the ] key, and issue the following command:

pkg> add DuckDB

Basics

using DuckDB

# create a new in-memory database

con = DBInterface.connect(DuckDB.DB, ":memory:")

231
DuckDB Documentation

# create a table
DBInterface.execute(con, "CREATE TABLE integers (i INTEGER)")

# insert data using a prepared statement

stmt = DBInterface.prepare(con, "INSERT INTO integers VALUES(?)")
DBInterface.execute(stmt, [42])

# query the database

results = DBInterface.execute(con, "SELECT 42 a")
print(results)

Scanning DataFrames

The DuckDB Julia package also provides support for querying Julia DataFrames. Note that the DataFrames are directly read by DuckDB ‑
they are not inserted or copied into the database itself.

If you wish to load data from a DataFrame into a DuckDB table you can run a CREATE TABLE ... AS or INSERT INTO query.

using DuckDB
using DataFrames

# create a new in-memory dabase

con = DBInterface.connect(DuckDB.DB)

# create a DataFrame
df = DataFrame(a = [1, 2, 3], b = [42, 84, 42])

# register it as a view in the database

DuckDB.register_data_frame(con, df, "my_df")

# run a SQL query over the DataFrame

results = DBInterface.execute(con, "SELECT * FROM my_df")
print(results)

Appender API

The DuckDB Julia package also supports the Appender api, which is much faster than using prepared statements or individual INSERT
INTO statements. Appends are made in row‑wise format. For every column, an append() call should be made, after which the row should
be finished by calling flush(). After all rows have been appended, close() should be used to finalize the appender and clean up the resulting
memory.

using DuckDB, DataFrames, Dates

db = DuckDB.DB()
# create a table
DBInterface.execute(db, "CREATE OR REPLACE
TABLE data(id INT PRIMARY KEY, value FLOAT,
timestamp TIMESTAMP, date DATE)")
# create data to insert
len = 100
df = DataFrames.DataFrame(id=collect(1:len),
value=rand(len),
timestamp=Dates.now() + Dates.Second.(1:len),
date=Dates.today() + Dates.Day.(1:len))
# append data by row
appender = DuckDB.Appender(db, "data")
for i in eachrow(df)

232
DuckDB Documentation

for j in i
DuckDB.append(appender, j)
end
DuckDB.end_row(appender)
end
# flush the appender after all rows
DuckDB.flush(appender)
DuckDB.close(appender)

Concurrency

Within a julia process, tasks are able to concurrently read and write to the database, as long as each task maintains its own connection to
the database. In the example below, a single task is spawned to periodically read the database and many tasks are spawned to write to the
database using both INSERT statements as well as the appender api.

using Dates, DataFrames, DuckDB

db = DuckDB.DB()
DBInterface.connect(db)
DBInterface.execute(db, "CREATE OR REPLACE TABLE data (date TIMESTAMP, id INT)")

function run_reader(db)
# create a DuckDB connection specifically for this task
conn = DBInterface.connect(db)
while true
println(DBInterface.execute(conn,
"SELECT id, count(date) as count, max(date) as max_date
FROM data group by id order by id") |> DataFrames.DataFrame)
Threads.sleep(1)
end
DBInterface.close(conn)
end
# spawn one reader task
Threads.@spawn run_reader(db)

function run_inserter(db, id)

# create a DuckDB connection specifically for this task
conn = DBInterface.connect(db)
for i in 1:1000
Threads.sleep(0.01)
DuckDB.execute(conn, "INSERT INTO data VALUES (current_timestamp, ?)"; id);
end
DBInterface.close(conn)
end
# spawn many insert tasks
for i in 1:100
Threads.@spawn run_inserter(db, 1)
end

function run_appender(db, id)

# create a DuckDB connection specifically for this task
appender = DuckDB.Appender(db, "data")
for i in 1:1000
Threads.sleep(0.01)
row = (Dates.now(Dates.UTC), id)
for j in row
DuckDB.append(appender, j);
end
DuckDB.end_row(appender);

233
DuckDB Documentation

DuckDB.flush(appender);
end
DuckDB.close(appender);
end
# spawn many appender tasks
for i in 1:100
Threads.@spawn run_appender(db, 2)
end

Original Julia Connector

Credits to kimmolinna for the original DuckDB Julia connector.

Node.js

Node.js API

This package provides a Node.js API for DuckDB. The API for this client is somewhat compliant to the SQLite Node.js client for easier tran‑
sition.

For TypeScript wrappers, see the duckdb‑async project.

Initializing

Load the package and create a database object:

const duckdb = require('duckdb');

const db = new duckdb.Database(':memory:'); // or a file name for a persistent DB

All options as described on Database configuration can be (optionally) supplied to the Database constructor as second argument. The
third argument can be optionally supplied to get feedback on the given options.

const db = new duckdb.Database(':memory:', {

"access_mode": "READ_WRITE",
"max_memory": "512MB",
"threads": "4"
}, (err) => {
if (err) {
console.error(err);
}
});

Running a Query

The following code snippet runs a simple query using the Database.all() method.

db.all('SELECT 42 AS fortytwo', function(err, res) {

if (err) {
console.warn(err);
return;
}
console.log(res[0].fortytwo)
});

234
DuckDB Documentation

Other available methods are each, where the callback is invoked for each row, run to execute a single statement without results and
exec, which can execute several SQL commands at once but also does not return results. All those commands can work with prepared
statements, taking the values for the parameters as additional arguments. For example like so:

db.all('SELECT ?::INTEGER AS fortytwo, ?::STRING AS hello', 42, 'Hello, World', function(err, res) {
if (err) {
console.warn(err);
return;
}
console.log(res[0].fortytwo)
console.log(res[0].hello)
});

Connections

A database can have multiple Connections, those are created using db.connect().

const con = db.connect();

You can create multiple connections, each with their own transaction context.

Connection objects also contain shorthands to directly call run(), all() and each() with parameters and callbacks, respectively,
for example:

con.all('SELECT 42 AS fortytwo', function(err, res) {

if (err) {
console.warn(err);
return;
}
console.log(res[0].fortytwo)
});

Prepared Statements

From connections, you can create prepared statements (and only that) using con.prepare():

const stmt = con.prepare('SELECT ?::INTEGER AS fortytwo');

To execute this statement, you can call for example all() on the stmt object:

stmt.all(42, function(err, res) {

if (err) {
console.warn(err);
} else {
console.log(res[0].fortytwo)
}
});

You can also execute the prepared statement multiple times. This is for example useful to fill a table with data:

con.run('CREATE TABLE a (i INTEGER)');

const stmt = con.prepare('INSERT INTO a VALUES (?)');
for (let i = 0; i < 10; i++) {
stmt.run(i);
}
stmt.finalize();
con.all('SELECT * FROM a', function(err, res) {
if (err) {
console.warn(err);
} else {

235
DuckDB Documentation

console.log(res)
}
});

prepare() can also take a callback which gets the prepared statement as an argument:

const stmt = con.prepare('SELECT ?::INTEGER AS fortytwo', function(err, stmt) {

stmt.all(42, function(err, res) {
if (err) {
console.warn(err);
} else {
console.log(res[0].fortytwo)
}
});
});

Inserting Data via Arrow

Apache Arrow can be used to insert data into DuckDB without making a copy:

const arrow = require('apache-arrow');

const db = new duckdb.Database(':memory:');

const jsonData = [
{"userId":1,"id":1,"title":"delectus aut autem","completed":false},
{"userId":1,"id":2,"title":"quis ut nam facilis et officia qui","completed":false}
];

// note; doesn't work on Windows yet

db.exec(` INSTALL arrow; LOAD arrow;`, (err) => {
if (err) {
console.warn(err);
return;
}

const arrowTable = arrow.tableFromJSON(jsonData);

db.register_buffer("jsonDataTable", [arrow.tableToIPC(arrowTable)], true, (err, res) => {
if (err) {
console.warn(err);
return;
}

// `SELECT * FROM jsonDataTable` would return the entries in `jsonData`

});
});

Loading Unsigned Extensions

To load unsigned extensions, instantiate the database as follows:

db = new duckdb.Database(':memory:', {"allow_unsigned_extensions": "true"});

236
DuckDB Documentation

Node.js API

Modules

Typedefs

duckdb

Summary: DuckDB is an embeddable SQL OLAP Database Management System

• duckdb

– ~Connection

* .run(sql, ...params, callback) ⇒ void

* .all(sql, ...params, callback) ⇒ void
* .arrowIPCAll(sql, ...params, callback) ⇒ void
* .arrowIPCStream(sql, ...params, callback) ⇒
* .each(sql, ...params, callback) ⇒ void
* .stream(sql, ...params)
* .register_udf(name, return_type, fun) ⇒ void
* .prepare(sql, ...params, callback) ⇒ Statement
* .exec(sql, ...params, callback) ⇒ void
* .register_udf_bulk(name, return_type, callback) ⇒ void
* .unregister_udf(name, return_type, callback) ⇒ void
* .register_buffer(name, array, force, callback) ⇒ void
* .unregister_buffer(name, callback) ⇒ void
* .close(callback) ⇒ void
– ~Statement

* .sql ⇒
* .get()
* .run(sql, ...params, callback) ⇒ void
* .all(sql, ...params, callback) ⇒ void
* .arrowIPCAll(sql, ...params, callback) ⇒ void
* .each(sql, ...params, callback) ⇒ void
* .finalize(sql, ...params, callback) ⇒ void
* .stream(sql, ...params)
* .columns() ⇒ Array.<ColumnInfo>
– ~QueryResult

* .nextChunk() ⇒
* .nextIpcBuffer() ⇒
* .asyncIterator()
– ~Database

* .close(callback) ⇒ void
* .close_internal(callback) ⇒ void
* .wait(callback) ⇒ void
* .serialize(callback) ⇒ void
* .parallelize(callback) ⇒ void
* .connect(path) ⇒ Connection
* .interrupt(callback) ⇒ void
* .prepare(sql) ⇒ Statement
* .run(sql, ...params, callback) ⇒ void
* .scanArrowIpc(sql, ...params, callback) ⇒ void
* .each(sql, ...params, callback) ⇒ void
* .all(sql, ...params, callback) ⇒ void

237
DuckDB Documentation

* .arrowIPCAll(sql, ...params, callback) ⇒ void

connection.stream(sql, ...params) Kind: instance method of Connection

Param Type

sql
...params *

239
DuckDB Documentation

connection.register_udf(name, return_type, fun) ⇒ void Register a User Defined Function

Kind: instance method of Connection

Note: this follows the wasm udfs somewhat but is simpler because we can pass data much more cleanly

Param

name
return_type
fun

connection.prepare(sql, ...params, callback) ⇒ Statement Prepare a SQL query for execution

Kind: instance method of Connection

Param Type

sql
...params *
callback

connection.exec(sql, ...params, callback) ⇒ void Execute a SQL query

Kind: instance method of Connection

Param Type

sql
...params *
callback

connection.register_udf_bulk(name, return_type, callback) ⇒ void Register a User Defined Function

Kind: instance method of Connection

Param

name
return_type
callback

connection.unregister_udf(name, return_type, callback) ⇒ void Unregister a User Defined Function

Kind: instance method of Connection

Param

name
return_type

240
DuckDB Documentation

Param

callback

connection.register_buffer(name, array, force, callback) ⇒ void Register a Buffer to be scanned using the Apache Arrow IPC scanner
(requires arrow extension to be loaded)

• ~Statement
– .sql ⇒
– .get()
– .run(sql, ...params, callback) ⇒ void
– .all(sql, ...params, callback) ⇒ void
– .arrowIPCAll(sql, ...params, callback) ⇒ void
– .each(sql, ...params, callback) ⇒ void
– .finalize(sql, ...params, callback) ⇒ void
– .stream(sql, ...params)
– .columns() ⇒ Array.<ColumnInfo>

statement.sql ⇒ Kind: instance property of Statement

Returns: sql contained in statement
Field:

241
DuckDB Documentation

statement.get() Not implemented

Kind: instance method of Statement

statement.run(sql, ...params, callback) ⇒ void Kind: instance method of Statement

Param Type

sql
...params *
callback

statement.all(sql, ...params, callback) ⇒ void Kind: instance method of Statement

Param Type

sql
...params *
callback

statement.arrowIPCAll(sql, ...params, callback) ⇒ void Kind: instance method of Statement

Param Type

sql
...params *
callback

statement.each(sql, ...params, callback) ⇒ void Kind: instance method of Statement

Param Type

sql
...params *
callback

statement.finalize(sql, ...params, callback) ⇒ void Kind: instance method of Statement

Param Type

sql
...params *
callback

242
DuckDB Documentation

statement.stream(sql, ...params) Kind: instance method of Statement

Param Type

sql
...params *

statement.columns() ⇒ Array.<ColumnInfo> Kind: instance method of Statement

Returns: Array.<ColumnInfo> ‑ ‑ Array of column names and types

duckdb~QueryResult Kind: inner class of duckdb

• ~QueryResult

– .nextChunk() ⇒
– .nextIpcBuffer() ⇒
– .asyncIterator()

queryResult.nextChunk() ⇒ Kind: instance method of QueryResult

Returns: data chunk

queryResult.nextIpcBuffer() ⇒ Function to fetch the next result blob of an Arrow IPC Stream in a zero‑copy way. (requires arrow exten‑
sion to be loaded)

Kind: instance method of QueryResult

Returns: data chunk

queryResult.asyncIterator() Kind: instance method of QueryResult

duckdb~Database Main database interface

Kind: inner property of duckdb

Param Description

path path to database file or :memory: for in‑memory database

access_mode access mode
config the configuration object
callback callback function

• ~Database

– .close(callback) ⇒ void
– .close_internal(callback) ⇒ void
– .wait(callback) ⇒ void
– .serialize(callback) ⇒ void
– .parallelize(callback) ⇒ void

243
DuckDB Documentation

– .connect(path) ⇒ Connection
– .interrupt(callback) ⇒ void
– .prepare(sql) ⇒ Statement
– .run(sql, ...params, callback) ⇒ void
– .scanArrowIpc(sql, ...params, callback) ⇒ void
– .each(sql, ...params, callback) ⇒ void
– .all(sql, ...params, callback) ⇒ void
– .arrowIPCAll(sql, ...params, callback) ⇒ void
– .arrowIPCStream(sql, ...params, callback) ⇒ void
– .exec(sql, ...params, callback) ⇒ void
– .register_udf(name, return_type, fun) ⇒ this
– .register_buffer(name) ⇒ this
– .unregister_buffer(name) ⇒ this
– .unregister_udf(name) ⇒ this
– .registerReplacementScan(fun) ⇒ this
– .tokenize(text) ⇒ ScriptTokens
– .get()

database.close(callback) ⇒ void Closes database instance

Kind: instance method of Database

Param

callback

database.close_internal(callback) ⇒ void Internal method. Do not use, call Connection#close instead

Kind: instance method of Database

Param

callback

database.wait(callback) ⇒ void Triggers callback when all scheduled database tasks have completed.

Kind: instance method of Database

Param

callback

database.serialize(callback) ⇒ void Currently a no‑op. Provided for SQLite compatibility

Kind: instance method of Database

Param

callback

244
DuckDB Documentation

database.parallelize(callback) ⇒ void Currently a no‑op. Provided for SQLite compatibility

Kind: instance method of Database

Param

callback

database.connect(path) ⇒ Connection Create a new database connection

Kind: instance method of Database

Param Description

path the database to connect to, either a file path, or :memory:

database.interrupt(callback) ⇒ void Supposedly interrupt queries, but currently does not do anything.

Kind: instance method of Database

Param

callback

database.prepare(sql) ⇒ Statement Prepare a SQL query for execution

Kind: instance method of Database

Param

sql

database.run(sql, ...params, callback) ⇒ void Convenience method for Connection#run using a built‑in default connection

Kind: instance method of Database

Param Type

sql
...params *
callback

database.scanArrowIpc(sql, ...params, callback) ⇒ void Convenience method for Connection#scanArrowIpc using a built‑in default
connection

Kind: instance method of Database

Param Type

sql
...params *

245
DuckDB Documentation

Param Type

callback

database.each(sql, ...params, callback) ⇒ void Kind: instance method of Database

Param Type

sql
...params *
callback

database.all(sql, ...params, callback) ⇒ void Convenience method for Connection#apply using a built‑in default connection

Kind: instance method of Database

Param Type

sql
...params *
callback

database.arrowIPCAll(sql, ...params, callback) ⇒ void Convenience method for Connection#arrowIPCAll using a built‑in default con‑
nection

Kind: instance method of Database

Param Type

sql
...params *
callback

database.arrowIPCStream(sql, ...params, callback) ⇒ void Convenience method for Connection#arrowIPCStream using a built‑in de‑
fault connection

Kind: instance method of Database

Param Type

sql
...params *
callback

database.exec(sql, ...params, callback) ⇒ void Kind: instance method of Database

246
DuckDB Documentation

Param Type

sql
...params *
callback

database.register_udf(name, return_type, fun) ⇒ this Register a User Defined Function

Convenience method for Connection#register_udf

Kind: instance method of Database

Param

name
return_type
fun

database.register_buffer(name) ⇒ this Register a buffer containing serialized data to be scanned from DuckDB.

Convenience method for Connection#unregister_buffer

Kind: instance method of Database

Param

name

database.unregister_buffer(name) ⇒ this Unregister a Buffer

Convenience method for Connection#unregister_buffer

Kind: instance method of Database

Param

name

database.unregister_udf(name) ⇒ this Unregister a UDF

Convenience method for Connection#unregister_udf

Kind: instance method of Database

Param

name

database.registerReplacementScan(fun) ⇒ this Register a table replace scan function

Kind: instance method of Database

247
DuckDB Documentation

Param Description

fun Replacement scan function

database.tokenize(text) ⇒ ScriptTokens Return positions and types of tokens in given text

Kind: instance method of Database

Param

text

database.get() Not implemented

Kind: instance method of Database

duckdb~TokenType Types of tokens return by tokenize.

Kind: inner property of duckdb

duckdb~ERROR : number Check that errno attribute equals this to check for a duckdb error

Kind: inner constant of duckdb

duckdb~OPEN_READONLY : number Open database in readonly mode

Kind: inner constant of duckdb

duckdb~OPEN_READWRITE : number Currently ignored

Kind: inner constant of duckdb

duckdb~OPEN_CREATE : number Currently ignored

Kind: inner constant of duckdb

duckdb~OPEN_FULLMUTEX : number Currently ignored

Kind: inner constant of duckdb

duckdb~OPEN_SHAREDCACHE : number Currently ignored

Kind: inner constant of duckdb

248
DuckDB Documentation

duckdb~OPEN_PRIVATECACHE : number Currently ignored

Kind: inner constant of duckdb

ColumnInfo : object

Kind: global typedef

Properties

Name Type Description

name string Column name

type TypeInfo Column type

TypeInfo : object

Kind: global typedef

Properties

Name Type Description

id string Type ID
[alias] string SQL type alias
sql_type string SQL type name

DuckDbError : object

Kind: global typedef

Properties

Name Type Description

errno number ‑1 for DuckDB errors

message string Error message
code string 'DUCKDB_NODEJS_ERROR' for DuckDB
errors
errorType string DuckDB error type code (eg, HTTP, IO,
Catalog)

HTTPError : object

Kind: global typedef

Extends: DuckDbError
Properties

249
DuckDB Documentation

Name Type Description

statusCode number HTTP response status code

reason string HTTP response reason
response string HTTP response body
headers object HTTP headers

Python

Python API

Installation

The DuckDB Python API can be installed using pip: pip install duckdb. Please see the installation page for details. It is also possible
to install DuckDB using conda: conda install python-duckdb -c conda-forge.

Python version: DuckDB requires Python 3.7 or newer.

Basic API Usage

The most straight‑forward manner of running SQL queries using DuckDB is using the duckdb.sql command.

import duckdb
duckdb.sql("SELECT 42").show()

This will run queries using an in‑memory database that is stored globally inside the Python module. The result of the query is returned
as a Relation. A relation is a symbolic representation of the query. The query is not executed until the result is fetched or requested to be
printed to the screen.

Relations can be referenced in subsequent queries by storing them inside variables, and using them as tables. This way queries can be
constructed incrementally.

import duckdb
r1 = duckdb.sql("SELECT 42 AS i")
duckdb.sql("SELECT i * 2 AS k FROM r1").show()

Data Input

DuckDB can ingest data from a wide variety of formats – both on‑disk and in‑memory. See the data ingestion page for more information.

import duckdb
duckdb.read_csv("example.csv") # read a CSV file into a Relation
duckdb.read_parquet("example.parquet") # read a Parquet file into a Relation
duckdb.read_json("example.json") # read a JSON file into a Relation

duckdb.sql("SELECT * FROM 'example.csv'") # directly query a CSV file

duckdb.sql("SELECT * FROM 'example.parquet'") # directly query a Parquet file
duckdb.sql("SELECT * FROM 'example.json'") # directly query a JSON file

DataFrames DuckDB can also directly query Pandas DataFrames, Polars DataFrames and Arrow tables.

250
DuckDB Documentation

import duckdb

# directly query a Pandas DataFrame

import pandas as pd
pandas_df = pd.DataFrame({"a": [42]})
duckdb.sql("SELECT * FROM pandas_df")

# directly query a Polars DataFrame

import polars as pl
polars_df = pl.DataFrame({"a": [42]})
duckdb.sql("SELECT * FROM polars_df")

# directly query a pyarrow table

import pyarrow as pa
arrow_table = pa.Table.from_pydict({"a": [42]})
duckdb.sql("SELECT * FROM arrow_table")

Result Conversion

DuckDB supports converting query results efficiently to a variety of formats. See the result conversion page for more information.

import duckdb
duckdb.sql("SELECT 42").fetchall() # Python objects
duckdb.sql("SELECT 42").df() # Pandas DataFrame
duckdb.sql("SELECT 42").pl() # Polars DataFrame
duckdb.sql("SELECT 42").arrow() # Arrow Table
duckdb.sql("SELECT 42").fetchnumpy() # NumPy Arrays

Writing Data to Disk

DuckDB supports writing Relation objects directly to disk in a variety of formats. The COPY statement can be used to write data to disk
using SQL as an alternative.

import duckdb
duckdb.sql("SELECT 42").write_parquet("out.parquet") # Write to a Parquet file
duckdb.sql("SELECT 42").write_csv("out.csv") # Write to a CSV file
duckdb.sql("COPY (SELECT 42) TO 'out.parquet'") # Copy to a Parquet file

Using an In‑Memory Database

When using DuckDB through duckdb.sql(), it operates on an in‑memory database, i.e., no tables are persisted on disk. Invoking the
duckdb.connect() method without arguments returns a connection, which also uses an in‑memory database:

import duckdb

con = duckdb.connect()
con.sql("SELECT 42 AS x").show()

Persistent Storage

The duckdb.connect( dbname) creates a connection to a persistent database. Any data written to that connection will be persisted,
and can be reloaded by re‑connecting to the same file, both from Python and from other DuckDB clients.

import duckdb

# create a connection to a file called 'file.db'

251
DuckDB Documentation

con = duckdb.connect("file.db")
# create a table and load data into it
con.sql("CREATE TABLE test (i INTEGER)")
con.sql("INSERT INTO test VALUES (42)")
# query the table
con.table("test").show()
# explicitly close the connection
con.close()
# Note: connections also closed implicitly when they go out of scope

You can also use a context manager to ensure that the connection is closed:

import duckdb

with duckdb.connect("file.db") as con:

con.sql("CREATE TABLE test (i INTEGER)")
con.sql("INSERT INTO test VALUES (42)")
con.table("test").show()
# the context manager closes the connection automatically

Connection Object and Module

The connection object and the duckdb module can be used interchangeably – they support the same methods. The only difference is that
when using the duckdb module a global in‑memory database is used.

Note that if you are developing a package designed for others to use, and use DuckDB in the package, it is recommend that you create con‑
nection objects instead of using the methods on the duckdb module. That is because the duckdb module uses a shared global database
– which can cause hard to debug issues if used from within multiple different packages.

Using Connections in Parallel Python Programs

The DuckDBPyConnection object is not thread‑safe. If you would like to write to the same database from multiple threads, create a
cursor for each thread with the DuckDBPyConnection.cursor() method.

Loading and Installing Extensions

DuckDB's Python API provides functions for installing and loading extensions, which perform the equivalent operations to running the
INSTALL and LOAD SQL commands, respectively. An example that installs and loads the spatial extension looks like follows:

import duckdb

con = duckdb.connect()
con.install_extension("spatial")
con.load_extension("spatial")

Note. To load unsigned extensions, add the config = {"allow_unsigned_extensions": "true"} argument to the
duckdb.connect() method.

Data Ingestion

CSV Files

CSV files can be read using the read_csv function, called either from within Python or directly from within SQL. By default, the read_
csv function attempts to auto‑detect the CSV settings by sampling from the provided file.

252
DuckDB Documentation

import duckdb
# read from a file using fully auto-detected settings
duckdb.read_csv("example.csv")
# read multiple CSV files from a folder
duckdb.read_csv("folder/*.csv")
# specify options on how the CSV is formatted internally
duckdb.read_csv("example.csv", header = False, sep = ",")
# override types of the first two columns
duckdb.read_csv("example.csv", dtype = ["int", "varchar"])
# use the (experimental) parallel CSV reader
duckdb.read_csv("example.csv", parallel = True)
# directly read a CSV file from within SQL
duckdb.sql("SELECT * FROM 'example.csv'")
# call read_csv from within SQL
duckdb.sql("SELECT * FROM read_csv('example.csv')")

See the CSV Import page for more information.

Parquet Files

Parquet files can be read using the read_parquet function, called either from within Python or directly from within SQL.
import duckdb
# read from a single Parquet file
duckdb.read_parquet("example.parquet")
# read multiple Parquet files from a folder
duckdb.read_parquet("folder/*.parquet")
# read a Parquet over https
duckdb.read_parquet("https://some.url/some_file.parquet")
# read a list of Parquet files
duckdb.read_parquet(["file1.parquet", "file2.parquet", "file3.parquet"])
# directly read a Parquet file from within SQL
duckdb.sql("SELECT * FROM 'example.parquet'")
# call read_parquet from within SQL
duckdb.sql("SELECT * FROM read_parquet('example.parquet')")

See the Parquet Loading page for more information.

JSON Files

JSON files can be read using the read_json function, called either from within Python or directly from within SQL. By default, the read_
json function will automatically detect if a file contains newline‑delimited JSON or regular JSON, and will detect the schema of the objects
stored within the JSON file.
import duckdb
# read from a single JSON file
duckdb.read_json("example.json")
# read multiple JSON files from a folder
duckdb.read_json("folder/*.json")
# directly read a JSON file from within SQL
duckdb.sql("SELECT * FROM 'example.json'")
# call read_json from within SQL
duckdb.sql("SELECT * FROM read_json_auto('example.json')")

DataFrames & Arrow Tables

DuckDB is automatically able to query a Pandas DataFrame, Polars DataFrame, or Arrow object that is stored in a Python variable by name.
Accessing these is made possible by replacement scans.

253
DuckDB Documentation

DuckDB supports querying multiple types of Apache Arrow objects including tables, datasets, RecordBatchReaders, and scanners. See the
Python guides for more examples.

import duckdb
import pandas as pd
test_df = pd.DataFrame.from_dict({"i": [1, 2, 3, 4], "j": ["one", "two", "three", "four"]})
duckdb.sql("SELECT * FROM test_df").fetchall()
# [(1, 'one'), (2, 'two'), (3, 'three'), (4, 'four')]

DuckDB also supports ”registering” a DataFrame or Arrow object as a virtual table, comparable to a SQL VIEW. This is useful when querying
a DataFrame/Arrow object that is stored in another way (as a class variable, or a value in a dictionary). Below is a Pandas example:

If your Pandas DataFrame is stored in another location, here is an example of manually registering it:

import duckdb
import pandas as pd
my_dictionary = {}
my_dictionary["test_df"] = pd.DataFrame.from_dict({"i": [1, 2, 3, 4], "j": ["one", "two", "three",
"four"]})
duckdb.register("test_df_view", my_dictionary["test_df"])
duckdb.sql("SELECT * FROM test_df_view").fetchall()
# [(1, 'one'), (2, 'two'), (3, 'three'), (4, 'four')]

You can also create a persistent table in DuckDB from the contents of the DataFrame (or the view):

# create a new table from the contents of a DataFrame

con.execute("CREATE TABLE test_df_table AS SELECT * FROM test_df")
# insert into an existing table from the contents of a DataFrame
con.execute("INSERT INTO test_df_table SELECT * FROM test_df")

Pandas DataFrames – object Columns pandas.DataFrame columns of an object dtype require some special care, since this
stores values of arbitrary type. To convert these columns to DuckDB, we first go through an analyze phase before converting the values. In
this analyze phase a sample of all the rows of the column are analyzed to determine the target type. This sample size is by default set to
1000. If the type picked during the analyze step is incorrect, this will result in a ”Failed to cast value:” error, in which case you will need to
increase the sample size. The sample size can be changed by setting the pandas_analyze_sample config option.

# example setting the sample size to 100k

duckdb.default_connection.execute("SET GLOBAL pandas_analyze_sample = 100_000")

Object Conversion

This is a mapping of Python object types to DuckDB Logical Types:

• None ‑> NULL

• bool ‑> BOOLEAN
• datetime.timedelta ‑> INTERVAL
• str ‑> VARCHAR
• bytearray ‑> BLOB
• memoryview ‑> BLOB
• decimal.Decimal ‑> DECIMAL / DOUBLE
• uuid.UUID ‑> UUID

The rest of the conversion rules are as follows.

int Since integers can be of arbitrary size in Python, there is not a one‑to‑one conversion possible for ints. Intead we perform these casts
in order until one succeeds:

• BIGINT

254
DuckDB Documentation

• INTEGER
• UBIGINT
• UINTEGER
• DOUBLE

When using the DuckDB Value class, it's possible to set a target type, which will influence the conversion.

float These casts are tried in order until one succeeds:

• DOUBLE
• FLOAT

datetime.datetime For datetime we will check pandas.isnull if it's available and return NULL if it returns true. We check
against datetime.datetime.min and datetime.datetime.max to convert to -inf and +inf respectively.

If the datetime has tzinfo, we will use TIMESTAMPTZ, otherwise it becomes TIMESTAMP.

datetime.time If the time has tzinfo, we will use TIMETZ, otherwise it becomes TIME.

datetime.date date converts to the DATE type. We check against datetime.date.min and datetime.date.max to convert
to -inf and +inf respectively.

bytes bytes converts to BLOB by default, when it's used to construct a Value object of type BITSTRING, it maps to BITSTRING
instead.

list list becomes a LIST type of the ”most permissive” type of its children, for example:

my_list_value = [
12345,
"test"
]

Will become VARCHAR[] because 12345 can convert to VARCHAR but test can not convert to INTEGER.

[12345, test]

dict The dict object can convert to either STRUCT(...) or MAP(..., ...) depending on its structure. If the dict has a structure
similar to:

my_map_dict = {
"key": [
1, 2, 3
],
"value": [
"one", "two", "three"
]
}

Then we'll convert it to a MAP of key‑value pairs of the two lists zipped together. The example above becomes a MAP(INTEGER, VAR-
CHAR):

{1=one, 2=two, 3=three}

255
DuckDB Documentation

Note. The names of the fields matter and the two lists need to have the same size.

Otherwise we'll try to convert it to a STRUCT.

my_struct_dict = {
1: "one",
"2": 2,
"three": [1, 2, 3],
False: True
}

Becomes:

{'1': one, '2': 2, 'three': [1, 2, 3], 'False': true}

Note. Every key of the dictionary is converted to string.

tuple tuple converts to LIST by default, when it's used to construct a Value object of type STRUCT it will convert to STRUCT in‑
stead.

numpy.ndarray and numpy.datetime64 ndarray and datetime64 are converted by calling tolist() and converting the
result of that.

Result Conversion

DuckDB's Python client provides multiple additional methods that can be used to efficiently retrieve data.

NumPy

• fetchnumpy() fetches the data as a dictionary of NumPy arrays

Pandas

• df() fetches the data as a Pandas DataFrame

• fetchdf() is an alias of df()
• fetch_df() is an alias of df()
• fetch_df_chunk(vector_multiple) fetches a portion of the results into a DataFrame. The number of rows returned in each
chunk is the vector size (2048 by default) * vector_multiple (1 by default).

Apache Arrow

• arrow() fetches the data as an Arrow table

• fetch_arrow_table() is an alias of arrow()
• fetch_record_batch(chunk_size) returns an Arrow record batch reader with chunk_size rows per batch

Polars

• pl() fetches the data as a Polars DataFrame

Below are some examples using this functionality. See the Python guides for more examples.

256
DuckDB Documentation

# fetch as Pandas DataFrame

df = con.execute("SELECT * FROM items").fetchdf()
print(df)
# item value count
# 0 jeans 20.0 1
# 1 hammer 42.2 2
# 2 laptop 2000.0 1
# 3 chainsaw 500.0 10
# 4 iphone 300.0 2

# fetch as dictionary of numpy arrays

arr = con.execute("SELECT * FROM items").fetchnumpy()
print(arr)
# {'item': masked_array(data=['jeans', 'hammer', 'laptop', 'chainsaw', 'iphone'],
# mask=[False, False, False, False, False],
# fill_value='?',
# dtype=object), 'value': masked_array(data=[20.0, 42.2, 2000.0, 500.0, 300.0],
# mask=[False, False, False, False, False],
# fill_value=1e+20), 'count': masked_array(data=[1, 2, 1, 10, 2],
# mask=[False, False, False, False, False],
# fill_value=999999,
# dtype=int32)}

# fetch as an Arrow table. Converting to Pandas afterwards just for pretty printing
tbl = con.execute("SELECT * FROM items").fetch_arrow_table()
print(tbl.to_pandas())
# item value count
# 0 jeans 20.00 1
# 1 hammer 42.20 2
# 2 laptop 2000.00 1
# 3 chainsaw 500.00 10
# 4 iphone 300.00 2

Python DB API

The standard DuckDB Python API provides a SQL interface compliant with the DB‑API 2.0 specification described by PEP 249 similar to the
SQLite Python API.

Connection

To use the module, you must first create a DuckDBPyConnection object that represents the database. The connection object takes as
a parameter the database file to read and write from. If the database file does not exist, it will be created (the file extension may be .db,
.duckdb, or anything else). The special value :memory: (the default) can be used to create an in‑memory database. Note that for an
in‑memory database no data is persisted to disk (i.e., all data is lost when you exit the Python process). If you would like to connect to an
existing database in read‑only mode, you can set the read_only flag to True. Read‑only mode is required if multiple Python processes
want to access the same database file at the same time.

By default we create an in‑memory‑database that lives inside the duckdb module. Every method of DuckDBPyConnection is also
available on the duckdb module, this connection is what's used by these methods. You can also get a reference to this connection by
providing the special value :default: to connect.

import duckdb

duckdb.execute("CREATE TABLE tbl AS SELECT 42 a")

con = duckdb.connect(":default:")
con.sql("SELECT * FROM tbl")

257
DuckDB Documentation

┌───────┐
│ a │
│ int32 │
├───────┤
│ 42 │
└───────┘

import duckdb
# to start an in-memory database
con = duckdb.connect(database = ":memory:")
# to use a database file (not shared between processes)
con = duckdb.connect(database = "my-db.duckdb", read_only = False)
# to use a database file (shared between processes)
con = duckdb.connect(database = "my-db.duckdb", read_only = True)
# to explicitly get the default connection
con = duckdb.connect(database = ":default:")

If you want to create a second connection to an existing database, you can use the cursor() method. This might be useful for example
to allow parallel threads running queries independently. A single connection is thread‑safe but is locked for the duration of the queries,
effectively serializing database access in this case.

Connections are closed implicitly when they go out of scope or if they are explicitly closed using close(). Once the last connection to a
database instance is closed, the database instance is closed as well.

Querying

SQL queries can be sent to DuckDB using the execute() method of connections. Once a query has been executed, results can be re‑
trieved using the fetchone and fetchall methods on the connection. fetchall will retrieve all results and complete the transaction.
fetchone will retrieve a single row of results each time that it is invoked until no more results are available. The transaction will only close
once fetchone is called and there are no more results remaining (the return value will be None). As an example, in the case of a query
only returning a single row, fetchone should be called once to retrieve the results and a second time to close the transaction. Below are
some short examples:

# create a table
con.execute("CREATE TABLE items (item VARCHAR, value DECIMAL(10, 2), count INTEGER)")
# insert two items into the table
con.execute("INSERT INTO items VALUES ('jeans', 20.0, 1), ('hammer', 42.2, 2)")

# retrieve the items again

con.execute("SELECT * FROM items")
print(con.fetchall())
# [('jeans', Decimal('20.00'), 1), ('hammer', Decimal('42.20'), 2)]

# retrieve the items one at a time

con.execute("SELECT * FROM items")
print(con.fetchone())
# ('jeans', Decimal('20.00'), 1)
print(con.fetchone())
# ('hammer', Decimal('42.20'), 2)
print(con.fetchone()) # This closes the transaction. Any subsequent calls to .fetchone will return None
# None

The description property of the connection object contains the column names as per the standard.

Prepared Statements DuckDB also supports prepared statements in the API with the execute and executemany methods. The val‑
ues may be passed as an additional parameter after a query that contains ? or $1 (dollar symbol and a number) placeholders. Using the ?
notation adds the values in the same sequence as passed within the Python parameter. Using the $ notation allows for values to be reused
within the SQL statement based on the number and index of the value found within the Python parameter.

258
DuckDB Documentation

Here are some examples:

# insert a row using prepared statements

con.execute("INSERT INTO items VALUES (?, ?, ?)", ["laptop", 2000, 1])

# insert several rows using prepared statements

con.executemany("INSERT INTO items VALUES (?, ?, ?)", [["chainsaw", 500, 10], ["iphone", 300, 2]] )

# query the database using a prepared statement

con.execute("SELECT item FROM items WHERE value > ?", [400])
print(con.fetchall())
# [('laptop',), ('chainsaw',)]

# query using $ notation for prepared statement and reused values

con.execute("SELECT $1, $1, $2", ["duck", "goose"])
print(con.fetchall())
# [('duck', 'duck', 'goose')]

Note. Warning Do not use executemany to insert large amounts of data into DuckDB. See the data ingestion page for better
options.

Named Parameters

Besides the standard unnamed parameters, like $1, $2 etc, it's also possible to supply named parameters, like $my_parameter. When
using named parameters, you have to provide a dictionary mapping of str to value in the parameters argument
An example use:

import duckdb

res = duckdb.execute("""
SELECT
$my_param,
$other_param,
$also_param
""",
{
"my_param": 5,
"other_param": "DuckDB",
"also_param": [42]
}
).fetchall()
print(res)
# [(5, 'DuckDB', [42])]

Relational API

The Relational API is an alternative API that can be used to incrementally construct queries. The API is centered around DuckDBPyRela-
tion nodes. The relations can be seen as symbolic representations of SQL queries. They do not hold any data ‑ and nothing is executed ‑
until a method that triggers execution is called.

Constructing Relations

Relations can be created from SQL queries using the duckdb.sql method. Alternatively, they can be created from the various data inges‑
tion methods (read_parquet, read_csv, read_json).

For example, here we create a relation from a SQL query:

259
DuckDB Documentation

import duckdb
rel = duckdb.sql("SELECT * FROM range(10_000_000_000) tbl(id)")
rel.show()

┌────────────────────────┐
│ id │
│ int64 │
├────────────────────────┤
│ 0 │
│ 1 │
│ 2 │
│ 3 │
│ 4 │
│ 5 │
│ 6 │
│ 7 │
│ 8 │
│ 9 │
│ · │
│ · │
│ · │
│ 9990 │
│ 9991 │
│ 9992 │
│ 9993 │
│ 9994 │
│ 9995 │
│ 9996 │
│ 9997 │
│ 9998 │
│ 9999 │
├────────────────────────┤
│ ? rows │
│ (>9999 rows, 20 shown) │
└────────────────────────┘

Note how we are constructing a relation that computes an immense amount of data (10B rows, or 74GB of data). The relation is constructed
instantly ‑ and we can even print the relation instantly.

When printing a relation using show or displaying it in the terminal, the first 10K rows are fetched. If there are more than 10K rows, the
output window will show >9999 rows (as the amount of rows in the relation is unknown).

Data Ingestion

Outside of SQL queries, the following methods are provided to construct relation objects from external data.

• from_arrow
• from_df
• read_csv
• read_json
• read_parquet

SQL Queries

Relation objects can be queried through SQL through so‑called replacement scans. If you have a relation object stored in a variable, you
can refer to that variable as if it was a SQL table (in the FROM clause). This allows you to incrementally build queries using relation objects.

260
DuckDB Documentation

import duckdb
rel = duckdb.sql("SELECT * FROM range(1_000_000) tbl(id)")
duckdb.sql("SELECT sum(id) FROM rel").show()

┌──────────────┐
│ sum(id) │
│ int128 │
├──────────────┤
│ 499999500000 │
└──────────────┘

Operations

There are a number of operations that can be performed on relations. These are all short‑hand for running the SQL queries ‑ and will return
relations again themselves.

aggregate(expr, groups = {}) Apply an (optionally grouped) aggregate over the relation. The system will automatically group
by any columns that are not aggregates.

import duckdb
rel = duckdb.sql("SELECT * FROM range(1_000_000) tbl(id)")
rel.aggregate("id % 2 AS g, sum(id), min(id), max(id)")

┌───────┬──────────────┬─────────┬─────────┐
│ g │ sum(id) │ min(id) │ max(id) │
│ int64 │ int128 │ int64 │ int64 │
├───────┼──────────────┼─────────┼─────────┤
│ 0 │ 249999500000 │ 0 │ 999998 │
│ 1 │ 250000000000 │ 1 │ 999999 │
└───────┴──────────────┴─────────┴─────────┘

except_(rel) Select all rows in the first relation, that do not occur in the second relation. The relations must have the same number
of columns.

import duckdb
r1 = duckdb.sql("SELECT * FROM range(10) tbl(id)")
r2 = duckdb.sql("SELECT * FROM range(5) tbl(id)")
r1.except_(r2).show()

┌───────┐
│ id │
│ int64 │
├───────┤
│ 5 │
│ 6 │
│ 7 │
│ 8 │
│ 9 │
└───────┘

filter(condition) Apply the given condition to the relation, filtering any rows that do not satisfy the condition.

import duckdb
rel = duckdb.sql("SELECT * FROM range(1_000_000) tbl(id)")
rel.filter("id > 5").limit(3).show()

261
DuckDB Documentation

┌───────┐
│ id │
│ int64 │
├───────┤
│ 6 │
│ 7 │
│ 8 │
└───────┘

intersect(rel) Select the intersection of two relations ‑ returning all rows that occur in both relations. The relations must have the
same number of columns.

import duckdb
r1 = duckdb.sql("SELECT * FROM range(10) tbl(id)")
r2 = duckdb.sql("SELECT * FROM range(5) tbl(id)")
r1.intersect(r2).show()

┌───────┐
│ id │
│ int64 │
├───────┤
│ 0 │
│ 1 │
│ 2 │
│ 3 │
│ 4 │
└───────┘

join(rel, condition, type = "inner") Combine two relations, joining them based on the provided condition.

import duckdb
r1 = duckdb.sql("SELECT * FROM range(5) tbl(id)").set_alias("r1")
r2 = duckdb.sql("SELECT * FROM range(10, 15) tbl(id)").set_alias("r2")
r1.join(r2, "r1.id + 10 = r2.id").show()

┌───────┬───────┐
│ id │ id │
│ int64 │ int64 │
├───────┼───────┤
│ 0 │ 10 │
│ 1 │ 11 │
│ 2 │ 12 │
│ 3 │ 13 │
│ 4 │ 14 │
└───────┴───────┘

limit(n, offset = 0) Select the first n rows, optionally offset by offset.

import duckdb
rel = duckdb.sql("SELECT * FROM range(1_000_000) tbl(id)")
rel.limit(3).show()

┌───────┐
│ id │
│ int64 │
├───────┤
│ 0 │
│ 1 │

262
DuckDB Documentation

│ 2 │
└───────┘

order(expr) Sort the relation by the given set of expressions.

import duckdb
rel = duckdb.sql("SELECT * FROM range(1_000_000) tbl(id)")
rel.order("id DESC").limit(3).show()

┌────────┐
│ id │
│ int64 │
├────────┤
│ 999999 │
│ 999998 │
│ 999997 │
└────────┘

project(expr) Apply the given expression to each row in the relation.

import duckdb
rel = duckdb.sql("SELECT * FROM range(1_000_000) tbl(id)")
rel.project("id + 10 AS id_plus_ten").limit(3).show()

┌─────────────┐
│ id_plus_ten │
│ int64 │
├─────────────┤
│ 10 │
│ 11 │
│ 12 │
└─────────────┘

union(rel) Combine two relations, returning all rows in r1 followed by all rows in r2. The relations must have the same number of
columns.

import duckdb
r1 = duckdb.sql("SELECT * FROM range(5) tbl(id)")
r2 = duckdb.sql("SELECT * FROM range(10, 15) tbl(id)")
r1.union(r2).show()

┌───────┐
│ id │
│ int64 │
├───────┤
│ 0 │
│ 1 │
│ 2 │
│ 3 │
│ 4 │
│ 10 │
│ 11 │
│ 12 │
│ 13 │
│ 14 │
└───────┘

263
DuckDB Documentation

Result Output

The result of relations can be converted to various types of Python structures, see the result conversion page for more information.

The result of relations can also be directly written to files using the below methods.

• write_csv
• write_parquet

Python Function API

You can create a DuckDB user‑defined function (UDF) out of a Python function so it can be used in SQL queries. Similarly to regular functions,
they need to have a name, a return type and parameter types.

Here is an example using a Python function that calls a third‑party library.

import duckdb
from duckdb.typing import *
from faker import Faker

def random_name():
fake = Faker()
return fake.name()

duckdb.create_function("random_name", random_name, [], VARCHAR)

res = duckdb.sql("SELECT random_name()").fetchall()
print(res)
# [('Gerald Ashley',)]

Creating Functions

To register a Python UDF, simply use the create_function method from a DuckDB connection. Here is the syntax:

import duckdb
con = duckdb.connect()
con.create_function(name, function, argument_type_list, return_type, type, null_handling)

The create_function method requires the following parameters:

1. name: A string representing the unique name of the UDF within the connection catalog.
2. function: The Python function you wish to register as a UDF.
3. return_type: Scalar functions return one element per row. This parameter specifies the return type of the function.
4. parameters: Scalar functions can operate on one or more columns. This parameter takes a list of column types used as input.
5. type (Optional): DuckDB supports both built‑in Python types and PyArrow Tables. By default, built‑in types are assumed, but you
can specify type = 'arrow' to use PyArrow Tables.
6. null_handling (Optional): By default, null values are automatically handled as Null‑In Null‑Out. Users can specify a desired behavior
for null values by setting null_handling = 'special'.
7. exception_handling (Optional): By default, when an exception is thrown from the Python function, it will be re‑thrown in Python.
Users can disable this behavior, and instead return null, by set this parameter to 'return_null'
8. side_effects (Optional): By default, functions are expected to produce the same result for the same input. If the result of a function
is impacted by any type of randomness, side_effects must be set to True.

To unregister a UDF, you can call the remove_function method with the UDF name:

con.remove_function(name)

264
DuckDB Documentation

Type Annotation

When the function has type annotation it's often possible to leave out all of the optional parameters. Using DuckDBPyType we can im‑
plicitly convert many known types to DuckDBs type system. For example:

import duckdb

def my_function(x: int) -> str:

return x

duckdb.create_function("my_func", my_function)
duckdb.sql("SELECT my_func(42)")

┌─────────────┐
│ my_func(42) │
│ varchar │
├─────────────┤
│ 42 │
└─────────────┘

If only the parameter list types can be inferred, you'll need to pass in None as argument_type_list.

Null Handling

By default when functions receive a NULL value, this instantly returns NULL, as part of the default NULL‑handling. When this is not desired,
you need to explicitly set this parameter to "special".

import duckdb
from duckdb.typing import *

def dont_intercept_null(x):
return 5

duckdb.create_function("dont_intercept", dont_intercept_null, [BIGINT], BIGINT)

res = duckdb.sql("SELECT dont_intercept(NULL)").fetchall()
print(res)
# [(None,)]

duckdb.remove_function("dont_intercept")
duckdb.create_function("dont_intercept", dont_intercept_null, [BIGINT], BIGINT, null_handling="special")
res = duckdb.sql("SELECT dont_intercept(NULL)").fetchall()
print(res)
# [(5,)]

Exception Handling

By default, when an exception is thrown from the Python function, we'll forward (re‑throw) the exception. If you want to disable this
behavior, and instead return null, you'll need to set this parameter to "return_null"

import duckdb
from duckdb.typing import *

def will_throw():
raise ValueError("ERROR")

duckdb.create_function("throws", will_throw, [], BIGINT)

try:
res = duckdb.sql("SELECT throws()").fetchall()

265
DuckDB Documentation

except duckdb.InvalidInputException as e:
print(e)

duckdb.create_function("doesnt_throw", will_throw, [], BIGINT, exception_handling="return_null")

res = duckdb.sql("SELECT doesnt_throw()").fetchall()
print(res)
# [(None,)]

Side Effects

By default DuckDB will assume the created function is a pure function, meaning it will produce the same output when given the same
input. If your function does not follow that rule, for example when your function makes use of randomness, then you will need to mark this
function as having side_effects.

For example, this function will produce a new count for every invocation

def count() -> int:

old = count.counter;
count.counter += 1
return old

count.counter = 0

If we create this function without marking it as having side effects, the result will be the following:

con = duckdb.connect()
con.create_function("my_counter", count, side_effects = False)
res = con.sql("SELECT my_counter() FROM range(10)").fetchall()
print(res)
# [(0,), (0,), (0,), (0,), (0,), (0,), (0,), (0,), (0,), (0,)]

Which is obviously not the desired result, when we add side_effects = True, the result is as we would expect:

con.remove_function("my_counter")
count.counter = 0
con.create_function("my_counter", count, side_effects = True)
res = con.sql("SELECT my_counter() FROM range(10)").fetchall()
print(res)
# [(0,), (1,), (2,), (3,), (4,), (5,), (6,), (7,), (8,), (9,)]

Python Function Types

Currently, two function types are supported, native (default) and arrow.

Arrow If the function is expected to receive arrow arrays, set the type parameter to 'arrow'.

This will let the system know to provide arrow arrays of up to STANDARD_VECTOR_SIZE tuples to the function, and also expect an array
of the same amount of tuples to be returned from the function.

Native When the function type is set to native the function will be provided with a single tuple at a time, and expect only a single value
to be returned. This can be useful to interact with Python libraries that don't operate on Arrow, such as faker:

import duckdb

from duckdb.typing import *

from faker import Faker

266
DuckDB Documentation

def random_date():
fake = Faker()
return fake.date_between()

duckdb.create_function("random_date", random_date, [], DATE, type="native")

res = duckdb.sql("SELECT random_date()").fetchall()
print(res)
# [(datetime.date(2019, 5, 15),)]

Types API

The DuckDBPyType class represents a type instance of our data types.

Converting from Other Types

To make the API as easy to use as possible, we have added implicit conversions from existing type objects to a DuckDBPyType instance.
This means that wherever a DuckDBPyType object is expected, it is also possible to provide any of the options listed below.

Python Built‑ins The table below shows the mapping of Python Built‑in types to DuckDB type.

Built‑in types DuckDB type

bool BOOLEAN
bytearray BLOB
bytes BLOB
float DOUBLE
int BIGINT
str VARCHAR

Numpy DTypes The table below shows the mapping of Numpy DType to DuckDB type.

Type DuckDB type

bool BOOLEAN
float32 FLOAT
float64 DOUBLE
int16 SMALLINT
int32 INTEGER
int64 BIGINT
int8 TINYINT
uint16 USMALLINT
uint32 UINTEGER
uint64 UBIGINT
uint8 UTINYINT

Nested Types

267
DuckDB Documentation

list[child_type] list type objects map to a LIST type of the child type. Which can also be arbitrarily nested.

import duckdb
from typing import Union

duckdb.typing.DuckDBPyType(list[dict[Union[str, int], str])

# MAP(UNION(u1 VARCHAR, u2 BIGINT), VARCHAR)[]

dict[key_type, value_type] dict type objects map to a MAP type of the key type and the value type.

import duckdb

duckdb.typing.DuckDBPyType(dict[str, int])
# MAP(VARCHAR, BIGINT)

{'a': field_one, 'b': field_two, .., 'n': field_n} dict objects map to a STRUCT composed of the keys and
values of the dict.

import duckdb

duckdb.typing.DuckDBPyType({'a': str, 'b': int})

# STRUCT(a VARCHAR, b BIGINT)

Union[ type_1 , ... type_n ] typing.Union objects map to a UNION type of the provided types.

import duckdb
from typing import Union

duckdb.typing.DuckDBPyType(Union[int, str, bool, bytearray])

# UNION(u1 BIGINT, u2 VARCHAR, u3 BOOLEAN, u4 BLOB)

Creation Functions For the built‑in types, you can use the constants defined in duckdb.typing:

DuckDB type

BIGINT
BIT
BLOB
BOOLEAN
DATE
DOUBLE
FLOAT
HUGEINT
INTEGER
INTERVAL
SMALLINT
SQLNULL
TIME_TZ
TIME
TIMESTAMP_MS

268
DuckDB Documentation

DuckDB type

TIMESTAMP_NS
TIMESTAMP_S
TIMESTAMP_TZ
TIMESTAMP
TINYINT
UBIGINT
UHUGEINT
UINTEGER
USMALLINT
UTINYINT
UUID
VARCHAR

For the complex types there are methods available on the DuckDBPyConnection object or the duckdb module. Anywhere a Duck-
DBPyType is accepted, we will also accept one of the type objects that can implicitly convert to a DuckDBPyType.

list_type | array_type Parameters:

• child_type: DuckDBPyType

struct_type | row_type Parameters:

• fields: Union[list[DuckDBPyType], dict[str, DuckDBPyType]]

map_type Parameters:

• key_type: DuckDBPyType
• value_type: DuckDBPyType

decimal_type Parameters:

• width: int
• scale: int

union_type Parameters:

• members: Union[list[DuckDBPyType], dict[str, DuckDBPyType]]

string_type Parameters:

• collation: Optional[str]

Expression API

The Expression class represents an instance of an expression.

269
DuckDB Documentation

Why Would I Use the Expression API?

Using this API makes it possible to dynamically build up expressions, which are typically created by the parser from the query string. This
allows you to skip that and have more fine‑grained control over the used expressions.

Below is a list of currently supported expressions that can be created through the API.

Column Expression

This expression references a column by name.

import duckdb
import pandas as pd

df = pd.DataFrame({
'a': [1, 2, 3, 4],
'b': [True, None, False, True],
'c': [42, 21, 13, 14]
})

# selecting a single column

col = duckdb.ColumnExpression('a')
res = duckdb.df(df).select(col).fetchall()
print(res)
# [(1,), (2,), (3,), (4,)]

# selecting multiple columns

col_list = [duckdb.ColumnExpression('a'), duckdb.ColumnExpression('c')]
res = duckdb.df(df).select(*col_list).fetchall()
print(res)
# [(1, 42), (2, 21), (3, 13), (4, 14)]

Star Expression

This expression selects all columns of the input source.

Optionally it's possible to provide an exclude list to filter out columns of the table. This exclude list can contain either strings or
Expressions.

import duckdb
import pandas as pd

df = pd.DataFrame({
'a': [1, 2, 3, 4],
'b': [True, None, False, True],
'c': [42, 21, 13, 14]
})

star = duckdb.StarExpression(exclude = ['b'])

res = duckdb.df(df).select(star).fetchall()
print(res)
# [(1, 42), (2, 21), (3, 13), (4, 14)]

Constant Expression

This expression contains a single value.

270
DuckDB Documentation

import duckdb
import pandas as pd

df = pd.DataFrame({
'a': [1, 2, 3, 4],
'b': [True, None, False, True],
'c': [42, 21, 13, 14]
})

const = duckdb.ConstantExpression('hello')
res = duckdb.df(df).select(const).fetchall()
print(res)
# [('hello',), ('hello',), ('hello',), ('hello',)]

Case Expression

This expression contains a CASE WHEN (...) THEN (...) ELSE (...) END expression. By default ELSE is NULL and it can be
set using .else(value = ...). Additional WHEN (...) THEN (...) blocks can be added with .when(condition = ...,
value = ...).

import duckdb
import pandas as pd
from duckdb import (
ConstantExpression,
ColumnExpression,
CaseExpression
)

df = pd.DataFrame({
'a': [1, 2, 3, 4],
'b': [True, None, False, True],
'c': [42, 21, 13, 14]
})

hello = ConstantExpression('hello')
world = ConstantExpression('world')

case = \
CaseExpression(condition = ColumnExpression('b') == False, value = world) \
.otherwise(hello)
res = duckdb.df(df).select(case).fetchall()
print(res)
# [('hello',), ('hello',), ('world',), ('hello',)]

Function Expression

This expression contains a function call. It can be constructed by providing the function name and an arbitrary amount of Expressions as
arguments.

import duckdb
import pandas as pd
from duckdb import (
ConstantExpression,
ColumnExpression,
FunctionExpression
)

271
DuckDB Documentation

df = pd.DataFrame({
'a': [
'test',
'pest',
'text',
'rest',
]
})

ends_with = FunctionExpression('ends_with', ColumnExpression('a'), ConstantExpression('est'))

res = duckdb.df(df).select(ends_with).fetchall()
print(res)
# [(True,), (True,), (False,), (True,)]

Common Operations

The Expression class also contains many operations that can be applied to any Expression type.

.cast(type: DuckDBPyType)
Applies a cast to the provided type on the expression.

.alias(name: str)
Apply an alias to the expression.

.isin(*exprs: Expression)
Create a IN expression against the provided expressions as the list.

.isnotin(*exprs: Expression)
Create a NOT IN expression against the provided expressions as the list.

Order Operations When expressions are provided to DuckDBPyRelation.order() these take effect:

.asc()
Indicates that this expression should be sorted in ascending order.

.desc()
Indicates that this expression should be sorted in descending order.

.nulls_first()
Indicates that the nulls in this expression should preceed the non‑null values.

.nulls_last()
Indicates that the nulls in this expression should come after the non‑null values.

Spark API

The DuckDB Spark API implements the PySpark API, allowing you to use the familiar Spark API to interact with DuckDB. All statements are
translated to DuckDB's internal plans using our relational API and executed using DuckDB's query engine.

Note. Warning The DuckDB Spark API is currently experimental and features are still missing. We are very interested in feedback.
Please report any functionality that you are missing, either through Discord or on GitHub.

Example

from duckdb.experimental.spark.sql import SparkSession as session

from duckdb.experimental.spark.sql.functions import lit, col
import pandas as pd

272
DuckDB Documentation

spark = session.builder.getOrCreate()

pandas_df = pd.DataFrame({
'age': [34, 45, 23, 56],
'name': ['Joan', 'Peter', 'John', 'Bob']
})

df = spark.createDataFrame(pandas_df)
df = df.withColumn(
'location', lit('Seattle')
)
res = df.select(
col('age'),
col('location')
).collect()

print(res)

[
Row(age=34, location='Seattle'),
Row(age=45, location='Seattle'),
Row(age=23, location='Seattle'),
Row(age=56, location='Seattle')
]

Contribution Guidelines

Contributions to the experimental Spark API are welcome. When making a contribution, please follow these guidelines:

• Instead of using temporary files, use our pytest testing framework.

• When adding new functions, ensure that method signatures comply with those in the PySpark API.

Python Client API

Known Python Issues

Unfortunately there are some issues that are either beyond our control or are very elusive / hard to track down. Below is a list of these
issues that you might have to be aware of, depending on your workflow.

Numpy Import Multithreading

When making use of multi threading and fetching results either directly as Numpy arrays or indirectly through a Pandas DataFrame, it might
be necessary to ensure that numpy.core.multiarray is imported. If this module has not been imported from the main thread, and a
different thread during execution attempts to import it this causes either a deadlock or a crash.

To avoid this, it's recommended to import numpy.core.multiarray before starting up threads.

Running EXPLAIN Renders Newlines in Jupyter and IPython

When DuckDB is run in Jupyter notebooks or in the IPython shell, the output of the EXPLAIN statement contains hard line breaks (\n):

In [1]: import duckdb

...: duckdb.sql("EXPLAIN SELECT 42 AS x")

273
DuckDB Documentation

Out[1]:
┌───────────────┬────────────────────────────────────────────────────────────────────────────────────────────────
│ explain_key │ explain_value
│
│ varchar │ varchar
│
├───────────────┼────────────────────────────────────────────────────────────────────────────────────────────────
│ physical_plan │ ┌───────────────────────────┐\n│ PROJECTION │\n│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─
│\n│ x … │
└───────────────┴────────────────────────────────────────────────────────────────────────────────────────────────

To work around this, print the output of the explain() function:

In [2]: print(duckdb.sql("SELECT 42 AS x").explain())

Out[2]:
┌───────────────────────────┐
│ PROJECTION │
│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │
│ x │
└─────────────┬─────────────┘
┌─────────────┴─────────────┐
│ DUMMY_SCAN │
└───────────────────────────┘

Please also check out the Jupyter guide for tips on using Jupyter with JupySQL.

Error When Importing the DuckDB Python Package on Windows

When importing DuckDB on Windows, the Python runtime may return the following error:

import duckdb

ImportError: DLL load failed while importing duckdb: The specified module could not be found.

The solution is to install the Microsoft Visual C++ Redistributable package.

R API

Installation

The DuckDB R API can be installed using install.packages("duckdb"). Please see the installation page for details.

Reference Manual

The reference manual for the DuckDB R API is available at R.duckdb.org.

Basic API Usage

The standard DuckDB R API implements the DBI interface for R. If you are not familiar with DBI yet, see here for an introduction.

274
DuckDB Documentation

Startup & Shutdown To use DuckDB, you must first create a connection object that represents the database. The connection object
takes as parameter the database file to read and write from. If the database file does not exist, it will be created (the file extension may be
.db, .duckdb, or anything else). The special value :memory: (the default) can be used to create an in‑memory database. Note that
for an in‑memory database no data is persisted to disk (i.e., all data is lost when you exit the R process). If you would like to connect to an
existing database in read‑only mode, set the read_only flag to TRUE. Read‑only mode is required if multiple R processes want to access
the same database file at the same time.

library("duckdb")
# to start an in-memory database
con <- dbConnect(duckdb())
# or
con <- dbConnect(duckdb(), dbdir = ":memory:")
# to use a database file (not shared between processes)
con <- dbConnect(duckdb(), dbdir = "my-db.duckdb", read_only = FALSE)
# to use a database file (shared between processes)
con <- dbConnect(duckdb(), dbdir = "my-db.duckdb", read_only = TRUE)

Connections are closed implicitly when they go out of scope or if they are explicitly closed using dbDisconnect(). To shut down the
database instance associated with the connection, use dbDisconnect(con, shutdown = TRUE)

Querying DuckDB supports the standard DBI methods to send queries and retrieve result sets. dbExecute() is meant for queries
where no results are expected like CREATE TABLE or UPDATE etc. and dbGetQuery() is meant to be used for queries that produce
results (e.g., SELECT). Below an example.

# create a table
dbExecute(con, "CREATE TABLE items (item VARCHAR, value DECIMAL(10, 2), count INTEGER)")
# insert two items into the table
dbExecute(con, "INSERT INTO items VALUES ('jeans', 20.0, 1), ('hammer', 42.2, 2)")

# retrieve the items again

res <- dbGetQuery(con, "SELECT * FROM items")
print(res)
# item value count
# 1 jeans 20.0 1
# 2 hammer 42.2 2

DuckDB also supports prepared statements in the R API with the dbExecute and dbGetQuery methods. Here is an example:

# prepared statement parameters are given as a list

dbExecute(con, "INSERT INTO items VALUES (?, ?, ?)", list('laptop', 2000, 1))

# if you want to reuse a prepared statement multiple times, use dbSendStatement() and dbBind()
stmt <- dbSendStatement(con, "INSERT INTO items VALUES (?, ?, ?)")
dbBind(stmt, list('iphone', 300, 2))
dbBind(stmt, list('android', 3.5, 1))
dbClearResult(stmt)

# query the database using a prepared statement

res <- dbGetQuery(con, "SELECT item FROM items WHERE value > ?", list(400))
print(res)
# item
# 1 laptop

Note. Warning Do not use prepared statements to insert large amounts of data into DuckDB. See below for better options.

275
DuckDB Documentation

Efficient Transfer

To write a R data frame into DuckDB, use the standard DBI function dbWriteTable(). This creates a table in DuckDB and populates it
with the data frame contents. For example:

dbWriteTable(con, "iris_table", iris)

res <- dbGetQuery(con, "SELECT * FROM iris_table LIMIT 1")
print(res)
# Sepal.Length Sepal.Width Petal.Length Petal.Width Species
# 1 5.1 3.5 1.4 0.2 setosa

It is also possible to ”register” a R data frame as a virtual table, comparable to a SQL VIEW. This does not actually transfer data into DuckDB
yet. Below is an example:

duckdb_register(con, "iris_view", iris)

res <- dbGetQuery(con, "SELECT * FROM iris_view LIMIT 1")
print(res)
# Sepal.Length Sepal.Width Petal.Length Petal.Width Species
# 1 5.1 3.5 1.4 0.2 setosa

Note. DuckDB keeps a reference to the R data frame after registration. This prevents the data frame from being garbage‑collected.
The reference is cleared when the connection is closed, but can also be cleared manually using the duckdb_unregister()
method.

Also refer to the data import documentation for more options of efficiently importing data.

dbplyr

DuckDB also plays well with the dbplyr / dplyr packages for programmatic query construction from R. Here is an example:

library("duckdb")
library("dplyr")
con <- dbConnect(duckdb())
duckdb_register(con, "flights", nycflights13::flights)

tbl(con, "flights") |>

group_by(dest) |>
summarise(delay = mean(dep_time, na.rm = TRUE)) |>
collect()

When using dbplyr, CSV and Parquet files can be read using the dplyr::tbl function.

# Establish a CSV for the sake of this example

write.csv(mtcars, "mtcars.csv")

# Summarize the dataset in DuckDB to avoid reading the entire CSV into R's memory
tbl(con, "mtcars.csv") |>
group_by(cyl) |>
summarise(across(disp:wt, .fns = mean)) |>
collect()

# Establish a set of Parquet files

dbExecute(con, "COPY flights TO 'dataset' (FORMAT PARQUET, PARTITION_BY (year, month))")

# Summarize the dataset in DuckDB to avoid reading 12 Parquet files into R's memory
tbl(con, "read_parquet('dataset/**/*.parquet', hive_partitioning = true)") |>
filter(month == "3") |>
summarise(delay = mean(dep_time, na.rm = TRUE)) |>
collect()

276
DuckDB Documentation

Rust API

Installation

The DuckDB Rust API can be installed from crates.io. Please see the docs.rs for details.

Basic API Usage

duckdb‑rs is an ergonomic wrapper based on the DuckDB C API, please refer to the README for details.

Startup & Shutdown To use duckdb, you must first initialize a Connection handle using Connection::open(). Connec-
tion::open() takes as parameter the database file to read and write from. If the database file does not exist, it will be created (the file
extension may be .db, .duckdb, or anything else). You can also use Connection::open_in_memory() to create an in‑memory
database. Note that for an in‑memory database no data is persisted to disk (i.e., all data is lost when you exit the process).

use duckdb::{params, Connection, Result};

let conn = Connection::open_in_memory()?;

You can conn.close() the Connection manually, or just leave it out of scope, we had implement the Drop trait which will automati‑
cally close the underlining db connection for you.

Querying SQL queries can be sent to DuckDB using the execute() method of connections, or we can also prepare the statement and
then query on that.

#[derive(Debug)]
struct Person {
id: i32,
name: String,
data: Option<Vec<u8>>,
}

conn.execute(
"INSERT INTO person (name, data) VALUES (?, ?)",
params![me.name, me.data],
)?;

let mut stmt = conn.prepare("SELECT id, name, data FROM person")?;

let person_iter = stmt.query_map([], |row| {
Ok(Person {
id: row.get(0)?,
name: row.get(1)?,
data: row.get(2)?,
})
})?;

for person in person_iter {

println!("Found person {:?}", person.unwrap());
}

Appender

The Rust client supports the DuckDB Appender API for bulk inserts. For example:

277
DuckDB Documentation

fn insert_rows(conn: &Connection) -> Result<()> {

let mut app = conn.appender("foo")?;
app.append_rows([[1, 2], [3, 4], [5, 6], [7, 8], [9, 10]])?;
Ok(())
}

Swift API

DuckDB offers a Swift API. See the announcement post for details.

Instantiating DuckDB

DuckDB supports both in‑memory and persistent databases. To work with an in‑memory datatabase, run:

let database = try Database(store: .inMemory)

To work with a persistent database, run:

let database = try Database(store: .file(at: "test.db"))

Queries can be issued through a database connection.

let connection = try database.connect()

DuckDB supports multiple connections per database.

Application Example

The rest of the page is based on the example of our announcement post, which uses raw data from NASA's Exoplanet Archive loaded directly
into DuckDB.

Creating an Application‑Specific Type We first create an application‑specific type that we'll use to house our database and connection
and through which we'll eventually define our app‑specific queries.

import DuckDB

final class ExoplanetStore {

let database: Database

let connection: Connection

init(database: Database, connection: Connection) {

self.database = database
self.connection = connection
}
}

Loading a CSV File We load the data from NASA's Exoplanet Archive:

wget https://exoplanetarchive.ipac.caltech.edu/TAP/sync?query=select+pl_name+,+disc_
year+from+pscomppars&format=csv -O downloaded_exoplanets.csv

Once we have our CSV downloaded locally, we can use the following SQL command to load it as a new table to DuckDB:

CREATE TABLE exoplanets AS

SELECT * FROM read_csv('downloaded_exoplanets.csv');

278
DuckDB Documentation

Let's package this up as a new asynchronous factory method on our ExoplanetStore type:

import DuckDB
import Foundation

final class ExoplanetStore {

// Factory method to create and prepare a new ExoplanetStore

static func create() async throws -> ExoplanetStore {

// Create our database and connection as described above

let database = try Database(store: .inMemory)
let connection = try database.connect()

// Download the CSV from the exoplanet archive

let (csvFileURL, _) = try await URLSession.shared.download(
from: URL(string: "https://exoplanetarchive.ipac.caltech.edu/TAP/sync?query=select+pl_name+,+disc_
year+from+pscomppars&format=csv")!)

// Issue our first query to DuckDB

try connection.execute("""
CREATE TABLE exoplanets AS
SELECT * FROM read_csv('\(csvFileURL.path)');
""")

// Create our pre-populated ExoplanetStore instance

return ExoplanetStore(
database: database,
connection: connection
)
}

// Let's make the initializer we defined previously

// private. This prevents anyone accidentally instantiating
// the store without having pre-loaded our Exoplanet CSV
// into the database
private init(database: Database, connection: Connection) {
...
}
}

Querying the Database The following example queires DuckDB from within Swift via an async function. This means the callee won't be
blocked while the query is executing. We'll then cast the result columns to Swift native types using DuckDB's ResultSet cast(to:)
family of methods, before finally wrapping them up in a DataFrame from the TabularData framework.

...

import TabularData

extension ExoplanetStore {

// Retrieves the number of exoplanets discovered by year

func groupedByDiscoveryYear() async throws -> DataFrame {

// Issue the query we described above

let result = try connection.query("""
SELECT disc_year, count(disc_year) AS Count
FROM exoplanets

279
DuckDB Documentation

GROUP BY disc_year
ORDER BY disc_year
""")

// Cast our DuckDB columns to their native Swift

// equivalent types
let discoveryYearColumn = result[0].cast(to: Int.self)
let countColumn = result[1].cast(to: Int.self)

// Use our DuckDB columns to instantiate TabularData

// columns and populate a TabularData DataFrame
return DataFrame(columns: [
TabularData.Column(discoveryYearColumn).eraseToAnyColumn(),
TabularData.Column(countColumn).eraseToAnyColumn(),
])
}
}

Complete Project For the complete example project, clone the DuckDB Swift repo and open up the runnable app project located in
Examples/SwiftUI/ExoplanetExplorer.xcodeproj.

Wasm

DuckDB Wasm

DuckDB has been compiled to WebAssembly, so it can run inside any browser on any device.

{% include iframe.html src=”https://shell.duckdb.org” %}

DuckDB‑Wasm offers a layered API, it can be embedded as a JavaScript + WebAssembly library, as a Web shell, or built from source according
to your needs.

Getting Started with DuckDB‑Wasm

A great starting point is to read the DuckDB‑Wasm launch blog post!

Another great resource is the GitHub repository.

For details, see the full DuckDB‑Wasm API Documentation.

Instantiation

DuckDB‑Wasm has multiple ways to be instantiated depending on the use case.

cdn(jsdelivr)

import * as duckdb from '@duckdb/duckdb-wasm';

const JSDELIVR_BUNDLES = duckdb.getJsDelivrBundles();

// Select a bundle based on browser checks

const bundle = await duckdb.selectBundle(JSDELIVR_BUNDLES);

280
DuckDB Documentation

const worker_url = URL.createObjectURL(

new Blob([`importScripts("${bundle.mainWorker!}");`], {type: 'text/javascript'})
);

// Instantiate the asynchronus version of DuckDB-Wasm

const worker = new Worker(worker_url);
const logger = new duckdb.ConsoleLogger();
const db = new duckdb.AsyncDuckDB(logger, worker);
await db.instantiate(bundle.mainModule, bundle.pthreadWorker);
URL.revokeObjectURL(worker_url);

webpack

import * as duckdb from '@duckdb/duckdb-wasm';

import duckdb_wasm from '@duckdb/duckdb-wasm/dist/duckdb-mvp.wasm';
import duckdb_wasm_next from '@duckdb/duckdb-wasm/dist/duckdb-eh.wasm';
const MANUAL_BUNDLES: duckdb.DuckDBBundles = {
mvp: {
mainModule: duckdb_wasm,
mainWorker: new URL('@duckdb/duckdb-wasm/dist/duckdb-browser-mvp.worker.js',
import.meta.url).toString(),
},
eh: {
mainModule: duckdb_wasm_next,
mainWorker: new URL('@duckdb/duckdb-wasm/dist/duckdb-browser-eh.worker.js',
import.meta.url).toString(),
},
};
// Select a bundle based on browser checks
const bundle = await duckdb.selectBundle(MANUAL_BUNDLES);
// Instantiate the asynchronus version of DuckDB-Wasm
const worker = new Worker(bundle.mainWorker!);
const logger = new duckdb.ConsoleLogger();
const db = new duckdb.AsyncDuckDB(logger, worker);
await db.instantiate(bundle.mainModule, bundle.pthreadWorker);

vite

import * as duckdb from '@duckdb/duckdb-wasm';

import duckdb_wasm from '@duckdb/duckdb-wasm/dist/duckdb-mvp.wasm?url';
import mvp_worker from '@duckdb/duckdb-wasm/dist/duckdb-browser-mvp.worker.js?url';
import duckdb_wasm_eh from '@duckdb/duckdb-wasm/dist/duckdb-eh.wasm?url';
import eh_worker from '@duckdb/duckdb-wasm/dist/duckdb-browser-eh.worker.js?url';

const MANUAL_BUNDLES: duckdb.DuckDBBundles = {

mvp: {
mainModule: duckdb_wasm,
mainWorker: mvp_worker,
},
eh: {
mainModule: duckdb_wasm_eh,
mainWorker: eh_worker,
},
};
// Select a bundle based on browser checks
const bundle = await duckdb.selectBundle(MANUAL_BUNDLES);
// Instantiate the asynchronus version of DuckDB-wasm

281
DuckDB Documentation

const worker = new Worker(bundle.mainWorker!);

const logger = new duckdb.ConsoleLogger();
const db = new duckdb.AsyncDuckDB(logger, worker);
await db.instantiate(bundle.mainModule, bundle.pthreadWorker);

Statically Served

It is possible to manually download the files from https://cdn.jsdelivr.net/npm/@duckdb/duckdb‑wasm/dist/.

import * as duckdb from '@duckdb/duckdb-wasm';

const MANUAL_BUNDLES: duckdb.DuckDBBundles = {

mvp: {
mainModule: 'change/me/../duckdb-mvp.wasm',
mainWorker: 'change/me/../duckdb-browser-mvp.worker.js',
},
eh: {
mainModule: 'change/m/../duckdb-eh.wasm',
mainWorker: 'change/m/../duckdb-browser-eh.worker.js',
},
};
// Select a bundle based on browser checks
const bundle = await duckdb.selectBundle(JSDELIVR_BUNDLES);
// Instantiate the asynchronous version of DuckDB-Wasm
const worker = new Worker(bundle.mainWorker!);
const logger = new duckdb.ConsoleLogger();
const db = new duckdb.AsyncDuckDB(logger, worker);
await db.instantiate(bundle.mainModule, bundle.pthreadWorker);

Data Ingestion

DuckDB‑Wasm has multiple ways to import data, depending on the format of the data.

There are two steps to import data into DuckDB.

First, the data file is imported into a local file system using register functions (registerEmptyFileBuffer, registerFileBuffer, registerFileHandle,
registerFileText, registerFileURL).

Then, the data file is imported into DuckDB using insert functions (insertArrowFromIPCStream, insertArrowTable, insertCSVFromPath, in‑
sertJSONFromPath) or directly using FROM SQL query (using extensions like Parquet or Wasm‑flavored httpfs).

Insert statements can also be used to import data.

Data Import

Open & Close Connection

// Create a new connection

const c = await db.connect();

// ... import data

// Close the connection to release memory

await c.close();

282
DuckDB Documentation

Apache Arrow

// Data can be inserted from an existing arrow.Table

// More Example https://arrow.apache.org/docs/js/
import { tableFromArrays } from 'apache-arrow';

// EOS signal according to Arrorw IPC streaming format

// See https://arrow.apache.org/docs/format/Columnar.html#ipc-streaming-format
const EOS = new Uint8Array([255, 255, 255, 255, 0, 0, 0, 0]);

const arrowTable = tableFromArrays({

id: [1, 2, 3],
name: ['John', 'Jane', 'Jack'],
age: [20, 21, 22],
});

await c.insertArrowTable(arrowTable, { name: 'arrow_table' });

// Write EOS
await c.insertArrowTable(EOS, { name: 'arrow_table' });

// ..., from a raw Arrow IPC stream

// Write EOS
streamInserts.push(c.insertArrowFromIPCStream(EOS, { name: 'streamed' }));

await Promise.all(streamInserts);

CSV

// ..., from CSV files

// (interchangeable: registerFile{Text,Buffer,URL,Handle})
const csvContent = '1|foo\n2|bar\n';
await db.registerFileText(` data.csv`, csvContent);
// ... with typed insert options
await c.insertCSVFromPath('data.csv', {
schema: 'main',
name: 'foo',
detect: false,
header: false,
delimiter: '|',
columns: {
col1: new arrow.Int32(),
col2: new arrow.Utf8(),
},
});

JSON

// ..., from JSON documents in row-major format

const jsonRowContent = [

283
DuckDB Documentation

{ "col1": 1, "col2": "foo" },

{ "col1": 2, "col2": "bar" },
];
await db.registerFileText(
'rows.json',
JSON.stringify(jsonRowContent),
);
await c.insertJSONFromPath('rows.json', { name: 'rows' });

// ... or column-major format

const jsonColContent = {
"col1": [1, 2],
"col2": ["foo", "bar"]
};
await db.registerFileText(
'columns.json',
JSON.stringify(jsonColContent),
);
await c.insertJSONFromPath('columns.json', { name: 'columns' });

// From API
const streamResponse = await fetch(` someapi/content.json`);
await db.registerFileBuffer('file.json', new Uint8Array(await streamResponse.arrayBuffer()))
await c.insertJSONFromPath('file.json', { name: 'JSONContent' });

Parquet

// from Parquet files

// ...Local
const pickedFile: File = letUserPickFile();
await db.registerFileHandle('local.parquet', pickedFile, DuckDBDataProtocol.BROWSER_FILEREADER, true);
// ...Remote
await db.registerFileURL('remote.parquet', 'https://origin/remote.parquet', DuckDBDataProtocol.HTTP,
false);
// ... Using Fetch
const res = await fetch('https://origin/remote.parquet');
await db.registerFileBuffer('buffer.parquet', new Uint8Array(await res.arrayBuffer()));

// ..., by specifying URLs in the SQL text

await c.query(`
CREATE TABLE direct AS
SELECT * FROM 'https://origin/remote.parquet'
`);
// ..., or by executing raw insert statements
await c.query(`
INSERT INTO existing_table
VALUES (1, 'foo'), (2, 'bar')`);

httpfs (Wasm‑flavored)

// ..., by specifying URLs in the SQL text

await c.query(`
CREATE TABLE direct AS
SELECT * FROM 'https://origin/remote.parquet'
`);

Insert Statement

284
DuckDB Documentation

// ..., or by executing raw insert statements

await c.query(`
INSERT INTO existing_table
VALUES (1, 'foo'), (2, 'bar')`);

Query

DuckDB‑Wasm provides functions for querying data. Queries are run sequentially.

First, a connection need to be created by calling connect. Then, queries can be run by calling query or send.

Query Execution

// Create a new connection

const conn = await db.connect();

await conn.close();

Extensions

DuckDB‑Wasm's (dynamic) extension loading is modeled after the regular DuckDB's extension loading, with a few relevant differences due
to the difference in platform.

Format

Extensions in DuckDB are binaries to be dynamically loaded via dlopen. A cryptographical signature is appended to the binary. An exten‑
sion in DuckDB‑Wasm is a regular Wasm file to be dynamically loaded via Emscripten's dlopen. A cryptographical signature is appended
to the Wasm file as a WebAssembly custom section called duckdb_signature. This ensures the file remains a valid WebAssembly file.

Note. Currently, we require this custom section to be the last one, but this can be potentially relaxed in the future.

INSTALL and LOAD

The INSTALL semantic in native embeddings of DuckDB is to fetch, decompress from gzip and store data in local disk. The LOAD semantic
in native embeddings of DuckDB is to (optionally) perform signature checks and dynamic load the binary with the main DuckDB binary.

In DuckDB‑Wasm, INSTALL is a no‑op given there is no durable cross‑session storage. The LOAD operation will fetch (and decompress on
the fly), perform signature checks and dynamically load via the Emscripten implementation of dlopen.

Autoloading

Autoloading, i.e., the possibility for DuckDB to add extension functionality on‑the‑fly, is enabled by default in DuckDB‑Wasm.

List of Officially Available Extensions

286
DuckDB Documentation

Extension name Description Aliases

autocomplete Adds support for autocomplete in the shell

excel Adds support for Excel‑like format strings
fts Adds support for Full‑Text Search Indexes
icu Adds support for time zones and collations using the ICU library
inet Adds support for IP‑related data types and functions
json Adds support for JSON operations
parquet Adds support for reading and writing Parquet files
sqlite GitHub Adds support for reading SQLite database files sqlite, sqlite3
sqlsmith
substrait GitHub Adds support for the Substrait integration
tpcds Adds TPC‑DS data generation and query support
tpch Adds TPC‑H data generation and query support

WebAssembly is basically an additional platform, and there might be platform‑specific limitations that make some extensions not able
to match their native capabilities or to perform them in a different way. We will document here relevant differences for DuckDB‑hosted
extensions.

HTTPFS The HTTPFS extension is, at the moment, not available in DuckDB‑Wasm. Https protocol capabilities needs to go through an
additional layer, the browser, which adds both differences and some restrictions to what is doable from native.

Instead, DuckDB‑Wasm has a separate implementation that for most purposes is interchangable, but does not support all use cases (as
it must follow security rules imposed by the browser, such as CORS). Due to this CORS restriction, any requests for data made using the
HTTPFS extension must be to websites that allow (using CORS headers) the website hosting the DuckDB‑Wasm instance to access that data.
The MDN website is a great resource for more information regarding CORS.

Extension Signing

As with regular DuckDB extensions, DuckDB‑Wasm extension are by default checked on LOAD to verify the signature confirm the extension
has not been tampered with. Extension signature verification can be disabled via a configuration option. Signing is a property of the binary
itself, so copying a DuckDB extension (say to serve it from a different location) will still keep a valid signature (e.g., for local development).

Fetching DuckDB‑Wasm Extensions

Official DuckDB extensions are served at extensions.duckdb.org, and this is also the default value for the default_extension_
repository option. When installing extensions, a relevant URL will be built that will look like extensions.duckdb.org/$duckdb_
version_hash/$duckdb_platform/$name.duckdb_extension.gz.

DuckDB‑Wasm extension are fetched only on load, and the URL will look like: extensions.duckdb.org/duckdb-wasm/$duckdb_
version_hash/$duckdb_platform/$name.duckdb_extension.wasm.

Note that an additional duckdb-wasm is added to the folder structure, and the file is served as a .wasm file.

DuckDB‑Wasm extensions are served pre‑compressed using Brotli compression. While fetched from a browser, extensions will be
transparently uncompressed. If you want to fetch the duckdb-wasm extension manually, you can use curl --compress exten-
sions.duckdb.org/<...>/icu.duckdb_extension.wasm.

287
DuckDB Documentation

Serving Extensions from a Third‑Party Repository

As with regular DuckDB, if you use SET custom_extension_repository = some.url.com, subsequent loads will be attempted
at some.url.com/duckdb-wasm/$duckdb_version_hash/$duckdb_platform/$name.duckdb_extension.wasm.

Note that GET requests on the extensions needs to be CORS enabled for a browser to allow the connection.

Tooling

Both DuckDB‑Wasm and its extensions have been compiled using latest packaged Emscripten toolchain.

{% include iframe.html src=”https://shell.duckdb.org” %}

ADBC API

Arrow Database Connectivity (ADBC), similarly to ODBC and JDBC, is a C‑style API that enables code portability between different database
systems. This allows developers to effortlessly build applications that communicate with database systems without using code specific to
that system. The main difference between ADBC and ODBC/JDBC is that ADBC uses Arrow to transfer data between the database system and
the application. DuckDB has an ADBC driver, which takes advantage of the zero‑copy integration between DuckDB and Arrow to efficiently
transfer data.

DuckDB's ADBC driver currently supports version 0.7 of ADBC.

Please refer to the ADBC documentation page for a more extensive discussion on ADBC and a detailed API explanation.

Implemented Functionality

The DuckDB‑ADBC driver implements the full ADBC specification, with the exception of the ConnectionReadPartition and State-
mentExecutePartitions functions. Both of these functions exist to support systems that internally partition the query results, which
does not apply to DuckDB. In this section, we will describe the main functions that exist in ADBC, along with the arguments they take and
provide examples for each function.

Database Set of functions that operate on a database.

Function name Description Arguments Example

DatabaseNew Allocate a new (AdbcDatabase AdbcDatabaseNew(&adbc_

(but *database, AdbcError database, &adbc_error)
uninitialized) *error)
database.
DatabaseSetOption Set a char* (AdbcDatabase AdbcDatabaseSetOption(&adbc_
option. *database, const char database, "path", "test.db",
*key, const char &adbc_error)
*value, AdbcError
*error)
DatabaseInit Finish setting (AdbcDatabase AdbcDatabaseInit(&adbc_
options and *database, AdbcError database, &adbc_error)
initialize the *error)
database.
DatabaseRelease Destroy the (AdbcDatabase AdbcDatabaseRelease(&adbc_
database. *database, AdbcError database, &adbc_error)
*error)

288
DuckDB Documentation

Connection A set of functions that create and destroy a connection to interact with a database.

Function name Description Arguments Example

ConnectionNew Allocate a new (AdbcConnection*, AdbcConnectionNew(&adbc_

(but AdbcError*) connection, &adbc_error)
uninitialized)
connection.
ConnectionSetOption Options may be (AdbcConnection*, const AdbcConnectionSetOption(&adbc_
set before char*, const char*, connection, ADBC_CONNECTION_
ConnectionInit. AdbcError*) OPTION_AUTOCOMMIT, ADBC_
OPTION_VALUE_DISABLED, &adbc_
error)
ConnectionInit Finish setting (AdbcConnection*, AdbcConnectionInit(&adbc_
options and AdbcDatabase*, connection, &adbc_database,
initialize the AdbcError*) &adbc_error)
connection.
ConnectionRelease Destroy this (AdbcConnection*, AdbcConnectionRelease(&adbc_
connection. AdbcError*) connection, &adbc_error)

A set of functions that retrieve metadata about the database. In general, these functions will return Arrow objects, specifically an ArrowAr‑
rayStream.

Function name Description Arguments Example

ConnectionGetObjects Get a (AdbcConnection*, int, AdbcDatabaseInit(&adbc_

hierarchical view const char*, const database, &adbc_error)
of all catalogs, char*, const char*,
database const char**, const
schemas, tables, char*,
and columns. ArrowArrayStream*,
AdbcError*)
ConnectionGetTableSchema Get the Arrow (AdbcConnection*, const AdbcDatabaseRelease(&adbc_
schema of a char*, const char*, database, &adbc_error)
table. const char*,
ArrowSchema*,
AdbcError*)
ConnectionGetTableTypes Get a list of table (AdbcConnection*, AdbcDatabaseNew(&adbc_
types in the ArrowArrayStream*, database, &adbc_error)
database. AdbcError*)

A set of functions with transaction semantics for the connection. By default, all connections start with auto‑commit mode on, but this can
be turned off via the ConnectionSetOption function.

Function name Description Arguments Example

ConnectionCommit Commit any (AdbcConnection*, AdbcConnectionCommit(&adbc_

pending AdbcError*) connection, &adbc_error)
transactions.

289
DuckDB Documentation

Function name Description Arguments Example

ConnectionRollback Rollback any (AdbcConnection*, AdbcConnectionRollback(&adbc_

pending AdbcError*) connection, &adbc_error)
transactions.

Statement Statements hold state related to query execution. They represent both one‑off queries and prepared statements. They can
be reused; however, doing so will invalidate prior result sets from that statement.

The functions used to create, destroy, and set options for a statement:

Function name Description Arguments Example

StatementNew Create a new (AdbcConnection*, AdbcStatementNew(&adbc_

statement for a AdbcStatement*, connection, &adbc_statement,
given AdbcError*) &adbc_error)
connection.
StatementRelease Destroy a (AdbcStatement*, AdbcStatementRelease(&adbc_
statement. AdbcError*) statement, &adbc_error)
StatementSetOption Set a string (AdbcStatement*, const StatementSetOption(&adbc_
option on a char*, const char*, statement, ADBC_INGEST_
statement. AdbcError*) OPTION_TARGET_TABLE, "TABLE_
NAME", &adbc_error)

Functions related to query execution:

Function name Description Arguments Example

StatementSetSqlQuery Set the SQL (AdbcStatement*, const AdbcStatementSetSqlQuery(&adbc_

query to execute. char*, AdbcError*) statement, "SELECT * FROM
The query can TABLE", &adbc_error)
then be
executed with
StatementExe‑
cuteQuery.
StatementSetSubstraitPlan Set a substrait (AdbcStatement*, const AdbcStatementSetSubstraitPlan(&adbc_
plan to execute. uint8_t*, size_t, statement, substrait_plan,
The query can AdbcError*) length, &adbc_error)
then be
executed with
StatementExe‑
cuteQuery.
StatementExecuteQuery Execute a (AdbcStatement*, AdbcStatementExecuteQuery(&adbc_
statement and ArrowArrayStream*, statement, &arrow_stream,
get the results. int64_t*, AdbcError*) &rows_affected, &adbc_error)

290
DuckDB Documentation

Function name Description Arguments Example

StatementPrepare Turn this (AdbcStatement*, AdbcStatementPrepare(&adbc_

statement into a AdbcError*) statement, &adbc_error)
prepared
statement to be
executed
multiple times.

Functions related to binding, used for bulk insertion or in prepared statements.

Function name Description Arguments Example

StatementBindStream Bind Arrow (AdbcStatement*, StatementBindStream(&adbc_

Stream. This can ArrowArrayStream*, statement, &input_data,
be used for bulk AdbcError*) &adbc_error)
inserts or
prepared
statements.

Examples

Regardless of the programming language being used, there are two database options which will be required to utilize ADBC with DuckDB.
The first one is the driver, which takes a path to the DuckDB library. The second option is the entrypoint, which is an exported function
from the DuckDB‑ADBC driver that initializes all the ADBC functions. Once we have configured these two options, we can optionally set the
path option, providing a path on disk to store our DuckDB database. If not set, an in‑memory database is created. After configuring all the
necessary options, we can proceed to initialize our database. Below is how you can do so with various different language environments.

C++ We begin our C++ example by declaring the essential variables for querying data through ADBC. These variables include Error,
Database, Connection, Statement handling, and an Arrow Stream to transfer data between DuckDB and the application.

AdbcError adbc_error;
AdbcDatabase adbc_database;
AdbcConnection adbc_connection;
AdbcStatement adbc_statement;
ArrowArrayStream arrow_stream;

We can then initialize our database variable. Before initializing the database, we need to set the driver and entrypoint
options as mentioned above. Then we set the path option and initialize the database. With the example below, the string
"path/to/libduckdb.dylib" should be the path to the dynamic library for DuckDB. This will be .dylib on macOS, and
.so on Linux.

AdbcDatabaseNew(&adbc_database, &adbc_error);
AdbcDatabaseSetOption(&adbc_database, "driver", "path/to/libduckdb.dylib", &adbc_error);
AdbcDatabaseSetOption(&adbc_database, "entrypoint", "duckdb_adbc_init", &adbc_error);
// By default, we start an in-memory database, but you can optionally define a path to store it on disk.
AdbcDatabaseSetOption(&adbc_database, "path", "test.db", &adbc_error);
AdbcDatabaseInit(&adbc_database, &adbc_error);

After initializing the database, we must create and initialize a connection to it.

AdbcConnectionNew(&adbc_connection, &adbc_error);
AdbcConnectionInit(&adbc_connection, &adbc_database, &adbc_error);

291
DuckDB Documentation

We can now initialize our statement and run queries through our connection. After the AdbcStatementExecuteQuery the arrow_
stream is populated with the result.

AdbcStatementNew(&adbc_connection, &adbc_statement, &adbc_error);

AdbcStatementSetSqlQuery(&adbc_statement, "SELECT 42", &adbc_error);
int64_t rows_affected;
AdbcStatementExecuteQuery(&adbc_statement, &arrow_stream, &rows_affected, &adbc_error);
arrow_stream.release(arrow_stream)

Besides running queries, we can also ingest data via arrow_streams. For this we need to set an option with the table name we want to
insert to, bind the stream and then execute the query.

StatementSetOption(&adbc_statement, ADBC_INGEST_OPTION_TARGET_TABLE, "AnswerToEverything", &adbc_error);

StatementBindStream(&adbc_statement, &arrow_stream, &adbc_error);
StatementExecuteQuery(&adbc_statement, nullptr, nullptr, &adbc_error);

Python The first thing to do is to use pip and install the ADBC Driver manager. You will also need to install the pyarrow to directly
access Apache Arrow formatted result sets (such as using fetch_arrow_table).

pip install adbc_driver_manager pyarrow

Note. For details on the adbc_driver_manager package, see the adbc_driver_manager package documentation.

As with C++, we need to provide initialization options consisting of the location of the libduckdb shared object and entrypoint function.
Notice that the path argument for DuckDB is passed in through the db_kwargs dictionary.

import adbc_driver_duckdb.dbapi

with adbc_driver_duckdb.dbapi.connect("test.db") as conn, conn.cursor() as cur:

cur.execute("SELECT 42")
# fetch a pyarrow table
tbl = cur.fetch_arrow_table()
print(tbl)

Alongside fetch_arrow_table, other methods from DBApi are also implemented on the cursor, such as fetchone and fetchall.
Data can also be ingested via arrow_streams. We just need to set options on the statement to bind the stream of data and execute the
query.

import adbc_driver_duckdb.dbapi
import pyarrow

data = pyarrow.record_batch(
[[1, 2, 3, 4], ["a", "b", "c", "d"]],
names = ["ints", "strs"],
)

with adbc_driver_duckdb.dbapi.connect("test.db") as conn, conn.cursor() as cur:

cur.adbc_ingest("AnswerToEverything", data)

ODBC

ODBC API ‑ Overview

The ODBC (Open Database Connectivity) is a C‑style API that provides access to different flavors of Database Management Systems (DBMSs).
The ODBC API consists of the Driver Manager (DM) and the ODBC drivers.

The DM is part of the system library, e.g., unixODBC, which manages the communications between the user applications and the ODBC
drivers. Typically, applications are linked against the DM, which uses Data Source Name (DSN) to look up the correct ODBC driver.

292
DuckDB Documentation

The ODBC driver is a DBMS implementation of the ODBC API, which handles all the internals of that DBMS.

The DM maps user application calls of ODBC functions to the correct ODBC driver that performs the specified function and returns the
proper values.

DuckDB ODBC Driver

DuckDB supports the ODBC version 3.0 according to the Core Interface Conformance.

We release the ODBC driver as assets for Linux and Windows. Users can download them from the Latest Release of DuckDB.

Operating Systems

Operating system Supported versions

Linux Ubuntu 20.04 or later

Microsoft Windows Microsoft Windows 10 or later

ODBC API ‑ Linux

A driver manager is required to manage communication between applications and the ODBC driver. We tested and support unixODBC
that is a complete ODBC driver manager for Linux. Users can install it from the command line:

Debian Flavors

sudo apt-get install unixodbc odbcinst

Fedora Flavors

sudo yum install unixODBC

Step 1: Download ODBC Driver

DuckDB releases the ODBC driver as asset. For linux, download it from ODBC Linux Asset that contains the following artifacts:

libduckdb_odbc.so: the DuckDB driver compiled to Ubuntu 16.04.

unixodbc_setup.sh: a setup script to aid the configuration on Linux.

Step 2: Extracting ODBC Artifacts

Run unzip to extract the files to a permanent directory:

mkdir duckdb_odbc
unzip duckdb_odbc-linux-amd64.zip -d duckdb_odbc

Step 3: Configuring with unixODBC

The unixodbc_setup.sh script aids the configuration of the DuckDB ODBC Driver. It is based on the unixODBC package that provides
some commands to handle the ODBC setup and test like odbcinst and isql.

In a terminal window, change to the duckdb_odbc permanent directory, and run the following commands with level options -u or -s
either to configure DuckDB ODBC.

293
DuckDB Documentation

User‑Level ODBC Setup (‑u) The -u option based on the user home directory to setup the ODBC init files.

./unixodbc_setup.sh -u

The default configuration consists of a database :memory:.

System‑Level ODBC setup (‑s) The ‑s changes the system level files that will be visible for all users, because of that it requires root
privileges.

sudo unixodbc_setup.sh -s

The default configuration consists of a database :memory:.

Show Usage (‑‑help) The option --help shows the usage of unixodbc_setup.sh that provides alternative options for a customer
configuration, like -db and -D.

unixodbc_setup.sh --help

Usage: ./unixodbc_setup.sh <level> [options]

Example: ./unixodbc_setup.sh -u -db ~/database_path -D ~/driver_path/libduckdb_odbc.so

Level:
-s: System-level, using 'sudo' to configure DuckDB ODBC at the system-level, changing the files:
/etc/odbc[inst].ini
-u: User-level, configuring the DuckDB ODBC at the user-level, changing the files: ~/.odbc[inst].ini.

Options:
-db database_path>: the DuckDB database file path, the default is ':memory:' if not provided.
-D driver_path: the driver file path (i.e., the path for libduckdb_odbc.so), the default is using the
base script directory

Step 4 (Optional): Configure the ODBC Driver

The ODBC setup on Linux is based on files, the well‑known .odbc.ini and .odbcinst.ini. These files can be placed at the system
/etc directory or at the user home directory /home/ user (shortcut as ~/). The DM prioritizes the user configuration files and then
the system files.

The .odbc.ini File The .odbc.ini contains the DSNs for the drivers, which can have specific knobs.

An example of .odbc.ini with DuckDB would be:

[DuckDB]
Driver = DuckDB Driver
Database = :memory:

[DuckDB]: between the brackets is a DSN for the DuckDB.

Driver: it describes the driver's name, and other configurations will be placed at the .odbcinst.ini.

Database: it describes the database name used by DuckDB, and it can also be a file path to a .db in the system.

The .odbcinst.ini File The .odbcinst.ini contains general configurations for the ODBC installed drivers in the system. A driver
section starts with the driver name between brackets, and then it follows specific configuration knobs belonging to that driver.

An example of .odbcinst.ini with the DuckDB driver would be:

294
DuckDB Documentation

[ODBC]
Trace = yes
TraceFile = /tmp/odbctrace

[DuckDB Driver]
Driver = /home/ user /duckdb_odbc/libduckdb_odbc.so

[ODBC]: it is the DM configuration section.

Trace: it enables the ODBC trace file using the option yes.

TraceFile: the absolute system file path for the ODBC trace file.

[DuckDB Driver]: the section of the DuckDB installed driver.

Driver: the absolute system file path of the DuckDB driver.

ODBC API ‑ Windows

The Microsoft Windows requires an ODBC Driver Manager to manage communication between applications and the ODBC drivers. The DM
on Windows is provided in a DLL file odbccp32.dll, and other files and tools. For detailed information checkout out the Common ODBC
Component Files.

Step 1: Download ODBC Driver

DuckDB releases the ODBC driver as asset. For Windows, download it from Windows Asset that contains the following artifacts:

duckdb_odbc.dll: the DuckDB driver compiled for Windows.

duckdb_odbc_setup.dll: a setup DLL used by the Windows ODBC Data Source Administrator tool.

odbc_install.exe: an installation script to aid the configuration on Windows.

Step 2: Extracting ODBC Artifacts

Unzip the file to a permanent directory (e.g., duckdb_odbc).

An example with PowerShell and unzip command would be:

mkdir duckdb_odbc
unzip duckdb_odbc-linux-amd64.zip -d duckdb_odbc

Step 3: ODBC Windows Installer

The odbc_install.exe aids the configuration of the DuckDB ODBC Driver on Windows. It depends on the Odbccp32.dll that pro‑
vides functions to configure the ODBC registry entries.

Inside the permanent directory (e.g., duckdb_odbc), double‑click on the odbc_install.exe.

Windows administrator privileges is required, in case of a non‑administrator a User Account Control shall display:

Step 4: Configure the ODBC Driver

The odbc_install.exe adds a default DSN configuration into the ODBC registries with a default database :memory:.

DSN Windows Setup After the installation, it is possible to change the default DSN configuration or add a new one using the Windows
ODBC Data Source Administrator tool odbcad32.exe.

It also can be launched thought the Windows start:

295
DuckDB Documentation

Default DuckDB DSN The newly installed DSN is visible on the System DSN in the Windows ODBC Data Source Administrator tool:

Changing DuckDB DSN When selecting the default DSN (i.e., DuckDB) or adding a new configuration, the following setup window will
display:

296
DuckDB Documentation

This window allows you to set the DSN and the database file path associated with that DSN.

More Detailed Windows Setup

There are two ways to configure the ODBC driver, either by altering the registry keys as detailed below, or by connecting with SQLDriver-
Connect. A combination of the two is also possible.

Furthermore, the ODBC driver supports all the configuration options included in DuckDB.

Note. If a configuration is set in both the connection string passed to SQLDriverConnect and in the odbc.ini file, the one
passed to SQLDriverConnect will take precedence.

Registry Keys The ODBC setup on Windows is based on registry keys (see Registry Entries for ODBC Components). The ODBC entries can
be placed at the current user registry key (HKCU) or the system registry key (HKLM).

We have tested and used the system entries based on HKLM->SOFTWARE->ODBC. The odbc_install.exe changes this entry that has
two subkeys: ODBC.INI and ODBCINST.INI.

The ODBC.INI is where users usually insert DSN registry entries for the drivers.

For example, the DSN registry for DuckDB would look like this:

297
DuckDB Documentation

The ODBCINST.INI contains one entry for each ODBC driver and other keys predefined for Windows ODBC configuration.

ODBC API ‑ macOS

A driver manager is required to manage communication between applications and the ODBC driver. We tested and support unixODBC
that is a complete ODBC driver manager for macOS (and Linux). Users can install it from the command line:

Brew

brew install unixodbc

Step 1: Download ODBC Driver

DuckDB releases the ODBC driver as asset. For macOS, download it from the ODBC macOS asset that contains the following artifacts:

libduckdb_odbc.dylib: the DuckDB ODBC driver compiled to macOS (with Intel and Apple Silicon support).

Step 2: Extracting ODBC Artifacts

Run unzip to extract the files to a permanent directory:

mkdir duckdb_odbc
unzip duckdb_odbc-osx-universal.zip -d duckdb_odbc

Step 3: Configure the ODBC Driver

There are two ways to configure the ODBC driver, either by initializing the configuration files listed below, or by connecting with
SQLDriverConnect. A combination of the two is also possible.

Furthermore, the ODBC driver supports all the configuration options included in DuckDB.

Note. If a configuration is set in both the connection string passed to SQLDriverConnect and in the odbc.ini file, the one
passed to SQLDriverConnect will take precedence.

298
DuckDB Documentation

The odbc.ini or .odbc.ini File The .odbc.ini contains the DSNs for the drivers, which can have specific knobs.

Example of .odbc.ini with DuckDB:

[DuckDB]
Driver = DuckDB Driver
Database=:memory:
access_mode=read_only
allow_unsigned_extensions=true

• [DuckDB]: between the brackets is a DSN for the DuckDB.

• Driver: Describes the driver's name, as well as where to find the configurations in the .odbcinst.ini.
• Database: Describes the database name used by DuckDB, can also be a file path to a .db in the system.
• access_mode: The mode in which to connect to the database
• allow_unsigned_extensions: Allow the use of unsigned extensions

Example of .odbcinst.ini with the DuckDB:

[ODBC]
Trace = yes
TraceFile = /tmp/odbctrace

[DuckDB Driver]
Driver = /User/ user /duckdb_odbc/libduckdb_odbc.dylib

• [ODBC]: it is the DM configuration section.

• Trace: it enables the ODBC trace file using the option yes.
• TraceFile: the absolute system file path for the ODBC trace file.
• [DuckDB Driver]: the section of the DuckDB installed driver.
• Driver: the absolute system file path of the DuckDB driver.

Step 4 (Optional): Test the ODBC Driver

After the configuration, for validate the installation, it is possible to use an odbc client. unixODBC use a command line tool called isql.

Use the DSN defined in odbc.ini as a parameter of isql.

isql DuckDB

SQL> SELECT 42;

+------------+
| 42 |
+------------+
| 42 |
+------------+

299
DuckDB Documentation

SQLRowCount returns -1
1 rows fetched

300
Configuration

Configuration

DuckDB has a number of configuration options that can be used to change the behavior of the system.

The configuration options can be set using either the SET statement or the PRAGMA statement. They can be reset to their original values
using the RESET statement. The values of configuration options can be queried via the current_setting() scalar function or using
the duckdb_settings() table function.

Examples

-- set the memory limit of the system to 10GB

SET memory_limit = '10GB';
-- configure the system to use 1 thread
SET threads TO 1;
-- enable printing of a progress bar during long-running queries
SET enable_progress_bar = true;
-- set the default null order to NULLS LAST
SET default_null_order = 'nulls_last';

-- return the current value of a specific setting

SELECT current_setting('threads') AS threads;

┌─────────┐
│ threads │
│ int64 │
├─────────┤
│ 10 │
└─────────┘

-- query a specific setting

SELECT *
FROM duckdb_settings()
WHERE name = 'threads';

┌─────────┬─────────┬─────────────────────────────────────────────────┬────────────┐
│ name │ value │ description │ input_type │
│ varchar │ varchar │ varchar │ varchar │
├─────────┼─────────┼─────────────────────────────────────────────────┼────────────┤
│ threads │ 10 │ The number of total threads used by the system. │ BIGINT │
└─────────┴─────────┴─────────────────────────────────────────────────┴────────────┘

-- show a list of all available settings

SELECT *
FROM duckdb_settings();

-- reset the memory limit of the system back to the default

RESET memory_limit;

Secrets Manager

DuckDB has a Secrets manager, which provides a unified user interface for secrets across all backends (e.g., AWS S3) that use them.

301
DuckDB Documentation

Configuration Reference

Below is a list of all available settings.

Name Description Input type Default value

Calendar The current calendar VARCHAR System (locale)

calendar
TimeZone The current time zone VARCHAR System (locale)
timezone
access_mode Access mode of the database (AUTOMATIC, READ_ONLY or VARCHAR automatic
READ_WRITE)
allocator_flush_ Peak allocation threshold at which to flush the allocator VARCHAR 128.0 MiB
threshold after completing a task.
allow_persistent_ Allow the creation of persistent secrets, that are stored and BOOLEAN 1
secrets loaded on restarts
allow_unsigned_ Allow to load extensions with invalid or missing signatures BOOLEAN false
extensions
arrow_large_buffer_ If arrow buffers for strings, blobs, uuids and bits should be BOOLEAN false
size exported using large buffers
autoinstall_ Overrides the custom endpoint for extension installation on VARCHAR
extension_repository autoloading
autoinstall_known_ Whether known extensions are allowed to be automatically BOOLEAN true
extensions installed when a query depends on them
autoload_known_ Whether known extensions are allowed to be automatically BOOLEAN true
extensions loaded when a query depends on them
binary_as_string In Parquet files, interpret binary data as a string. BOOLEAN
ca_cert_file Path to a custom certificate file for self‑signed certificates. VARCHAR
By default not set.
checkpoint_threshold, The WAL size threshold at which to automatically trigger a VARCHAR 16.0 MiB
wal_autocheckpoint checkpoint (e.g., 1GB)
custom_extension_ Overrides the custom endpoint for remote extension VARCHAR
repository installation
custom_user_agent Metadata from DuckDB callers VARCHAR
default_collation The collation setting used when none is specified VARCHAR
default_null_order, Null ordering used when none is specified (NULLS_FIRST or VARCHAR NULLS_LAST
null_order NULLS_LAST)
default_order The order type used when none is specified (ASC or DESC) VARCHAR ASC
default_secret_ Allows switching the default storage for secrets VARCHAR local_file
storage
disabled_filesystems Disable specific file systems preventing access (e.g., VARCHAR
LocalFileSystem)
duckdb_api DuckDB API surface VARCHAR cli
enable_external_ Allow the database to access external state (through e.g., BOOLEAN true
access loading/installing modules, COPY TO/FROM, CSV readers,
pandas replacement scans, etc)

302
DuckDB Documentation

Name Description Input type Default value

enable_fsst_vectors Allow scans on FSST compressed segments to emit BOOLEAN false

compressed vectors to utilize late decompression
enable_http_ Whether or not the global http metadata is used to cache BOOLEAN false
metadata_cache HTTP metadata
enable_object_cache Whether or not object cache is used to cache e.g., Parquet BOOLEAN false
metadata
enable_profiling Enables profiling, and sets the output format (JSON, VARCHAR NULL
QUERY_TREE, QUERY_TREE_OPTIMIZER)
enable_progress_bar_ Controls the printing of the progress bar, when 'enable_ BOOLEAN true
print progress_bar' is true
enable_progress_bar Enables the progress bar, printing progress to the terminal BOOLEAN false
for long queries
enable_server_cert_ Enable server side certificate verification, defaults to False. BOOLEAN 0
verification
errors_as_json Output error messages as structured JSON instead of as a BOOLEAN false
raw string
explain_output Output of EXPLAIN statements (ALL, OPTIMIZED_ONLY, VARCHAR physical_only
PHYSICAL_ONLY)
extension_directory Set the directory to store extensions in VARCHAR
external_threads The number of external threads that work on DuckDB tasks. BIGINT 1
file_search_path A comma separated list of directories to search for input files VARCHAR
force_download Forces upfront download of file BOOLEAN 0
home_directory Sets the home directory used by the system VARCHAR
http_keep_alive Keep alive connections. Setting this to false can help when BOOLEAN 1
running into connection failures
http_retries HTTP retries on I/O error (default 3) UBIGINT 3
http_retry_backoff Backoff factor for exponentially increasing retry wait time FLOAT 4
(default 4)
http_retry_wait_ms Time between retries (default 100ms) UBIGINT 100
http_timeout HTTP timeout read/write/connection/retry (default UBIGINT 30000
30000ms)
immediate_ Whether transactions should be started lazily when needed, BOOLEAN false
transaction_mode or immediately when BEGIN TRANSACTION is called
integer_division Whether or not the / operator defaults to integer division, or BOOLEAN 0
to floating point division
lock_configuration Whether or not the configuration can be altered BOOLEAN false
log_query_path Specifies the path to which queries should be logged VARCHAR NULL
(default: empty string, queries are not logged)
max_expression_depth The maximum expression depth limit in the parser. UBIGINT 1000
WARNING: increasing this setting and using very deep
expressions might lead to stack overflow errors.
max_memory, memory_ The maximum memory of the system (e.g., 1GB) VARCHAR 80% of RAM
limit
old_implicit_casting Allow implicit casting to/from VARCHAR BOOLEAN false

303
DuckDB Documentation

Name Description Input type Default value

ordered_aggregate_ The number of rows to accumulate before sorting, used for UBIGINT 262144
threshold tuning
password The password to use. Ignored for legacy compatibility. VARCHAR NULL
perfect_ht_threshold Threshold in bytes for when to use a perfect hash table BIGINT 12
(default: 12)
pivot_filter_ The threshold to switch from using filtered aggregates to BIGINT 10
threshold LIST with a dedicated pivot operator
pivot_limit The maximum number of pivot columns in a pivot BIGINT 100000
statement (default: 100000)
prefer_range_joins Force use of range joins with mixed predicates BOOLEAN false
preserve_identifier_ Whether or not to preserve the identifier case, instead of BOOLEAN true
case always lowercasing all non‑quoted identifiers
preserve_insertion_ Whether or not to preserve insertion order. If set to false the BOOLEAN true
order system is allowed to re‑order any results that do not contain
ORDER BY clauses.
profile_output, The file to which profile output should be saved, or empty VARCHAR
profiling_output to print to the terminal
profiling_mode The profiling mode (STANDARD or DETAILED) VARCHAR NULL
progress_bar_time Sets the time (in milliseconds) how long a query needs to BIGINT 2000
take before we start printing a progress bar
s3_access_key_id S3 Access Key ID VARCHAR
s3_endpoint S3 Endpoint (empty for default endpoint) VARCHAR
s3_region S3 Region (default us‑east‑1) VARCHAR us-east-1
s3_secret_access_key S3 Access Key VARCHAR
s3_session_token S3 Session Token VARCHAR
s3_uploader_max_ S3 Uploader max filesize (between 50GB and 5TB, default VARCHAR 800GB
filesize 800GB)
s3_uploader_max_ S3 Uploader max parts per file (between 1 and 10000, UBIGINT 10000
parts_per_file default 10000)
s3_uploader_thread_ S3 Uploader global thread limit (default 50) UBIGINT 50
limit
s3_url_ Disable Globs and Query Parameters on S3 URLs BOOLEAN 0
compatibility_mode
s3_url_style S3 URL style ('vhost' (default) or 'path') VARCHAR vhost
s3_use_ssl S3 use SSL (default true) BOOLEAN 1
schema Sets the default search schema. Equivalent to setting VARCHAR main
search_path to a single value.
search_path Sets the default catalog search path as a comma‑separated VARCHAR
list of values
secret_directory Set the directory to which persistent secrets are stored VARCHAR ~/.duckdb/stored_
secrets
temp_directory Set the directory to which to write temp files VARCHAR

304
DuckDB Documentation

Name Description Input type Default value

threads, worker_ The number of total threads used by the system. BIGINT # Cores
threads
username, user The username to use. Ignored for legacy compatibility. VARCHAR NULL

Pragmas

The PRAGMA statement is an SQL extension adopted by DuckDB from SQLite. PRAGMA statements can be issued in a similar manner to reg‑
ular SQL statements. PRAGMA commands may alter the internal state of the database engine, and can influence the subsequent execution
or behavior of the engine.

PRAGMA statements that assign a value to an option can also be issued using the SET statement and the value of an option can be retrieved
using SELECT current_setting(option_name).

For DuckDB's built in configuration options, see the Configuration Reference.

List of Supported PRAGMA Statements

Below is a list of supported PRAGMA statements.

Schema Information List all databases:

PRAGMA database_list;

List all tables:

PRAGMA show_tables;

List all tables, with extra information, similarly to DESCRIBE:

PRAGMA show_tables_expanded;

To list all functions:

PRAGMA functions;

Table Information Get info for a specific table:

PRAGMA table_info('table_name');
CALL pragma_table_info('table_name');

table_info returns information about the columns of the table with name table_name. The exact format of the table returned is given
below:

cid INTEGER, -- cid of the column

name VARCHAR, -- name of the column
type VARCHAR, -- type of the column
notnull BOOLEAN, -- if the column is marked as NOT NULL
dflt_value VARCHAR, -- default value of the column, or NULL if not specified
pk BOOLEAN -- part of the primary key or not

To also show table structure, but in a slightly different format (included for compatibility):

PRAGMA show('table_name');

305
DuckDB Documentation

Memory Limit Set the memory limit for the buffer manager:

SET memory_limit = '1GB';

SET max_memory = '1GB';

Note. Warning The specified memory limit is only applied to the buffer manager. For most queries, the buffer manager handles
the majority of the data processed. However, certain in‑memory data structures such as vectors and query results are allocated
outside of the buffer manager. Additionally, aggregate functions with complex state (e.g., list, mode, quantile, string_agg,
and approx functions) use memory outside of the buffer manager. Therefore, the actual memory consumption can be higher than
the specified memory limit.

Threads Set the amount of threads for parallel query execution:

SET threads = 4;

Database Size Get the file and memory size of each database:

SET database_size;
CALL pragma_database_size();

database_size returns information about the file and memory size of each database. The column types of the returned results are
given below:

database_name VARCHAR, -- database name

database_size VARCHAR, -- total block count times the block size
block_size BIGINT, -- database block size
total_blocks BIGINT, -- total blocks in the database
used_blocks BIGINT, -- used blocks in the database
free_blocks BIGINT, -- free blocks in the database
wal_size VARCHAR, -- write ahead log size
memory_usage VARCHAR, -- memory used by the database buffer manager
memory_limit VARCHAR -- maximum memory allowed for the database

Collations List all available collations:

PRAGMA collations;

Set the default collation to one of the available ones:

SET default_collation = 'nocase';

Implicit Casting to VARCHAR Prior to version 0.10.0, DuckDB would automatically allow any type to be implicitly cast to VARCHAR
during function binding. As a result it was possible to e.g., compute the substring of an integer without using an implicit cast. For version
v0.10.0 and later an explicit cast is needed instead. To revert to the old behaviour that performs implicit casting, set the old_implicit_
casting variable to true.

SET old_implicit_casting = true;

Default Ordering for NULLs Set the default ordering for NULLs to be either NULLS FIRST or NULLS LAST:

SET default_null_order = 'NULLS FIRST';

SET default_null_order = 'NULLS LAST';

Set the default result set ordering direction to ASCENDING or DESCENDING:

SET default_order = 'ASCENDING';

SET default_order = 'DESCENDING';

306
DuckDB Documentation

Version Show DuckDB version:

PRAGMA version;
CALL pragma_version();

Platform platform returns an identifier for the platform the current DuckDB executable has been compiled for, e.g., osx_arm64. The
format of this identifier matches the platform name as described on the extension loading explainer.

PRAGMA platform;
CALL pragma_platform();

Progress Bar Show progress bar when running queries:

PRAGMA enable_progress_bar;

Don't show a progress bar for running queries:

PRAGMA disable_progress_bar;

Profiling

Enable Profiling To enable profiling:

PRAGMA enable_profiling;
PRAGMA enable_profile;

Profiling Format The format of the resulting profiling information can be specified as either json, query_tree, or query_tree_
optimizer. The default format is query_tree, which prints the physical operator tree together with the timings and cardinalities of
each operator in the tree to the screen.

To return the logical query plan as JSON:

SET enable_profiling = 'json';

To return the logical query plan:

SET enable_profiling = 'query_tree';

To return the physical query plan:

SET enable_profiling = 'query_tree_optimizer';

Disable Profiling To disable profiling:

PRAGMA disable_profiling;
PRAGMA disable_profile;

Profiling Output By default, profiling information is printed to the console. However, if you prefer to write the profiling information to a
file the PRAGMA profiling_output can be used to write to a specified file. Note that the file contents will be overwritten for every
new query that is issued, hence the file will only contain the profiling information of the last query that is run.

SET profiling_output = '/path/to/file.json';

SET profile_output = '/path/to/file.json';

307
DuckDB Documentation

Profiling Mode By default, a limited amount of profiling information is provided (standard). For more details, use the detailed profiling
mode by setting profiling_mode to detailed. The output of this mode shows how long it takes to apply certain optimizers on the
query tree and how long physical planning takes.

SET profiling_mode = 'detailed';

Optimizer To disable the query optimizer:

PRAGMA disable_optimizer;

To enable the query optimizer:

PRAGMA enable_optimizer;

Logging Set a path for query logging:

SET log_query_path = '/tmp/duckdb_log/';

Disable query logging:

SET log_query_path = '';

Explain Plan Output The output of EXPLAIN output can be configured to show only the physical plan. This is the default configura‑
tion.

SET explain_output = 'physical_only';

To only show the optimized query plan:

SET explain_output = 'optimized_only';

To show all query plans:

SET explain_output = 'all';

Full‑Text Search Indexes The create_fts_index and drop_fts_index options are only available when the fts extension is
loaded. Their usage is documented on the Full‑Text Search extension page.

Verification of External Operators Enable verification of external operators:

PRAGMA verify_external;

Disable verification of external operators:

PRAGMA disable_verify_external;

Verification of Round‑Trip Capabilities Enable verification of round‑trip capabilities for supported logical plans:

PRAGMA verify_serializer;

Disable verification of round‑trip capabilities:

PRAGMA disable_verify_serializer;

Object Cache Enable caching of objects for e.g., Parquet metadata:

PRAGMA enable_object_cache;

Disable caching of objects:

PRAGMA disable_object_cache;

308
DuckDB Documentation

Checkpoint

Force Checkpoint When CHECKPOINT is called when no changes are made, force a checkpoint regardless.

PRAGMA force_checkpoint;

Checkpoint on Shutdown Run a CHECKPOINT on successful shutdown and delete the WAL, to leave only a single database file behind:

PRAGMA enable_checkpoint_on_shutdown;

Don't run a CHECKPOINT on shutdown:

PRAGMA disable_checkpoint_on_shutdown;

Progress Bar Enable printing of the progress bar (if it's possible):

PRAGMA enable_print_progress_bar;

Disable printing of the progress bar:

PRAGMA disable_print_progress_bar;

Temp Directory for Spilling Data to Disk By default, DuckDB uses a temporary directory named database_file_name .tmp to
spill to disk, located in the same directory as the database file. To change this, use:

SET temp_directory = '/path/to/temp_dir.tmp/'

Storage Information To get storage information:

PRAGMA storage_info('table_name');
CALL pragma_storage_info('table_name');

This call returns the following information for the given table:

Name Type Description

row_group_id BIGINT
column_name VARCHAR
column_id BIGINT
column_path VARCHAR
segment_id BIGINT
segment_type VARCHAR
start BIGINT The start row id of this chunk
count BIGINT The amount of entries in this storage chunk
compression VARCHAR Compression type used for this column ‑ see blog post
stats VARCHAR
has_updates BOOLEAN
persistent BOOLEAN false if temporary table
block_id BIGINT empty unless persistent
block_offset BIGINT empty unless persistent

See Storage for more information.

309
DuckDB Documentation

Show Databases The following statement is equivalent to the SHOW DATABASES statement:

PRAGMA show_databases;

User Agent The following statement returns the user agent information, e.g., duckdb/v0.10.0(osx_arm64).

PRAGMA user_agent;

Metadata Information The following statement returns information on the metadata store (block_id, total_blocks, free_
blocks, and free_list).

PRAGMA metadata_info;

Selectively Disabling Optimizers The disabled_optimizers option allows selectively disabling optimization steps. For example,
to disable filter_pushdown and statistics_propagation, run:

SET disabled_optimizers = 'filter_pushdown,statistics_propagation';

The available optimizations can be queried using the duckdb_optimizers() table function.

Note. Warning The disabled_optimizers option should only be used for debugging performance issues and should be
avoided in production.

Returning Errors as JSON The errors_as_json option can be set to obtain error information in raw JSON format. For certain errors,
extra information or decomposed information is provided for easier machine processing. For example:

SET errors_as_json = true;

Then, running a query that results in an error produces a JSON output:

SELECT * FROM nonexistent_tbl;

{
"exception_type":"Catalog",
"exception_message":"Table with name nonexistent_tbl does not exist!\nDid you mean
\"temp.information_schema.tables\"?",
"name":"nonexistent_tbl",
"candidates":"temp.information_schema.tables",
"position":"14",
"type":"Table",
"error_subtype":"MISSING_ENTRY"
}

Query Verification (for Development) The following PRAGMAs are mostly used for development and internal testing.

Enable query verification:

PRAGMA enable_verification;

Disable query verification:

PRAGMA disable_verification;

Enable force parallel query processing:

PRAGMA verify_parallelism;

Disable force parallel query processing:

PRAGMA disable_verify_parallelism;

310
DuckDB Documentation

Secrets Manager

The Secrets manager provides a unified user interface for secrets across all backends that use them. Secrets can be scoped, so different
storage prefixes can have different secrets, allowing for example to join data across organizations in a single query. Secrets can also be
persisted, so that they do not need to be specified every time DuckDB is launched.

Note. Secrets were introduced with DuckDB version 0.10.

Note. Warning Persistent secrets are stored in unencrypted binary format on the disk.

Secrets

Types of Secrets Secrets are typed, their type identifies which service they are for. Currently, the following cloud services are available:

• AWS S3 (S3), through the httpfs extension

• Google Cloud Storage (GCS), through the httpfs extension
• Cloudflare R2 (R2), through the httpfs extension
• Azure Blob Storage (AZURE), through the azure extension

For each type, there are one or more ”secret providers” that specify how the secret is created. Secrets can also have an optional scope,
which is a file path prefix that the secret applies to. When fetching a secret for a path, the secret scopes are compared to the path, returning
the matching secret for the path. In the case of multiple matching secrets, the longest prefix is chosen.

Creating a Secret Secrets can be created using the CREATE SECRET SQL statement. Secrets can be temporary or persistent. Tem‑
porary secrets are used by default – and are stored in‑memory for the life span of the DuckDB instance similar to how settings worked
previously. Persistent secrets are stored in unencrypted binary format in the ~/.duckdb/stored_secrets directory. On startup of
DuckDB, persistent secrets are read from this directory and automatically loaded.

Secret Providers To create a secret, a Secret Provider needs to be used. A Secret Provider is a mechanism through which a secret
is generated. To illustrate this, for the S3, GCS, R2, and AZURE secret types, DuckDB currently supports two providers: CONFIG and
CREDENTIAL_CHAIN. The CONFIG provider requires the user to pass all configuration information into the CREATE SECRET, whereas
the CREDENTIAL_CHAIN provider will automatically try to fetch credentials. When no Secret Provider is specified, the CONFIG provider
is used. For more details on how to create secrets using different providers checkout the respective pages on httpfs and azure

Temporary Secrets To create a temporary unscoped secret to access S3, we can now use the following:

CREATE SECRET (
TYPE S3,
KEY_ID 'mykey',
SECRET 'mysecret',
REGION 'myregion'
);

Note that we implicitly use the default CONFIG secret provider here.

Persistent Secrets In order to persist secrets between DuckDB database instances, we can now use the CREATE PERSISTENT SECRET
command, e.g.:

CREATE PERSISTENT SECRET my_persistent_secret (

TYPE S3,
KEY_ID 'key',
SECRET 'secret'
);

This will write the secret (unencrypted) to the ~/.duckdb/stored_secrets directory.

311
DuckDB Documentation

Deleting Secrets Secrets can be deleted using the DROP SECRET statement, e.g.:

DROP PERSISTENT SECRET my_persistent_secret;

Creating Multiple Secrets for the Same Service Type If two secrets exist for a service type, the scope can be used to decide which one
should be used. For example:

CREATE SECRET secret1 (

TYPE S3,
KEY_ID 'my_key1',
SECRET 'my_secret1',
SCOPE 's3://my-bucket'
);

CREATE SECRET secret2 (

TYPE S3,
KEY_ID 'my_key2',
SECRET 'my_secret2',
SCOPE 's3://my-other-bucket'
);

Now, if the user queries something from s3://my-other-bucket/something, secret secret2 will be chosen automatically for
that request. To see which secret is being used, the which_secret scalar function can be used, which takes a path and a secret type as
parameters:

SELECT which_secret('s3://my-other-bucket/file.parquet', 'S3');

Listing Secrets Secrets can be listed using the built‑in table‑producing function, e.g., by using the duckdb_secrets() table func‑
tion:

FROM duckdb_secrets();

Sensitive information will be redacted.

312
SQL

SQL Introduction

Here we provide an overview of how to perform simple operations in SQL. This tutorial is only intended to give you an introduction and is
in no way a complete tutorial on SQL. This tutorial is adapted from the PostgreSQL tutorial.

In the examples that follow, we assume that you have installed the DuckDB Command Line Interface (CLI) shell. See the installation page
for information on how to install the CLI.

Concepts

DuckDB is a relational database management system (RDBMS). That means it is a system for managing data stored in relations. A relation
is essentially a mathematical term for a table.

Each table is a named collection of rows. Each row of a given table has the same set of named columns, and each column is of a specific
data type. Tables themselves are stored inside schemas, and a collection of schemas constitutes the entire database that you can access.

Creating a New Table

You can create a new table by specifying the table name, along with all column names and their types:

CREATE TABLE weather (

city VARCHAR,
temp_lo INTEGER, -- minimum temperature on a day
temp_hi INTEGER, -- maximum temperature on a day
prcp REAL,
date DATE
);

You can enter this into the shell with the line breaks. The command is not terminated until the semicolon.

White space (i.e., spaces, tabs, and newlines) can be used freely in SQL commands. That means you can type the command aligned differ‑
ently than above, or even all on one line. Two dash characters (--) introduce comments. Whatever follows them is ignored up to the end
of the line. SQL is case insensitive about key words and identifiers.

In the SQL command, we first specify the type of command that we want to perform: CREATE TABLE. After that follows the parameters
for the command. First, the table name, weather, is given. Then the column names and column types follow.

city VARCHAR specifies that the table has a column called city that is of type VARCHAR. VARCHAR specifies a data type that can store
text of arbitrary length. The temperature fields are stored in an INTEGER type, a type that stores integer numbers (i.e., whole numbers
without a decimal point). REAL columns store single precision floating‑point numbers (i.e., numbers with a decimal point). DATE stores a
date (i.e., year, month, day combination). DATE only stores the specific day, not a time associated with that day.

DuckDB supports the standard SQL types INTEGER, SMALLINT, REAL, DOUBLE, DECIMAL, CHAR(n), VARCHAR(n), DATE, TIME and
TIMESTAMP.

The second example will store cities and their associated geographical location:

CREATE TABLE cities (

name VARCHAR,
lat DECIMAL,

313
DuckDB Documentation

lon DECIMAL
);

Finally, it should be mentioned that if you don't need a table any longer or want to recreate it differently you can remove it using the
following command:

DROP TABLE [tablename];

Populating a Table with Rows

The insert statement is used to populate a table with rows:

INSERT INTO weather VALUES ('San Francisco', 46, 50, 0.25, '1994-11-27');

Constants that are not numeric values (e.g., text and dates) must be surrounded by single quotes (''), as in the example. Input dates for
the date type must be formatted as 'YYYY-MM-DD'.

We can insert into the cities table in the same manner.

INSERT INTO cities

VALUES ('San Francisco', -194.0, 53.0);

The syntax used so far requires you to remember the order of the columns. An alternative syntax allows you to list the columns explicitly:

INSERT INTO weather (city, temp_lo, temp_hi, prcp, date)

VALUES ('San Francisco', 43, 57, 0.0, '1994-11-29');

You can list the columns in a different order if you wish or even omit some columns, e.g., if the prcp is unknown:

INSERT INTO weather (date, city, temp_hi, temp_lo)

VALUES ('1994-11-29', 'Hayward', 54, 37);

Many developers consider explicitly listing the columns better style than relying on the order implicitly.

Please enter all the commands shown above so you have some data to work with in the following sections.

You could also have used COPY to load large amounts of data from CSV files. This is usually faster because the COPY command is optimized
for this application while allowing less flexibility than INSERT. An example with weather.csv would be:

COPY weather
FROM 'weather.csv';

Where the file name for the source file must be available on the machine running the process. There are many other ways of loading data
into DuckDB, see the corresponding documentation section for more information.

Querying a Table

To retrieve data from a table, the table is queried. A SQL SELECT statement is used to do this. The statement is divided into a select list
(the part that lists the columns to be returned), a table list (the part that lists the tables from which to retrieve the data), and an optional
qualification (the part that specifies any restrictions). For example, to retrieve all the rows of table weather, type:

SELECT *
FROM weather;

Here * is a shorthand for ”all columns”. So the same result would be had with:

SELECT city, temp_lo, temp_hi, prcp, date

FROM weather;

The output should be:

314
DuckDB Documentation

┌───────────────┬─────────┬─────────┬───────┬────────────┐
│ city │ temp_lo │ temp_hi │ prcp │ date │
│ varchar │ int32 │ int32 │ float │ date │
├───────────────┼─────────┼─────────┼───────┼────────────┤
│ San Francisco │ 46 │ 50 │ 0.25 │ 1994-11-27 │
│ San Francisco │ 43 │ 57 │ 0.0 │ 1994-11-29 │
│ Hayward │ 37 │ 54 │ │ 1994-11-29 │
└───────────────┴─────────┴─────────┴───────┴────────────┘

You can write expressions, not just simple column references, in the select list. For example, you can do:

SELECT city, (temp_hi+temp_lo)/2 AS temp_avg, date

FROM weather;

This should give:

┌───────────────┬──────────┬────────────┐
│ city │ temp_avg │ date │
│ varchar │ double │ date │
├───────────────┼──────────┼────────────┤
│ San Francisco │ 48.0 │ 1994-11-27 │
│ San Francisco │ 50.0 │ 1994-11-29 │
│ Hayward │ 45.5 │ 1994-11-29 │
└───────────────┴──────────┴────────────┘

Notice how the AS clause is used to relabel the output column. (The AS clause is optional.)

A query can be ”qualified” by adding a WHERE clause that specifies which rows are wanted. The WHERE clause contains a Boolean (truth
value) expression, and only rows for which the Boolean expression is true are returned. The usual Boolean operators (AND, OR, and NOT)
are allowed in the qualification. For example, the following retrieves the weather of San Francisco on rainy days:

SELECT *
FROM weather
WHERE city = 'San Francisco' AND prcp > 0.0;

Result:

┌───────────────┬─────────┬─────────┬───────┬────────────┐
│ city │ temp_lo │ temp_hi │ prcp │ date │
│ varchar │ int32 │ int32 │ float │ date │
├───────────────┼─────────┼─────────┼───────┼────────────┤
│ San Francisco │ 46 │ 50 │ 0.25 │ 1994-11-27 │
└───────────────┴─────────┴─────────┴───────┴────────────┘

You can request that the results of a query be returned in sorted order:

SELECT *
FROM weather
ORDER BY city;

┌───────────────┬─────────┬─────────┬───────┬────────────┐
│ city │ temp_lo │ temp_hi │ prcp │ date │
│ varchar │ int32 │ int32 │ float │ date │
├───────────────┼─────────┼─────────┼───────┼────────────┤
│ Hayward │ 37 │ 54 │ │ 1994-11-29 │
│ San Francisco │ 46 │ 50 │ 0.25 │ 1994-11-27 │
│ San Francisco │ 43 │ 57 │ 0.0 │ 1994-11-29 │
└───────────────┴─────────┴─────────┴───────┴────────────┘

In this example, the sort order isn't fully specified, and so you might get the San Francisco rows in either order. But you'd always get the
results shown above if you do:

315
DuckDB Documentation

SELECT *
FROM weather
ORDER BY city, temp_lo;

You can request that duplicate rows be removed from the result of a query:

SELECT DISTINCT city

FROM weather;

┌───────────────┐
│ city │
│ varchar │
├───────────────┤
│ Hayward │
│ San Francisco │
└───────────────┘

Here again, the result row ordering might vary. You can ensure consistent results by using DISTINCT and ORDER BY together:

SELECT DISTINCT city

FROM weather
ORDER BY city;

Joins between Tables

Thus far, our queries have only accessed one table at a time. Queries can access multiple tables at once, or access the same table in such a
way that multiple rows of the table are being processed at the same time. A query that accesses multiple rows of the same or different tables
at one time is called a join query. As an example, say you wish to list all the weather records together with the location of the associated
city. To do that, we need to compare the city column of each row of the weather table with the name column of all rows in the cities
table, and select the pairs of rows where these values match.

This would be accomplished by the following query:

SELECT *
FROM weather, cities
WHERE city = name;

┌───────────────┬─────────┬─────────┬───────┬────────────┬───────────────┬───────────────┬───────────────┐
│ city │ temp_lo │ temp_hi │ prcp │ date │ name │ lat │ lon │
│ varchar │ int32 │ int32 │ float │ date │ varchar │ decimal(18,3) │ decimal(18,3) │
├───────────────┼─────────┼─────────┼───────┼────────────┼───────────────┼───────────────┼───────────────┤
│ San Francisco │ 46 │ 50 │ 0.25 │ 1994-11-27 │ San Francisco │ -194.000 │ 53.000 │
│ San Francisco │ 43 │ 57 │ 0.0 │ 1994-11-29 │ San Francisco │ -194.000 │ 53.000 │
└───────────────┴─────────┴─────────┴───────┴────────────┴───────────────┴───────────────┴───────────────┘

Observe two things about the result set:

• There is no result row for the city of Hayward. This is because there is no matching entry in the cities table for Hayward, so the
join ignores the unmatched rows in the weather table. We will see shortly how this can be fixed.
• There are two columns containing the city name. This is correct because the lists of columns from the weather and cities tables
are concatenated. In practice this is undesirable, though, so you will probably want to list the output columns explicitly rather than
using *:

SELECT city, temp_lo, temp_hi, prcp, date, lon, lat

FROM weather, cities
WHERE city = name;

┌───────────────┬─────────┬─────────┬───────┬────────────┬───────────────┬───────────────┐
│ city │ temp_lo │ temp_hi │ prcp │ date │ lon │ lat │
│ varchar │ int32 │ int32 │ float │ date │ decimal(18,3) │ decimal(18,3) │
├───────────────┼─────────┼─────────┼───────┼────────────┼───────────────┼───────────────┤

316
DuckDB Documentation

│ San Francisco │ 46 │ 50 │ 0.25 │ 1994-11-27 │ 53.000 │ -194.000 │

│ San Francisco │ 43 │ 57 │ 0.0 │ 1994-11-29 │ 53.000 │ -194.000 │
└───────────────┴─────────┴─────────┴───────┴────────────┴───────────────┴───────────────┘

Since the columns all had different names, the parser automatically found which table they belong to. If there were duplicate column
names in the two tables you'd need to qualify the column names to show which one you meant, as in:

SELECT weather.city, weather.temp_lo, weather.temp_hi,

weather.prcp, weather.date, cities.lon, cities.lat
FROM weather, cities
WHERE cities.name = weather.city;

It is widely considered good style to qualify all column names in a join query, so that the query won't fail if a duplicate column name is later
added to one of the tables.

Join queries of the kind seen thus far can also be written in this alternative form:

SELECT *
FROM weather
INNER JOIN cities ON weather.city = cities.name;

This syntax is not as commonly used as the one above, but we show it here to help you understand the following topics.

Now we will figure out how we can get the Hayward records back in. What we want the query to do is to scan the weather table and for
each row to find the matching cities row(s). If no matching row is found we want some ”empty values” to be substituted for the cities
table's columns. This kind of query is called an outer join. (The joins we have seen so far are inner joins.) The command looks like this:

SELECT *
FROM weather
LEFT OUTER JOIN cities ON weather.city = cities.name;

┌───────────────┬─────────┬─────────┬───────┬────────────┬───────────────┬───────────────┬───────────────┐
│ city │ temp_lo │ temp_hi │ prcp │ date │ name │ lat │ lon │
│ varchar │ int32 │ int32 │ float │ date │ varchar │ decimal(18,3) │ decimal(18,3) │
├───────────────┼─────────┼─────────┼───────┼────────────┼───────────────┼───────────────┼───────────────┤
│ San Francisco │ 46 │ 50 │ 0.25 │ 1994-11-27 │ San Francisco │ -194.000 │ 53.000 │
│ San Francisco │ 43 │ 57 │ 0.0 │ 1994-11-29 │ San Francisco │ -194.000 │ 53.000 │
│ Hayward │ 37 │ 54 │ │ 1994-11-29 │ │ │ │
└───────────────┴─────────┴─────────┴───────┴────────────┴───────────────┴───────────────┴───────────────┘

This query is called a left outer join because the table mentioned on the left of the join operator will have each of its rows in the output
at least once, whereas the table on the right will only have those rows output that match some row of the left table. When outputting a
left‑table row for which there is no right‑table match, empty (null) values are substituted for the right‑table columns.

Aggregate Functions

Like most other relational database products, DuckDB supports aggregate functions. An aggregate function computes a single result from
multiple input rows. For example, there are aggregates to compute the count, sum, avg (average), max (maximum) and min (minimum)
over a set of rows.

As an example, we can find the highest low‑temperature reading anywhere with:

SELECT max(temp_lo)
FROM weather;

┌──────────────┐
│ max(temp_lo) │
│ int32 │
├──────────────┤
│ 46 │
└──────────────┘

317
DuckDB Documentation

If we wanted to know what city (or cities) that reading occurred in, we might try:

SELECT city
FROM weather
WHERE temp_lo = max(temp_lo); -- WRONG

but this will not work since the aggregate max cannot be used in the WHERE clause. (This restriction exists because the WHERE clause
determines which rows will be included in the aggregate calculation; so obviously it has to be evaluated before aggregate functions are
computed.) However, as is often the case the query can be restated to accomplish the desired result, here by using a subquery:

SELECT city
FROM weather
WHERE temp_lo = (SELECT max(temp_lo) FROM weather);

┌───────────────┐
│ city │
│ varchar │
├───────────────┤
│ San Francisco │
└───────────────┘

This is OK because the subquery is an independent computation that computes its own aggregate separately from what is happening in
the outer query.

Aggregates are also very useful in combination with GROUP BY clauses. For example, we can get the maximum low temperature observed
in each city with:

SELECT city, max(temp_lo)

FROM weather
GROUP BY city;

┌───────────────┬──────────────┐
│ city │ max(temp_lo) │
│ varchar │ int32 │
├───────────────┼──────────────┤
│ San Francisco │ 46 │
│ Hayward │ 37 │
└───────────────┴──────────────┘

Which gives us one output row per city. Each aggregate result is computed over the table rows matching that city. We can filter these
grouped rows using HAVING:

SELECT city, max(temp_lo)

FROM weather
GROUP BY city
HAVING max(temp_lo) < 40;

┌─────────┬──────────────┐
│ city │ max(temp_lo) │
│ varchar │ int32 │
├─────────┼──────────────┤
│ Hayward │ 37 │
└─────────┴──────────────┘

which gives us the same results for only the cities that have all temp_lo values below 40. Finally, if we only care about cities whose names
begin with ”S”, we can use the LIKE operator:

SELECT city, max(temp_lo)

FROM weather
WHERE city LIKE 'S%' -- (1)
GROUP BY city
HAVING max(temp_lo) < 40;

318
DuckDB Documentation

More information about the LIKE operator can be found in the pattern matching page.

It is important to understand the interaction between aggregates and SQL's WHERE and HAVING clauses. The fundamental difference
between WHERE and HAVING is this: WHERE selects input rows before groups and aggregates are computed (thus, it controls which rows
go into the aggregate computation), whereas HAVING selects group rows after groups and aggregates are computed. Thus, the WHERE
clause must not contain aggregate functions; it makes no sense to try to use an aggregate to determine which rows will be inputs to the
aggregates. On the other hand, the HAVING clause always contains aggregate functions.

In the previous example, we can apply the city name restriction in WHERE, since it needs no aggregate. This is more efficient than adding
the restriction to HAVING, because we avoid doing the grouping and aggregate calculations for all rows that fail the WHERE check.

Updates

You can update existing rows using the UPDATE command. Suppose you discover the temperature readings are all off by 2 degrees after
November 28. You can correct the data as follows:

UPDATE weather
SET temp_hi = temp_hi - 2, temp_lo = temp_lo - 2
WHERE date > '1994-11-28';

Look at the new state of the data:

SELECT *
FROM weather;

┌───────────────┬─────────┬─────────┬───────┬────────────┐
│ city │ temp_lo │ temp_hi │ prcp │ date │
│ varchar │ int32 │ int32 │ float │ date │
├───────────────┼─────────┼─────────┼───────┼────────────┤
│ San Francisco │ 46 │ 50 │ 0.25 │ 1994-11-27 │
│ San Francisco │ 41 │ 55 │ 0.0 │ 1994-11-29 │
│ Hayward │ 35 │ 52 │ │ 1994-11-29 │
└───────────────┴─────────┴─────────┴───────┴────────────┘

Deletions

Rows can be removed from a table using the DELETE command. Suppose you are no longer interested in the weather of Hayward. Then
you can do the following to delete those rows from the table:

DELETE FROM weather

WHERE city = 'Hayward';

All weather records belonging to Hayward are removed.

SELECT *
FROM weather;

┌───────────────┬─────────┬─────────┬───────┬────────────┐
│ city │ temp_lo │ temp_hi │ prcp │ date │
│ varchar │ int32 │ int32 │ float │ date │
├───────────────┼─────────┼─────────┼───────┼────────────┤
│ San Francisco │ 46 │ 50 │ 0.25 │ 1994-11-27 │
│ San Francisco │ 41 │ 55 │ 0.0 │ 1994-11-29 │
└───────────────┴─────────┴─────────┴───────┴────────────┘

One should be wary of statements of the form

DELETE FROM tablename;

Without a qualification, DELETE will remove all rows from the given table, leaving it empty. The system will not request confirmation
before doing this!

319
DuckDB Documentation

Statements

Statements Overview

ALTER TABLE Statement

The ALTER TABLE statement changes the schema of an existing table in the catalog.

Examples

-- add a new column with name "k" to the table "integers", it will be filled with the default value NULL
ALTER TABLE integers ADD COLUMN k INTEGER;
-- add a new column with name "l" to the table integers, it will be filled with the default value 10
ALTER TABLE integers ADD COLUMN l INTEGER DEFAULT 10;

-- drop the column "k" from the table integers

ALTER TABLE integers DROP k;

-- change the type of the column "i" to the type "VARCHAR" using a standard cast
ALTER TABLE integers ALTER i TYPE VARCHAR;
-- change the type of the column "i" to the type "VARCHAR", using the specified expression to convert
the data for each row
ALTER TABLE integers ALTER i SET DATA TYPE VARCHAR USING concat(i, '_', j);

-- set the default value of a column

ALTER TABLE integers ALTER COLUMN i SET DEFAULT 10;
-- drop the default value of a column
ALTER TABLE integers ALTER COLUMN i DROP DEFAULT;

-- make a column not nullable

ALTER TABLE t ALTER COLUMN x SET NOT NULL;
-- drop the not null constraint
ALTER TABLE t ALTER COLUMN x DROP NOT NULL;

-- rename a table
ALTER TABLE integers RENAME TO integers_old;

-- rename a column of a table

ALTER TABLE integers RENAME i TO j;

Syntax

ALTER TABLE changes the schema of an existing table. All the changes made by ALTER TABLE fully respect the transactional semantics,
i.e., they will not be visible to other transactions until committed, and can be fully reverted through a rollback.

RENAME TABLE

-- rename a table
ALTER TABLE integers RENAME TO integers_old;

The RENAME TO clause renames an entire table, changing its name in the schema. Note that any views that rely on the table are not
automatically updated.

320
DuckDB Documentation

RENAME COLUMN

-- rename a column of a table

ALTER TABLE integers RENAME i TO j;
ALTER TABLE integers RENAME COLUMN j TO k;

The RENAME COLUMN clause renames a single column within a table. Any constraints that rely on this name (e.g., CHECK constraints) are
automatically updated. However, note that any views that rely on this column name are not automatically updated.

ADD COLUMN

The ADD COLUMN clause can be used to add a new column of a specified type to a table. The new column will be filled with the specified
default value, or NULL if none is specified.

DROP COLUMN

-- drop the column "k" from the table integers

ALTER TABLE integers DROP k;

The DROP COLUMN clause can be used to remove a column from a table. Note that columns can only be removed if they do not have any
indexes that rely on them. This includes any indexes created as part of a PRIMARY KEY or UNIQUE constraint. Columns that are part of
multi‑column check constraints cannot be dropped either.

ALTER TYPE

The SET DATA TYPE clause changes the type of a column in a table. Any data present in the column is converted according to the
provided expression in the USING clause, or, if the USING clause is absent, cast to the new data type. Note that columns can only have
their type changed if they do not have any indexes that rely on them and are not part of any CHECK constraints.

SET / DROP DEFAULT

-- set the default value of a column

ALTER TABLE integers ALTER COLUMN i SET DEFAULT 10;
-- drop the default value of a column
ALTER TABLE integers ALTER COLUMN i DROP DEFAULT;

The SET/DROP DEFAULT clause modifies the DEFAULT value of an existing column. Note that this does not modify any existing data in
the column. Dropping the default is equivalent to setting the default value to NULL.

Note. Warning At the moment DuckDB will not allow you to alter a table if there are any dependencies. That means that if you
have an index on a column you will first need to drop the index, alter the table, and then recreate the index. Otherwise you will get a
”Dependency Error.”

321
DuckDB Documentation

ADD / DROP CONSTRAINT

Note. The ADD CONSTRAINT and DROP CONSTRAINT clauses are not yet supported in DuckDB.

ALTER VIEW Statement

The ALTER VIEW statement changes the schema of an existing view in the catalog.

Examples

-- rename a view
ALTER VIEW v1 RENAME TO v2;

ALTER VIEW changes the schema of an existing table. All the changes made by ALTER VIEW fully respect the transactional semantics,
i.e., they will not be visible to other transactions until committed, and can be fully reverted through a rollback. Note that other views that
rely on the table are not automatically updated.

ATTACH/DETACH Statement

The ATTACH statement adds a new database file to the catalog that can be read from and written to.

Examples

-- attach the database "file.db" with the alias inferred from the name ("file")
ATTACH 'file.db';
-- attach the database "file.db" with an explicit alias ("file_db")
ATTACH 'file.db' AS file_db;
-- attach the database "file.db" in read only mode
ATTACH 'file.db' (READ_ONLY);
-- attach a SQLite database for reading and writing (see the sqlite extension for more information)
ATTACH 'sqlite_file.db' AS sqlite_db (TYPE SQLITE);
-- attach the database "file.db" if inferred database alias "file_db" does not yet exist
ATTACH IF NOT EXISTS 'file.db';
-- attach the database "file.db" if explicit database alias "file_db" does not yet exist
ATTACH IF NOT EXISTS 'file.db' AS file_db;
-- create a table in the attached database with alias "file"
CREATE TABLE file.new_table (i INTEGER);
-- detach the database with alias "file"
DETACH file;
-- show a list of all attached databases
SHOW DATABASES;
-- change the default database that is used to the database "file"
USE file;

Attach

Attach Syntax ATTACH allows DuckDB to operate on multiple database files, and allows for transfer of data between different database
files.

Detach

The DETACH statement allows previously attached database files to be closed and detached, releasing any locks held on the database file.
It is not possible to detach from the default database: if you would like to do so, issue the USE statement to change the default database
to another one.

322
DuckDB Documentation

Note. Warning Closing the connection, e.g., invoking the close() function in Python, does not release the locks held on the
database files as the file handles are held by the main DuckDB instance (in Python's case, the duckdb module).

Detach Syntax

Name Qualification

The fully qualified name of catalog objects contains the catalog, the schema and the name of the object. For example:

-- attach the database "new_db"

ATTACH 'new_db.db';
-- create the schema "my_schema" in the database "new_db"
CREATE SCHEMA new_db.my_schema;
-- create the table "my_table" in the schema "my_schema"
CREATE TABLE new_db.my_schema.my_table (col INTEGER);
-- refer to the column "col" inside the table "my_table"
SELECT new_db.my_schema.my_table.col FROM new_db.my_schema.my_table;

Note that often the fully qualified name is not required. When a name is not fully qualified, the system looks for which entries to reference
using the catalog search path. The default catalog search path includes the system catalog, the temporary catalog and the initially attached
database together with the main schema.

Also note the rules on identifiers and database names in particular.

Default Database and Schema When a table is created without any qualifications, the table is created in the default schema of the default
database. The default database is the database that is launched when the system is created ‑ and the default schema is main.

-- create the table "my_table" in the default database

CREATE TABLE my_table (col INTEGER);

Changing the Default Database and Schema The default database and schema can be changed using the USE command.

-- set the default database schema to `new_db.main`

USE new_db;
-- set the default database schema to `new_db.my_schema`
USE new_db.my_schema;

Resolving Conflicts When providing only a single qualification, the system can interpret this as either a catalog or a schema, as long as
there are no conflicts. For example:

ATTACH 'new_db.db';
CREATE SCHEMA my_schema;
-- creates the table "new_db.main.tbl"
CREATE TABLE new_db.tbl (i INTEGER);
-- creates the table "default_db.my_schema.tbl"
CREATE TABLE my_schema.tbl (i INTEGER);

If we create a conflict (i.e., we have both a schema and a catalog with the same name) the system requests that a fully qualified path is used
instead:

CREATE SCHEMA new_db;

CREATE TABLE new_db.tbl (i INTEGER);
-- Error: Binder Error: Ambiguous reference to catalog or schema "new_db" -
-- use a fully qualified path like "memory.new_db"

323
DuckDB Documentation

Changing the Catalog Search Path The catalog search path can be adjusted by setting the search_path configuration option, which
uses a comma‑separated list of values that will be on the search path. The following example demonstrates searching in two databases:

ATTACH ':memory:' AS db1;

ATTACH ':memory:' AS db2;
CREATE table db1.tbl1 (i INTEGER);
CREATE table db2.tbl2 (j INTEGER);
-- reference the tables using their fully qualified name
SELECT * FROM db1.tbl1;
SELECT * FROM db2.tbl2;
-- or set the search path and reference the tables using their name
SET search_path = 'db1,db2';
SELECT * FROM tbl1;
SELECT * FROM tbl2;

Transactional Semantics

When running queries on multiple databases, the system opens separate transactions per database. The transactions are started lazily by
default ‑ when a given database is referenced for the first time in a query, a transaction for that database will be started. SET immediate_
transaction_mode = true can be toggled to change this behavior to eagerly start transactions in all attached databases instead.

While multiple transactions can be active at a time ‑ the system only supports writing to a single attached database in a single transaction.
If you try to write to multiple attached databases in a single transaction the following error will be thrown:

Attempting to write to database "db2" in a transaction that has already modified database "db1" -
a single transaction can only write to a single attached database.

The reason for this restriction is that the system does not maintain atomicity for transactions across attached databases. Transactions are
only atomic within each database file. By restricting the global transaction to write to only a single database file the atomicity guarantees
are maintained.

CALL Statement

The CALL statement invokes the given table function and returns the results.

Examples

-- Invoke the 'duckdb_functions' table function.

CALL duckdb_functions();
-- Invoke the 'pragma_table_info' table function.
CALL pragma_table_info('pg_am');

Syntax

CHECKPOINT Statement

The CHECKPOINT statement synchronizes data in the write‑ahead log (WAL) to the database data file. For in‑memory databases this
statement will succeed with no effect.

Examples

-- Synchronize data in the default database

CHECKPOINT;
-- Synchronize data in the specified database

324
DuckDB Documentation

CHECKPOINT file_db;
-- Abort any in-progress transactions to synchronize the data
FORCE CHECKPOINT;

Syntax

Checkpoint operations happen automatically based on the WAL size (see Configuration). This statement is for manual checkpoint ac‑
tions.

Behavior

The default CHECKPOINT command will fail if there are any running transactions. Including FORCE will abort any transactions and execute
the checkpoint operation.

Also see the related PRAGMA option for further behavior modification.

Reclaiming Space When performing a checkpoint (automatic or otherwise), the space occupied by deleted rows is partially reclaimed.
Note that this does not remove all deleted rows, but rather merges row groups that have a significant amount of deletes together. In the
current implementation this requires ~25% of rows to be deleted in adjacent row groups.

When running in in‑memory mode, checkpointing has no effect, hence it does not reclaim space after deletes in in‑memory databases.

Note. Warning The VACUUM statement does not trigger vacuuming deletes and hence does not reclaim space.

COMMENT ON Statement

The COMMENT ON statement allows adding metadata to catalog entries (tables, columns, etc.). It follows the PostgreSQL syntax.

Examples

COMMENT ON TABLE test_table IS 'very nice table';

COMMENT ON COLUMN test_table.test_table_column IS 'very nice column';
COMMENT ON VIEW test_view IS 'very nice view';
COMMENT ON INDEX test_index IS 'very nice index';
COMMENT ON SEQUENCE test_sequence IS 'very nice sequence';
COMMENT ON TYPE test_type IS 'very nice type';
COMMENT ON MACRO test_macro IS 'very nice macro';
COMMENT ON MACRO TABLE test_table_macro IS 'very nice table macro';
-- to unset a comment, set it to NULL, e.g.:
COMMENT ON TABLE test_table IS NULL;

Reading Comments

Comments can be read by querying the comment column of the respective metadata functions:

SELECT comment FROM duckdb_tables(); -- TABLE

SELECT comment FROM duckdb_columns(); -- COLUMN
SELECT comment FROM duckdb_views(); -- VIEW
SELECT comment FROM duckdb_indexes(); -- INDEX
SELECT comment FROM duckdb_sequences(); -- SEQUENCE
SELECT comment FROM duckdb_types(); -- TYPE
SELECT comment FROM duckdb_functions(); -- MACRO
SELECT comment FROM duckdb_functions(); -- MACRO TABLE

325
DuckDB Documentation

Limitations

The COMMENT ON statement currently has the following limitations:

• It is not possible to comment on schemas or databases.

• It is not possible to comment on things that have a dependency (e.g., a table with an index).

Syntax

COPY Statement

Examples

-- read a CSV file into the lineitem table, using auto-detected CSV options
COPY lineitem FROM 'lineitem.csv';
-- read a CSV file into the lineitem table, using manually specified CSV options
COPY lineitem FROM 'lineitem.csv' (DELIMITER '|');
-- read a Parquet file into the lineitem table
COPY lineitem FROM 'lineitem.pq' (FORMAT PARQUET);
-- read a JSON file into the lineitem table, using auto-detected options
COPY lineitem FROM 'lineitem.json' (FORMAT JSON, AUTO_DETECT true);
-- read a CSV file into the lineitem table, using double quotes
COPY lineitem FROM "lineitem.csv";
-- read a CSV file into the lineitem table, omitting quotes
COPY lineitem FROM lineitem.csv;

-- write a table to a CSV file

COPY lineitem TO 'lineitem.csv' (FORMAT CSV, DELIMITER '|', HEADER);
-- write a table to a CSV file, using double quotes
COPY lineitem TO "lineitem.csv";
-- write a table to a CSV file, omitting quotes
COPY lineitem TO lineitem.csv;
-- write the result of a query to a Parquet file
COPY (SELECT l_orderkey, l_partkey FROM lineitem) TO 'lineitem.parquet' (COMPRESSION ZSTD);

-- copy the entire content of database 'db1' to database 'db2'

COPY FROM DATABASE db1 TO db2;
-- copy only the schema (catalog elements) but not any data
COPY FROM DATABASE db1 TO db2 (SCHEMA);

Overview

COPY moves data between DuckDB and external files. COPY ... FROM imports data into DuckDB from an external file. COPY ... TO
writes data from DuckDB to an external file. The COPY command can be used for CSV, PARQUET and JSON files.

COPY ... FROM

COPY ... FROM imports data from an external file into an existing table. The data is appended to whatever data is in the table already.
The amount of columns inside the file must match the amount of columns in the table table_name, and the contents of the columns
must be convertible to the column types of the table. In case this is not possible, an error will be thrown.

If a list of columns is specified, COPY will only copy the data in the specified columns from the file. If there are any columns in the table that
are not in the column list, COPY ... FROM will insert the default values for those columns

326
DuckDB Documentation

-- Copy the contents of a comma-separated file 'test.csv' without a header into the table 'test'
COPY test FROM 'test.csv';
-- Copy the contents of a comma-separated file with a header into the 'category' table
COPY category FROM 'categories.csv' (HEADER);
-- Copy the contents of 'lineitem.tbl' into the 'lineitem' table, where the contents are delimited by a
pipe character ('|')
COPY lineitem FROM 'lineitem.tbl' (DELIMITER '|');
-- Copy the contents of 'lineitem.tbl' into the 'lineitem' table, where the delimiter, quote character,
and presence of a header are automatically detected
COPY lineitem FROM 'lineitem.tbl' (AUTO_DETECT true);
-- Read the contents of a comma-separated file 'names.csv' into the 'name' column of the 'category'
table. Any other columns of this table are filled with their default value.
COPY category(name) FROM 'names.csv';
-- Read the contents of a Parquet file 'lineitem.parquet' into the lineitem table
COPY lineitem FROM 'lineitem.parquet' (FORMAT PARQUET);
-- Read the contents of a newline-delimited JSON file 'lineitem.ndjson' into the lineitem table
COPY lineitem FROM 'lineitem.ndjson' (FORMAT JSON);
-- Read the contents of a JSON file 'lineitem.json' into the lineitem table
COPY lineitem FROM 'lineitem.json' (FORMAT JSON, ARRAY true);

Syntax

COPY ... TO

COPY ... TO exports data from DuckDB to an external CSV or Parquet file. It has mostly the same set of options as COPY ... FROM,
however, in the case of COPY ... TO the options specify how the file should be written to disk. Any file created by COPY ... TO can
be copied back into the database by using COPY ... FROM with a similar set of options.

The COPY ... TO function can be called specifying either a table name, or a query. When a table name is specified, the contents of the
entire table will be written into the resulting file. When a query is specified, the query is executed and the result of the query is written to
the resulting file.

-- Copy the contents of the 'lineitem' table to a CSV file with a header
COPY lineitem TO 'lineitem.csv';
-- Copy the contents of the 'lineitem' table to the file 'lineitem.tbl',
-- where the columns are delimited by a pipe character ('|'), including a header line.
COPY lineitem TO 'lineitem.tbl' (DELIMITER '|');
-- Use tab separators to create a TSV file without a header
COPY lineitem TO 'lineitem.tsv' (DELIMITER '\t', HEADER false);
-- Copy the l_orderkey column of the 'lineitem' table to the file 'orderkey.tbl'
COPY lineitem(l_orderkey) TO 'orderkey.tbl' (DELIMITER '|');
-- Copy the result of a query to the file 'query.csv', including a header with column names
COPY (SELECT 42 AS a, 'hello' AS b) TO 'query.csv' (DELIMITER ',');
-- Copy the result of a query to the Parquet file 'query.parquet'
COPY (SELECT 42 AS a, 'hello' AS b) TO 'query.parquet' (FORMAT PARQUET);
-- Copy the result of a query to the newline-delimited JSON file 'query.ndjson'
COPY (SELECT 42 AS a, 'hello' AS b) TO 'query.ndjson' (FORMAT JSON);
-- Copy the result of a query to the JSON file 'query.json'
COPY (SELECT 42 AS a, 'hello' AS b) TO 'query.json' (FORMAT JSON, ARRAY true);

COPY ... TO Options Zero or more copy options may be provided as a part of the copy operation. The WITH specifier is optional, but
if any options are specified, the parentheses are required. Parameter values can be passed in with or without wrapping in single quotes.

Any option that is a Boolean can be enabled or disabled in multiple ways. You can write true, ON, or 1 to enable the option, and false,
OFF, or 0 to disable it. The BOOLEAN value can also be omitted, e.g., by only passing (HEADER), in which case true is assumed.

The below options are applicable to all formats written with COPY.

327
DuckDB Documentation

Name Description Type Default

overwrite_or_ignore Whether or not to allow overwriting a directory if one BOOL false

already exists. Only has an effect when used with
partition_by.
file_size_bytes If this parameter is set, the COPY process creates a VARCHAR or (empty)
directory which will contain the exported files. If a file BIGINT
exceeds the set limit (specified as bytes such as 1000 or in
human‑readable format such as 1k), the process creates a
new file in the directory. This parameter works in
combination with per_thread_output. Note that the
size is used as an approximation, and files can be
occasionally slightly over the limit.
format Specifies the copy function to use. The default is selected VARCHAR auto
from the file extension (e.g., .parquet results in a
Parquet file being written/read). If the file extension is
unknown CSV is selected. Available options are CSV,
PARQUET and JSON.
partition_by The columns to partition by using a Hive partitioning VARCHAR[] (empty)
scheme, see the partitioned writes section.
per_thread_output Generate one file per thread, rather than one file in total. BOOL false
This allows for faster parallel writing.
use_tmp_file Whether or not to write to a temporary file first if the BOOL auto
original file exists (target.csv.tmp). This prevents
overwriting an existing file with a broken file in case the
writing is cancelled.

Syntax

COPY FROM DATABASE ... TO

The COPY FROM DATABASE ... TO statement copies the entire content from one attached database to another attached database.
This includes the schema, including constraints, indexes, sequences, macros, and the data itself.

ATTACH 'db1.db' AS db1;

CREATE TABLE db1.tbl AS SELECT 42 AS x, 3 AS y;
CREATE MACRO db1.two_x_plus_y(x, y) AS 2 * x + y;

ATTACH 'db2.db' AS db2;

COPY FROM DATABASE db1 TO db2;
SELECT db2.two_x_plus_y(x, y) AS z FROM db2.tbl;

┌───────┐
│ z │
│ int32 │
├───────┤
│ 87 │
└───────┘

To only copy the schema of db1 to db2 but omit copying the data, add SCHEMA to the statement:

COPY FROM DATABASE db1 TO db2 (SCHEMA);

328
DuckDB Documentation

Syntax

Format‑Specific Options

CSV Options The below options are applicable when writing CSV files.

Name Description Type Default

compression The compression type for the file. By default this will be VARCHAR auto
detected automatically from the file extension (e.g.,
file.csv.gz will use gzip, file.csv will use none).
Options are none, gzip, zstd.
force_quote The list of columns to always add quotes to, even if not VARCHAR[] []
required.
dateformat Specifies the date format to use when writing dates. See VARCHAR (empty)
Date Format
delim or sep The character that is written to separate columns within VARCHAR ,
each row.
escape The character that should appear before a character that VARCHAR "
matches the quote value.
header Whether or not to write a header for the CSV file. BOOL true
nullstr The string that is written to represent a NULL value. VARCHAR (empty)
quote The quoting character to be used when a data value is VARCHAR "
quoted.
timestampformat Specifies the date format to use when writing timestamps. VARCHAR (empty)
See Date Format

Parquet Options The below options are applicable when writing Parquet files.

Name Description Type Default

compression The compression format to use (uncompressed, VARCHAR snappy

snappy, gzip or zstd).
row_group_size The target size, i.e., number of rows, of each row group. BIGINT 122880
row_group_size_bytes The target size of each row group. You can pass either a BIGINT row_group_
human‑readable string, e.g., '2MB', or an integer, i.e., the size * 1024
number of bytes. This option is only used when you have
issued SET preserve_insertion_order =
false;, otherwise it is ignored.
field_ids The field_id for each column. Pass auto to attempt to STRUCT (empty)
infer automatically.

Some examples of FIELD_IDS are:

-- Assign field_ids automatically

COPY
(SELECT 128 AS i)
TO 'my.parquet'
(FIELD_IDS 'auto');

329
Du