0% found this document useful (0 votes)
8 views22 pages

Duckdb Cli Tutorial

DuckDB is a lightweight, in-process SQL OLAP database designed for analytical workloads, allowing users to run SQL queries without a server. The tutorial covers installation, basic CLI commands, data import/export, and advanced features like transactions and remote data querying. It is ideal for ad-hoc SQL queries and data science exploration, providing efficient analytics on laptops.

Uploaded by

baimochenya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views22 pages

Duckdb Cli Tutorial

DuckDB is a lightweight, in-process SQL OLAP database designed for analytical workloads, allowing users to run SQL queries without a server. The tutorial covers installation, basic CLI commands, data import/export, and advanced features like transactions and remote data querying. It is ideal for ad-hoc SQL queries and data science exploration, providing efficient analytics on laptops.

Uploaded by

baimochenya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

数据库系统

Database System
DuckDB Tutorial
DuckDB 教程

9/22/2025 Fall 2025 1


Getting Started with DuckDB CLI
• Lightweight, fast, and easy-to-use analytical SQL engine
• Focus: Practical CLI usage
What is DuckDB?
• In-process SQL OLAP database
• Designed for analytical workloads
• Runs without a server, embedded in apps or CLI
DuckDB
• Download:
https://duckdb.org/docs/installation/?version=stable&environ
ment=cli&platform=win&download_method=direct&architect
ure=x86_64
• x86_64

9/22/2025 Fall 2025 4


Installing DuckDB
• Download binaries from https://duckdb.org

• Linux/macOS: brew install duckdb or package manager

• Windows: use precompiled .exe

• Python/R: pip install duckdb, install.packages('duckdb')


Starting the CLI
• Open terminal (in Linxu, Ubuntu)
• Run:
duckdb

• Starts an interactive SQL shell


• By default, creates an in-memory DB
Creating a Database File
• Persistent database stored in file
• Start with file:
duckdb mydb.duckdb

• Reopen anytime with same command


Basic CLI Commands
• .help → list all commands
• .open file → open another database
• .quit → exit CLI
• .schema → show schema
Running SQL Queries
• Example:

CREATE TABLE users(id INTEGER, name TEXT);


INSERT INTO users VALUES (1, 'Alice'), (2, 'Bob');
SELECT * FROM users;

• CLI prints tabular results


Importing CSV Data
• Use built-in read_csv_auto:
CREATE TABLE trips AS SELECT * FROM
read_csv_auto('trips.csv’);

• Automatically detects schema


Inspecting Data
• .tables → list all tables
• .schema tablename → show table schema
• PRAGMA table_info('trips'); → query metadata
Querying Data
• Standard SQL supported:
SELECT COUNT(*) FROM trips;
SELECT city, AVG(duration) FROM trips GROUP BY city;

• Aggregate & group functions


Filtering and Joins
• Filtering with WHERE:
SELECT * FROM trips WHERE duration > 60;

• Joining:
SELECT u.name, t.city FROM users u JOIN trips t ON u.id =
t.user_id;
Using Parquet Files
• DuckDB reads/writes Parquet natively
• Example:
SELECT * FROM 'data.parquet’;

• Write query results: COPY (SELECT * FROM trips) TO


'out.parquet';
Exporting Results
• CSV export:

COPY trips TO 'trips.csv' (HEADER, DELIMITER ',’);

• Parquet export (efficient for large data)


CLI Output Options
• Change output format:
• .mode csv → output in CSV format
• .mode markdown → for docs
• .mode box → nice table boxes
Views and Temporary Tables
• Create view:
• CREATE VIEW active_trips AS SELECT * FROM trips WHERE
duration > 60;

• Temporary tables auto-cleaned after session


Transactions
• Start: BEGIN TRANSACTION;
• Multiple inserts/updates
• Commit: COMMIT;
• Rollback: ROLLBACK;
Extensions
• Load extensions dynamically:

INSTALL httpfs;
LOAD httpfs;
Enables S3/HTTP access
Querying Remote Data
• Example with HTTPFS:

SELECT * FROM read_csv_auto('https://.../data.csv’);

• Works with Parquet/CSV over HTTP or S3


Performance Features
• Vectorized execution engine
• Columnar storage
• Parallel query execution
• Efficient for analytics on laptops
Wrap Up
• DuckDB CLI is lightweight yet powerful

• Ideal for:
• - Ad-hoc SQL queries
• - Data science exploration
• - Fast analytics without servers

You might also like