0% found this document useful (0 votes)
23 views2 pages

Lab4 Partitioning

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views2 pages

Lab4 Partitioning

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Lab 4: Partition by List

In PostgreSQL, the PARTITION BY LIST is a method to create table partitions based on a


specific list of values for a designated partition key. This can be useful for organizing data into
partitions that have a logical relationship, making it easier to manage and query the data.
Here's a basic example of creating a partitioned table using PARTITION BY LIST:

CREATE TABLE sales (


sale_id serial PRIMARY KEY,
product_id INT,
sale_date DATE,
amount DECIMAL
) PARTITION BY LIST (product_id);

CREATE TABLE sales_electronics PARTITION OF sales FOR VALUES IN (1, 2, 3);


CREATE TABLE sales_clothing PARTITION OF sales FOR VALUES IN (4, 5, 6);
CREATE TABLE sales_books PARTITION OF sales FOR VALUES IN (7, 8, 9);

In this example, the sales table is partitioned based on the product_id column. Three
partitions (sales_electronics, sales_clothing, and sales_books) are created, each
containing a specific range of product IDs.
Partitioning in PostgreSQL, such as using PARTITION BY LIST, can be beneficial for large
datasets. In the context of the TPC-H benchmark, you might want to partition tables to optimize
certain types of queries. Let's consider an example with the lineitem table from TPC-H, where
we might partition by the shipdate column.

First, let's create the partitioned table:


CREATE TABLE lineitem (
l_orderkey INT,
l_partkey INT,
l_suppkey INT,
l_linenumber INT,
l_quantity DECIMAL,
l_extendedprice DECIMAL,
l_discount DECIMAL,
l_tax DECIMAL,
l_returnflag CHAR(1),
l_linestatus CHAR(1),
l_shipdate DATE,
l_commitdate DATE,
l_receiptdate DATE,
l_shipinstruct VARCHAR(25),
l_shipmode VARCHAR(10),
l_comment VARCHAR(44)
) PARTITION BY LIST (EXTRACT(YEAR FROM l_shipdate));

This example partitions the lineitem table based on the shipdate column,
creating separate partitions for each year. Now, let's assume you have a
specific query that you want to optimize. For instance, if you often query for
the total extended price of items shipped in a particular year, you might
rewrite the query to take advantage of partitioning.
-- Original query
SELECT SUM(l_extendedprice)
FROM lineitem
WHERE EXTRACT(YEAR FROM l_shipdate) = 1992;

-- Optimized query using partitioning


SELECT SUM(l_extendedprice)
FROM lineitem_1992;

By rewriting the query to reference the specific partition


(lineitem_1992), PostgreSQL can directly access the relevant partition,
potentially improving query performance.

Keep in mind that the effectiveness of partitioning depends on your specific workload and query
patterns. Additionally, it's important to regularly analyze and maintain partitions to ensure optimal
performance.
Remember to adapt the partitioning strategy based on your actual data distribution and the queries
you frequently run. In real-world scenarios, you might want to experiment with different partition
keys and strategies to find the most suitable solution for your specific use case.

QUERY PLAN
------------------------------------------------------------------------
Aggregate (cost=1000.00..1200.00 rows=1 width=8)
-> Partitioned Table Scan on lineitem (cost=0.00..1000.00 rows=10000 width=8)
Partition Cond: (l_shipdate >= '1992-01-01'::date AND l_shipdate < '1993-01-
01'::date)

You might also like