Skip to content

Conversation

@avamingli
Copy link
Contributor

@avamingli avamingli commented Dec 16, 2024

Handle of data status changes for materialized views associated with partitioned tables.

Writable operations (INSERT, UPDATE, DELETE, COPY, TRUNCATE, CLUSTER and etc) on a base table lead to changes in the data status of linked materialized views. For partitioned tables, changes in child partitions can impact parent and ancestor tables, necessitating updates to the relevant materialized views.

For example, consider a root partitioned table P0 with two child partitioned tables: P1 and P2. Each child table further has
two sub-partitions: P1 has P1_1 and P1_2, while P2 has P2_1 and P2_2.
q_dynamic_table drawio
If we insert rows into P1_1, the data status of both mv0 (based on P0) and mv1 (based on P1) will change, as P1_1's parent table P1 and the root table P0 now contain more rows.
q_dynamic_table drawio2
Conversely, if we insert rows into P2_1, only the data status of mv2_1 (based on P2_1) and mv0 will be affected, while mv1 remains unchanged because the data in P1 has not changed.
q_dynamic_table drawio3

Propagation Direction:

Data status changes propagate both UP and DOWN the partition tree, except for specific DDL operations:

  • Create Table XXX Partition Of:
    Indicates an insert operation on the parent table.
  • Drop Table:
    Indicates a delete operation on the parent table.
  • Alter Table ATTACH/DETACH :
    Attaching implies an insert, while detaching implies a delete.

Handling TRUNCATE and CLUSTER: A TRUNCATE operation on a parent table is treated as a DELETE on its child tables, affecting both the parent and its descendants. Similarly, a CLUSTER operation will transform the data status in both UP and DOWN directions. However, CLUSTER has a recognized status to indicate that the changes have been applied to the pages of the tables.

The COPY command indicates that data has been inserted into the table and its ancestor tables.

The VACUUM command indicates that data has been recognized on the table and its ancestor tables.

Authored-by: Zhang Mingli [email protected]

Fixes #ISSUE_Number

What does this PR do?

Type of Change

  • Bug fix (non-breaking change)
  • New feature (non-breaking change)
  • Breaking change (fix or feature with breaking changes)
  • Documentation update

Breaking Changes

Test Plan

  • Unit tests added/updated
  • Integration tests added/updated
  • Passed make installcheck
  • Passed make -C src/test installcheck-cbdb-parallel

Impact

Performance:

User-facing changes:

Dependencies:

Checklist

Additional Context

CI Skip Instructions


@avamingli
Copy link
Contributor Author

This commit represents the first step in supporting partitioned tables for answering queries using materialized views of discussion #780. In a subsequent pull request, I will enable the full functionality, while this commit focuses solely on maintaining the data status.

@avamingli
Copy link
Contributor Author

avamingli commented Dec 17, 2024

Push a temp commit to debug CI failure which I could not reproduce locally. Will remove it after case fixed.

This commit improves the handling of data status changes for
materialized views associated with partitioned tables.

Writable operations (INSERT, UPDATE, DELETE, COPY, TRUNCATE, CLUSTER) on
a base table lead to changes in the data status of linked materialized views.
For partitioned tables, changes in child partitions can impact parent and
ancestor tables, necessitating updates to the relevant materialized views.

Example:

      P0
    /    \
   P1    P2
       /    \
     P2_1  P2_2

For a root partitioned table P0 with child partitions P1 and P2
(each having sub-partitions), inserting rows into P1_1 updates the
data status of mv0 (based on P0) and mv1 (based on P1). Conversely,
inserting into P2_1 only affects mv2_1 (based on P2_1) and mv0, leaving
mv1 unchanged.

Propagation Direction: Data status changes propagate both UP and DOWN the
partition tree, except for specific DDL operations:
  Create Table XXX Partition Of:
	Indicates an insert operation on the parent table.
  Drop Table:
	 Indicates a delete operation on the parent table.
  Alter Table ATTACH/DETACH :
	Attaching implies an insert, while detaching implies a delete.

Handling TRUNCATE and CLUSTER: A TRUNCATE operation on a parent table is
treated as a DELETE on its child tables, affecting both the parent and
its descendants. Similarly, a CLUSTER operation will transform the
data status in both UP and DOWN directions. However, CLUSTER has a
recognized status to indicate that the changes have been applied to
the pages of the tables.

The COPY command indicates that data has been inserted into the table
and its ancestor tables.

The VACUUM command indicates that data has been recognized on the table
and its ancestor tables.

Authored-by: Zhang Mingli [email protected]
@avamingli
Copy link
Contributor Author

Push a temp commit to debug CI failure which I could not reproduce locally. Will remove it after case fixed.

Didn't do anything and CI passed after retry.

Copy link
Member

@yjhjstz yjhjstz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@my-ship-it my-ship-it merged commit f34ae72 into apache:main Dec 25, 2024
10 checks passed
@avamingli avamingli deleted the qumv_par branch December 25, 2024 04:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants