-
Notifications
You must be signed in to change notification settings - Fork 196
Implement of Directory table #390
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
commit message? |
|
postgres=# create directory table tbl; |
We need to support default tablespace here? |
|
postgres=# create directory table tbl tablespace pg_default; |
|
postgres=# create directory table tbl tablespace pg_default; |
Implement directory table feature in this commit. Directory table is a new
relation which used to organize the unstructured data files in the specified
tablespace. The date files are stored in the specified tablespace while
the tuples recorded the metadata of the data files such as relative_path, md5
size etc. are stored in normal table.
We support local directory table and remote directory table meanwhile. The
local directory table uses the local tablespace while the remote directory
table uses the DFS tablespace which implemented in our enterprise extension.
We support copy binary from to upload file to directory table, directory_table
UDF to get file content, remove_file UDF to remove file from directory table.
What's more, we implement a tool called cbload used to upload file to direcotry
table. Meanwhile, to support DFS directory table, we also import some catalog
tables such as gp_storage_server, gp_storage_user_mapping which are shared in
all databases.
We will illustrage some examples for your convinence of usage as follow.
-- Create an oss_server that points to endpoint:
CREATE STORAGE SERVER oss_server OPTIONS
(protocol 'qingstor', endpoint 'pek3b.qingstor.com', https 'true', virtual_host 'false');
-- Create a user mapping to access oss_server
CREATE STORAGE USER MAPPING FOR CURRENT_USER STORAGE SERVER oss_server OPTIONS
(accesskey 'KGCPPHVCHRDSYFEAWLLC', secretkey '0SJIWiIATh6jOlmAas23q6hOAGBI1BnsnvgJmTs');
-- Create a local tablespace
CREATE TABLESPACE dirtable_spc location '/data/dirtable_spc';
-- Create a local directory table
CREATE DIRECTORY TABLE dirtable TABLESPACE dirtable_spc;
-- Copy binary from directory table
COPY BINARY dirtable FROM '/data/file1.csv' 'file1';
-- Select directory table
SELECT * FROM dirtable;
SELECT * FROM directory_table('dirtable');
-- Remove file from directory table
SELECT remove_file('dirtable', 'file1');
Co-authored-by: Mu Guoqing [email protected]
Reviewd-by: Yang Yu [email protected]
Yang Jianghua [email protected]
fix #ISSUE_Number
Change logs
Describe your change clearly, including what problem is being solved or what feature is being added.
If it has some breaking backward or forward compatibility, please clary.
Why are the changes needed?
Describe why the changes are necessary.
Does this PR introduce any user-facing change?
If yes, please clarify the previous behavior and the change this PR proposes.
How was this patch tested?
Please detail how the changes were tested, including manual tests and any relevant unit or integration tests.
Contributor's Checklist
Here are some reminders and checklists before/when submitting your pull request, please check them:
make installcheckmake -C src/test installcheck-cbdb-parallelcloudberrydb/devteam for review and approval when your PR is ready🥳