0% found this document useful (0 votes)
38 views33 pages

Delta Lake With Azure Databricks

The document outlines SQL commands for optimizing and managing Delta Lake tables, including operations like Z-Ordering, schema evolution, and adding or renaming columns. It demonstrates creating a new table with clustering, updating schemas, and reordering columns. Additionally, it includes commands for reorganizing the table and setting properties for Delta Lake functionality.

Uploaded by

Woody Woodpecker
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views33 pages

Delta Lake With Azure Databricks

The document outlines SQL commands for optimizing and managing Delta Lake tables, including operations like Z-Ordering, schema evolution, and adding or renaming columns. It demonstrates creating a new table with clustering, updating schemas, and reordering columns. Additionally, it includes commands for reorganizing the table and setting properties for Delta Lake functionality.

Uploaded by

Woody Woodpecker
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

Optimize

%sql
optimize delta_catalog.raw.ext_table_dml;

ZOrder by

%sql
optimize delta_catalog.raw.ext_table_dml zorder by id;

Liquid Clusturing

%sql
create table delta_catalog.raw.liq_table
(
id int,
amount double,
name string
)
using delta
location 'abfss://[email protected]/liq_table'
cluster by (id)
%md from pyspark.sql.functions import * %md

df_new = df_new.withColumn("flag", lit(1)) ###Explicit Schema Update


###Schema Evolution
df_new.display() %md
%md
df_new.write.format('delta')\
**Add a Column**
####Merge Schema
.mode('append')\
%sql
my_data =
[(1,'food',10),(2,'drink',20),(3,'food',30),(4,'drink',40)] alter table delta_catalog.raw.ext_table_dml
.option("path","abfss://[email protected]
my_schema = "id INT, category STRING, amt INT" .windows.net/sch_tbl")\ add columns flag string;

.option("mergeSchema", "true")\ %sql

df = spark.createDataFrame(my_data, .save() select * from delta_catalog.raw.ext_table_dml


schema=my_schema)
df = %md
df.display() spark.read.format('delta').load('abfss://raw@mydeltalake
storage.dfs.core.windows.net/sch_tbl') ####Add a Column After
df.write.format('delta')\
df.display() %sql
.mode('append')\
alter table delta_catalog.raw.ext_table_dml

.option("path","abfss://[email protected] add columns newcol string after id;


.windows.net/sch_tbl")\
%md
.save()
####Reordering Columns
df_new =
df.union(spark.createDataFrame([(5,'food',50),(6,'drink',6 %sql
0)], schema=my_schema))
alter table delta_catalog.raw.ext_table_dml
alter column newcol after flag; 'delta.columnMapping.mode' = 'name'

%md );

####Rename Columns %md

%sql ALTER TABLE delta_catalog.raw.ext_table_dml ###REORG Command


ALTER TABLE delta_catalog.raw.ext_table_dml RENAME COLUMN newcol TO newflag;
%sql

SET TBLPROPERTIES ( %sql


reorg table delta_catalog.raw.ext_table_dml apply(purge)

'delta.minReaderVersion' = '2', select * from delta_catalog.raw.ext_table_dml

'delta.minWriterVersion' = '5',

You might also like