Delta Lake With Azure Databricks

The document outlines SQL commands for optimizing and managing Delta Lake tables, including operations like Z-Ordering, schema evolution, and adding or renaming columns. It demonstrates creating a new table with clustering, updating schemas, and reordering columns. Additionally, it includes commands for reorganizing the table and setting properties for Delta Lake functionality.

Uploaded by

Woody Woodpecker

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

38 views33 pages

Delta Lake With Azure Databricks

Uploaded by

Woody Woodpecker

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 33

Optimize

%sql
optimize delta_catalog.raw.ext_table_dml;

ZOrder by

%sql
optimize delta_catalog.raw.ext_table_dml zorder by id;

Liquid Clusturing

%sql
create table delta_catalog.raw.liq_table
(
id int,
amount double,
name string
)
using delta
location 'abfss://[email protected]/liq_table'
cluster by (id)
%md from pyspark.sql.functions import * %md

df_new = df_new.withColumn("flag", lit(1)) ###Explicit Schema Update

###Schema Evolution
df_new.display() %md
%md
df_new.write.format('delta')\
**Add a Column**
####Merge Schema
.mode('append')\
%sql
my_data =
[(1,'food',10),(2,'drink',20),(3,'food',30),(4,'drink',40)] alter table delta_catalog.raw.ext_table_dml
.option("path","abfss://[email protected]
my_schema = "id INT, category STRING, amt INT" .windows.net/sch_tbl")\ add columns flag string;

.option("mergeSchema", "true")\ %sql

df = spark.createDataFrame(my_data, .save() select * from delta_catalog.raw.ext_table_dml

schema=my_schema)
df = %md
df.display() spark.read.format('delta').load('abfss://raw@mydeltalake
storage.dfs.core.windows.net/sch_tbl') ####Add a Column After
df.write.format('delta')\
df.display() %sql
.mode('append')\
alter table delta_catalog.raw.ext_table_dml

.option("path","abfss://[email protected] add columns newcol string after id;

.windows.net/sch_tbl")\
%md
.save()
####Reordering Columns
df_new =
df.union(spark.createDataFrame([(5,'food',50),(6,'drink',6 %sql
0)], schema=my_schema))
alter table delta_catalog.raw.ext_table_dml
alter column newcol after flag; 'delta.columnMapping.mode' = 'name'

%md );

####Rename Columns %md

%sql ALTER TABLE delta_catalog.raw.ext_table_dml ###REORG Command

ALTER TABLE delta_catalog.raw.ext_table_dml RENAME COLUMN newcol TO newflag;
%sql

SET TBLPROPERTIES ( %sql

reorg table delta_catalog.raw.ext_table_dml apply(purge)

'delta.minReaderVersion' = '2', select * from delta_catalog.raw.ext_table_dml

'delta.minWriterVersion' = '5',

SCD 1,2,3
No ratings yet
SCD 1,2,3
4 pages
SQL & Pyspark
No ratings yet
SQL & Pyspark
9 pages
SQL Vs Pyspark-1
No ratings yet
SQL Vs Pyspark-1
9 pages
SQL & pySPARK
No ratings yet
SQL & pySPARK
9 pages
SQL vs PySpark Operations Guide
No ratings yet
SQL vs PySpark Operations Guide
8 pages
PySpark Cheatsheet - Elaborate
No ratings yet
PySpark Cheatsheet - Elaborate
14 pages
SQL PySpark Cheat Sheet 1731729790
No ratings yet
SQL PySpark Cheat Sheet 1731729790
9 pages
SQL To Pyspark Conversion
No ratings yet
SQL To Pyspark Conversion
9 pages
Spark SQL Optimization - Real Case Studies
No ratings yet
Spark SQL Optimization - Real Case Studies
18 pages
Pyspark SQL and DataFrames
No ratings yet
Pyspark SQL and DataFrames
6 pages
DP 700 Code Used 250701
No ratings yet
DP 700 Code Used 250701
47 pages
PySpark Transformations
No ratings yet
PySpark Transformations
18 pages
Deltatable
No ratings yet
Deltatable
22 pages
DWM Lab1
No ratings yet
DWM Lab1
5 pages
(Exam) Data Engineering Certification Prep Guide - Partners
No ratings yet
(Exam) Data Engineering Certification Prep Guide - Partners
15 pages
Databricks Spark Exam Notes
No ratings yet
Databricks Spark Exam Notes
27 pages
Data Engineering Cert Guide
No ratings yet
Data Engineering Cert Guide
15 pages
Big Data Analytics with Spark DataFrames
No ratings yet
Big Data Analytics with Spark DataFrames
79 pages
Pyspark SQL Basics Cheat Sheet: Python For Data Science
No ratings yet
Pyspark SQL Basics Cheat Sheet: Python For Data Science
1 page
First Pyspark
No ratings yet
First Pyspark
18 pages
Journal
No ratings yet
Journal
47 pages
Pyspark Interview Questions
No ratings yet
Pyspark Interview Questions
4 pages
SQL & PySpark for Data Engineers
No ratings yet
SQL & PySpark for Data Engineers
58 pages
Spark SQLPDF 20 Jan
No ratings yet
Spark SQLPDF 20 Jan
4 pages
SQL - & - Pyspak
No ratings yet
SQL - & - Pyspak
6 pages
PySpark All Query
No ratings yet
PySpark All Query
22 pages
Column Renaming in Pyspark
No ratings yet
Column Renaming in Pyspark
4 pages
Day 27
No ratings yet
Day 27
6 pages
SQL Final Document
No ratings yet
SQL Final Document
37 pages
V2 SQL Final Document
No ratings yet
V2 SQL Final Document
35 pages
Pyspark SQL Transformation Cheat Sheet
No ratings yet
Pyspark SQL Transformation Cheat Sheet
3 pages
Pyspark Syntax Using Simple Examples
No ratings yet
Pyspark Syntax Using Simple Examples
28 pages
Quewtion SQL - Pyspark
No ratings yet
Quewtion SQL - Pyspark
4 pages
Scenarios Where Bad Records Occur
No ratings yet
Scenarios Where Bad Records Occur
38 pages
HOL Hive
No ratings yet
HOL Hive
85 pages
DSDL
No ratings yet
DSDL
6 pages
(Big Data Analytics With PySpark) (CheatSheet)
No ratings yet
(Big Data Analytics With PySpark) (CheatSheet)
7 pages
Top 100 Pyspark Functions For Data Engineers 1738131847
No ratings yet
Top 100 Pyspark Functions For Data Engineers 1738131847
30 pages
Python Data Exploratory Commands
No ratings yet
Python Data Exploratory Commands
9 pages
Spark Data Processing Guide
No ratings yet
Spark Data Processing Guide
10 pages
Delta Lake Overview and Features
No ratings yet
Delta Lake Overview and Features
11 pages
Pyspark RDD and DataFrame Examples
No ratings yet
Pyspark RDD and DataFrame Examples
3 pages
Resumão - SQL Com Databricks
No ratings yet
Resumão - SQL Com Databricks
2 pages
SQL and PySpark
No ratings yet
SQL and PySpark
80 pages
Basic DataFrame Operation
No ratings yet
Basic DataFrame Operation
11 pages
Spark Test Que
No ratings yet
Spark Test Que
3 pages
SQL Cheat Sheet Python
100% (1)
SQL Cheat Sheet Python
1 page
PySpark Interview Cheatsheet 1741068112
No ratings yet
PySpark Interview Cheatsheet 1741068112
19 pages
Data Wrangling and EDA with PySpark
No ratings yet
Data Wrangling and EDA with PySpark
10 pages
PySpark SQL Pandas CheatSheet
No ratings yet
PySpark SQL Pandas CheatSheet
2 pages
DGDGSZ
No ratings yet
DGDGSZ
15 pages
Spark Entity Resolution with DataFrame Analysis
No ratings yet
Spark Entity Resolution with DataFrame Analysis
5 pages
DWDM Exeperiment 1
No ratings yet
DWDM Exeperiment 1
9 pages
Using Spark to Read CSV Data
No ratings yet
Using Spark to Read CSV Data
5 pages
The Complete SQL HandBook
No ratings yet
The Complete SQL HandBook
89 pages
Crime Analysis in India (2001-2013)
No ratings yet
Crime Analysis in India (2001-2013)
23 pages
Data Structure and Algorithms
No ratings yet
Data Structure and Algorithms
110 pages
Excel Mastery With These Guided Projects
100% (2)
Excel Mastery With These Guided Projects
66 pages
Celebrate 50 Years of Microsoft
No ratings yet
Celebrate 50 Years of Microsoft
28 pages
Power BI Interview Questions Part-1
No ratings yet
Power BI Interview Questions Part-1
53 pages
Mastering SQL CASE WHEN Statement
100% (1)
Mastering SQL CASE WHEN Statement
10 pages
Power BI Dax Cheat Sheet
No ratings yet
Power BI Dax Cheat Sheet
18 pages
Limpieza de Datos Con Pandas
100% (1)
Limpieza de Datos Con Pandas
19 pages
ETL Best Practices
No ratings yet
ETL Best Practices
21 pages
The Big Six - SQL
No ratings yet
The Big Six - SQL
23 pages
Trade Tariffs in 3 Levels of Difficulty
No ratings yet
Trade Tariffs in 3 Levels of Difficulty
10 pages
Dimensional Modeling
No ratings yet
Dimensional Modeling
52 pages
8 Machine Learning Algorithms
No ratings yet
8 Machine Learning Algorithms
13 pages
Inventory Abbreviations
No ratings yet
Inventory Abbreviations
13 pages
Data KPIs Cheat Sheet
100% (1)
Data KPIs Cheat Sheet
12 pages
R Cookbook: Geospatial Data Processing
No ratings yet
R Cookbook: Geospatial Data Processing
79 pages
Crack Your Databricks
100% (2)
Crack Your Databricks
103 pages
Java JDBC MySQL Database Guide
No ratings yet
Java JDBC MySQL Database Guide
6 pages
Introduction To CARDIC Hospital
No ratings yet
Introduction To CARDIC Hospital
30 pages
Grocery Store SQL Project
No ratings yet
Grocery Store SQL Project
1 page
Data Base and Data Model
No ratings yet
Data Base and Data Model
22 pages
Experienced Data Engineer Profile
No ratings yet
Experienced Data Engineer Profile
2 pages
Solutions DatabaseSystemConcepts 7thed
No ratings yet
Solutions DatabaseSystemConcepts 7thed
193 pages
UTF8 Character Set Conversion Guide
100% (1)
UTF8 Character Set Conversion Guide
11 pages
Relational Model in DBMS
No ratings yet
Relational Model in DBMS
10 pages
Toad For Oracle Version 10 Functional Matrix
No ratings yet
Toad For Oracle Version 10 Functional Matrix
7 pages
SQL Basics for Beginners
No ratings yet
SQL Basics for Beginners
31 pages
Chapter 3 - SQL Notes
No ratings yet
Chapter 3 - SQL Notes
25 pages
LECTURE - 9 Aggregation and Grouping
No ratings yet
LECTURE - 9 Aggregation and Grouping
39 pages
PHP With Mongodb
No ratings yet
PHP With Mongodb
7 pages
Database Programming With SQL Final Exam
100% (1)
Database Programming With SQL Final Exam
31 pages
Database Fundamentals
100% (1)
Database Fundamentals
4 pages
Curro - G11 - SQL Revision Exercises
No ratings yet
Curro - G11 - SQL Revision Exercises
1 page
SQL & HTML MCQs for Learners
No ratings yet
SQL & HTML MCQs for Learners
11 pages
WEBD 236: Web Information Systems Programming
No ratings yet
WEBD 236: Web Information Systems Programming
60 pages
Chapter 5 DBMS
No ratings yet
Chapter 5 DBMS
22 pages
Understanding ABAP Internal Tables
No ratings yet
Understanding ABAP Internal Tables
20 pages
PC 318 DBMS: Prepared By: Darwin N. Desamparado
No ratings yet
PC 318 DBMS: Prepared By: Darwin N. Desamparado
21 pages
Core SQL:2003 Command Overview
No ratings yet
Core SQL:2003 Command Overview
8 pages
Fundamental of Database System
No ratings yet
Fundamental of Database System
31 pages
DBMS Papers
No ratings yet
DBMS Papers
10 pages
DBMS Lab Question & Solution
No ratings yet
DBMS Lab Question & Solution
7 pages
Nosql Practice Questions
No ratings yet
Nosql Practice Questions
2 pages
01 Prepare Data With Power
No ratings yet
01 Prepare Data With Power
16 pages
Student Record Keeping System Database: Team Members
No ratings yet
Student Record Keeping System Database: Team Members
27 pages
DBMS Unit 2
No ratings yet
DBMS Unit 2
33 pages
Lab Assignment 5
No ratings yet
Lab Assignment 5
17 pages

Delta Lake With Azure Databricks

Uploaded by

Delta Lake With Azure Databricks

Uploaded by

Optimize

df_new = df_new.withColumn("flag", lit(1)) ###Explicit Schema Update

.option("mergeSchema", "true")\ %sql

df = spark.createDataFrame(my_data, .save() select * from delta_catalog.raw.ext_table_dml

.option("path","abfss://[email protected] add columns newcol string after id;

####Rename Columns %md

%sql ALTER TABLE delta_catalog.raw.ext_table_dml ###REORG Command

SET TBLPROPERTIES ( %sql

'delta.minReaderVersion' = '2', select * from delta_catalog.raw.ext_table_dml

You might also like