0% found this document useful (0 votes)
181 views9 pages

ADF Data Flow Cheat Sheet

The document is a cheat sheet for Azure Data Factory (ADF) Data Flow, detailing various transformations used in data engineering. It covers basic transformations like Filter, Select, and Aggregate, as well as advanced techniques such as Join, Derived Column, and Sort. Additionally, it discusses specialized transformations like Union, Conditional Split, and Lookup, along with advanced control transformations like Alter Row, Exists, and Pivot.

Uploaded by

Sumanth Kulkarni
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
181 views9 pages

ADF Data Flow Cheat Sheet

The document is a cheat sheet for Azure Data Factory (ADF) Data Flow, detailing various transformations used in data engineering. It covers basic transformations like Filter, Select, and Aggregate, as well as advanced techniques such as Join, Derived Column, and Sort. Additionally, it discusses specialized transformations like Union, Conditional Split, and Lookup, along with advanced control transformations like Alter Row, Exists, and Pivot.

Uploaded by

Sumanth Kulkarni
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Azure Data Engineering

ADF Data Flow

Cheat Sheet
Part 1 - The Basics

1. Filter Transformation
- What It Does: Filters rows in your dataset based on
conditions you define.
- Use Case: Remove rows with null values or filter data
based on date ranges or specific conditions (e.g., only
select customers from a certain country).
- Pro Tip: Combine multiple conditions using AND/OR
logic to create complex filters.

2. Select Transformation
- What It Does: Chooses which columns to include in the
data flow output.
- Use Case: Streamline datasets by only keeping the
necessary columns for downstream activities.
- Pro Tip: You can also rename columns here, making it
easier to handle field mappings.
3. Aggregate Transformation-
- What It Does: Performs aggregation operations like sum,
average, min, max, count, and more on grouped data.
- Use Case: Summarize sales data by region or calculate
average transaction values.
- Pro Tip: Use group-by functions to calculate aggregated
values for different data categories.
Part 2 - Advanced
Techniques

1. Join Transformation
- What It Does: Combines two datasets based on a
common key, allowing you to merge records.
- Use Case: Join sales data with customer details using a
unique customer ID.
- Pro Tip: ADF supports various types of joins (inner, left,
right, full) to ensure you get the desired output.

2. Derived Column Transformation


- What It Does: Creates new columns by applying
expressions to existing data fields.
- Use Case: Generate calculated fields like “Total Price”
from individual unit prices and quantities.
- Pro Tip: Use ADF’s built-in functions like substring, date,
or math functions to create dynamic columns.
3. Sort Transformation
- What It Does: Sorts data by one or more columns, either
ascending or descending.
- Use Case: Organize your data for reports, or for
downstream systems that require sorted input.
- Pro Tip: Sorting can be memory-intensive, so only sort
when necessary and on relevant columns.
Part 3 - Specialized
Transformations

1. Union Transformation
- What It Does: Combines multiple datasets into a single
dataset by appending rows.
- Use Case: Union sales data from different regions or
months into one master dataset.
- Pro Tip: Ensure the column schemas of the datasets
match or use “Select” transformations to align them.

2. Conditional Split Transformation


- What It Does: Divides your dataset into multiple streams
based on conditions (think of it as an "if-else" for data).
- Use Case: Separate valid and invalid records or route
data to different processes based on status.
- Pro Tip: Use multiple conditions to split data in more
granular ways, improving the flexibility of your data
processing.
3. Lookup Transformation
- What It Does: Looks up values from another dataset and
adds them as new columns in your current dataset.
- Use Case: Enrich customer data by looking up additional
information like loyalty status or demographics.
- Pro Tip: Always ensure that the dataset being looked up
has unique keys for accurate results.
Part 4 - Advanced Control

1. Alter Row Transformation


- What It Does: Controls row-level operations like
inserting, updating, deleting, or skipping rows in sink
operations.
- Use Case: Use this transformation to perform UPSERT
operations into databases.
- Pro Tip: Combine with conditions to dynamically decide
how each row should be processed
(insert/update/delete).

2. Exists Transformation
- What It Does: Determines whether rows from one
stream exist in another stream, returning only rows that
match or don’t match.
- Use Case: Filter records that exist in one dataset but not
another (e.g., records that exist in source data but not in
destination).
- Pro Tip: Use for deduplication or filter out already
processed records from new data.
3. Pivot Transformation
- What It Does: Converts rows into columns, essentially
rotating your data for better analytics and reporting.
- Use Case: Create cross-tabular reports (e.g., sales per
product, per region).
- Pro Tip: Use aggregate functions in combination with
Pivot to summarize data effectively.

You might also like