0% found this document useful (0 votes)
23 views19 pages

Copy Activity in Azure Data Factory

The document outlines the process of converting a CSV file to JSON format using the Copy Activity in Azure Data Factory. It details steps including creating a storage account, uploading the CSV file, setting up linked services and datasets, and creating a pipeline to perform the conversion. The final steps involve running the pipeline and ensuring the JSON file has properly formatted keys.

Uploaded by

swapnabalaji897
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views19 pages

Copy Activity in Azure Data Factory

The document outlines the process of converting a CSV file to JSON format using the Copy Activity in Azure Data Factory. It details steps including creating a storage account, uploading the CSV file, setting up linked services and datasets, and creating a pipeline to perform the conversion. The final steps involve running the pipeline and ensuring the JSON file has properly formatted keys.

Uploaded by

swapnabalaji897
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

How to convert CSV file to JSON format with Copy Activity in Azure data factory

create a storage Account to add csv file

Create a container with name “landing” and upload csv file


Create linked service to connect to storage Account.

Linked services are much like connection strings,


which define the connection information that's
necessary for Data Factory to connect to external
resources.

Save “SAS URL” and “token” from storage Account and we use it to authenticate while
creating linked service as shown below:

A service shared access signature (SAS) delegates access to a resource in just one of the storage
services: Azure Blob Storage, Azure Queue Storage, Azure Table Storage, or Azure Files. The URI
for a service-level SAS consists of the URI to the resource for which the SAS will delegate access,
followed by the SAS token.
As csv file is in blob storage account, we need to select blob storage.
We need to create a Dataset to point to exact location(Source
location).

Dataset is the named view of data that references the data, can
be utilized in activities as inputs and outputs.

Datasets identify data within different data stores, such as tables,


files, folders, and documents.

Datasets are references to the data source locations.


Select Data Store:
Datastores are used for storing connection information to Azure storage services.

We need select the exact file format we are pointing. In our case , it is csv(comma-separated
values).
We need to select exact root location where we are picking up CSV file.
We need to enable “Auto resolve Integration Runtime”, if it is not enabled.

Note: By default Azure uses “Auto resolve Integration Runtime” at specific location.

The Integration Runtime (IR) is the compute infrastructure used by Azure Data
Factory to provide data integration capabilities across different network
environments.
Select the linked service to connect to storage account.

Select the Root folder


Next we need to create other Dataset to point to Destination location.
We need to change CSV file to JSON at destination location
Select Destination location where JSON file should land.
Two datasets Successfully created .Now we need to create a pipeline to pick csv file and
change csv to json and move it to Azure storage account.

A pipeline in ADF is a logical grouping of activities that together perform a task. The
activities in a pipeline define actions to perform on your data.

Data Factory supports three types of activities: data movement activities and data
transformation activities, and Control activities. Each activity can have zero or more
input datasets and produce one or more output datasets.

Copy Activity:

Create a new pipeline:

Data Factory uses the Copy activity to move source data from a data location
to a sink data store.
Select the source and sink dataset.
Click on Debug to run manually
We can see the status of pipeline in Debug tab
After pipeline succeeded , we can see JSON file created successfully.
CSV file

JSON File
We can notice that in JSON file , keys are not printed in proper format.
We need to select “First row as header” in dataset “Copy_CSV_to_JSON”

Now run the pipeline to see the changes in file.


CSV File:

JSON File with proper keys:


!!!!!Successfully moved CSV file to JSON file using COPY activity in pipeline!!!!!

For more updates . Follow me on LINKEDIN: LinkedIn.com/in/amulya1003

You might also like