DP 700
DP 700
https://www.certyiq.com
Mail us on - [email protected]
Cell
Live Mentor Want A Break?
Support & Student Pause Your
Experience Team Course
Dedicated TAs and Student experience Take a short break when you need
Percentage Students placed team to make sure that your doubts get it. Pause your course for upto 60
so far placement in top MNCs resolved quickly and you don't miss your days. Resume when you are ready
deadlines.
(DP-700)
This is a case study. Case studies are not timed separately. You can use as much exam time as you would like to
complete each case. However, there may be additional case studies and sections on this exam. You must manage
your time to ensure that you are able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information that is provided in the
case study. Case studies might contain exhibits and other resources that provide more information about the
scenario that is described in the case study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to review your answers and to
make changes before you move to the next section of the exam. After you begin a new section, you cannot return
to this section.
To display the first question in this case study, click the Next button. Use the buttons in the left pane to explore
the content of the case study before you answer the questions.
Clicking these buttons displays information such as business requirements, existing environment, and problem
statements. If the case study has an All Information tab, note that the information displayed is identical to the
information displayed on the subsequent tabs. When you are ready to answer a question, click the Question button
to return to the question.
Contoso, Ltd. is an online retail company that wants to modernize its analytics platform by moving to Fabric. The
company plans to begin using Fabric for marketing analytics.
Overview. IT Structure -
The company’s IT department has a team of data analysts and a team of data engineers that use analytics
systems.
The data engineers perform the ingestion, transformation, and loading of data. They prefer to use Python or SQL to
transform the data.
The data analysts query data and create semantic models and reports. They are qualified to write queries in Power
Query and T-SQL.
Contoso has an F64 capacity named Cap1. All Fabric users are allowed to create items.
Contoso has two workspaces named WorkspaceA and WorkspaceB that currently use Pro license mode.
Contoso has a point of sale (POS) system named POS1 that uses an instance of SQL Server on Azure Virtual
Machines in the same Microsoft Entra tenant as Fabric. The host virtual machine is on a private virtual network that
has public access blocked. POS1 contains all the sales transactions that were processed on the company’s
website.
The company has a software as a service (SaaS) online marketing app named MAR1. MAR1 has seven entities. The
entities contain data that relates to email open rates and interaction rates, as well as website interactions. The
data can be exported from MAR1 by calling REST APIs. Each entity has a different endpoint.
Contoso has been using MAR1 for one year. Data from prior years is stored in Parquet files in an Amazon Simple
Storage Service (Amazon S3) bucket. There are 12 files that range in size from 300 MB to 900 MB and relate to
email interactions.
Products -
ProductCategories -
ProductSubcategories -
In the data, products are related to product subcategories, and subcategories are related to product categories.
The company has an existing Azure DevOps organization and creates a new project for repositories that relate to
Fabric.
Existing Environment. User Problems
The VP of marketing at Contoso requires analysis on the effectiveness of different types of email content. It
typically takes a week to manually compile and analyze the data. Contoso wants to reduce the time to less than
one day by using Fabric.
The data engineering team has successfully exported data from MAR1. The team experiences transient
connectivity errors, which causes the data exports to fail.
The new lakehouses must follow a medallion architecture by using the following three layers: bronze, silver, and
gold. There will be extensive data cleansing required to populate the MAR1 data in the silver layer, including
deduplication, the handling of missing values, and the standardizing of capitalization.
Each layer must be fully populated before moving on to the next layer. If any step in populating the lakehouses
fails, an email must be sent to the data engineers.
Items that relate to data ingestion must meet the following requirements:
The items must be source controlled alongside other workspace items.
Ingested data must land in the bronze layer of Lakehouse1 in the Delta format.
No changes other than changes to the file formats must be implemented before the data lands in the bronze layer.
Development effort must be minimized and a built-in connection must be used to import the source data.
In the event of a connectivity error, the ingestion processes must attempt the connection again.
Lakehouses, data pipelines, and notebooks must be stored in WorkspaceA. Semantic models, reports, and
dataflows must be stored in WorkspaceB.
Once a week, old files that are no longer referenced by a Delta table log must be removed.
Some product categories and subcategories are NOT assigned to any product. They are NOT analytically relevant
and must be omitted from the product dimension in the gold layer.
The data analysts must only have read access to the Delta tables in the gold layer.
The data analysts must NOT have access to the data in the bronze and silver layers.
The data engineers must be able to commit changes to source control in WorkspaceA.
You need to ensure that the data analysts can access the gold layer lakehouse.
Answer: C
Explanation:
C: Share the lakehouse with the DataAnalysts group and grant the Read all data permission.This approach
ensures that data analysts have the necessary read access to the Delta tables in the gold layer, aligning with
the requirement that they should not have access to data in the bronze and silver layers.
By granting Read all SQL Endpoint data permission, the analysts get the necessary and sufficient access to
query the gold layer data while adhering to the principle of least privilege.
Question: 2 CertyIQ
You have a Fabric workspace.
You have semi-structured data.
You need to read the data by using T-SQL, KQL, and Apache Spark. The data will only be written by using Spark.
What should you use to store the data?
A.a lakehouse
B.an eventhouse
C.a datamart
D.a warehouse
Answer: B
Explanation:
B. an eventhouse .
an eventhouse suggests the focus is on storing event-based or streaming data. While the data itself is semi-
structured and written using Apache Spark, the choice of an eventhouse aligns with scenarios where real-time
ingestion and analysis of data streams are required. Additionally, an eventhouse is optimized for applications
handling high-frequency data events, making it suitable for Spark-based write operations and enabling the
integration of T-SQL, KQL, and Spark query capabilities.
Question: 3 CertyIQ
You have a Fabric workspace that contains a warehouse named Warehouse1.
You have an on-premises Microsoft SQL Server database named Database1 that is accessed by using an on-
premises data gateway.
You need to copy data from Database1 to Warehouse1.
Which item should you use?
Answer: B
Explanation:
B: a data pipeline.
A data pipeline is the most suitable tool for moving data between different sources and destinations. In this
case, you need to copy data from your on-premises Microsoft SQL Server database (Database1) to your Fabric
warehouse (Warehouse1). A data pipeline can efficiently handle this task by allowing you to define and
manage the data transfer process.
Question: 4 CertyIQ
You have a Fabric workspace that contains a warehouse named Warehouse1.
You have an on-premises Microsoft SQL Server database named Database1 that is accessed by using an on-
premises data gateway.
You need to copy data from Database1 to Warehouse1.
Which item should you use?
Answer: B
Explanation:
B: a data pipeline.
A data pipeline is specifically designed for orchestrating and automating data movement tasks between
different sources and destinations. Here’s why a data pipeline is the best choice for copying data from your
on-premises Microsoft SQL Server database (Database1) to your Fabric warehouse (Warehouse1)
Data pipelines in Microsoft Fabric are designed to facilitate the movement and transformation of data
between various sources and destinations. In this scenario, a data pipeline can be configured to copy data
from the on-premises SQL Server database to the Fabric warehouse, utilizing the on-premises data gateway
for secure connectivity.
Question: 5 CertyIQ
You have a Fabric F32 capacity that contains a workspace. The workspace contains a warehouse named DW1 that
is modelled by using MD5 hash surrogate keys.
DW1 contains a single fact table that has grown from 200 million rows to 500 million rows during the past year.
You have Microsoft Power BI reports that are based on Direct Lake. The reports show year-over-year values.
Users report that the performance of some of the reports has degraded over time and some visuals show errors.
You need to resolve the performance issues. The solution must meet the following requirements:
Provide the best query performance.
Minimize operational costs.
Which should you do?
Answer: C
Explanation:
C. Enable V-Order.
V-Order is a feature that optimizes query performance by enabling faster data retrieval, especially for large
datasets. It organizes the data in a compressed format that improves storage efficiency and query speed,
which directly addresses the issue of degraded performance in reports. Additionally, enabling V-Order
minimizes operational costs because it reduces the amount of storage used and accelerates query execution,
avoiding the need for expensive resource scaling (like increasing capacity).
Question: 6 CertyIQ
HOTSPOT -
You have a Fabric workspace that contains a warehouse named DW1. DW1 contains the following tables and
columns.
You need to create an output that presents the summarized values of all the order quantities by year and product.
The results must include a summary of the order quantities at the year level for all the products.
How should you complete the code? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
Answer:
Explanation:
Key Details:
The use of ROLLUP ensures compliance with the requirement for summarized values at different grouping
levels.
SUM(SO.OrderQty) calculates the total order quantities.
Question: 7 CertyIQ
You have a Fabric workspace that contains a lakehouse named Lakehouse1. Data is ingested into Lakehouse1 as
one flat table. The table contains the following columns.
You plan to load the data into a dimensional model and implement a star schema. From the original flat table, you
create two tables named FactSales and DimProduct. You will track changes in DimProduct.
You need to prepare the data.
Which three columns should you include in the DimProduct table? Each correct answer presents part of the
solution.
NOTE: Each correct selection is worth one point.
A.Date
B.ProductName
C.ProductColor
D.TransactionID
E.SalesAmount
F.ProductID
Answer: BCF
Explanation:
B. ProductName: This attribute describes the product and is crucial for understanding and analyzing the data
related to each product.
C. ProductColor: This attribute provides additional information about the product, which can be useful for
analysis, reporting, and segmentation.
F. ProductID: This is the unique identifier for each product and serves as the primary key for the DimProduct
table. It's essential for establishing the relationship between the FactSales table and the DimProduct table.
Question: 8 CertyIQ
You have a Fabric workspace named Workspace1 that contains a notebook named Notebook1.
In Workspace1, you create a new notebook named Notebook2.
You need to ensure that you can attach Notebook2 to the same Apache Spark session as Notebook1.
What should you do?
Answer: A
Explanation:
A.Enable high concurrency for notebooks: High concurrency allows multiple notebooks to share the same
Apache Spark session. This setting ensures that different notebooks can run simultaneously within the same
session, facilitating collaboration and efficient resource usage.
Question: 9 CertyIQ
You have a Fabric workspace named Workspace1 that contains a lakehouse named Lakehouse1. Lakehouse1
contains the following tables:
Orders -
Customer -
Employee -
The Employee table contains Personally Identifiable Information (PII).
A data engineer is building a workflow that requires writing data to the Customer table, however, the user does
NOT have the elevated permissions required to view the contents of the Employee table.
You need to ensure that the data engineer can write data to the Customer table without reading data from the
Employee table.
Which three actions should you perform? Each correct answer presents part of the solution.
NOTE: Each correct selection is worth one point.
Answer: DEF
Explanation:
Assigning the Contributor role to the data engineer for Workspace1 grants them the necessary permissions to
write data to the Customer table in Lakehouse1. However, since the data engineer does not have elevated
permissions to view the Employee table, they won't be able to access its content.
E. Migrate the Employee table from Lakehouse1 to Lakehouse2:
Moving the Employee table, which contains Personally Identifiable Information (PII), to a separate Lakehouse2
helps ensure that the data engineer cannot accidentally or intentionally access it. This action keeps sensitive
data segregated from the data engineer's operational environment.
F. Create a new workspace named Workspace2 that contains a new lakehouse named Lakehouse2:
By creating a new workspace and lakehouse for the Employee table, you further isolate the sensitive data.
The data engineer can still perform their tasks in Workspace1 without accessing Workspace2, ensuring secure
data handling and compliance with privacy requirements.
Question: 10 CertyIQ
You have a Fabric warehouse named DW1. DW1 contains a table that stores sales data and is used by multiple sales
representatives.
You plan to implement row-level security (RLS).
You need to ensure that the sales representatives can see only their respective data.
Which warehouse object do you require to implement RLS?
A.STORED PROCEDURE
B.CONSTRAINT
C.SCHEMA
D.FUNCTION
Answer: D
Explanation:
To implement Row-Level Security (RLS) in a Fabric warehouse like DW1, need to use a FUNCTION to define
the filtering logic. Specifically, a user-defined function (UDF) is created and associated with the RLS policy to
determine which rows each user can access.
Reference:
https://learn.microsoft.com/en-us/fabric/data-warehouse/tutorial-row-level-security#2-define-security-
policies
Question: 11 CertyIQ
HOTSPOT -
You have a Fabric workspace named Workspace1_DEV that contains the following items:
10 reports
Four notebooks -
Three lakehouses -
Answer: No/Yes/No
Explanation:
1. Data from the semantic models will be deployed to the target stage.
Answer: No
Semantic models are only deployed to the target stage in the form of metadata. The deployment process
does not copy actual data; instead, only the structural and configuration metadata (e.g., model schema and
measures) is deployed. The target stage will require a refresh to fetch the data into the semantic models.
Reference: Microsoft Learn - Item Properties Copied During Deployment
Answer: Yes
Dataflow Gen1 objects are included in the deployment pipeline and are fully deployed to the target stage,
including their configurations. This ensures that Dataflow Gen1 pipelines can run in the target environment.
The deployment process supports this functionality without requiring a manual configuration.
Answer: No
The deployment process does not copy or deploy refresh schedules for datasets, semantic models, or other
items. Although metadata for the items is deployed, refresh schedules must be manually recreated or
configured in the target stage. This limitation is highlighted in Microsoft's documentation.
Reference: Microsoft Learn - Item Properties Copied During Deployment
Question: 12 CertyIQ
You have a Fabric deployment pipeline that uses three workspaces named Dev, Test, and Prod.
You need to deploy an eventhouse as part of the deployment process.
What should you use to add the eventhouse to the deployment process?
A.GitHub Actions
B.a deployment pipeline
C.an Azure DevOps pipeline
Answer: B
Explanation:
B. a deployment pipeline.
Deployment Pipeline: In Microsoft Fabric, a deployment pipeline is specifically designed for managing and
deploying resources across different environments (Dev, Test, and Prod). It allows you to automate the
deployment process, ensuring consistency and efficiency. By using a deployment pipeline, you can easily
include the eventhouse in your deployment process and manage its promotion through the different stages
(Dev, Test, Prod).
Reference:
https://learn.microsoft.com/en-us/fabric/cicd/deployment-pipelines/get-started-with-deployment-pipelines?
tabs=from-fabric%2Cnew%2Cstage-settings-new
https://learn.microsoft.com/en-us/fabric/cicd/deployment-pipelines/understand-the-deployment-process?
tabs=new
Question: 13 CertyIQ
You have a Fabric workspace named Workspace1 that contains a warehouse named Warehouse1.
You plan to deploy Warehouse1 to a new workspace named Workspace2.
As part of the deployment process, you need to verify whether Warehouse1 contains invalid references. The
solution must minimize development effort.
What should you use?
Answer: B
Explanation:
Microsoft Fabric's deployment pipelines provide a built-in mechanism to manage and validate the deployment
of artifacts like warehouses. When you use a deployment pipeline to move Warehouse1 from one workspace
(Workspace1) to another (Workspace2), the pipeline automatically checks for issues such as invalid
references or missing dependencies during the deployment process.
Question: 14 CertyIQ
You have a Fabric workspace that contains a Real-Time Intelligence solution and an eventhouse.
Users report that from OneLake file explorer, they cannot see the data from the eventhouse.
You enable OneLake availability for the eventhouse.
What will be copied to OneLake?
A.only data added to new databases that are added to the eventhouse
B.only the existing data in the eventhouse
C.no data
D.both new data and existing data in the eventhouse
E.only new data added to the eventhouse
Answer: D
Explanation:
When you enable OneLake availability for the eventhouse, all existing data in the eventhouse is copied to
OneLake, ensuring that users have access to historical data. Additionally, any new data added to the
eventhouse after enabling OneLake availability will also be synchronized and accessible through OneLake.
This ensures seamless integration of past and future data for users leveraging OneLake file explorer.
Question: 15 CertyIQ
You have a Fabric workspace named Workspace1.
You plan to integrate Workspace1 with Azure DevOps.
You will use a Fabric deployment pipeline named deployPipeline1 to deploy items from Workspace1 to higher
environment workspaces as part of a medallion architecture. You will run deployPipeline1 by using an API call from
an Azure DevOps pipeline.
You need to configure API authentication between Azure DevOps and Fabric.
Which type of authentication should you use?
A.service principal
B.Microsoft Entra username and password
C.managed private endpoint
D.workspace identity
Answer: A
Explanation:
A. service principal.
Service Principal: A service principal is a security identity used by applications, services, and automation tools
to access specific Azure resources. It provides a secure way to authenticate and authorize API calls between
Azure DevOps and Fabric. By using a service principal, you can grant the necessary permissions to
deployPipeline1 to interact with the Fabric workspace (Workspace1) and deploy items to higher environments.
This approach ensures secure and managed access without relying on individual user credentials.
Question: 16 CertyIQ
You have a Google Cloud Storage (GCS) container named storage1 that contains the files shown in the following
table.
You have a Fabric workspace named Workspace1 that has the cache for shortcuts enabled. Workspace1 contains a
lakehouse named Lakehouse1. Lakehouse1 has the shortcuts shown in the following table.
A.Stores only
B.Products only
C.Stores and Products only
D.Products, Stores, and Trips
E.Trips only
F.Products and Trips only
Answer: C
Explanation:
When the cache for shortcuts is enabled in a Fabric workspace, it allows for faster access to the data by
caching the files locally. However, the effectiveness of this caching depends on whether the cache was
enabled before the files were added to the storage or if the shortcuts were already pointing to those files.
Question: 17 CertyIQ
You have a Fabric workspace named Workspace1 that contains an Apache Spark job definition named Job1.
You have an Azure SQL database named Source1 that has public internet access disabled.
You need to ensure that Job1 can access the data in Source1.
What should you create?
Answer: B
Explanation:
Managed Private Endpoint: This allows secure and private communication between Azure services without
exposing data to the public internet. By creating a managed private endpoint, you can establish a direct
connection between the Apache Spark job in Workspace1 and the Azure SQL database (Source1) while
keeping public internet access disabled. This approach ensures that data transfer happens securely within the
Azure network.
To ensure that Job1 can access the data in Source1, you need to create a managed private endpoint. This will
allow the Spark job to securely connect to the Azure SQL database without requiring public internet access.
Question: 18 CertyIQ
You have an Azure Data Lake Storage Gen2 account named storage1 and an Amazon S3 bucket named storage2.
You have the Delta Parquet files shown in the following table.
You have a Fabric workspace named Workspace1 that has the cache for shortcuts enabled. Workspace1 contains a
lakehouse named Lakehouse1. Lakehouse1 has the following shortcuts:
A shortcut to ProductFile aliased as Products
A shortcut to StoreFile aliased as Stores
A shortcut to TripsFile aliased as Trips
The data from which shortcuts will be retrieved from the cache?
Answer: B
Explanation:
When the cache for shortcuts is enabled in a Fabric workspace, it allows for faster access to the data by
caching the files locally. This means that data accessed through the cached shortcuts is retrieved from the
local cache instead of the original storage locations, which improves performance.
Reference:
https://learn.microsoft.com/en-us/fabric/onelake/onelake-shortcuts
Question: 19 CertyIQ
HOTSPOT -
You have a Fabric workspace named Workspace1 that contains the items shown in the following table.
For Model1, the Keep your Direct Lake data up to date option is disabled.
You need to configure the execution of the items to meet the following requirements:
Notebook1 must execute every weekday at 8:00 AM.
Notebook2 must execute when a file is saved to an Azure Blob Storage container.
Model1 must refresh when Notebook1 has executed successfully.
How should you orchestrate each item? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
Answer:
A.workspace Admin
B.domain admin
C.domain contributor
D.Fabric admin
Answer: D
Explanation:
Fabric Admin: Possesses the highest level of permissions within the Fabric environment, enabling the creation
of domains and subdomains, as well as the assignment of resources to those subdomains.
Question: 21 CertyIQ
You have a Fabric workspace named Workspace1 that contains a warehouse named DW1 and a data pipeline named
Pipeline1.
You plan to add a user named User3 to Workspace1.
You need to ensure that User3 can perform the following actions:
View all the items in Workspace1.
Update the tables in DW1.
The solution must follow the principle of least privilege.
You already assigned the appropriate object-level permissions to DW1.
Which workspace role should you assign to User3?
A.Admin
B.Member
C.Viewer
D.Contributor
Answer: B
Explanation:
Member: This role allows users to view and interact with all the items in the workspace. When combined with
the already assigned object-level permissions to DW1, it ensures that User3 can update the tables in DW1.
Question: 22 CertyIQ
You have a Fabric capacity that contains a workspace named Workspace1. Workspace1 contains a lakehouse
named Lakehouse1, a data pipeline, a notebook, and several Microsoft Power BI reports.
A user named User1 wants to use SQL to analyze the data in Lakehouse1.
You need to configure access for User1. The solution must meet the following requirements:
Provide User1 with read access to the table data in Lakehouse1.
Prevent User1 from using Apache Spark to query the underlying files in Lakehouse1.
Prevent User1 from accessing other items in Workspace1.
What should you do?
A.Share Lakehouse1 with User1 directly and select Read all SQL endpoint data.
B.Assign User1 the Viewer role for Workspace1. Share Lakehouse1 with User1 and select Read all SQL endpoint
data.
C.Share Lakehouse1 with User1 directly and select Build reports on the default semantic model.
D.Assign User1 the Member role for Workspace1. Share Lakehouse1 with User1 and select Read all SQL
endpoint data.
Answer: A
Explanation:
A. Share Lakehouse1 with User1 directly and select Read all SQL endpoint data.
Share Lakehouse1 with User1 directly and select Read all SQL endpoint data: This approach grants User1
read access specifically to the table data in Lakehouse1 through the SQL endpoint, without giving them
broader permissions in Workspace1 or access to other items. By directly sharing Lakehouse1 and selecting the
"Read all SQL endpoint data" option, you ensure User1 can use SQL to analyze the data while preventing them
from using Apache Spark to query the underlying files.
Question: 23 CertyIQ
DRAG DROP -
You are implementing the following data entities in a Fabric environment:
Entity1: Available in a lakehouse and contains data that will be used as a core organization entity
Entity2: Available in a semantic model and contains data that meets organizational standards
Entity3: Available in a Microsoft Power BI report and contains data that is ready for sharing and reuse
Entity4: Available in a Power BI dashboard and contains approved data for executive-level decision making
Your company requires that specific governance processes be implemented for the data.
You need to apply endorsement badges to the entities based on each entity’s use case.
Which badge should you apply to each entity? To answer, drag the appropriate badges the correct entities. Each
badge may be used once, more than once, or not at all. You may need to drag the split bar between panes or scroll
to view content.
NOTE: Each correct selection is worth one point.
Answer:
Explanation:
1.Master Data.
Refers to authoritative data that is central to business operations, often stored in a master data management
system.
2.Certified.
Indicates that an entity (such as a dataset or report) is officially validated by an authority in the organization.
3.Promoted.
Indicates that an entity is recommended for use but is not fully certified.
This badge is usually given when an item is considered useful but has not gone through a formal approval
process.
Assigned to Entity3, which signifies that it is endorsed for use but not yet fully certified.
4. Cannot be Endorsed.
Indicates that an entity does not qualify for endorsement (either promoted or certified).
Assigned to Entity4, meaning it has not met the standards for endorsement.
Question: 24 CertyIQ
HOTSPOT -
You have three users named User1, User2, and User3.
You have the Fabric workspaces shown in the following table.
You have a security group named Group1 that contains User1 and User3.
The Fabric admin creates the domains shown in the following table.
Explanation:
The "Yes" option is selected, meaning User3 does have Viewer access to Workspace3.
The Viewer role allows read-only access to the workspace but does not permit modifications.
The "Yes" option is selected, meaning User3 has Domain Contributor permissions in Domain1.
The Domain Contributor role typically allows managing content within a domain but does not grant full admin
rights.
The "No" option is selected, meaning User2 does NOT have Contributor access to Workspace3.
The Contributor role would allow editing content in the workspace, but since "No" is selected, User2 lacks
these permissions.
Question: 25 CertyIQ
You have two Fabric workspaces named Workspace1 and Workspace2.
You have a Fabric deployment pipeline named deployPipeline1 that deploys items from Workspace1 to
Workspace2. DeployPipeline1 contains all the items in Workspace1.
You recently modified the items in Workspaces1.
The workspaces currently contain the items shown in the following table.
Items in Workspace1 that have the same name as items in Workspace2 are currently paired.
You need to ensure that the items in Workspace1 overwrite the corresponding items in Workspace2. The solution
must minimize effort.
What should you do?
Answer: D
Explanation:
When items in Workspace1 and Workspace2 are paired and you run the deployment pipeline (deployPipeline1),
the pipeline will automatically update the paired items in Workspace2 with the changes made in Workspace1.
This means that the modifications in Workspace1 will overwrite the corresponding items in Workspace2
without requiring any additional steps.
Question: 26 CertyIQ
You have a Fabric workspace named Workspace1 that contains a data pipeline named Pipeline1 and a lakehouse
named Lakehouse1.
You have a deployment pipeline named deployPipeline1 that deploys Workspace1 to Workspace2.
You restructure Workspace1 by adding a folder named Folder1 and moving Pipeline1 to Folder1.
You use deployPipeline1 to deploy Workspace1 to Workspace2.
What occurs to Workspace2?
Answer: A
Explanation:
A. Folder1 is created, Pipeline1 moves to Folder1, and Lakehouse1 is deployed.
Folder1 is created: The deployment pipeline will replicate the structure of Workspace1 in Workspace2,
including the creation of Folder1.
Pipeline1 moves to Folder1: Since Pipeline1 was moved to Folder1 in Workspace1, it will be deployed to Folder1
in Workspace2.
Lakehouse1 is deployed: Lakehouse1 is part of Workspace1 and will be deployed to Workspace2 as part of the
deployment process.
Question: 27 CertyIQ
DRAG DROP -
Your company has a team of developers. The team creates Python libraries of reusable code that is used to
transform data.
You create a Fabric workspace name Workspace1 that will be used to develop extract, transform, and load (ETL)
solutions by using notebooks.
You need to ensure that the libraries are available by default to new notebooks in Workspace1.
Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of
actions to the answer area and arrange them in the correct order.
Answer:
Explanation:
This action involves defining an environment where libraries, dependencies, and configurations are managed.
Installing libraries involves setting up necessary packages required for development and execution.
Set the default environment :Matches with "Set the default environment."
Question: 28 CertyIQ
You have a Fabric workspace that contains a lakehouse and a notebook named Notebook1. Notebook1 reads data
into a DataFrame from a table named Table1 and applies transformation logic. The data from the DataFrame is then
written to a new Delta table named Table2 by using a merge operation.
You need to consolidate the underlying Parquet files in Table1.
Which command should you run?
A.VACUUM
B.BROADCAST
C.OPTIMIZE
D.CACHE
Answer: C
Explanation:
OPTIMIZE: This command is used to compact small files into larger ones and optimize the layout of data in a
Delta table. By running the OPTIMIZE command on Table1, you can consolidate the Parquet files and improve
the performance of read and write operations on the table. To consolidate the underlying Parquet files in
Table1, you should run the OPTIMIZE command.
Question: 29 CertyIQ
You have five Fabric workspaces.
You are monitoring the execution of items by using Monitoring hub.
You need to identify in which workspace a specific item runs.
Which column should you view in Monitoring hub?
A.Start time
B.Capacity
C.Activity name
D.Submitter
E.Item type
F.Job type
G.Location
Answer: G
Explanation:
Location: This column displays the workspace where the item is being executed, helping you pinpoint the
exact workspace of the item.
Reference:
https://learn.microsoft.com/en-us/training/modules/monitor-fabric-items/3-use-monitor-hub
Question: 30 CertyIQ
You have a Fabric workspace that contains a warehouse named DW1. DW1 is loaded by using a notebook named
Notebook1.
You need to identify which version of Delta was used when Notebook1 was executed.
What should you use?
A.Real-Time hub
B.OneLake data hub
C.the Admin monitoring workspace
D.Fabric Monitor
E.the Microsoft Fabric Capacity Metrics app
Answer: D
Explanation:
D. Fabric Monitor.
Fabric Monitor: This tool provides detailed monitoring and logging capabilities for various components within
a Fabric workspace, including notebooks and data processing tasks. By using Fabric Monitor, you can track
and analyze the execution details of Notebook1, including the version of Delta used during its execution. This
information is crucial for debugging, auditing, and ensuring compatibility across different versions of Delta.
Question: 31 CertyIQ
DRAG DROP -
You have a Fabric workspace that contains a warehouse named Warehouse1.
In Warehouse1, you create a table named DimCustomer by running the following statement.
You need to set the Customerkey column as a primary key of the DimCustomer table.
Which three code segments should you run in sequence? To answer, move the appropriate code segments from
the list of code segments to the answer area and arrange them in the correct order.
Answer:
Explanation:
Since adding or dropping a primary key constraint requires modifying a table, this statement is correct.
It specifies a NONCLUSTERED primary key, meaning the physical ordering of data is not changed, and a
separate index structure is created.
This selection aligns with the requirement of having a nonclustered primary key.
NOT ENFORCED
In some data warehousing scenarios, constraints might not be enforced to allow better query performance
and faster data ingestion.
If the system does not enforce referential integrity (e.g., in Azure Synapse Analytics), this would be applicable.
Question: 32 CertyIQ
You have a Fabric workspace that contains a semantic model named Model1.
You need to dynamically execute and monitor the refresh progress of Model1.
What should you use?
Answer: D
Explanation:
Semantic link in a notebook: This approach allows you to dynamically execute operations and monitor the
refresh progress of the semantic model (Model1) within the interactive and flexible environment of a
notebook. By using a semantic link, you can write custom scripts to trigger the refresh process and track its
progress in real-time. This method provides a high degree of control and visibility over the operations on your
semantic model.
Question: 33 CertyIQ
Note: This question is part of a series of questions that present the same scenario. Each question in the series
contains a unique solution that might meet the stated goals. Some question sets might have more than one correct
solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not
appear in the review screen.
You have a Fabric eventstream that loads data into a table named Bike_Location in a KQL database. The table
contains the following columns:
BikepointID -
Street -
Neighbourhood -
No_Bikes -
No_Empty_Docks -
Timestamp -
You need to apply transformation and filter logic to prepare the data for consumption. The solution must return
data for a neighbourhood named Sands End when No_Bikes is at least 15. The results must be ordered by No_Bikes
in ascending order.
A.Yes
B.No
Answer: B
Explanation:
The answer is B. No because the "sort by" is sorting values in descending order (default behavior -->
https://learn.microsoft.com/en-us/kusto/query/sort-operator?view=microsoft-fabric). One should add "asc" to
sort values as required. The double "project" at the end does not affect the final result
Question: 34 CertyIQ
Note: This question is part of a series of questions that present the same scenario. Each question in the series
contains a unique solution that might meet the stated goals. Some question sets might have more than one correct
solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not
appear in the review screen.
You have a Fabric eventstream that loads data into a table named Bike_Location in a KQL database. The table
contains the following columns:
BikepointID -
Street -
Neighbourhood -
No_Bikes -
No_Empty_Docks -
Timestamp -
You need to apply transformation and filter logic to prepare the data for consumption. The solution must return
data for a neighbourhood named Sands End when No_Bikes is at least 15. The results must be ordered by No_Bikes
in ascending order.
A.Yes
B.No
Answer: B
Explanation:
The default sorting order in KQL is descending (desc), not ascending (asc).
The solution does not explicitly specify asc in the order by clause, so the results will be sorted in descending
order by default.
The requirement is to sort the data by No_Bikes in ascending order, which is not achieved without explicitly
specifying asc.
A. Yes: This would be incorrect because the solution fails to meet the requirement of sorting in ascending
order due to the default descending behavior in KQL.
Important Tip:
Always explicitly specify the sorting order (asc or desc) in KQL to avoid confusion, especially since its default
behavior differs from SQL.
Question: 35 CertyIQ
Note: This question is part of a series of questions that present the same scenario. Each question in the series
contains a unique solution that might meet the stated goals. Some question sets might have more than one correct
solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not
appear in the review screen.
You have a Fabric eventstream that loads data into a table named Bike_Location in a KQL database. The table
contains the following columns:
BikepointID -
Street -
Neighbourhood -
No_Bikes -
No_Empty_Docks -
Timestamp -
You need to apply transformation and filter logic to prepare the data for consumption. The solution must return
data for a neighbourhood named Sands End when No_Bikes is at least 15. The results must be ordered by No_Bikes
in ascending order.
A.Yes
B.No
Answer: A
Explanation:
The provided code segment correctly filters the data for the neighborhood "Sands End" where the number of
bikes (No_Bikes) is at least 15. It then explicitly sorts the results by No_Bikes in ascending order using sort by
No_Bikes asc and projects the required columns (BikepointID, Street, Neighbourhood, No_Bikes,
No_Empty_Docks, Timestamp). This meets all the stated goals of the problem.
B. No: This would be incorrect because the solution explicitly specifies asc in the sort by clause, ensuring the
data is ordered by No_Bikes in ascending order as required.
Important Tip:
Always ensure that the sorting order is explicitly specified in KQL to match the requirements, as the default
behavior might differ from other query languages like SQL.
Reference:
https://learn.microsoft.com/en-us/kusto/query/sort-operator?view=microsoft-fabric
Question: 36 CertyIQ
Note: This question is part of a series of questions that present the same scenario. Each question in the series
contains a unique solution that might meet the stated goals. Some question sets might have more than one correct
solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not
appear in the review screen.
You have a Fabric eventstream that loads data into a table named Bike_Location in a KQL database. The table
contains the following columns:
BikepointID -
Street -
Neighbourhood -
No_Bikes -
No_Empty_Docks -
Timestamp -
You need to apply transformation and filter logic to prepare the data for consumption. The solution must return
data for a neighbourhood named Sands End when No_Bikes is at least 15. The results must be ordered by No_Bikes
in ascending order.
A.Yes
B.No
Answer: B
Explanation:
The provided solution uses SQL syntax (SELECT, FROM, WHERE, ORDER BY), but the scenario specifies that
the data is in a KQL (Kusto Query Language) database. KQL and SQL have different syntax and functions. The
correct KQL syntax should be used to filter and sort the data in a KQL database.
A. Yes: This would be incorrect because the solution uses SQL syntax instead of KQL, which is not applicable
in this context.
Important Tip:
Always use the appropriate query language for the database you are working with. In this case, KQL should be
used instead of SQL to interact with the KQL database. The correct KQL query would use filter, sort by, and
project as shown in previous examples.
Question: 37 CertyIQ
Case Study -
This is a case study. Case studies are not timed separately. You can use as much exam time as you would like to
complete each case. However, there may be additional case studies and sections on this exam. You must manage
your time to ensure that you are able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information that is provided in the
case study. Case studies might contain exhibits and other resources that provide more information about the
scenario that is described in the case study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to review your answers and to
make changes before you move to the next section of the exam. After you begin a new section, you cannot return
to this section.
Overview -
Litware, Inc. is a publishing company that has an online bookstore and several retail bookstores worldwide. Litware
also manages an online advertising business for the authors it represents.
Litware has a Fabric workspace named Workspace1. High concurrency is enabled for Workspace1.
The company has a data engineering team that uses Python for data processing.
The retail bookstores send sales data at the end of each business day, while the online bookstore constantly
provides logs and sales data to a central enterprise resource planning (ERP) system.
Litware implements a medallion architecture by using the following three layers: bronze, silver, and gold. The sales
data is ingested from the ERP system as Parquet files that land in the Files folder in a lakehouse. Notebooks are
used to transform the files in a Delta table for the bronze and silver layers. The gold layer is in a warehouse that
has V-Order disabled.
Litware has image files of book covers in Azure Blob Storage. The files are loaded into the Files folder.
Month-end sales data is processed on the first calendar day of each month. Data that is older than one month
never changes.
In the source system, the sales data refreshes every six hours starting at midnight each day.
The sales data is captured in a Dataflow Gen1 dataflow. When the dataflow runs, new and historical data is
captured. The dataflow captures the following fields of the source:
•Sales Date
•Author
•Price
•Units
•SKU
A table named AuthorSales stores the sales data that relates to each author. The table contains a column named
AuthorEmail. Authors authenticate to a guest Fabric tenant by using their email address.
•Sales
•Fabric Admins
•Streaming Admins
Business users perform ad-hoc queries against the warehouse. The business users indicate that reports against
the warehouse sometimes run for two hours and fail to load as expected. Upon further investigation, the data
engineering team receives the following error message when the reports fail to load: “The SQL query failed while
running.”
The data engineering team wants to debug the issue and find queries that cause more than one failure.
When the authors have new book releases, there is often an increase in sales activity. This increase slows the data
ingestion process.
The company’s sales team reports that during the last month, the sales data has NOT been up-to-date when they
arrive at work in the morning.
Litware recently signed a contract to receive book reviews. The provider of the reviews exposes the data in
Amazon Simple Storage Service (Amazon S3) buckets.
Litware plans to manage Search Engine Optimization (SEO) for the authors. The SEO data will be streamed from a
REST API.
Litware plans to implement a version control solution in Fabric that will use GitHub integration and follow the
principle of least privilege.
To control data platform costs, the data platform must use only Fabric services and items. Additional Azure
resources must NOT be provisioned.
You need to ensure that processes for the bronze and silver layers run in isolation.
Answer: B
Explanation:
While disabling high concurrency (Option A) might seem like it isolates processes, it's not the recommended
approach for managing isolation in layered architectures like bronze and silver. By creating a custom pool
(Option B), you can allocate dedicated resources to each layer, ensuring they run independently without
interfering with one another. Custom pools give you fine-grained control over resource allocation, making
them the ideal solution for this scenario.
Question: 38 CertyIQ
DRAG DROP
-
Case Study
-
This is a case study. Case studies are not timed separately. You can use as much exam time as you would like to
complete each case. However, there may be additional case studies and sections on this exam. You must manage
your time to ensure that you are able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information that is provided in the
case study. Case studies might contain exhibits and other resources that provide more information about the
scenario that is described in the case study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to review your answers and to
make changes before you move to the next section of the exam. After you begin a new section, you cannot return
to this section.
Overview
-
Litware, Inc. is a publishing company that has an online bookstore and several retail bookstores worldwide. Litware
also manages an online advertising business for the authors it represents.
Litware has a Fabric workspace named Workspace1. High concurrency is enabled for Workspace1.
The company has a data engineering team that uses Python for data processing.
The retail bookstores send sales data at the end of each business day, while the online bookstore constantly
provides logs and sales data to a central enterprise resource planning (ERP) system.
Litware implements a medallion architecture by using the following three layers: bronze, silver, and gold. The sales
data is ingested from the ERP system as Parquet files that land in the Files folder in a lakehouse. Notebooks are
used to transform the files in a Delta table for the bronze and silver layers. The gold layer is in a warehouse that
has V-Order disabled.
Litware has image files of book covers in Azure Blob Storage. The files are loaded into the Files folder.
Month-end sales data is processed on the first calendar day of each month. Data that is older than one month
never changes.
In the source system, the sales data refreshes every six hours starting at midnight each day.
The sales data is captured in a Dataflow Gen1 dataflow. When the dataflow runs, new and historical data is
captured. The dataflow captures the following fields of the source:
•Sales Date
•Author
•Price
•Units
•SKU
A table named AuthorSales stores the sales data that relates to each author. The table contains a column named
AuthorEmail. Authors authenticate to a guest Fabric tenant by using their email address.
•Sales
•Fabric Admins
•Streaming Admins
Business users perform ad-hoc queries against the warehouse. The business users indicate that reports against
the warehouse sometimes run for two hours and fail to load as expected. Upon further investigation, the data
engineering team receives the following error message when the reports fail to load: “The SQL query failed while
running.”
The data engineering team wants to debug the issue and find queries that cause more than one failure.
When the authors have new book releases, there is often an increase in sales activity. This increase slows the data
ingestion process.
The company’s sales team reports that during the last month, the sales data has NOT been up-to-date when they
arrive at work in the morning.
Litware recently signed a contract to receive book reviews. The provider of the reviews exposes the data in
Amazon Simple Storage Service (Amazon S3) buckets.
Litware plans to manage Search Engine Optimization (SEO) for the authors. The SEO data will be streamed from a
REST API.
Litware plans to implement a version control solution in Fabric that will use GitHub integration and follow the
principle of least privilege.
To control data platform costs, the data platform must use only Fabric services and items. Additional Azure
resources must NOT be provisioned.
You need to ensure that the authors can see only their respective sales data.
How should you complete the statement? To answer, drag the appropriate values the correct targets. Each value
may be used once, more than once, or not at all. You may need to drag the split bar between panes or scroll to view
content.
Answer:
Explanation:
SCHEMABINDING: Ensures the function is bound to the schema of the referenced objects. Required for RLS
functions.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not
appear in the review screen.
You have an Azure key vault named KeyVault1 that contains secrets.
You have a Fabric workspace named Workspace1. Workspace contains a notebook named Notebook1 that performs
the following tasks:
You plan to add functionality to Notebook1 that will use the Fabric API to monitor the semantic model refreshes.
You need to retrieve the registered application ID and secret from KeyVault1 to generate the authentication token.
Use notebookutils.credentials.getSecret and specify the key vault URL and key vault secret.
A.Yes
B.No
Answer: B
Explanation:
The method notebookutils.credentials.getSecret() in Microsoft Fabric does not accept a Key Vault URL.
Instead, it requires the name of a linked service (which securely points to the Key Vault) and the name of the
secret.
Question: 40 CertyIQ
Note: This question is part of a series of questions that present the same scenario. Each question in the series
contains a unique solution that might meet the stated goals. Some question sets might have more than one correct
solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not
appear in the review screen.
You have an Azure key vault named KeyVault1 that contains secrets.
You have a Fabric workspace named Workspace1. Workspace contains a notebook named Notebook1 that performs
the following tasks:
You plan to add functionality to Notebook1 that will use the Fabric API to monitor the semantic model refreshes.
You need to retrieve the registered application ID and secret from KeyVault1 to generate the authentication token.
Solution: You use the following code segment:
Use notebookutils.credentials.putSecret and specify the key vault URL and key vault secret.
A.Yes
B.No
Answer: B
Explanation:
You need to retrieve the registered application ID and secret from KeyVault1 to generate the authentication
token. The function notebookutils.credentials.putSecret is used to store a secret into a secret scope — not
retrieve it.
Question: 41 CertyIQ
Note: This question is part of a series of questions that present the same scenario. Each question in the series
contains a unique solution that might meet the stated goals. Some question sets might have more than one correct
solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not
appear in the review screen.
You have an Azure key vault named KeyVault1 that contains secrets.
You have a Fabric workspace named Workspace1. Workspace contains a notebook named Notebook1 that performs
the following tasks:
You plan to add functionality to Notebook1 that will use the Fabric API to monitor the semantic model refreshes.
You need to retrieve the registered application ID and secret from KeyVault1 to generate the authentication token.
Use notebookutils.credentials.getSecret and specify the key vault URL and the name of a linked service.
A.Yes
B.No
Answer: A
Explanation:
A. Yes.
The notebookutils.credentials.getSecret function is designed to retrieve secrets from Azure Key Vault in a
Fabric environment. By specifying the key vault URL and the name of a linked service, you can successfully
access the registered application ID and secret stored in KeyVault1. This method ensures secure retrieval and
meets the goal for generating the authentication token.
Question: 42 CertyIQ
DRAG DROP
-
You have two Fabric notebooks named Load_Salesperson and Load_Orders that read data from Parquet files in a
lakehouse. Load_Salesperson writes to a Delta table named dim_salesperson. Load_Orders writes to a Delta table
named fact_orders and is dependent on the successful execution of Load_Salesperson.
You need to implement a pattern to dynamically execute Load_Salesperson and Load_Orders in the appropriate
order by using a notebook.
How should you complete the code? To answer, drag the appropriate values the correct targets. Each value may
be used once, more than once, or not at all. You may need to drag the split bar between panes or scroll to view
content.
Answer:
Explanation:
Question: 43 CertyIQ
HOTSPOT
-
You have a Fabric workspace named Workspace1 that contains a warehouse named Warehouse2.
How should you complete the statement? To answer, select the appropriate options in the answer area.
Explanation:
PARTIAL(prefix padding, padding string, suffix padding) - to expose first and last n characters, adding custom
padding 'xxx' of a text field
Question: 44 CertyIQ
HOTSPOT
-
You are building a data orchestration pattern by using a Fabric data pipeline named Dynamic Data Copy as shown
in the exhibit. (Click the Exhibit tab.)
Dynamic Data Copy does NOT use parametrization.
You need to configure the ForEach activity to receive the list of tables to be copied.
How should you complete the pipeline expression? To answer, select the appropriate options in the answer area.
Answer:
Explanation:
A lookup activity is typically used to retrieve (or "look up") data from a data source (like a database) — for
example, fetching a list of tables, schemas, or specific records.
output.value.
The lookup operation usually returns results inside the value property. This is where the actual data retrieved
by the lookup is stored — typically an array or object that can then be used by other activities (like a ForEach
loop or a copy activity).
Question: 45 CertyIQ
HOTSPOT
-
You have a Fabric workspace that contains a warehouse named Warehouse1. Warehouse1 contains a table named
DimCustomers. DimCustomers contains the following columns:
•CustomerName
•CustomerID
•BirthDate
•EmailAddress
How should you complete the statement? To answer, select the appropriate options in the answer area.
Answer:
Explanation:
default() replaces the actual value with a fixed masking value based on the column’s data type:
'partial(1,"@",5)')
Question: 46 CertyIQ
You have a Fabric workspace named Workspace1 that contains the following items:
Your company requires that specific governance processes be implemented for the items.
Which items can you endorse in Fabric?
Answer: B
Explanation:
Model (Semantic Model) Yes Frequently used in Power BI, key for analytics and reporting
Report Yes Reports can be promoted/certified to guide users toward reliable content
Question: 47 CertyIQ
You have a Fabric workspace named Workspace1.
You need to configure source control for Workpace1 to use GitHub. The solution must follow the principle of least
privilege.
Which permissions do you require to ensure that you can commit code to GitHub?
Answer: C
Explanation:
To commit code to GitHub while adhering to the principle of least privilege, you need permissions limited to
Contents (Read and write) to access and update the repository's content. This ensures you can perform the
required actions without granting unnecessary permissions like Actions, which are not needed for committing
code.
Question: 48 CertyIQ
You have a Fabric workspace named Workspace1.
You plan to configure Git integration for Workspace1 by using an Azure DevOps Git repository.
An Azure DevOps admin creates the required artifacts to support the integration of Workspace1.
Answer: A
Explanation:
To configure Git integration for a Microsoft Fabric workspace with an Azure DevOps Git repository, you need
to provide details about the organization, project, Git repository, and branch2. These details ensure that the
workspace is correctly linked to the desired repository and branch for version control and collaboration.
Question: 49 CertyIQ
You have a Fabric workspace that contains a lakehouse and a semantic model named Model1.
You use a notebook named Notebook1 to ingest and transform data from an external data source.
You need to execute Notebook1 as part of a data pipeline named Pipeline1. The process must meet the following
requirements:
Which three actions should you perform? Each correct answer presents part of the solution.
A.Place the Semantic model refresh activity after the Notebook activity and link the activities by using the On
success condition.
B.From the Schedule settings of Pipeline1, set the time zone to UTC.
C.Set the Retry setting of the Notebook activity to 2.
D.From the Schedule settings of Notebook1, set the time zone to UTC.
E.Set the Retry setting of the Semantic model refresh activity to 2.
F.Place the Semantic model refresh activity after the Notebook activity and link the activities by using an On
completion condition.
Answer: ABC
Explanation:
A. Place the Semantic model refresh activity after the Notebook activity and link the activities by using the On
success condition. B. From the Schedule settings of Pipeline1, set the time zone to UTC. C. Set the Retry
setting of the Notebook activity to 2.
A - On Success condition: Ensures proper sequencing. The semantic model refresh activity will only run if the
Notebook activity is successful.
B - Time zone setting for Pipeline1: By configuring the time zone to UTC, the scheduling of the pipeline
becomes consistent and clear across global teams or systems.
C - Retry setting for Notebook: Setting a retry count helps ensure robustness, as transient failures can
automatically trigger retries to avoid manual intervention.
Question: 50 CertyIQ
You have a Fabric workspace that contains a lakehouse named Lakehouse1.
You plan to create a data pipeline named Pipeline1 to ingest data into Lakehouse1. You will use a parameter named
param1 to pass an external value into Pipeline1. The param1 parameter has a data type of int.
You need to ensure that the pipeline expression returns param1 as an int value.
A."@pipeline().parameters.param1"
B."@ pipeline().parameters.param1 "
C."@ pipeline().parameters.[param1] "
D."@@ pipeline().parameters.param1 "
Answer: B
Explanation:
The syntax @ pipeline().parameters.param1 is used in data pipelines to dynamically reference the parameter
param1. This ensures that the parameter value is correctly evaluated as an integer during pipeline execution.
The curly braces are essential for indicating dynamic expression evaluation.
Question: 51 CertyIQ
You have a Fabric workspace named Workspace1 that contains a lakehouse named Lakehouse1. Workspace1
contains the following items:
•A Dataflow Gen2 dataflow that copies data from an on-premises Microsoft SQL Server database to Lakehouse1
•A notebook that transforms files and loads the data to Lakehouse1
•A data pipeline that loads a CSV file to Lakehouse1
You need to develop an orchestration solution in Fabric that will load each item one after the other. The solution
must be scheduled to run every 15 minutes.
A.notebook
B.warehouse
C.Dataflow Gen2 dataflow
D.data pipeline
Answer: D
Explanation:
D. data pipeline.
A data pipeline is designed for orchestrating and scheduling workflows in Fabric. It enables you to load items
sequentially (one after the other) and can be set to run on a defined schedule, such as every 15 minutes. This
makes it the ideal choice for your requirement.
Question: 52 CertyIQ
You are building a Fabric notebook named MasterNotebook1 in a workspace. MasterNotebook1 contains the
following code.
You need to ensure that the notebooks are executed in the following sequence:
1. Notebook_03
2. Notebook_01
3. Notebook_02
Which two actions should you perform? Each correct answer presents part of the solution.
A.Move the declaration of Notebook_02 to the bottom of the Directed Acyclic Graph (DAG) definition.
B.Add dependencies to the execution of Notebook_03.
C.Split the Directed Acyclic Graph (DAG) definition into three separate definitions.
D.Add dependencies to the execution of Notebook_02.
E.Change the concurrency to 3.
F.Move the declaration of Notebook_03 to the top of the Directed Acyclic Graph (DAG) definition.
Answer: DF
Explanation:
D. Add dependencies to the execution of Notebook_02: Adding dependencies ensures that Notebook_02 runs
in the proper order within the Directed Acyclic Graph (DAG), maintaining task execution logic and avoiding
conflicts.
F. Move the declaration of Notebook_03 to the top of the Directed Acyclic Graph (DAG) definition: Reordering
the declaration of Notebook_03 ensures it aligns with the intended flow of execution within the DAG, helping
maintain the logical sequencing of tasks.
Question: 53 CertyIQ
You have a Fabric workspace that contains a data pipeline named Pipeline1 as shown in the exhibit. (Click the
Exhibit tab.)
What will occur the next time Pipeline1 runs?
A.Copy_kdi will run first, and then Execute procedure1 will run.
B.Execute procedure1 will run first, and then Copy_kdi will run.
C.Execute procedure1 will run and Copy_kdi will be skipped.
D.Copy_kdi will run and Execute procedure1 will be skipped.
E.Both activities will run simultaneously.
F.Both activities will be skipped.
Answer: D
Explanation:
Copy_kdi is set to run, and Execute procedure1 will be skipped under certain conditions or because of pipeline
configuration, perhaps based on dependencies, conditions, or a failure in a prior step.
Question: 54 CertyIQ
Case Study -
This is a case study. Case studies are not timed separately. You can use as much exam time as you would like to
complete each case. However, there may be additional case studies and sections on this exam. You must manage
your time to ensure that you are able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information that is provided in the
case study. Case studies might contain exhibits and other resources that provide more information about the
scenario that is described in the case study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to review your answers and to
make changes before you move to the next section of the exam. After you begin a new section, you cannot return
to this section.
Contoso, Ltd. is an online retail company that wants to modernize its analytics platform by moving to Fabric. The
company plans to begin using Fabric for marketing analytics.
Overview. IT Structure -
The company’s IT department has a team of data analysts and a team of data engineers that use analytics
systems.
The data engineers perform the ingestion, transformation, and loading of data. They prefer to use Python or SQL to
transform the data.
The data analysts query data and create semantic models and reports. They are qualified to write queries in Power
Query and T-SQL.
Contoso has an F64 capacity named Cap1. All Fabric users are allowed to create items.
Contoso has two workspaces named WorkspaceA and WorkspaceB that currently use Pro license mode.
Contoso has a point of sale (POS) system named POS1 that uses an instance of SQL Server on Azure Virtual
Machines in the same Microsoft Entra tenant as Fabric. The host virtual machine is on a private virtual network that
has public access blocked. POS1 contains all the sales transactions that were processed on the company’s
website.
The company has a software as a service (SaaS) online marketing app named MAR1. MAR1 has seven entities. The
entities contain data that relates to email open rates and interaction rates, as well as website interactions. The
data can be exported from MAR1 by calling REST APIs. Each entity has a different endpoint.
Contoso has been using MAR1 for one year. Data from prior years is stored in Parquet files in an Amazon Simple
Storage Service (Amazon S3) bucket. There are 12 files that range in size from 300 MB to 900 MB and relate to
email interactions.
POS1 contains a product list and related data. The data comes from the following three tables:
•Products
•ProductCategories
•ProductSubcategories
In the data, products are related to product subcategories, and subcategories are related to product categories.
Contoso has a Microsoft Entra tenant that has the following mail-enabled security groups:
The company has an existing Azure DevOps organization and creates a new project for repositories that relate to
Fabric.
The VP of marketing at Contoso requires analysis on the effectiveness of different types of email content. It
typically takes a week to manually compile and analyze the data. Contoso wants to reduce the time to less than
one day by using Fabric.
The data engineering team has successfully exported data from MAR1. The team experiences transient
connectivity errors, which causes the data exports to fail.
•Lakehouse1: Will store both raw and cleansed data from the sources
•Lakehouse2: Will serve data in a dimensional model to users for analytical queries
The new lakehouses must follow a medallion architecture by using the following three layers: bronze, silver, and
gold. There will be extensive data cleansing required to populate the MAR1 data in the silver layer, including
deduplication, the handling of missing values, and the standardizing of capitalization.
Each layer must be fully populated before moving on to the next layer. If any step in populating the lakehouses
fails, an email must be sent to the data engineers.
The use of email data from the Amazon S3 bucket must meet the following requirements:
Items that relate to data ingestion must meet the following requirements:
Lakehouses, data pipelines, and notebooks must be stored in WorkspaceA. Semantic models, reports, and
dataflows must be stored in WorkspaceB.
Once a week, old files that are no longer referenced by a Delta table log must be removed.
In the POS1 product data, ProductID values are unique. The product dimension in the gold layer must include only
active products from product list. Active products are identified by an IsActive value of 1.
Some product categories and subcategories are NOT assigned to any product. They are NOT analytically relevant
and must be omitted from the product dimension in the gold layer.
•The data engineers must have read and write access to all the lakehouses, including the underlying files.
•The data analysts must only have read access to the Delta tables in the gold layer.
•The data analysts must NOT have access to the data in the bronze and silver layers.
•The data engineers must be able to commit changes to source control in WorkspaceA.
You need to ensure that WorkspaceA can be configured for source control.
Which two actions should you perform? Each correct answer presents part of the solution.
NOTE: Each correct selection is worth one point.
A.From Tenant setting, set Users can synchronize workspace items with their Git repositories to Enabled.
B.From Tenant setting, set Users can sync workspace items with GitHub repositories to Enabled.
C.Configure WorkspaceA to use a Premium Per User (PPU) license.
D.Assign WorkspaceA to Cap1.
Answer: AD
Explanation:
A. From Tenant setting, set Users can synchronize workspace items with their Git repositories to Enabled.
This setting enables Git integration for workspaces at the tenant level.
Workspaces must be assigned to a capacity that supports Git integration, not just shared or PPU.
Git integration requires Premium capacity (P SKUs or Embedded A SKUs), not just PPU.
Question: 55 CertyIQ
HOTSPOT
-
You have a Fabric workspace that contains a warehouse named Warehouse1. Warehouse1 contains a table named
Customer. Customer contains the following data.
You have an internal Microsoft Entra user named User1 that has an email address of [email protected].
You need to provide User1 with access to the Customer table. The solution must prevent User1 from accessing the
CreditCard column.
How should you complete the statement? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
Answer:
Explanation:
GRANT: SELECT.
The SELECT permission allows a user to query data from a table or view.
In this case, the permission is being granted on the Customers table.
TO: [User1].
This is the correct syntax for referencing a database principal (like a user or database role) when:
The square brackets [] are T-SQL delimiters to ensure the identifier is treated literally.
Question: 56 CertyIQ
Case Study -
This is a case study. Case studies are not timed separately. You can use as much exam time as you would like to
complete each case. However, there may be additional case studies and sections on this exam. You must manage
your time to ensure that you are able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information that is provided in the
case study. Case studies might contain exhibits and other resources that provide more information about the
scenario that is described in the case study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to review your answers and to
make changes before you move to the next section of the exam. After you begin a new section, you cannot return
to this section.
Overview -
Litware, Inc. is a publishing company that has an online bookstore and several retail bookstores worldwide. Litware
also manages an online advertising business for the authors it represents.
Existing Environment. Fabric Environment
Litware has a Fabric workspace named Workspace1. High concurrency is enabled for Workspace1.
The company has a data engineering team that uses Python for data processing.
Existing Environment. Data Processing
The retail bookstores send sales data at the end of each business day, while the online bookstore constantly
provides logs and sales data to a central enterprise resource planning (ERP) system.
Litware implements a medallion architecture by using the following three layers: bronze, silver, and gold. The sales
data is ingested from the ERP system as Parquet files that land in the Files folder in a lakehouse. Notebooks are
used to transform the files in a Delta table for the bronze and silver layers. The gold layer is in a warehouse that
has V-Order disabled.
Litware has image files of book covers in Azure Blob Storage. The files are loaded into the Files folder.
Existing Environment. Sales Data
Month-end sales data is processed on the first calendar day of each month. Data that is older than one month
never changes.
In the source system, the sales data refreshes every six hours starting at midnight each day.
The sales data is captured in a Dataflow Gen1 dataflow. When the dataflow runs, new and historical data is
captured. The dataflow captures the following fields of the source:
Sales Date -
Author -
Price -
Units -
SKU -
A table named AuthorSales stores the sales data that relates to each author. The table contains a column named
AuthorEmail. Authors authenticate to a guest Fabric tenant by using their email address.
Existing Environment. Security Groups
Litware has the following security groups:
Sales -
Fabric Admins -
Streaming Admins -
Existing Environment. Performance Issues
Business users perform ad-hoc queries against the warehouse. The business users indicate that reports against
the warehouse sometimes run for two hours and fail to load as expected. Upon further investigation, the data
engineering team receives the following error message when the reports fail to load: “The SQL query failed while
running.”
The data engineering team wants to debug the issue and find queries that cause more than one failure.
When the authors have new book releases, there is often an increase in sales activity. This increase slows the data
ingestion process.
The company’s sales team reports that during the last month, the sales data has NOT been up-to-date when they
arrive at work in the morning.
Answer: B
Explanation:
B. Create a shortcut.
Create a Shortcut: Creating a shortcut in the lakehouse allows you to link to external data sources without
making a copy of the data. This means you can make the book reviews available in the lakehouse by creating a
shortcut to the location where the book reviews are stored. The data remains in its original location but is
accessible from the lakehouse, meeting the requirement of not duplicating the data.
Question: 57 CertyIQ
Case Study -
This is a case study. Case studies are not timed separately. You can use as much exam time as you would like to
complete each case. However, there may be additional case studies and sections on this exam. You must manage
your time to ensure that you are able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information that is provided in the
case study. Case studies might contain exhibits and other resources that provide more information about the
scenario that is described in the case study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to review your answers and to
make changes before you move to the next section of the exam. After you begin a new section, you cannot return
to this section.
Overview -
Litware, Inc. is a publishing company that has an online bookstore and several retail bookstores worldwide. Litware
also manages an online advertising business for the authors it represents.
Existing Environment. Fabric Environment
Litware has a Fabric workspace named Workspace1. High concurrency is enabled for Workspace1.
The company has a data engineering team that uses Python for data processing.
Existing Environment. Data Processing
The retail bookstores send sales data at the end of each business day, while the online bookstore constantly
provides logs and sales data to a central enterprise resource planning (ERP) system.
Litware implements a medallion architecture by using the following three layers: bronze, silver, and gold. The sales
data is ingested from the ERP system as Parquet files that land in the Files folder in a lakehouse. Notebooks are
used to transform the files in a Delta table for the bronze and silver layers. The gold layer is in a warehouse that
has V-Order disabled.
Litware has image files of book covers in Azure Blob Storage. The files are loaded into the Files folder.
Existing Environment. Sales Data
Month-end sales data is processed on the first calendar day of each month. Data that is older than one month
never changes.
In the source system, the sales data refreshes every six hours starting at midnight each day.
The sales data is captured in a Dataflow Gen1 dataflow. When the dataflow runs, new and historical data is
captured. The dataflow captures the following fields of the source:
Sales Date -
Author -
Price -
Units -
SKU -
A table named AuthorSales stores the sales data that relates to each author. The table contains a column named
AuthorEmail. Authors authenticate to a guest Fabric tenant by using their email address.
Existing Environment. Security Groups
Litware has the following security groups:
Sales -
Fabric Admins -
Streaming Admins -
Existing Environment. Performance Issues
Business users perform ad-hoc queries against the warehouse. The business users indicate that reports against
the warehouse sometimes run for two hours and fail to load as expected. Upon further investigation, the data
engineering team receives the following error message when the reports fail to load: “The SQL query failed while
running.”
The data engineering team wants to debug the issue and find queries that cause more than one failure.
When the authors have new book releases, there is often an increase in sales activity. This increase slows the data
ingestion process.
The company’s sales team reports that during the last month, the sales data has NOT been up-to-date when they
arrive at work in the morning.
Answer: E
Explanation:
E. Configure incremental refresh for the dataflow. Set Refresh rows from the past to 1 Month.
This approach ensures minimal data transfer while keeping the refresh scope limited to the most recent and
relevant data (1 month), which is aligned with the requirement to minimize data transfer.
Question: 58 CertyIQ
HOTSPOT -
Case Study -
This is a case study. Case studies are not timed separately. You can use as much exam time as you would like to
complete each case. However, there may be additional case studies and sections on this exam. You must manage
your time to ensure that you are able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information that is provided in the
case study. Case studies might contain exhibits and other resources that provide more information about the
scenario that is described in the case study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to review your answers and to
make changes before you move to the next section of the exam. After you begin a new section, you cannot return
to this section.
Overview. IT Structure -
The company’s IT department has a team of data analysts and a team of data engineers that use analytics
systems.
The data engineers perform the ingestion, transformation, and loading of data. They prefer to use Python or SQL to
transform the data.
The data analysts query data and create semantic models and reports. They are qualified to write queries in Power
Query and T-SQL.
Products -
ProductCategories -
ProductSubcategories -
In the data, products are related to product subcategories, and subcategories are related to product categories.
Answer:
Explanation:
The Bronze Layer is typically the raw data ingestion layer in a medallion architecture.
A Copy activity in a pipeline is commonly used in Azure Data Factory (ADF) or Synapse Pipelines to ingest and
store raw data into the Bronze Layer (such as a Data Lake or Delta Lake).
This choice ensures efficient and scalable data ingestion from various sources.
A notebook (such as an Azure Databricks or Synapse notebook) is often used to apply transformations,
perform data validation, and enrich the raw ingested data.
This choice aligns with the goal of refining, structuring, and preparing the data before moving it to the Gold
Layer.
Question: 59 CertyIQ
Case Study -
This is a case study. Case studies are not timed separately. You can use as much exam time as you would like to
complete each case. However, there may be additional case studies and sections on this exam. You must manage
your time to ensure that you are able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information that is provided in the
case study. Case studies might contain exhibits and other resources that provide more information about the
scenario that is described in the case study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to review your answers and to
make changes before you move to the next section of the exam. After you begin a new section, you cannot return
to this section.
Overview. IT Structure -
The company’s IT department has a team of data analysts and a team of data engineers that use analytics
systems.
The data engineers perform the ingestion, transformation, and loading of data. They prefer to use Python or SQL to
transform the data.
The data analysts query data and create semantic models and reports. They are qualified to write queries in Power
Query and T-SQL.
Products -
ProductCategories -
ProductSubcategories -
In the data, products are related to product subcategories, and subcategories are related to product categories.
A.Create a workspace identity and enable high concurrency for the notebooks.
B.Create a shortcut and ensure that caching is disabled for the workspace.
C.Create a workspace identity and use the identity in a data pipeline.
D.Create a shortcut and ensure that caching is enabled for the workspace.
Answer: D
Explanation:
Enabling caching for the workspace will help minimize egress costs by reducing the amount of data that
needs to be transferred across clouds. Creating a shortcut ensures that the raw data is not duplicated in the
lakehouse.
Question: 60 CertyIQ
HOTSPOT -
Case Study -
This is a case study. Case studies are not timed separately. You can use as much exam time as you would like to
complete each case. However, there may be additional case studies and sections on this exam. You must manage
your time to ensure that you are able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information that is provided in the
case study. Case studies might contain exhibits and other resources that provide more information about the
scenario that is described in the case study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to review your answers and to
make changes before you move to the next section of the exam. After you begin a new section, you cannot return
to this section.
Overview. IT Structure -
The company’s IT department has a team of data analysts and a team of data engineers that use analytics
systems.
The data engineers perform the ingestion, transformation, and loading of data. They prefer to use Python or SQL to
transform the data.
The data analysts query data and create semantic models and reports. They are qualified to write queries in Power
Query and T-SQL.
Products -
ProductCategories -
ProductSubcategories -
In the data, products are related to product subcategories, and subcategories are related to product categories.
Answer:
Explanation:
Final Answer:
Joins:
WHERE Clause:
IsActive = 1
These selections ensure that:
All products are retained, even if they are not assigned to a subcategory.
The first join should be a LEFT OUTER JOIN to ensure that all products are retained, even if they are not
assigned to a subcategory. The second join should be an INNER JOIN to exclude categories and subcategories
that are not linked to any product, as they are not analytically relevant.
Only active products, identified by an IsActive value of 1, should be included in the product dimension in the
gold layer. Additionally, in the POS1 product data, ProductID values are unique. Categories and subcategories
without assigned products must be omitted to maintain analytical relevance.
Question: 61 CertyIQ
Case Study -
This is a case study. Case studies are not timed separately. You can use as much exam time as you would like to
complete each case. However, there may be additional case studies and sections on this exam. You must manage
your time to ensure that you are able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information that is provided in the
case study. Case studies might contain exhibits and other resources that provide more information about the
scenario that is described in the case study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to review your answers and to
make changes before you move to the next section of the exam. After you begin a new section, you cannot return
to this section.
Overview. IT Structure -
The company’s IT department has a team of data analysts and a team of data engineers that use analytics
systems.
The data engineers perform the ingestion, transformation, and loading of data. They prefer to use Python or SQL to
transform the data.
The data analysts query data and create semantic models and reports. They are qualified to write queries in Power
Query and T-SQL.
Products -
ProductCategories -
ProductSubcategories -
In the data, products are related to product subcategories, and subcategories are related to product categories.
A.ForEach
B.Copy data
C.WebHook
D.Stored procedure
Answer: AB
Explanation:
ForEach: This activity allows you to iterate over a collection of items and execute activities for each item. In
this context, it can be used to process multiple datasets or files within the bronze layer, ensuring that each
file is appropriately handled and transformed.
Copy Data: This activity is fundamental in pipelines for data movement. It enables you to copy data from a
source to a destination, such as moving data from a staging area to the bronze layer. The Copy Data activity
can read the MAR1 data from its source and write it to the bronze layer, ensuring the data is properly ingested.
Question: 62 CertyIQ
HOTSPOT -
You have a Fabric workspace that contains a warehouse named Warehouse1. Warehouse1 contains the following
tables and columns.
You need to denormalize the tables and include the ContractType and StartDate columns in the Employee table.
The solution must meet the following requirements:
Ensure that the StartDate column is of the date data type.
Ensure that all the rows from the Employee table are preserved and include any matching rows from the Contract
table.
Ensure that the result set displays the total number of employees per contract type for all the contract types that
have more than two employees.
How should you complete the statement? To answer, select the appropriate options in the answer area.
The CONVERT function is used to explicitly convert data types in SQL Server.
A LEFT OUTER JOIN ensures all employees are included, even if they do not have a corresponding contract.
If some employees do not have contracts, this join type ensures they are still listed with NULL contract values.
HAVING is used because COUNT(DISTINCT EmployeeID) is an aggregate function, and aggregate functions
cannot be used in WHERE.
Question: 63 CertyIQ
HOTSPOT -
You have an Azure Event Hubs data source that contains weather data.
You ingest the data from the data source by using an eventstream named Eventstream1. Eventstream1 uses a
lakehouse as the destination.
You need to batch ingest only rows from the data source where the City attribute has a value of Kansas. The filter
must be added before the destination. The solution must minimize development effort.
What should you use for the data processor and filtering? To answer, select the appropriate options in the answer
area.
NOTE: Each correct selection is worth one point.
Answer:
Explanation:
Selecting "An eventstream with an external data source" means data is coming from an external system such
as IoT devices, logs, or real-time telemetry.
This is appropriate when dealing with real-time ingestion from sources like Azure Event Hubs, IoT Hub, or
Kafka.
2.Filtering: An eventstream processor.
Filtering in streaming systems typically happens during real-time data ingestion to remove irrelevant or
unnecessary events before further processing.
An eventstream processor can be used to apply transformations, filtering, and aggregations dynamically.
This ensures that only relevant data moves forward in the pipeline.
Question: 64 CertyIQ
You have a Fabric workspace that contains an eventstream named Eventstream1. Eventstream1 processes data
from a thermal sensor by using event stream processing, and then stores the data in a lakehouse.
You need to modify Eventstream1 to include the standard deviation of the temperature.
Which transform operator should you include in the Eventstream1 logic?
A.Expand
B.Group by
C.Union
D.Aggregate
Answer: B
Explanation:
The Group by transform operator contains the Standard deviation aggregation. The Aggregate transform
operator only contains Average, Max, Min and Sum aggregation.
Reference:
https://learn.microsoft.com/en-us/fabric/real-time-intelligence/event-streams/process-events-using-event-
processor-editor?pivots=standard-capabilities#group-by
Question: 65 CertyIQ
You have an Azure event hub. Each event contains the following fields:
BikepointID -
Street -
Neighbourhood -
Latitude -
Longitude -
No_Bikes -
No_Empty_Docks -
You need to ingest the events. The solution must only retain events that have a Neighbourhood value of Chelsea,
and then store the retained events in a Fabric lakehouse.
What should you use?
Answer: B
Explanation:
B. an eventstream.
Eventstream: An eventstream is specifically designed for processing and managing events in real-time. It
allows you to filter, transform, and route events efficiently. In this scenario, you can configure the
eventstream to retain only the events where the Neighbourhood value is "Chelsea" and then store the filtered
events in a Fabric lakehouse. This approach ensures that only the relevant events are ingested, adhering to
the requirement to retain only specific events based on the Neighbourhood value.
Question: 66 CertyIQ
HOTSPOT -
You are building a data loading pattern for Fabric notebook workloads.
You have the following code segment:
For each of the following statements, select Yes if the statement is true. Otherwise, select No.
NOTE: Each correct selection is worth one point.
Answer:
Explanation:
Selected Answer: No
In many data loading strategies, especially when using incremental loads or merge operations, the target
table is not always overwritten. Instead, new data is appended, updated, or merged based on keys.
Overwriting usually happens in full refresh scenarios, which is not always the case.
Selected Answer: No
The merge operation (such as SQL MERGE or Delta Lake MERGE INTO) only runs if certain conditions are met,
such as the presence of new or changed data. If there is no data to update or merge, it may not execute. Thus,
it's correct to say that it does not always run.
"The loading pattern supports both full and incremental loading requirements."
A well-designed data pipeline often supports both full and incremental loads. Full loads replace the entire
dataset, while incremental loads append or update only changed records. Since this is a common practice,
selecting "Yes" is correct.
Question: 67 CertyIQ
HOTSPOT -
You have a Fabric workspace that contains two lakehouses named Lakehouse1 and Lakehouse2. Lakehouse1
contains staging data in a Delta table named Orderlines. Lakehouse2 contains a Type 2 slowly changing dimension
(SCD) dimension table named Dim_Customer.
You need to build a query that will combine data from Orderlines and Dim_Customer to create a new fact table
named Fact_Orders. The new table must meet the following requirements:
Enable the analysis of customer orders based on historical attributes.
Enable the analysis of customer orders based on the current attributes.
How should you complete the statement? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
Answer:
Explanation:
This ensures that the OrderDate falls on or after the start of the valid period.
This is essential to capture all orders that are valid based on the entity's timeline.
This ensures that the OrderDate is strictly before the valid end date.
This prevents fetching orders that occur after the entity has expired.
Question: 68 CertyIQ
You have a Fabric workspace that contains a lakehouse named Lakehouse1.
In an external data source, you have data files that are 500 GB each. A new file is added every day.
You need to ingest the data into Lakehouse1 without applying any transformations. The solution must meet the
following requirements
Trigger the process when a new file is added.
Provide the highest throughput.
Which type of item should you use to ingest the data?
A.Eventstream
B.Dataflow Gen2
C.Streaming dataset
D.Data pipeline
Answer: D
Explanation:
D. Data pipeline.
Data pipeline: A data pipeline is designed to handle large-scale data ingestion and movement efficiently. It
can be configured to automatically trigger the ingestion process when a new file is added to the external data
source, ensuring that the data is ingested into Lakehouse1 as soon as it becomes available. Data pipelines are
optimized for high throughput, making them suitable for handling large files (like the 500 GB files mentioned)
and ensuring the process is both fast and efficient.
Question: 69 CertyIQ
You have a Fabric workspace that contains a lakehouse named Lakehouse1.
In an external data source, you have data files that are 500 GB each. A new file is added every day.
You need to ingest the data into Lakehouse1 without applying any transformations. The solution must meet the
following requirements
Trigger the process when a new file is added.
Provide the highest throughput.
Which type of item should you use to ingest the data?
A.Data pipeline
B.Environment
C.KQL queryset
D.Dataflow Gen2
Answer: A
Explanation:
A. Data pipeline.
Data Pipeline: Data pipelines in Fabric are designed for high-throughput data ingestion and can be triggered
automatically when new files are added to the external data source. They are optimized for moving large
volumes of data efficiently and can handle the ingestion of 500 GB files without applying transformations.
Question: 70 CertyIQ
You have a Fabric workspace that contains an eventhouse and a KQL database named Database1. Database1 has
the following:
A.
B.
C.
D.
Answer: BD
Explanation:
Record B loads because it conforms to the updated schema (string DeviceId, StreamData with temperature).
Record D loads because it conforms to the original schema (guid DeviceId, no temperature in StreamData).
Question: 71 CertyIQ
HOTSPOT -
You have a Fabric workspace.
You are debugging a statement and discover the following issues:
Sometimes, the statement fails to return all the expected rows.
The PurchaseDate output column is NOT in the expected format of mmm dd, yy.
You need to resolve the issues. The solution must ensure that the data types of the results are retained. The results
can contain blank cells.
How should you complete the statement? To answer, select the appropriate options in the answer area.
1. try_cast(item_name as varchar(20))
Purpose: Attempts to convert item_name into a VARCHAR(20). If conversion fails, it returns NULL instead of
an error.
2. convert(varchar, purchase_date, 7)
Question: 72 CertyIQ
You are developing a data pipeline named Pipeline1.
You need to add a Copy data activity that will copy data from a Snowflake data source to a Fabric warehouse.
What should you configure?
Answer: C
Explanation:
Enable Staging: When copying data from a Snowflake data source to a Fabric warehouse, enabling staging
can significantly improve the efficiency and reliability of the data transfer process. Staging involves
temporarily storing the data in an intermediate location before loading it into the final destination. This
approach helps in handling large datasets and complex transformations, ensuring that the data is transferred
smoothly without interruptions. It also allows for more manageable and optimized data movement,
particularly when dealing with different data storage systems like Snowflake and Fabric.
Question: 73 CertyIQ
Note: This question is part of a series of questions that present the same scenario. Each question in the series
contains a unique solution that might meet the stated goals. Some question sets might have more than one correct
solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not
appear in the review screen.
You have a KQL database that contains two tables named Stream and Reference. Stream contains streaming data
in the following format.
You need to reduce how long it takes to run the KQL queryset.
Solution: You change the join type to kind=outer.
Does this meet the goal?
A.Yes
B.No
Answer: B
Explanation:
No. An outer join can be more computationally intensive than an inner join because it needs to process all rows
from both tables and include rows that don't have matching entries.
Question: 74 CertyIQ
Note: This question is part of a series of questions that present the same scenario. Each question in the series
contains a unique solution that might meet the stated goals. Some question sets might have more than one correct
solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not
appear in the review screen.
You have a KQL database that contains two tables named Stream and Reference. Stream contains streaming data
in the following format.
You need to reduce how long it takes to run the KQL queryset.
Solution: You change project to extend.
A.Yes
B.No
Answer: B
Explanation:
No. The `project` operator is used to select specific columns, whereas `extend` is used to add new calculated
columns to the result set. They serve different purposes.
Question: 75 CertyIQ
Note: This question is part of a series of questions that present the same scenario. Each question in the series
contains a unique solution that might meet the stated goals. Some question sets might have more than one correct
solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not
appear in the review screen.
You have a KQL database that contains two tables named Stream and Reference. Stream contains streaming data
in the following format.
You need to reduce how long it takes to run the KQL queryset.
Solution: You move the filter to line 02.
A.Yes
B.No
Answer: A
Explanation:
Yes. By applying the `where` clause early in the query, you reduce the number of rows processed in
subsequent operations, which improves performance.
Question: 76 CertyIQ
Note: This question is part of a series of questions that present the same scenario. Each question in the series
contains a unique solution that might meet the stated goals. Some question sets might have more than one correct
solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not
appear in the review screen.
You have a KQL database that contains two tables named Stream and Reference. Stream contains streaming data
in the following format.
You need to reduce how long it takes to run the KQL queryset.
Solution: You add the make_list() function to the output columns.
A.Yes
B.No
Answer: B
Explanation:
No. The `make_list()` function aggregates values into a list, which can be useful for certain types of analysis
but does not inherently improve query performance.
Question: 77 CertyIQ
Case Study -
This is a case study. Case studies are not timed separately. You can use as much exam time as you would like to
complete each case. However, there may be additional case studies and sections on this exam. You must manage
your time to ensure that you are able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information that is provided in the
case study. Case studies might contain exhibits and other resources that provide more information about the
scenario that is described in the case study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to review your answers and to
make changes before you move to the next section of the exam. After you begin a new section, you cannot return
to this section.
Overview -
Litware, Inc. is a publishing company that has an online bookstore and several retail bookstores worldwide. Litware
also manages an online advertising business for the authors it represents.
Litware has a Fabric workspace named Workspace1. High concurrency is enabled for Workspace1.
The company has a data engineering team that uses Python for data processing.
The retail bookstores send sales data at the end of each business day, while the online bookstore constantly
provides logs and sales data to a central enterprise resource planning (ERP) system.
Litware implements a medallion architecture by using the following three layers: bronze, silver, and gold. The sales
data is ingested from the ERP system as Parquet files that land in the Files folder in a lakehouse. Notebooks are
used to transform the files in a Delta table for the bronze and silver layers. The gold layer is in a warehouse that
has V-Order disabled.
Litware has image files of book covers in Azure Blob Storage. The files are loaded into the Files folder.
Month-end sales data is processed on the first calendar day of each month. Data that is older than one month
never changes.
In the source system, the sales data refreshes every six hours starting at midnight each day.
The sales data is captured in a Dataflow Gen1 dataflow. When the dataflow runs, new and historical data is
captured. The dataflow captures the following fields of the source:
•Sales Date
•Author
•Price
•Units
•SKU
A table named AuthorSales stores the sales data that relates to each author. The table contains a column named
AuthorEmail. Authors authenticate to a guest Fabric tenant by using their email address.
•Sales
•Fabric Admins
•Streaming Admins
Business users perform ad-hoc queries against the warehouse. The business users indicate that reports against
the warehouse sometimes run for two hours and fail to load as expected. Upon further investigation, the data
engineering team receives the following error message when the reports fail to load: “The SQL query failed while
running.”
The data engineering team wants to debug the issue and find queries that cause more than one failure.
When the authors have new book releases, there is often an increase in sales activity. This increase slows the data
ingestion process.
The company’s sales team reports that during the last month, the sales data has NOT been up-to-date when they
arrive at work in the morning.
Litware recently signed a contract to receive book reviews. The provider of the reviews exposes the data in
Amazon Simple Storage Service (Amazon S3) buckets.
Litware plans to manage Search Engine Optimization (SEO) for the authors. The SEO data will be streamed from a
REST API.
Litware plans to implement a version control solution in Fabric that will use GitHub integration and follow the
principle of least privilege.
To control data platform costs, the data platform must use only Fabric services and items. Additional Azure
resources must NOT be provisioned.
You need to create a workflow for the new book cover images.
Which two components should you include in the workflow? Each correct answer presents part of the solution.
Answer: CD
Explanation:
C. A blob storage action: This is essential for storing and managing the book cover images.
D. A data pipeline: This helps in processing and transferring the images efficiently.
Question: 78 CertyIQ
Case Study -
This is a case study. Case studies are not timed separately. You can use as much exam time as you would like to
complete each case. However, there may be additional case studies and sections on this exam. You must manage
your time to ensure that you are able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information that is provided in the
case study. Case studies might contain exhibits and other resources that provide more information about the
scenario that is described in the case study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to review your answers and to
make changes before you move to the next section of the exam. After you begin a new section, you cannot return
to this section.
Overview -
Litware, Inc. is a publishing company that has an online bookstore and several retail bookstores worldwide. Litware
also manages an online advertising business for the authors it represents.
Litware has a Fabric workspace named Workspace1. High concurrency is enabled for Workspace1.
The company has a data engineering team that uses Python for data processing.
The retail bookstores send sales data at the end of each business day, while the online bookstore constantly
provides logs and sales data to a central enterprise resource planning (ERP) system.
Litware implements a medallion architecture by using the following three layers: bronze, silver, and gold. The sales
data is ingested from the ERP system as Parquet files that land in the Files folder in a lakehouse. Notebooks are
used to transform the files in a Delta table for the bronze and silver layers. The gold layer is in a warehouse that
has V-Order disabled.
Litware has image files of book covers in Azure Blob Storage. The files are loaded into the Files folder.
Month-end sales data is processed on the first calendar day of each month. Data that is older than one month
never changes.
In the source system, the sales data refreshes every six hours starting at midnight each day.
The sales data is captured in a Dataflow Gen1 dataflow. When the dataflow runs, new and historical data is
captured. The dataflow captures the following fields of the source:
•Sales Date
•Author
•Price
•Units
•SKU
A table named AuthorSales stores the sales data that relates to each author. The table contains a column named
AuthorEmail. Authors authenticate to a guest Fabric tenant by using their email address.
•Sales
•Fabric Admins
•Streaming Admins
Business users perform ad-hoc queries against the warehouse. The business users indicate that reports against
the warehouse sometimes run for two hours and fail to load as expected. Upon further investigation, the data
engineering team receives the following error message when the reports fail to load: “The SQL query failed while
running.”
The data engineering team wants to debug the issue and find queries that cause more than one failure.
When the authors have new book releases, there is often an increase in sales activity. This increase slows the data
ingestion process.
The company’s sales team reports that during the last month, the sales data has NOT been up-to-date when they
arrive at work in the morning.
Litware recently signed a contract to receive book reviews. The provider of the reviews exposes the data in
Amazon Simple Storage Service (Amazon S3) buckets.
Litware plans to manage Search Engine Optimization (SEO) for the authors. The SEO data will be streamed from a
REST API.
Litware plans to implement a version control solution in Fabric that will use GitHub integration and follow the
principle of least privilege.
To control data platform costs, the data platform must use only Fabric services and items. Additional Azure
resources must NOT be provisioned.
What should you recommend that the data engineering team use to ingest the SEO data?
Answer: D
Explanation:
D. an eventstream.
Microsoft Fabric Eventstream is a modern tool designed specifically for real-time data ingestion and
processing scenarios in Fabric. It supports:
Near-real-time data capture from various sources (IoT hubs, Event Hubs, etc.)
This makes Eventstream the ideal choice for NRT SEO data ingestion in a Fabric environment.
Question: 79 CertyIQ
HOTSPOT
-
You have a Fabric warehouse named DW1 that contains four staging tables named ProductCategory,
ProductSubcategory, Product, and SalesOrder. ProductCategory, ProductSubcategory, and Product are used
often in analytical queries.
You need to implement a star schema for DW1. The solution must minimize development effort.
Which design approach should you use? To answer, select the appropriate options in the answer area.
Answer:
Explanation:
In dimensional modeling, especially when designing a star schema, it's common to denormalize hierarchies
like Product > Subcategory > Category into one dimension table (e.g., DimProduct).
This simplifies relationships, speeds up queries, and is optimal for analytical workloads.
Having one dimension (e.g., DimProduct) containing all relevant attributes makes slicing and dicing in reports
easier.
The best practice for joining dimension and fact tables is to use a surrogate key or a system-generated unique
identifier (such as ProductID).
This ensures efficiency, uniqueness, and referential integrity between the fact (SalesOrder) and dimension
(Product) tables.
Question: 80 CertyIQ
HOTSPOT
-
Dataset1: This dataset will be added to Fabric and will have a unique primary key between the source and the
destination. The unique primary key will be an integer and will start from 1 and have an increment of 1.
Dataset2: This dataset contains semi-structured data that uses bulk data transfer. The dataset must be handled in
one process between the source and the destination. The data transformation process will include the use of
custom visuals to understand and work with the dataset in development mode.
Dataset3: This dataset is in a lakehouse. The data will be bulk loaded. The data transformation process will include
row-based windowing functions during the loading process.
You need to identify which type of item to use for the datasets. The solution must minimize development effort and
use built-in functionality, when possible.
What should you identify for each dataset? To answer, select the appropriate options in the answer area.
Explanation:
Dataflow Gen2 is used to ingest, transform, and load (ETL) data using a visual, low-code experience.
Pulling in data from various sources (e.g., databases, files)
Dataset2: A notebook.
Notebooks support rich, code-driven data processing using languages like PySpark, Scala, or SQL in a Spark
environment.
Suitable for big data and complex logic where visual dataflows aren't sufficient.
T-SQL (Transact-SQL) is the language for querying SQL-based engines like Azure SQL Database or
Lakehouse SQL Endpoints.
T-SQL statements are highly performant for structured, relational data operations.
Question: 81 CertyIQ
HOTSPOT
-
You have a Fabric workspace that contains a lakehouse named Lakehouse1. Lakehouse1 contains a table named
Status_Target that has the following columns:
•Key
•Status
•LastModified
The data source contains a table named Status_Source that has the same columns as Status_Target.
Status_Source is used to populate Status_Target.
In a notebook name Notebook1, you load Status_Source to a DataFrame named sourceDF and Status_Target to a
DataFrame named targetDF.
You need to implement an incremental loading pattern by using Notebook1. The solution must meet the following
requirements:
•For all the matching records that have the same value of key, update the value of LastModified in Status_Target
to the value of LastModified in Status_Source.
•Insert all the records that exist in Status_Source that do NOT exist in Status_Target.
•Set the value of Status in Status_Target to inactive for all the records that were last modified more than seven
days ago and that do NOT exist in Status_Source.
How should you complete the statement? To answer, select the appropriate options in the answer area.
1. whenMatchedUpdate()
Selected for when existing records match between sourceDF and targetDF based on Key.
Action:
Update the LastModified field in the target table with the one from the source table.
python
Copy
Edit
.whenMatchedUpdate(
Meaning:
If a record with the same Key exists, update its LastModified value to the new one.
2. whenNotMatchedInsert()
Action:
python
Copy
Edit
.whenNotMatchedInsert(
values =
"targetDF.Key": "sourceDF.Key",
"targetDF.LastModified": "sourceDF.LastModified",
"targetDF.Status": "sourceDF.Status"
Meaning:
If a Key is found in sourceDF but not in targetDF, insert the full new row (Key, LastModified, Status).
3. whenNotMatchedBySourceUpdate()
Action:
Copy
Edit
.whenNotMatchedBySourceUpdate(
Meaning:
If a record is missing from the incoming sourceDF AND it was recently modified (within the last 7 days), set its
Status to "inactive".
Question: 82 CertyIQ
DRAG DROP
-
You are building a data loading pattern by using a Fabric data pipeline. The source is an Azure SQL database that
contains 25 tables. The destination is a lakehouse.
In a warehouse, you create a control table named Control.Object as shown in the exhibit. (Click the Exhibit tab.)
You need to build a data pipeline that will support the dynamic ingestion of the tables listed in the control table by
using a single execution.
Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of
actions to the answer area and arrange them in the correct order.
Answer:
Explanation:
Add a Lookup activity to query Control.Object and generate a list of the schemas and tables to copy.
The Lookup activity is used to retrieve metadata — in this case, information from Control.Object (likely a
control table or database containing a list of source schemas and table names).
This activity returns a list of tables and schemas that need to be copied.
Think of Lookup as fetching a dynamic list. Instead of hardcoding table names, you retrieve them
automatically.
Add a ForEach activity to iterate over the list of tables and copy the source data to the lakehouse Delta
tables.
The ForEach activity is used to loop through the list generated by the Lookup activity.
For each table/schema combination, you perform operations — in this case, copying data into the lakehouse
(into Delta tables, which are optimized tables supporting ACID transactions and fast querying).
ForEach loops allow you to automate operations over multiple tables without manually repeating the same
logic for each table.
Inside the ForEach loop, the Copy Data activity will copy the actual data from the source system into the Delta
table in the lakehouse.
Each table in the list will be copied one by one as the loop runs.
Each iteration uses parameters (table name, schema, etc.) from the current item being looped over.
Question: 83 CertyIQ
You are implementing a medallion architecture in a Fabric lakehouse.
You plan to create a dimension table that will contain the following columns:
•ID
•CustomerCode
•CustomerName
•CustomerAddress
•CustomerLocation
•ValidFrom
•ValidTo
You need to ensure that the table supports the analysis of historical sales data by customer location at the time of
each sale.
A.Type 2
B.Type 0
C.Type 1
D.Type 3
Answer: A
Explanation:
A. Type 2.
Type 2 slowly changing dimensions allow you to keep a full history of changes over time. In your case, since
the goal is to analyze historical sales data by customer location at the time of each sale, you'll need to
preserve every past change to the customer's location. Type 2 achieves this by creating a new row in the
dimension table for each change, including the date range for when the record is valid.
Question: 84 CertyIQ
You have a Fabric workspace that contains an eventstream named EventStream1. EventStream1 outputs events to
a table named Table1 in a lakehouse. The streaming data is sourced from motorway sensors and represents the
speed of cars.
You need to add a transformation to EventStream1 to average the car speeds. The speeds must be grouped by non-
overlapping and contiguous time intervals of one minute. Each event must belong to exactly one window.
A.sliding
B.hopping
C.tumbling
D.session
Answer: C
Explanation:
C. Tumbling.
Tumbling windows divide the data stream into fixed, non-overlapping, and contiguous time intervals, such as
one-minute windows in this case. Each event belongs to exactly one window, making tumbling windows ideal
for calculating averages or other aggregate metrics over defined intervals of time.
Question: 85 CertyIQ
HOTSPOT
-
You have a table in a Fabric lakehouse that contains the following data.
For each of the following statements, select Yes if the statement is true. Otherwise, select No.
Answer:
Explanation:
Line 01 will replace all the null and empty values in the CustomerName column with the Unknown value.
Yes
This line describes a data cleaning operation where missing or empty entries in a column are filled with a
default value, "Unknown".
This is a common transformation in tools like Power Query (M Language), SQL, or Python (e.g.,
fillna("Unknown") in pandas).
Line 02 will extract the value before the @ character and generate a new column named Username.
No
This sounds like it's trying to split email addresses into usernames.
However, without seeing the actual syntax of Line 02, we cannot assume that it correctly:
Often, this mistake happens if the line tries to modify the column in-place or fails to assign a new column.
So unless it explicitly creates a new column (e.g., Username = Text.BeforeDelimiter(Email, "@")), the claim is
inaccurate.
Line 03 will extract the year value from the OrderDate column and keep only the first occurrence for each
year.
No.
However, keeping only the first occurrence for each year suggests some kind of grouping and filtering step.
Unless Line 03 explicitly includes logic like Group By Year and selecting the first row, it doesn't do what this
statement claims.
So this statement assumes extra behavior not guaranteed by merely extracting the year.
Question: 86 CertyIQ
DRAG DROP
-
In Eventhouse1, you plan to create a table named DeviceStreamData in a KQL database. The table will contain data
based on the following sample.
You need to use a KQL query to develop the solution for Eventhouse1.
Which three code segments should you run in sequence? To answer, move the appropriate code segments from
the list of code segments to the answer area and arrange them in the correct order.
Answer:
Explanation:
TimeStamp:datetime, DeviceId:string.
Defines two columns in the table:
TimeStamp:datetime
This column will hold the date and time of the event.
DeviceId:string
StreamData:dynamic )
The dynamic data type in KQL is used for semi-structured data, such as JSON.
This column is intended to store complex or nested event payloads that vary in structure.
Question: 87 CertyIQ
You have a Fabric workspace that contains a warehouse named Warehouse1.
You have an on-premises Microsoft SQL Server database named Database1 that is accessed by using an on-
premises data gateway.
Answer: A
Explanation:
A. a data pipeline.
A data pipeline is designed to copy and transform data between sources and destinations, such as
transferring data from an on-premises Microsoft SQL Server database to a Fabric warehouse like Warehouse1.
It can seamlessly leverage the on-premises data gateway for connectivity and ensure efficient movement of
data.
Question: 88 CertyIQ
You have a Fabric warehouse named DW1 that contains a Type 2 slowly changing dimension (SCD) dimension table
named DimCustomer. DimCustomer contains 100 columns and 20 million rows. The columns are of various data
types, including int, varchar, date, and varbinary.
You need to identify incoming changes to the table and update the records when there is a change. The solution
must minimize resource consumption.
Answer: A
Explanation:
Using a hash function is an efficient way to identify changes, as it minimizes resource consumption. By
generating and comparing hash values for attributes, you can quickly detect differences between the source
table and the target table without comparing each attribute directly, which can be resource-intensive.
Question: 89 CertyIQ
You have an Azure SQL database named DB1.
In a Fabric workspace, you deploy an eventstream named EventStreamDB1 to stream record changes from DB1 into
a lakehouse.
Answer: D
Explanation:
Change Data Capture (CDC) is a feature used to track changes (inserts, updates, and deletes) in a database
table and make those changes available in a way that they can be consumed by other systems or processes,
such as EventStreamDB1. When events are not being propagated, it typically means that the system
responsible for capturing changes (in this case, CDC) is not enabled or configured.
Question: 90 CertyIQ
Case Study -
This is a case study. Case studies are not timed separately. You can use as much exam time as you would like to
complete each case. However, there may be additional case studies and sections on this exam. You must manage
your time to ensure that you are able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information that is provided in the
case study. Case studies might contain exhibits and other resources that provide more information about the
scenario that is described in the case study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to review your answers and to
make changes before you move to the next section of the exam. After you begin a new section, you cannot return
to this section.
Contoso, Ltd. is an online retail company that wants to modernize its analytics platform by moving to Fabric. The
company plans to begin using Fabric for marketing analytics.
Overview. IT Structure -
The company’s IT department has a team of data analysts and a team of data engineers that use analytics
systems.
The data engineers perform the ingestion, transformation, and loading of data. They prefer to use Python or SQL to
transform the data.
The data analysts query data and create semantic models and reports. They are qualified to write queries in Power
Query and T-SQL.
Contoso has an F64 capacity named Cap1. All Fabric users are allowed to create items.
Contoso has two workspaces named WorkspaceA and WorkspaceB that currently use Pro license mode.
Contoso has a point of sale (POS) system named POS1 that uses an instance of SQL Server on Azure Virtual
Machines in the same Microsoft Entra tenant as Fabric. The host virtual machine is on a private virtual network that
has public access blocked. POS1 contains all the sales transactions that were processed on the company’s
website.
The company has a software as a service (SaaS) online marketing app named MAR1. MAR1 has seven entities. The
entities contain data that relates to email open rates and interaction rates, as well as website interactions. The
data can be exported from MAR1 by calling REST APIs. Each entity has a different endpoint.
Contoso has been using MAR1 for one year. Data from prior years is stored in Parquet files in an Amazon Simple
Storage Service (Amazon S3) bucket. There are 12 files that range in size from 300 MB to 900 MB and relate to
email interactions.
POS1 contains a product list and related data. The data comes from the following three tables:
•Products
•ProductCategories
•ProductSubcategories
In the data, products are related to product subcategories, and subcategories are related to product categories.
Contoso has a Microsoft Entra tenant that has the following mail-enabled security groups:
The company has an existing Azure DevOps organization and creates a new project for repositories that relate to
Fabric.
The VP of marketing at Contoso requires analysis on the effectiveness of different types of email content. It
typically takes a week to manually compile and analyze the data. Contoso wants to reduce the time to less than
one day by using Fabric.
The data engineering team has successfully exported data from MAR1. The team experiences transient
connectivity errors, which causes the data exports to fail.
•Lakehouse1: Will store both raw and cleansed data from the sources
•Lakehouse2: Will serve data in a dimensional model to users for analytical queries
The new lakehouses must follow a medallion architecture by using the following three layers: bronze, silver, and
gold. There will be extensive data cleansing required to populate the MAR1 data in the silver layer, including
deduplication, the handling of missing values, and the standardizing of capitalization.
Each layer must be fully populated before moving on to the next layer. If any step in populating the lakehouses
fails, an email must be sent to the data engineers.
The use of email data from the Amazon S3 bucket must meet the following requirements:
Items that relate to data ingestion must meet the following requirements:
Lakehouses, data pipelines, and notebooks must be stored in WorkspaceA. Semantic models, reports, and
dataflows must be stored in WorkspaceB.
Once a week, old files that are no longer referenced by a Delta table log must be removed.
Requirements. Data Transformation
In the POS1 product data, ProductID values are unique. The product dimension in the gold layer must include only
active products from product list. Active products are identified by an IsActive value of 1.
Some product categories and subcategories are NOT assigned to any product. They are NOT analytically relevant
and must be omitted from the product dimension in the gold layer.
•The data engineers must have read and write access to all the lakehouses, including the underlying files.
•The data analysts must only have read access to the Delta tables in the gold layer.
•The data analysts must NOT have access to the data in the bronze and silver layers.
•The data engineers must be able to commit changes to source control in WorkspaceA.
You need to recommend a solution to resolve the MAR1 connectivity issues. The solution must minimize
development effort.
Answer: B
Explanation:
Configuring retries for the Copy data activity is a straightforward solution that minimizes development effort
while addressing connectivity issues. By enabling retries, the pipeline can automatically attempt to reconnect
and complete the operation without requiring additional complex configurations or manual intervention.
Question: 91 CertyIQ
Case Study -
This is a case study. Case studies are not timed separately. You can use as much exam time as you would like to
complete each case. However, there may be additional case studies and sections on this exam. You must manage
your time to ensure that you are able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information that is provided in the
case study. Case studies might contain exhibits and other resources that provide more information about the
scenario that is described in the case study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to review your answers and to
make changes before you move to the next section of the exam. After you begin a new section, you cannot return
to this section.
Contoso, Ltd. is an online retail company that wants to modernize its analytics platform by moving to Fabric. The
company plans to begin using Fabric for marketing analytics.
Overview. IT Structure -
The company’s IT department has a team of data analysts and a team of data engineers that use analytics
systems.
The data engineers perform the ingestion, transformation, and loading of data. They prefer to use Python or SQL to
transform the data.
The data analysts query data and create semantic models and reports. They are qualified to write queries in Power
Query and T-SQL.
Contoso has an F64 capacity named Cap1. All Fabric users are allowed to create items.
Contoso has two workspaces named WorkspaceA and WorkspaceB that currently use Pro license mode.
Contoso has a point of sale (POS) system named POS1 that uses an instance of SQL Server on Azure Virtual
Machines in the same Microsoft Entra tenant as Fabric. The host virtual machine is on a private virtual network that
has public access blocked. POS1 contains all the sales transactions that were processed on the company’s
website.
The company has a software as a service (SaaS) online marketing app named MAR1. MAR1 has seven entities. The
entities contain data that relates to email open rates and interaction rates, as well as website interactions. The
data can be exported from MAR1 by calling REST APIs. Each entity has a different endpoint.
Contoso has been using MAR1 for one year. Data from prior years is stored in Parquet files in an Amazon Simple
Storage Service (Amazon S3) bucket. There are 12 files that range in size from 300 MB to 900 MB and relate to
email interactions.
POS1 contains a product list and related data. The data comes from the following three tables:
•Products
•ProductCategories
•ProductSubcategories
In the data, products are related to product subcategories, and subcategories are related to product categories.
Contoso has a Microsoft Entra tenant that has the following mail-enabled security groups:
The company has an existing Azure DevOps organization and creates a new project for repositories that relate to
Fabric.
Existing Environment. User Problems
The VP of marketing at Contoso requires analysis on the effectiveness of different types of email content. It
typically takes a week to manually compile and analyze the data. Contoso wants to reduce the time to less than
one day by using Fabric.
The data engineering team has successfully exported data from MAR1. The team experiences transient
connectivity errors, which causes the data exports to fail.
•Lakehouse1: Will store both raw and cleansed data from the sources
•Lakehouse2: Will serve data in a dimensional model to users for analytical queries
The new lakehouses must follow a medallion architecture by using the following three layers: bronze, silver, and
gold. There will be extensive data cleansing required to populate the MAR1 data in the silver layer, including
deduplication, the handling of missing values, and the standardizing of capitalization.
Each layer must be fully populated before moving on to the next layer. If any step in populating the lakehouses
fails, an email must be sent to the data engineers.
The use of email data from the Amazon S3 bucket must meet the following requirements:
Items that relate to data ingestion must meet the following requirements:
Lakehouses, data pipelines, and notebooks must be stored in WorkspaceA. Semantic models, reports, and
dataflows must be stored in WorkspaceB.
Once a week, old files that are no longer referenced by a Delta table log must be removed.
In the POS1 product data, ProductID values are unique. The product dimension in the gold layer must include only
active products from product list. Active products are identified by an IsActive value of 1.
Some product categories and subcategories are NOT assigned to any product. They are NOT analytically relevant
and must be omitted from the product dimension in the gold layer.
•The data engineers must have read and write access to all the lakehouses, including the underlying files.
•The data analysts must only have read access to the Delta tables in the gold layer.
•The data analysts must NOT have access to the data in the bronze and silver layers.
•The data engineers must be able to commit changes to source control in WorkspaceA.
You need to recommend a solution for handling old files. The solution must meet the technical requirements.
Answer: C
Explanation:
The VACUUM command is typically used to handle old files in a data lake or Delta table environment. It
removes files that are no longer referenced by the current state of the table. This is essential for cleaning up
outdated files and optimizing storage usage, while still preserving the technical requirements that ensure
data integrity and compliance with retention policies.
Question: 92 CertyIQ
DRAG DROP
-
You need to build a KQL query to compare the MeterReading value of each row to the previous row base on the
Timestamp value.
How should you complete the query? To answer, drag the appropriate values the correct targets. Each value may
be used once, more than once, or not at all. You may need to drag the split bar between panes or scroll to view
content.
Explanation:
sort by Timestamp.
This line sorts the filtered results (the Kansas readings) based on the values in the Timestamp column.
This is where the magic of looking at previous rows happens. The extend operator adds new columns to your
result set.
The project operator selects which columns to keep in the final output and in what order they should appear.
Question: 93 CertyIQ
HOTSPOT
-
You need to recommend a Fabric streaming solution that will use the sources shown in the following table.
The solution must minimize development effort.
What should you include in the recommendation for each source? To answer, select the appropriate options in the
answer area.
Answer:
Explanation:
Data pipelines are used for orchestrating and scheduling data movement and transformation tasks.
They support:
Apache Spark Structured Streaming is designed for real-time stream processing with high scalability.
Streaming dataflows in Microsoft Fabric provide a no-code/low-code interface for ingesting and transforming
streaming data.
Great for:
Lightweight streaming pipelines
Question: 94 CertyIQ
HOTSPOT -
Case Study -
This is a case study. Case studies are not timed separately. You can use as much exam time as you would like to
complete each case. However, there may be additional case studies and sections on this exam. You must manage
your time to ensure that you are able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information that is provided in the
case study. Case studies might contain exhibits and other resources that provide more information about the
scenario that is described in the case study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to review your answers and to
make changes before you move to the next section of the exam. After you begin a new section, you cannot return
to this section.
Overview -
Litware, Inc. is a publishing company that has an online bookstore and several retail bookstores worldwide. Litware
also manages an online advertising business for the authors it represents.
Existing Environment. Fabric Environment
Litware has a Fabric workspace named Workspace1. High concurrency is enabled for Workspace1.
The company has a data engineering team that uses Python for data processing.
Existing Environment. Data Processing
The retail bookstores send sales data at the end of each business day, while the online bookstore constantly
provides logs and sales data to a central enterprise resource planning (ERP) system.
Litware implements a medallion architecture by using the following three layers: bronze, silver, and gold. The sales
data is ingested from the ERP system as Parquet files that land in the Files folder in a lakehouse. Notebooks are
used to transform the files in a Delta table for the bronze and silver layers. The gold layer is in a warehouse that
has V-Order disabled.
Litware has image files of book covers in Azure Blob Storage. The files are loaded into the Files folder.
Existing Environment. Sales Data
Month-end sales data is processed on the first calendar day of each month. Data that is older than one month
never changes.
In the source system, the sales data refreshes every six hours starting at midnight each day.
The sales data is captured in a Dataflow Gen1 dataflow. When the dataflow runs, new and historical data is
captured. The dataflow captures the following fields of the source:
Sales Date -
Author -
Price -
Units -
SKU -
A table named AuthorSales stores the sales data that relates to each author. The table contains a column named
AuthorEmail. Authors authenticate to a guest Fabric tenant by using their email address.
Existing Environment. Security Groups
Litware has the following security groups:
Sales -
Fabric Admins -
Streaming Admins -
Existing Environment. Performance Issues
Business users perform ad-hoc queries against the warehouse. The business users indicate that reports against
the warehouse sometimes run for two hours and fail to load as expected. Upon further investigation, the data
engineering team receives the following error message when the reports fail to load: “The SQL query failed while
running.”
The data engineering team wants to debug the issue and find queries that cause more than one failure.
When the authors have new book releases, there is often an increase in sales activity. This increase slows the data
ingestion process.
The company’s sales team reports that during the last month, the sales data has NOT been up-to-date when they
arrive at work in the morning.
queryinsights.frequently_run_queries
number_of_failed_runs > 1
only this table have the fields specified in the SELECT AND WHERE statements
The data engineering team wants to debug the issue and find queries that cause more than one failure.
https://learn.microsoft.com/en-us/sql/relational-databases/system-views/queryinsights-frequently-run-
queries-transact-sql?view=fabric&preserve-view=true
Question: 95 CertyIQ
Case Study -
This is a case study. Case studies are not timed separately. You can use as much exam time as you would like to
complete each case. However, there may be additional case studies and sections on this exam. You must manage
your time to ensure that you are able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information that is provided in the
case study. Case studies might contain exhibits and other resources that provide more information about the
scenario that is described in the case study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to review your answers and to
make changes before you move to the next section of the exam. After you begin a new section, you cannot return
to this section.
Overview. IT Structure -
The company’s IT department has a team of data analysts and a team of data engineers that use analytics
systems.
The data engineers perform the ingestion, transformation, and loading of data. They prefer to use Python or SQL to
transform the data.
The data analysts query data and create semantic models and reports. They are qualified to write queries in Power
Query and T-SQL.
Products -
ProductCategories -
ProductSubcategories -
In the data, products are related to product subcategories, and subcategories are related to product categories.
Answer: A
Explanation:
Schedule a data pipeline that calls other data pipelines: This approach allows you to orchestrate and
manage the population of medallion layers efficiently. By scheduling a main data pipeline that calls other data
pipelines, you can ensure that each step in the data processing workflow is executed in the correct sequence.
This method provides better modularity and manageability, as each sub-pipeline can focus on a specific layer
or task within the medallion architecture.
Question: 96 CertyIQ
DRAG DROP -
You have a Fabric eventhouse that contains a KQL database. The database contains a table named TaxiData. The
following is a sample of the data in TaxiData.
You need to build two KQL queries. The solution must meet the following requirements:
One of the queries must partition RunningTotalAmount by VendorID.
The other query must create a column named FirstPickupDateTime that shows the first value of each hour from
tpep_pickup_datetime partitioned by payment_type.
How should you complete each query? To answer, drag the appropriate values the correct targets. Each value may
be used once, more than once, or not at all. You may need to drag the split bar between panes or scroll to view
content.
NOTE: Each correct selection is worth one point.
Answer:
Explanation:
Row_cumsum – Computes the cumulative sum of a column, resetting when a condition is met.
Row_window_session – Groups records into sessions based on time intervals or other conditions.
Question: 97 CertyIQ
HOTSPOT -
You are processing streaming data from an external data provider.
You have the following code segment.
For each of the following statements, select Yes if the statement is true. Otherwise, select No.
NOTE: Each correct selection is worth one point.
Answer:
Explanation:
Question: 98 CertyIQ
You have a Fabric workspace that contains a lakehouse named Lakehouse1. Lakehouse1 contains a Delta table
named Table1.
You analyze Table1 and discover that Table1 contains 2,000 Parquet files of 1 MB each.
You need to minimize how long it takes to query Table1.
What should you do?
Answer: C
Explanation:
OPTIMIZE Command: Running the OPTIMIZE command on a Delta table helps to combine smaller files into
larger ones, which can significantly improve query performance. This process, known as compaction, reduces
the number of Parquet files that need to be read during a query, thereby decreasing query latency. In your
case, with 2,000 Parquet files of 1 MB each, running OPTIMIZE will consolidate these files into fewer, larger
files, making queries faster and more efficient.
VACUUM Command: The VACUUM command cleans up old versions of data files that are no longer needed,
which helps to free up storage space and maintain the performance of the Delta table. After running
OPTIMIZE, it's a good practice to run VACUUM to remove any obsolete files and further streamline the data
storage.
Question: 99 CertyIQ
You have a Fabric workspace that contains a warehouse named Warehouse1. Data is loaded daily into Warehouse1
by using data pipelines and stored procedures.
You discover that the daily data load takes longer than expected.
You need to monitor Warehouse1 to identify the names of users that are actively running queries.
Which view should you use?
A.sys.dm_exec_connections
B.sys.dm_exec_requests
C.queryinsights.long_running_queries
D.queryinsights.frequently_run_queries
E.sys.dm_exec_sessions
Answer: E
Explanation:
sys.dm_exec_sessions: This view provides detailed information about all active user connections to the SQL
server. It includes information about the user, session ID, login time, and more. By querying this view, you can
identify which users are currently connected and actively running queries.
Use sys.dm_exec_sessions. This view has info about all active user sessions, including user names, session IDs
and status.
A.VACUUM
B.COMPUTE
C.OPTIMIZE
D.CLONE
Answer: A
Explanation:
The VACUUM command is used to clean up old files that are no longer in use, which fits the requirement of
removing files that are older than seven days. This command is typically used in data lake environments to
delete files that are no longer needed by the system, ensuring that storage is efficiently managed.
The default retention period for the VACUUM command is 7 days, therefore it will remove files older than 7
days.
A.From Monitoring hub, select the latest failed run of Pipeline1, and then view the output JSON.
B.From Monitoring hub, select the latest failed run of Pipeline1, and then view the input JSON.
C.From Real-time hub, select Fabric events, and then review the details of Microsoft.Fabric.ItemReadFailed.
D.From Real-time hub, select Fabric events, and then review the details of Microsoft. Fabric.ItemUpdateFailed.
Answer: B
Explanation:
B. From Monitoring hub, select the latest failed run of Pipeline1, and then view the input JSON.
Monitoring hub: The Monitoring hub provides detailed logs and information about the execution of your data
pipelines. By selecting the latest failed run of Pipeline1, you can access the execution details and diagnose
the issue.
View the input JSON: The input JSON contains the parameters, configurations, and the dynamic SQL query
used for the Copy data activity. By examining the input JSON, you can identify the specific SQL query that was
executed at the time the pipeline failed. This information is crucial for troubleshooting the issue and
understanding why the pipeline keeps failing.
A.Real-Time hub
B.Monitoring hub
C.the job history from the application run
D.Spark History Server
E.the run series from the details of the application run
Answer: E
Explanation:
The run series from the details of the application run: This option allows you to view a detailed timeline of the
jobs that were executed during the last run of Notebook1. The run series provides a chronological view of all
the jobs, including their start and end times, which enables you to visualize the execution timeline effectively.
What should you use? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
Answer:
Explanation:
Runtime logs.
Runtime logs provide detailed error messages and timestamps when the error occurred.
Data insights.
Data insights summarize metrics such as the total number of errors, throughput, and performance statistics.
This is a case study. Case studies are not timed separately. You can use as much exam time as you would like to
complete each case. However, there may be additional case studies and sections on this exam. You must manage
your time to ensure that you are able to complete all questions included on this exam in the time provided.
To answer the questions included in a case study, you will need to reference information that is provided in the
case study. Case studies might contain exhibits and other resources that provide more information about the
scenario that is described in the case study. Each question is independent of the other questions in this case study.
At the end of this case study, a review screen will appear. This screen allows you to review your answers and to
make changes before you move to the next section of the exam. After you begin a new section, you cannot return
to this section.
Overview -
Litware, Inc. is a publishing company that has an online bookstore and several retail bookstores worldwide. Litware
also manages an online advertising business for the authors it represents.
Litware has a Fabric workspace named Workspace1. High concurrency is enabled for Workspace1.
The company has a data engineering team that uses Python for data processing.
The retail bookstores send sales data at the end of each business day, while the online bookstore constantly
provides logs and sales data to a central enterprise resource planning (ERP) system.
Litware implements a medallion architecture by using the following three layers: bronze, silver, and gold. The sales
data is ingested from the ERP system as Parquet files that land in the Files folder in a lakehouse. Notebooks are
used to transform the files in a Delta table for the bronze and silver layers. The gold layer is in a warehouse that
has V-Order disabled.
Litware has image files of book covers in Azure Blob Storage. The files are loaded into the Files folder.
Month-end sales data is processed on the first calendar day of each month. Data that is older than one month
never changes.
In the source system, the sales data refreshes every six hours starting at midnight each day.
The sales data is captured in a Dataflow Gen1 dataflow. When the dataflow runs, new and historical data is
captured. The dataflow captures the following fields of the source:
•Sales Date
•Author
•Price
•Units
•SKU
A table named AuthorSales stores the sales data that relates to each author. The table contains a column named
AuthorEmail. Authors authenticate to a guest Fabric tenant by using their email address.
•Sales
•Fabric Admins
•Streaming Admins
Business users perform ad-hoc queries against the warehouse. The business users indicate that reports against
the warehouse sometimes run for two hours and fail to load as expected. Upon further investigation, the data
engineering team receives the following error message when the reports fail to load: “The SQL query failed while
running.”
The data engineering team wants to debug the issue and find queries that cause more than one failure.
When the authors have new book releases, there is often an increase in sales activity. This increase slows the data
ingestion process.
The company’s sales team reports that during the last month, the sales data has NOT been up-to-date when they
arrive at work in the morning.
Litware recently signed a contract to receive book reviews. The provider of the reviews exposes the data in
Amazon Simple Storage Service (Amazon S3) buckets.
Litware plans to manage Search Engine Optimization (SEO) for the authors. The SEO data will be streamed from a
REST API.
Litware plans to implement a version control solution in Fabric that will use GitHub integration and follow the
principle of least privilege.
To control data platform costs, the data platform must use only Fabric services and items. Additional Azure
resources must NOT be provisioned.
What should you do to optimize the query experience for the business users?
A.Enable V-Order.
B.Create and update statistics.
C.Run the VACUUM command.
D.Introduce primary keys.
Answer: B
Explanation:
Creating and updating statistics helps optimize query performance by providing the query engine with
accurate information about the data distribution. This allows the engine to generate efficient query execution
plans, ultimately improving the query experience for business users.
While monitoring Warehouse1, you discover that query performance has degraded during the last 60 minutes.
You need to isolate all the queries that were run during the last 60 minutes. The results must include the username
of the users that submitted the queries and the query statements.
Answer: B
Explanation:
The queryinsights schema in Microsoft Fabric provides detailed information about query execution, including
the username of the users who submitted the queries and the query statements themselves. By using the
relevant views from the queryinsights schema, you can isolate and analyze all queries executed during the
specified time frame, which is essential for troubleshooting performance issues.
You need to monitor the refresh history of Model1 and visualize the refresh history in a chart.
Explanation:
B. a notebook
In Microsoft Fabric, if you want to monitor the refresh history of a semantic model (Model1) and visualize that
data in a chart, a notebook is the most flexible and capable tool.
They are asking to visualize the data as well. Which is possible via notebook.
A.Enable V-Order.
B.Create statistics.
C.Drop statistics.
D.Disable V-Order.
Answer: D
Explanation:
D. Disable V-Order.
V-Order is an optimized format for query performance in analytical workloads. However, since DW1 is write-
intensive and the staging tables are primarily read once and recreated, enabling V-Order could increase the
write overhead. Disabling V-Order minimizes the load time in this specific scenario, as it eliminates the cost
associated with reorganizing data into the V-Order format.
Thank you
Thank you for being so interested in the premium exam material.
I'm glad to hear that you found it informative and helpful.
If you have any feedback or thoughts on the bumps, I would love to hear them.
Your insights can help me improve our writing and better understand our readers.
Best of Luck
You have worked hard to get to this point, and you are well-prepared for the exam
Keep your head up, stay positive, and go show that exam what you're made of!