PL-300 Exam Prep-Part 1
PL-300 Exam Prep-Part 1
Divya Panseriya
linkedin.com/in/divya-panseriya/
Divya Panseriya
❑ Agenda linkedin.com/in/divya-panseriya/
01 y a
ri
Prepare the Data
s e
a n
a P
02
iv y
Model the Data
y D
d b
r
a03 e Visualize and Analyze the Data
r e p
P
04 Deploy and Maintain Assets
Divya Panseriya
y a
ri
Question Types:
e
Questions 40-60 will typically include a variety of formats:
•Multiple-choice questions (single answer)
n s
•Multiple-choice questions (multiple answers)
•Drag and Drop questions
P a
•Build List Reorder questions
y a
iv
•Case studies with multiple questions
Exam Duration:
y D
d b
The total exam duration is 100 to 120 minutes (2 hours).
•Passing Score:
a r e
e p
•The passing score for the PL-300 exam is 700 out of 1000 points.
r
P
Divya Panseriya
y a
Skills measured as of April 23, 2024
s eri
➢ Prepare the data (25–30%)
an
➢ Model the data (25–30%)
a P
➢
➢ Deploy and maintain assets (15–20%
iv y
Visualize and analyze the data (25–30%)
y D
d b
a r e
r e p
P
Divya Panseriya
❑ Part
Part 1 One: Prepare
: Prepare the data the Data
(25–30%) linkedin.com/in/divya-panseriya/
y a
• Identify and connect to a data source
s e ri
n
• Change data source settings, including credentials,
a
privacy levels, and data source locations
• Select a shared dataset, or create a local dataset
a
• Choose between DirectQuery, Import, and Dual mode P
• Change the value in a parameter
iv y
1.2. Clean the data
y D
b
• Evaluate data, including data statistics and column properties
d
• Resolve inconsistencies, unexpected or null values, and data quality issues
r e
• Resolve data import errors
a
e p
1.3. Transform and load the data
r
• Select appropriate column data types
y a
➢ Identify and connect to a data source
s eri
an
➢Identify and
✓ Understand different connect
data sources suchto a data
as databases
P
ya
(SQL Server, MySQL), Excel files, SharePoint lists,
source
web APIs, etc.
i v
D
✓ Use Power BI Desktop to connect to these data
y
sources using built-in connectors.
b
r e d
p a
r e
P
Divya Panseriya
➢ Connect to data in a relational database linkedin.com/in/divya-panseriya/
y a
s e ri
a n
a P
iv y
y D
d b
a r e
r e p
P
Your next step is to enter your database server name and a database name in the SQL Server database window. The
two options in data connectivity mode are: Import (selected by default, recommended) and DirectQuery. Mostly, you
select Import. Other advanced options are also available in the SQL Server database window,
Divya Panseriya
linkedin.com/in/divya-panseriya/
When connecting to a database in Power BI's Power Query, you typically
have three main options available for providing credentials:
y a
s e ri
n
1.Windows:
1. Use your Windows credentials (username
and password) to authenticate with the
P a
database.
y a
iv
2.Database:
1. Use database-specific credentials (username
y D
and password) that you provide to
b
authenticate directly with the database.
d
3.Microsoft Account:
r e
1. Uses your Microsoft account credentials to
a
authenticate, suitable for databases that
e p
support Azure Active Directory
r
authentication.
P
In the Power Query Navigator window, the three main options are:
1.Load
2.Transform Data
3.Cancel
Divya Panseriya
linkedin.com/in/divya-panseriya/
➢Import data by writing an SQL query
y a
•Import Data with SQL Query:
•Write an SQL query to specify required tables and columns.
s eri
an
P
•Steps to Write SQL Query in Power BI:
a
•Open the SQL Server database window.
•Enter server and database names.
iv y
•Expand options by selecting the arrow next to Advanced options.
•Click OK.
y D
•Write your SQL query in the SQL statement box.
d b
e
•Example Query:
a r
•Use the SELECT statement to load ID, NAME, and SALESAMOUNT
p
columns from the SALES table.
r e
P
Divya Panseriya
➢ Get data from a NoSQL database
linkedin.com/in/divya-panseriya/
A NoSQL database, also known as non-SQL or non-relational, is flexible and doesn't use tables to store data.
Connect to a NoSQL database (Azure Cosmos DB)
y a
s e ri
a n
a P
iv y
y D
d b
a r e
r e p
P
On the Preview Connector window, select Connect and then enter your database credentials. In this example, on
the Azure Cosmos DB window, you can enter the database details. You can specify the Azure Cosmos DB account
endpoint URL that you want to get the data from (you can get the URL from the Keys blade of your Azure portal).
Alternatively, you can enter the database name, collection name or use the navigator to select the database and
collection to identify the data source.
Divya Panseriya
➢ Get data from online services linkedin.com/in/divya-panseriya/
To support their daily operations, organizations frequently use a range of software applications, such as SharePoint, OneDrive,
y a
Dynamics 365, Google Analytics and so on. These applications produce their own data. Power BI can combine the data from multiple
ri
applications to create more meaningful insights and reports.
s e
a n
a P
iv y
y D
d b
a r e
r e p
P
After you've entered your URL, select OK.
Power BI needs to authorize the connection to
SharePoint, so sign in with your Microsoft
account and then select Connect.
Divya Panseriya
➢ Change data source settings linkedin.com/in/divya-panseriya/
a
➢ Change data source settings, including credentials, privacy levels, and data source locations
ri y
s e
a n
a P
iv y
y D
d b
a r e
r e p
P
Divya Panseriya
➢ Changing Data Source Settings in Power Query linkedin.com/in/divya-panseriya/
a
Method 2: Update Data Source:
y
•Method 1:
ri
to the Query Settings In the window that displays:
• Select the table in Power panel on the right side of
e
• Update the server and
s
Query. the screen. database details.
• Go to the Home ribbon.
• Select Data source settings.
•Select the settings icon
a n • Select OK.
P
next to Source or double-
a
click Source.
iv y
y D
d b
a r e
r e p
P
Divya Panseriya
linkedin.com/in/divya-panseriya/
Privacy levels
y a
When connecting to a data source in Power BI's Power
s eri
Query, you can set the privacy level for the data source to
an
P
control how data is combined. The three main privacy levels
a
available are:
1. Public:
iv y
y
with other public data sources.D
1. Data is considered non-sensitive and can be freely combined
d b
e
2. Organizational:
a r
1. Data is considered sensitive within an organization and
p
requires to use of credentials but can be combined with other
r e
organizational data sources.
P
3. Private:
1. Data is considered highly Confidential and sensitive and
cannot be combined with other data sources to ensure data
privacy and security.
Divya Panseriya
linkedin.com/in/divya-panseriya/
➢ Select a shared dataset
y a
ri
➢ Select a shared dataset
s e
n
Shared Datasets: These are datasets published to the Power
BI service and can be reused by multiple reports or
dashboards. Shared datasets promote reusability and
P a
consistency across different reports.
y a
D iv
by
r e d
p a
r e
P
Divya Panseriya
linkedin.com/in/divya-panseriya/
y a
The three different types of storage modes you can choose from:
s e ri
1.DirectQuery: Connects to the data source live without
a n
P
importing data into Power BI.
y a
iv
2.Import: Imports data into Power BI and stores it in a highly
D
compressed, in-memory cache.
by
3.Dual ( Composite)Mode: Allows tables to operate in both
e d
Import and DirectQuery modes, depending on the context.
r
p a
r e
P
You can access storage modes by switching to the Model view, selecting a data table, and in the
resulting Properties pane, selecting which mode you want to use from the Storage mode drop-down
list, as shown in the following visual.
Divya Panseriya
DirectQuery
Pros: linkedin.com/in/divya-panseriya/
•Real-Time Data: Always fetches the most up-to-date data from the source.
•Reduced Memory Usage: Does not store data in Power BI, so it uses less memory.
a
•Data Size: Ideal for very large datasets where importing the data is impractical.
ri y
Cons:
•Performance: Slower performance since each interaction queries the data source.
e
•Limitations: Certain DAX functions and complex transformations are not supported.
s
When to Use:
•When you need real-time data updates.
•When working with very large datasets that cannot be imported into Power BI.
a n
Import Mode in Power BI
a P
Pros:
iv y
•Performance: Faster performance as data is stored in-memory.
D
•Full Functionality: Supports all Power BI features, including complex DAX calculations.
•Offline Capability: Reports can be used even when the data source is offline.
Cons:
by
•Data Staleness: Data is only as current as the last refresh, which might be outdated depending on the refresh schedule.
d
•Memory Usage: Consumes memory to store the data in Power BI.
r e
When to Use:
a
•When performance is a critical factor.
p
•When working with datasets that fit comfortably in memory.
e
•When full DAX functionality is required.
P r
Dual Mode in Power BI
Pros:
•Flexibility: Combines the benefits of both DirectQuery and Import modes.
•Performance and Real-Time: Uses in-memory data for performance and direct queries for real-time needs.
Cons:
•Complexity: More complex to manage and optimize.
•Memory Usage: Uses memory for imported data while also querying the data source.
When to Use:
•When you need a mix of real-time data and high performance.
•When specific tables need to be frequently updated in real-time, while others can be imported for better performance.
Divya Panseriya
➢ Change the value in a parameter linkedin.com/in/divya-panseriya/
y a
ri
1. Created in Power Query Editor
s e
an
P
1. Open Power BI Desktop:
a
1. Launch Power BI Desktop and open your report.
2. Go to Transform Data:
iv y
D
1. Click on the "Transform Data" button in the Home tab to
y
open the Power Query Editor.
3. Manage Parameters:
d b
a r e
1. In the Power Query Editor, go to the "Manage Parameters"
drop-down in the Home tab and select "Edit Parameters".
4.
e p
Select and Edit the Parameter:
r
P
1. A dialog box will appear listing all the parameters defined
in your report.
2. Select the parameter you want to change.
3. Modify the value of the parameter as needed.
4. Click "OK" to apply the changes.
Divya Panseriya
linkedin.com/in/divya-panseriya/
Modified in the Power BI Service
y a
ri
1. Navigate to the Power BI service and sign in with your credentials.
an
P
2. Click on the ellipsis (three dots) next to the dataset and select
a
"Settings".
3.Edit Parameters:
iv y
D
1. In the settings page, you will find a section for "Parameters".
2. Click on "Edit Parameters".
by
3. Modify the parameter values as needed.
4.Apply Changes:
r e d
1. Save the changes to apply the new parameter values.
a
2. Depending on the changes, you may need to refresh the dataset to
p
e
see the updated data.
P r
When to Use Parameters:
•When you need dynamic and flexible data models that can be adjusted without altering the
underlying data source.
•When creating reports that require user input to filter and analyze data.
•When performing scenario or what-if analysis to understand different business outcomes.
Divya Panseriya
❑Prepare the Data linkedin.com/in/divya-panseriya/
1.2 Clean the data
y a
• Evaluate data, including data statistics and column properties
s e
• Resolve inconsistencies, unexpected or null values, and data quality issues ri
• Resolve data import errors
a n
a P
iv y
y D
d b
a r e
r e p
P
❑ Clean the data Divya Panseriya
linkedin.com/in/divya-panseriya/
➢
a
Evaluate data, including data statistics and column
ri y
1…Column Distribution
s e
a n
a P
iv y
y D
d b
a r e
p
Distinct values are all the different values in a column, including duplicates and null values, while unique values do not
r e
include duplicates or nulls.
P
Therefore, distinct in this table tells you the total count of how many values are present, while unique tells you how
many of those values only appear once.
By default, Power Query examines the first 1000 rows of your data set. To change this,
Note select the profiling status in the status bar and select Column profiling based on entire
data set
Divya Panseriya
❑ Clean the data linkedin.com/in/divya-panseriya/
y a
2..Column Quality
s e ri
a n
a P
iv y
y D
d b
a r e
r e p
P
Column quality shows you the percentages of data that is valid, in error, and empty. In an ideal situation,
you want 100 percent of the data to be valid.
Divya Panseriya
❑ Clean the data
linkedin.com/in/divya-panseriya/
y a
3..Column Profile
s e ri
a n
a P
iv y
y D
d b
a r e
r e p
P
Column profile gives you a more in-depth look into the statistics within the columns for the first 1,000 rows of data. This column provides several different
values, including the count of rows, which is important when verifying whether the importing of your data was successful. For example, if your original
database had 100 rows, you could use this row count to verify that 100 rows were, in fact, imported correctly. Additionally, this row count will show how
many rows that Power BI has deemed as being outliers, empty rows and strings, and the min and max, which will tell you the smallest and largest value
in a column, respectively. This distinction is particularly important in the case of numeric data because it will immediately notify you if you have a
maximum value that is beyond what your business identifies as a "maximum." This value calls to your attention these values, which means that you can
then focus your efforts when delving deeper into the data. In the case where data was in the text column, as seen in the previous image, the minimum
value is the first value and the maximum value is the last value when in alphabetical order.
Divya Panseriya
linkedin.com/in/divya-panseriya/
➢➢Resolve inconsistencies,
Resolve Inconsistencies, unexpected
Unexpected or Data
or Null Values, and nullQuality
values,Issuesand data
quality issues
y a
• Remove Null Values:
s e ri
n
• Replace Null Values:
y D
d b
a r e
r e p
P
Divya Panseriya
linkedin.com/in/divya-panseriya/
➢ Resolve data import errors
y a
s e ri
Here are some more specific and common error messages that users often encounter in Power BI:
D iv
• Data Type Error
by
r e d
p a
r e
P
Divya Panseriya
➢ Resolve data import errors linkedin.com/in/divya-panseriya/
y a
ri
For instance, if you’re pulling data from your
e
organization’s SQL Server, you might see the error
shown in the following figure.
n s
P a
a
These timeouts can be configured for any timespan,
from as little as five seconds to as much as 30 minutes
or more.
iv y
y D
d b
e
Strategies to Avoid Query Timeouts
a r
p
•Load Selective Data: •Optimize Query Steps:
e
•Fetch only necessary columns and a subset of rows to reduce initial •Review and optimize each query step to minimize
data load.
P r
•Divide Data Retrieval:
•Split data fetching into smaller queries (e.g., half columns in separate
processing time.
•Utilize Data Model:
•Establish efficient relationships between tables in
queries). Power BI’s data model.
•Merge Queries: •Monitor Performance:
•Use Power Query to combine data subsets after retrieval. •Use Performance Analyzer to identify and resolve
•Apply Transformations: performance bottlenecks.
•Perform required transformations and aggregations on merged data.
Divya Panseriya
linkedin.com/in/divya-panseriya/
Couldn't find file
➢ Resolve data import errors
a
Usually, this error is caused by the file moving locations or the permissions to the file changing. If the
y
ri
cause is the former, you need to find the file and change the source settings.
s e
a n
a P
iv y
y D
d b
a r e
Open Power Query by selecting the Transform Data
e
button in Power BI.
r p
P
Highlight the query that is creating the error.
y a
you may encounter the “We couldn’t find any data formatted as a table” error while importing data
s e ri
from Microsoft Excel. Power BI expects to find data formatted as a table from Excel.
a n
a P
iv y
y D
d b
a r e
Resolving "We couldn’t find any data formatted as a table" Error in Power BI
r e p
1.Select Data:
P 1. Highlight the data you want to import in Excel.
2.Format as Table:
1. Press Ctrl-T to format the selected data as a table.
2. Ensure the first row serves as column headers.
3.Import Data:
1. Try importing data from Excel again in Power BI.
Divya Panseriya
linkedin.com/in/divya-panseriya/
Data type errors
y a
when you import data into Power BI, the columns appear blank. This situation happens because of
ri
an error in interpreting the data type in Power BI. The resolution to this error is unique to the data
s e
source. For instance, if you're importing data from SQL Server and see blank columns, you could try
to convert to the correct data type in the query.
a n
Instead of using this query:
a P
iv y
D
SELECT CustomerPostalCode FROM Sales.Customers
r e p
P
Question:
You are importing data from a SQL Server database into Power BI. Some columns appear blank
due to data type interpretation issues. Which SQL query modification can resolve this problem?
A) SELECT CustomerName FROM Sales.Customers
B) SELECT CAST(CustomerPostalCode as varchar(10)) AS CustomerPostalCode FROM Sales.Customers
C) SELECT CONVERT(CustomerPostalCode, varchar(10)) FROM Sales.Customers
D) SELECT CustomerPostalCode::varchar(10) FROM Sales.Customers
Divya Panseriya
❑1.3 Transform and load the data linkedin.com/in/divya-panseriya/
a n
• Identify when to use reference or duplicate queries and the resulting
impact
a P
• Merge and append queries
iv y
• Identify and create appropriate keys for relationships
D
• Configure data loading for queries
y
d b
a r e
r e p
P
Divya Panseriya
❑3.1 Transform and load the data linkedin.com/in/divya-panseriya/
y D
d b
a r e
r e p
P
Choose suitable data types (text, numeric, date/time, etc.) for each
column to accurately represent the data and optimize performance.
Divya Panseriya
linkedin.com/in/divya-panseriya/
Create and
➢➢ Create Newtransform
Columnscolumns
y a
ri
•Column From Examples: Ideal for transforming data where you can demonstrate a clear
pattern, such as formatting dates or extracting text.
s e
n
•Custom Column: Best for scenarios requiring sophisticated calculations or data
a
P
transformations that need precise control, such as complex aggregations or conditional logic.
a
•Conditional Column: Perfect for quickly adding new columns based on simple IF-THEN-
iv y
ELSE conditions, such as categorizing data or flagging outliers.
y D
d b
a r e
r e p
P
Divya Panseriya
linkedin.com/in/divya-panseriya/
➢ Transform columns
y a
ri
•Remove Columns: Exclude unnecessary columns from the dataset to focus on essential data.
•Use First Row As Headers: Set the first row of data as the column headers for better data interpretation.
s e
•Duplicate Columns: Create additional copies of existing columns to facilitate different analyses or
transformations.
a n
P
•Split Columns: Divide concatenated or combined data in a single column into separate columns for better
a
analysis.
iv y
•Remove Duplicates: Eliminate duplicate rows to ensure data integrity and accuracy in visualizations.
•Replace Values: Substituting specific values with new ones across the dataset to clean and standardize
data.
y D
•Change Type: Convert data types of columns (e.g., from string to numeric) to enable correct calculations
and visualizations.
d b
e
•Group By: Aggregate data based on specified column(s) to summarize information for reporting.
a r
•Unpivot Columns: Transform columns into rows to normalize data and simplify analysis.
p
•Rename: Change column names to improve clarity and align with business terminology.
r e
•Uppercase Lowercase: Convert text to uniform case (uppercase or lowercase) for consistency in
P
visualizations.
•Trim Capitalize Each Word: Remove leading and trailing spaces and capitalize the first letter of each
word in text data.
•Clean (Remove unprintables): Remove non-printable characters to ensure data cleanliness and avoid
processing errors.
•Replace Errors: Correct or replace erroneous data entries to maintain data accuracy and reliability.
Divya Panseriya
➢ Transform a query Merge And Append Queries linkedin.com/in/divya-panseriya/
a
Merge Queries:
Merging queries in Power BI allows you to combine data from different tables based on
ri y
e
matching values in one or more columns.
n s
P a
y a
D iv
by
r e d
p a
r e
.
P
You can also choose how to join the two tables together, a process that is also similar to JOIN statements in SQL. These join options include:
Types of Joins:
•Inner Join: Includes only rows that have matching values in both tables.
•Left (Outer) Join: Includes all rows from the left table and matching rows from the right table.
•Right (Outer) Join: Includes all rows from the right table and matching rows from the left table.
•Full (Outer) Join: Includes all rows from both tables, matching where possible and placing nulls where there is no match.
Merge And Append Queries Divya Panseriya
linkedin.com/in/divya-panseriya/
Append Queries:
a
Appending queries in Power BI allows you to stack data from multiple tables vertically (row-wise).
ri y
s e
a n
a P
iv y
y D
d b
a r e
r e p
P
Example Scenario:
•Merge Queries: Suppose you have two tables: SalesData and ProductInfo. You want to
merge SalesData with ProductInfo based on the ProductID column to add product details to
each sale record.
Append Queries: Suppose you have two tables: SalesQ1 and SalesQ2. You want to
append SalesQ2 data below SalesQ1 data to create a consolidated sales dataset.
Divya Panseriya
➢ Use Advanced Editor to modify M code linkedin.com/in/divya-panseriya/
To modify data shaping steps in Power Query, you can use the Advanced Editor to view and edit the
underlying M code.
y a
e ri
After creating steps to clean data, select the View ribbon of Power Query and then select Advanced
s
n
Editor.
P a
y a
D iv
by
r e d
The following screen should appear.
p a
r e
P
Divya Panseriya
➢ Design a star schema that contains facts and dimensions linkedin.com/in/divya-panseriya/
Fact Table:
y a
•Definition: The fact table in a Star schema contains
quantitative data (facts or metrics) for a business
s e ri
process or event. It typically includes foreign keys that
a n
reference the primary keys of associated dimension
tables.
a P
Dimension Tables:
iv y
y D
•Definition: Dimension tables in a Star schema
d b
contain descriptive attributes related to the facts in the
fact table. They provide context and meaning to the
r e
measures in the fact table.
a
r e p
Relationship Keys:
P
•Fact Table's Foreign Keys: The fact table contains foreign keys that reference the primary
keys of dimension tables. These foreign keys establish relationships between the fact table and
dimension tables.
•Dimension Table's Primary Keys: Each dimension table has a primary key that uniquely
identifies each record within that dimension.
Divya Panseriya
linkedin.com/in/divya-panseriya/
Fact vs. Dimension Tables:
•Fact Table:
a
• Contains numerical measures (facts or metrics) that represent business processes or events.
y
ri
• Often contains foreign keys to link to dimension tables.
• Designed for efficient querying of aggregated data.
s e
•Dimension Table:
a n
P
• Contains descriptive attributes that provide context to the measures in the fact table.
a
• Each dimension table typically represents a specific aspect of business entities (e.g., time,
product, location, employee).
iv
• Designed for detailed analysis and filtering.y
Relationship Keys:
y D
•Primary Keys (PK):
d b
• Primary keys are unique identifiers for each record in a dimension table.
r e
• Used as the basis for establishing relationships with fact tables (via foreign keys).
a
p
•Foreign Keys (FK):
r e
• Foreign keys in a fact table reference the primary keys of dimension tables.
P
• Establish relationships between the fact table (quantitative data) and dimension tables
(descriptive attributes).
In summary, a Star schema organizes data into a centralized fact table surrounded by dimension
tables, linked through primary and foreign keys, to facilitate efficient querying and analysis of
business data.
Divya Panseriya
Snowflake Schema vs. Star Schema: linkedin.com/in/divya-panseriya/
•Star Schema:
a
• Denormalized structure with dimension tables directly linked to a central fact table.
• Easier to understand and query, typically better performance for most queries.
ri y
e
• Suitable for scenarios where performance and simplicity in query design are priorities.
•Snowflake Schema:
n s
tables.
P a
• Normalized structure with dimension tables further broken down into multiple related
y
• Reduces redundancy and supports more flexible data updates.
a
iv
• More complex query structure and potential performance trade-offs due to increased joins.
D
by
r e d
p a
r e
P
In summary, while both snowflake and star schemas organize data in a data warehouse, they differ primarily in their approach to
normalization and denormalization. Snowflake schemas prioritize data integrity and storage efficiency through normalization,
whereas star schemas prioritize query performance and simplicity through denormalization. Choosing between them depends on
specific business requirements and data characteristics.
Divya Panseriya
linkedin.com/in/divya-panseriya/
y a
ri
Referencing Queries:
Definition:
s e
A referenced query in Power BI creates a new query
an
P
that links back to the original query. Any changes
a
made to the original query automatically update the
referenced query.
iv y
Considerations:
y D
d b
•Changes in the original query affect all referenced
queries, which may not be desirable if you need
independent datasets.
a r e
p
•Dependency on the original query's performance and
structure.
r e
P
Divya Panseriya
linkedin.com/in/divya-panseriya/
y a
ri
Duplicating Queries:
Definition:
s e
n
Duplicating a query in Power BI creates an entirely separate copy of the query, including all applied transformation
a
steps. Changes made to the original query do not affect duplicated queries.
P
Considerations:
y a
iv
•
D
• Increased storage and potentially duplicated effort in maintaining separate copies of data.
•
by
May lead to inconsistencies if updates are needed across multiple versions of the same data.
r e d
a
Choosing Between Referencing and Duplicating:
p
e
•Dependence vs. Independence: Use referencing when consistency across reports or
r
P
dashboards is crucial. Opt for duplicating when you need independence and flexibility in data
manipulation.
•Efficiency vs. Customization: References are efficient for maintaining synchronized data,
while duplicates offer customization and snapshot capabilities.
•Space and Performance: References save space but are dependent on the original query's
performance. Duplicates provide flexibility but require careful management to avoid
redundancy.
Divya Panseriya
linkedin.com/in/divya-panseriya/
➢ Identify and create appropriate keys for relationships
y a
• Summary:
s eri
• Key Columns: Unique identifiers that ensure data
an
integrity and establish relationships between tables.
a P
• Primary Key: Unique identifier within a table.
iv y
•
y D
Foreign Key: Column referencing a primary key in
b
another table.
•
e d
Usage: Critical for relational databases to maintain
r
a
integrity, optimize performance, and enable efficient
p
e
querying and reporting in tools like Power BI.
P r
Divya Panseriya
linkedin.com/in/divya-panseriya/
• How Keys are Used:
1. Establishing Relationships:
y a
ri
1. Primary Key: Defines the unique identifier for each row in a table.
s e
2. Foreign Key: Establishes relationships between tables by referencing the primary key of another
table.
a n
2. Ensuring Data Integrity:
a P
iv y
1. Primary Key: Ensures that each row is uniquely identified, preventing duplicate records.
2. Foreign Key: Enforces referential integrity, meaning values in the foreign key column must exist in
D
the referenced primary key column.
y
3. Improving Query Performance:
d b
e
1. Keys are used by database engines to optimize query performance, especially when joining tables or
a r
retrieving specific rows based on relationships.
r e p
P
Divya Panseriya
➢ Configure Data Loading for Queries linkedin.com/in/divya-panseriya/
• Enable Load:
y a
•
e ri
When you enable the load for a query, it means the data from that query will be
s
n
loaded into your Power BI model when you refresh or load data into your report.
iv
•
especially if you have multiple queries loading similar data,
y D
By enabling or disabling the load for queries and managing
b
•
duplicates effectively, you can control how data flows into your
e d
Power BI model, ensuring data integrity and optimizing
r
a
performance during data refreshes and report creation.
p
r e
P
Divya Panseriya
Q&A: PL-300 Part 1 linkedin.com/in/divya-panseriya/
a
Question 1: You plan to get data for a Power BI dataset from flat files. You need to select a location for the files. The location must provide the ability
ri y
to configure scheduled refresh for the dataset by using Microsoft 365 credentials. Which two location types should you recommend? Each correct answer
presents a complete solution.
e
•A) OneDrive for Business
s
•B) Google Drive
n
•C) SharePoint – Team Sites
•D) Dropbox
Answer:
P a
a
•A) OneDrive for Business
y
•C) SharePoint – Team Sites
D
interface? Each correct answer presents a complete solution. iv
Question 2: You need to make changes to your data sources. Which three changes are supported by the Data Source Settings in the Power Query
y
•A) clearing permissions
b
•B) editing permissions
d
•C) modifying the file path
e
•D) changing the data source type
Answer:
•A) clearing permissions
a r
e
•B) editing permissions
r p
•C) modifying the file path
P
Question 3: You have a Power BI dataset that gets data from a table in a SQL Server database. From which view in Power BI Desktop can you modify
the storage mode of the table?
•A) Report view
•B) Data view
•C) Model view
•D) Home view
Answer:
•C) Model view
Divya Panseriya
Question 4: You plan to add data to Power BI Desktop from a new data source. You are evaluating whether you should
linkedin.com/in/divya-panseriya/
use the DirectQuery storage mode or the Import storage mode. What are two benefits of using Import instead of
DirectQuery? Each correct answer presents a complete solution.
•A) Full support for the Q&A Power BI service
•B) Real-time data updates
y a
ri
•C) Full support for the Quick Insights Power BI service
e
•D) Reduced load on the Power BI service
Answer:
n s
a
•A) Full support for the Q&A Power BI service
P
•C) Full support for the Quick Insights Power BI service
y a
iv
Question 5: You create a Power BI data source which uses a SQL SELECT statement. The SQL statement queries
D
multiple tables in a SQL Server database and includes subqueries. After you import data from the data source into
Power BI, you notice that one of the columns in the resulting dataset appears blank. You verify that the source table
by
does include data. What should you do to resolve the issue?
•A) Use the CAST function in the SQL statement.
e d
•B) Use the GROUP BY function in the SQL statement.
r
•C) Refresh the data source in Power BI.
p a
•D) Check for data type mismatches in the source table.
e
Answer:
r
•A) Use the CAST function in the SQL statement.
P
Divya Panseriya
1.2 Clean the Data linkedin.com/in/divya-panseriya/
Question 6: From Power BI Desktop, you create a data source by importing an Excel file that contains 10,000 rows. You
plan to identify data anomalies within the data source. You need to ensure that column distribution considers all rows in the
Excel file. What should you do?
y a
ri
•A) In the Power Query Editor window, enable the Column Profile view.
e
•B) In the Power Query Editor window, enable the Column Quality view.
•C) In the Data view, enable the Column Distribution option.
n s
a
•D) In the Report view, enable the Column Statistics option.
P
Answer:
•A) In the Power Query Editor window, enable the Column Profile view.
y a
Question 7: From Power BI Desktop, you create a data source by importing an Excel file. You plan to identify data
iv
anomalies within the data source. Which interface should you use to display the column distribution and column quality
graphs?
•A) Report view
y D
b
•B) Data view
•C) Model view
•D) Power Query Editor
Answer:
r e d
•D) Power Query Editor
p a
r e
Question 8: You have a query that retrieves data from a table that contains more than 8,000 rows of data. In Power Query
P
Editor, you notice that the column statistics for each column shows a count of exactly 1,000. You need to ensure that the
column statistics for each column shows the statistics based on all rows that are returned by the query. What should you do?
•A) From the query window, select Column profiling on the entire dataset.
•B) From the query window, select Column profiling on the first 1,000 rows.
•C) From the query window, enable Column profiling.
•D) From the query window, disable Column profiling.
Answer:
•A) From the query window, select Column profiling on the entire dataset.
Divya Panseriya
linkedin.com/in/divya-panseriya/
Question 9: You are analyzing query data by using Power Query Editor. You need to ensure that the
a
Column statistics are based on an analysis of the entire dataset. What should you do?
y
•A) From the status bar, change profiling status to entire dataset.
ri
•B) From the status bar, change profiling status to first 1,000 rows.
e
•C) From the status bar, change profiling status to last 1,000 rows.
•D) From the status bar, change profiling status to sample dataset.
n s
a
Answer:
P
•A) From the status bar, change profiling status to entire dataset.
y a
Question 10: You import an Excel file into Power BI Desktop and begin to analyze the data in Power
iv
Query Editor. You need to identify outliers in a text column within the data source. Which information
D
should you use from Power Query Editor?
y
•A) The min and max values in Column profile
b
•B) The top and bottom entries in Value distribution
d
•C) The distinct count in Column statistics
e
•D) The unique count in Column statistics
Answer:
a r
p
•B) The top and bottom entries in Value distribution
r e
P
Divya Panseriya
linkedin.com/in/divya-panseriya/
1.3 Transform and Load the Data
Question 11: You have a Power BI data source that contains the following tables:
a
•ProductCategory: Lists the product category ID and category name
ri
•ProductSubcategory: Lists the product subcategory ID, subcategory name, and the related category ID
y
•ProductList: Lists the product ID, name, and subcategory ID You need to optimize the data to create a dimension for use
s e
in a star schema data model. How should the tables be transformed before loading into the model?
n
•A) Merge the queries to create a single loaded table for Product.
•B) Append the queries to create a single loaded table for Product.
•C) Keep the tables as they are.
P a
•D) Create a relationship between the tables.
y a
iv
Answer:
•A) Merge the queries to create a single loaded table for Product.
y D
Question 12: You have an Excel spreadsheet that contains two columns as follows:
•Category: Contains names of categories
d b
•Subcategory: Contains names of subcategories for each category You import the Excel spreadsheet into Power BI Desktop.
r e
You need to transform the data to meet the following requirements:
a
•Have multiple columns, where one column represents one category
r
•A) Pivot Columns
e p
•Have a single row that includes the total count of subcategories for each category Which transformation should you use?
P
•B) Unpivot Columns
•C) Transpose Columns
•D) Group By Columns
Answer:
•A) Pivot Columns
Divya Panseriya
linkedin.com/in/divya-panseriya/
Question 13: You connect Power Query Editor to a database table. You need to remove the Row ID column. Your solution must
a
ensure that new columns do NOT display in the table model during a scheduled refresh in the future. What transformation should
y
you use?
ri
•A) Use the Remove Columns command.
e
•B) Use the Select Columns command and choose the columns to keep.
•C) Use the Delete Columns command.
n s
a
•D) Use the Hide Columns command.
P
Answer:
a
•B) Use the Select Columns command and choose the columns to keep.
iv y
Question 14: You create a Power BI Desktop dataset by importing and transforming a data source based on multiple tables in a
D
SQL Server database. You need to modify the relationship between the tables. Which interface should you use in Power BI Desktop?
y
•A) Report view
b
•B) Data view
d
•C) Model view
e
•D) Home view
Answer:
a r
p
•C) Model view
Question 15:
r e
P
•True or False: Parameters in Power BI allow you to dynamically change filter values and data sources within a report.
A) True
B) False
Answer:
A) True
Divya Panseriya
linkedin.com/in/divya-panseriya/
a
Question 16: What is a key difference between referencing and duplicating a query in Power BI Desktop?
ri y
•A. Referencing creates a new copy of the query with its own data source
•B. Duplicating creates a new query that shares the original query's data
•C. Referencing creates a linked query that updates with the original
s e
n
•D. Duplicating creates a new query that is independent of the original
a
Answer: C
a P
y
Question 17:
iv
Inside the Power Query Editor, there is an option called “Column Distribution” that will show you the total count of
D
distinct values and unique values for all columns in the selected query. Within the Geography Key column, there are 348
distinct values and 74 unique values. Select the answer below that correctly describes the column distribution.
Options:
by
A) The Geography Key column contains 348 different values, with 74 of those values appearing only once.
e d
B) The Geography Key column contains 74 different values, with 348 of those values appearing only once.
r
C) The Geography Key column contains 74 different values, each appearing multiple times, with 348 of those values
appearing only once.
p a
e
D) The Geography Key column contains 348 different values, with 74 of those values appearing multiple times.
Answer:
P r
A) The Geography Key column contains 348 different values, with 74 of those values appearing only once.
Divya Panseriya
Question 18:
linkedin.com/in/divya-panseriya/
You’ve been tasked with creating a data model for your company ColorJoy. You have an Excel file containing data for orders,
products, customers, and sales regions all in one worksheet titled ColorJoyData. Listed below are the steps needed to create the
customer dimensions.
1- Merge the Customer ID back to ColorJoyData
y a
ri
2- Remove all columns except for those specific to Customer
3- Add an index column. Rename to Customer Key
s e
4- Multi-select Customer columns, then remove duplicates
a n
P
5- Remove columns used in newly created dimension table
a
6- Duplicate the ColorJoyData query and rename the duplicated query to Customer
iv y
Select the answer choice that lists the steps needed to create a dimension table in the correct order.
D
A) 6, 2, 4, 3, 5, 1
y
B) 6, 3, 4, 2, 1, 5
C) 6, 2, 3, 4, 5, 1
d b
e
D) 6, 3, 2, 4, 5, 1
Answer:
a r
A) 6, 2, 4, 3, 5, 1
r e p
P
Question 19:
Which Power Query feature allows you to add a new column even when you are uncertain about the specific transformations to apply?
A) Add Custom Column
B) Add Conditional Column
C) Column From Examples
D) Invoke Custom Function
Answer:
C) Column From Examples