0% found this document useful (0 votes)
6 views5 pages

Lab Session 3

The document outlines a comprehensive lab session for analyzing customer orders, including data cleaning, transformation, and visualization tasks. Key tasks involve correcting data types, creating new columns for total price and order status, and generating various charts and pivot tables to summarize sales data by city and product category. Additionally, it includes advanced data manipulation techniques such as outlier detection, profit estimation, and dynamic filtering for report generation.

Uploaded by

y23cd116
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views5 pages

Lab Session 3

The document outlines a comprehensive lab session for analyzing customer orders, including data cleaning, transformation, and visualization tasks. Key tasks involve correcting data types, creating new columns for total price and order status, and generating various charts and pivot tables to summarize sales data by city and product category. Additionally, it includes advanced data manipulation techniques such as outlier detection, profit estimation, and dynamic filtering for report generation.

Uploaded by

y23cd116
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

Lab Session 3: Customer Orders Analysis

1. Remove unnecessary columns (if any extra).


2. Check and correct data types for:
o OrderDate as Date
o Quantity and UnitPrice as Whole Number
3. Create a new column TotalPrice = Quantity × UnitPrice.
4. Add a conditional column named OrderStatusFlag:
o Delivered → 1
o Pending → 0
o Cancelled → -1
5. Extract Month and Year from the OrderDate column.
6. Add a column HighValueOrder:
o If TotalPrice > ₹40,000 → “Yes”
o Else → “No”
7. Remove duplicate rows based on OrderID.

8. Group by City and calculate:


o Total Orders
o Total Quantity
o Total Revenue (Sum of TotalPrice)
9. Pivot Table:
o Rows: City,
o Columns: Status,
o Values: Count of Orders
10. Unpivot the above pivoted columns to bring back original structure.

Assume a second table CityRegion.csv:


City Region
Hyderabad South
Chennai South
Mumbai West
Delhi North
Bangalore South
Kolkata East

11. Import CityRegion.csv and merge with your main dataset using the City column to bring
in Region.

12. Create a bar chart showing:

 X-axis: Product
 Y-axis: Total Sales (TotalPrice)

13. Create a pie chart for the distribution of order Status.


14. Use a slicer for HighValueOrder (Yes/No) to filter all visuals.
15. Use a stacked bar chart to show total sales by Region and Status.
16. Create a line chart to show monthly total sales trend.

Level-2

1. Detect and remove outliers:


o Identify orders where Quantity > 10 or UnitPrice > ₹75,000.
o Remove them from the dataset assuming they are data entry errors.
2. Categorizeproductsbypricerange:
Create a column PriceCategory:
o Low (<= ₹15,000)
o Medium (₹15,001 - ₹35,000)
o High (> ₹35,000)
3. Create a Profit Estimate column:
Assuming profit is 18% of TotalPrice, create a new column EstimatedProfit.
4. Group by Month-Year and Product, and calculate:
o Total Orders
o Total Revenue
o Avg. Quantity per Order

5. Given another table named ProductCategory.csv:

Product Category
Mobile Electronics
TV Electronics
AC Appliances
Washing Machine Appliances
Fridge Appliances
Laptop Electronics
Tablet Electronics

Merge this with the main table using the Product column and bring in the Category.

6. Using the merged table from above, find:


o Total Revenue by Category
o Average Unit Price per Category

Create a matrix visual that shows:

o Rows: Region
o Columns: Status
o Values: Count of Orders and Sum of Revenue
8. Use a slicer for:
o City
o OrderStatusFlag
o PriceCategory
So users can dynamically filter the report.
9. Create a line chart showing:
o Total monthly sales trend
o With a line per Product Category
10. Export only Delivered orders from South region in June 2024 as a new table.
Use filters and transformations to achieve this inside Power Query.

Level-3

1. Rename the following columns:


o CustomerName to Customer
o SalesDate to OrderDate
2. Remove the top 5 rows and bottom 10 rows from the table.
3. Remove duplicate rows based on OrderID.
4. Keep only rows where the Status column equals "Delivered".
5. Sort the data by OrderDate in descending order.
6. Change the data types of:
o OrderDate → Date
o Quantity → Whole Number
o UnitPrice → Decimal Number
7. Replace "Pending" in Status with "In Progress".
8. Trim and clean text in the Customer column.
9. Split the Customer column by space into FirstName and LastName.
10. Extract:

 Year from OrderDate


 Month from OrderDate
 Day from OrderDate

11. Unpivot the columns: Quantity, UnitPrice, and TotalPrice.


12. Pivot the Status column using Quantity as values.
13. Group data by Region and calculate the total TotalPrice.
14. Round the UnitPrice to the nearest 100.
15. Format Product column to:

 UPPERCASE
 lowercase
 Capitalize Each Word

16. Add a custom column TotalPrice = Quantity * UnitPrice.


17. Add an index column starting from 1001.
18. Add a column that calculates the number of characters in Product name.

19. Add a conditional column PriceLevel:

 If TotalPrice > 50000, then "High"


 If between 20000 and 50000, then "Medium"
 Else "Low"

20. Add a column that extracts the first 3 letters of City.


21. Add a column showing number of days between today and OrderDate.
22. Add a column using "Standardize" to normalize Quantity.

23. Use “From Example” to create a column that extracts domain from Email field.
24. Add a column that combines Customer, City, and Product with hyphens (e.g., Rajesh-
Chennai-Laptop).
25. Use a date transformation to add a column for End of Month from OrderDate.

You might also like