SQL
Part A: Quick Booster (Short Notes + Q&A)
1. SQL Joins
INNER JOIN → Returns only matching rows from both tables.
LEFT JOIN → All rows from left + matching rows from right; NULL if no match.
RIGHT JOIN → All rows from right + matching rows from left; NULL if no match.
FULL OUTER JOIN → All rows from both, NULLs for missing matches.
Q: What is the difference between INNER JOIN and LEFT JOIN?
A: INNER JOIN gives only matched records. LEFT JOIN gives all records from the left
table even if there is no match in the right table.
2. WHERE vs HAVING
WHERE → Filters rows before GROUP BY.
HAVING → Filters aggregated groups after GROUP BY.
3. GROUP BY & ORDER BY
GROUP BY → Used for summarizing (SUM, AVG, COUNT).
ORDER BY → Used for sorting results ascending/descending.
4. Aggregate Functions
SUM() → Adds values
AVG() → Calculates average
MIN() / MAX() → Smallest / Largest value
COUNT() → Counts rows
5. Subqueries
Independent Subquery: Runs separately.
Correlated Subquery: Depends on outer query.
6. Common Queries
Second Highest Salary
SELECT MAX(salary)
FROM employees
WHERE salary < (SELECT MAX(salary) FROM employees);
Find Duplicates
SELECT name, COUNT(*)
FROM employees
GROUP BY name
HAVING COUNT(*) > 1;
Top 5 Customers by Sales
SELECT customer_id, SUM(sales) AS total_sales
FROM orders
GROUP BY customer_id
ORDER BY total_sales DESC
LIMIT 5;
7. COUNT(*) vs COUNT(column)
COUNT(*) → Counts all rows (including NULLs).
COUNT(column) → Counts only non-NULL values.
8. Window Functions
ROW_NUMBER() → Unique number for each row.
RANK() → Ranking with gaps.
DENSE_RANK() → Ranking without gaps.
Example:
SELECT name, salary,
RANK() OVER (ORDER BY salary DESC) AS rank
FROM employees;
9. Indexes
Clustered Index → Sorts and stores data physically. Only one per table.
Non-Clustered Index → Separate from data; multiple allowed.
10. Normalization
Breaking tables into smaller tables to reduce redundancy and improve data
integrity.
Part B: Detailed Explanations + Examples
Joins with Example
Employees Table
EmpID Name DeptID
1 Ravi 10
2 Sita 20
3 Raju NULL
Departments Table
DeptID DeptName
10 HR
20 IT
30 Finance
INNER JOIN
SELECT e.Name, d.DeptName
FROM Employees e
INNER JOIN Departments d
ON e.DeptID = d.DeptID;
👉 Output: Ravi-HR, Sita-IT
LEFT JOIN
SELECT e.Name, d.DeptName
FROM Employees e
LEFT JOIN Departments d
ON e.DeptID = d.DeptID;
👉 Output: Ravi-HR, Sita-IT, Raju-NULL
2. WHERE vs HAVING Example
Sales Table
Region Sales
North 1000
South 2000
East 800
SELECT Region, SUM(Sales) AS TotalSales
FROM Sales
GROUP BY Region
HAVING SUM(Sales) > 1000;
👉 Output: South (2000)
3. Subquery Example
SELECT name, salary
FROM employees
WHERE salary > (SELECT AVG(salary) FROM employees);
👉 Finds employees earning above average salary.
Correlated Subquery
SELECT e1.Name, e1.Salary
FROM employees e1
WHERE e1.Salary > (
SELECT AVG(e2.Salary)
FROM employees e2
WHERE e1.DeptID = e2.DeptID
);
👉 Finds employees earning above their department’s average.
4. Window Function Example
SELECT Name, Salary,
DENSE_RANK() OVER (ORDER BY Salary DESC) AS SalaryRank
FROM Employees;
Ranks employees by salary, with no gaps in rank numbers.
5. Indexes
Q: Why use Indexes?
A: To improve query performance.
Example: If table has 1 million rows, index on EmpID helps quick searching instead of
scanning all rows.
6. CTE (Common Table Expression)
WITH HighSalary AS (
SELECT name, salary
FROM employees
WHERE salary > 50000
SELECT * FROM HighSalary;
Temporary result set that can be used like a table.
7. Scenario Question
Q: Sales report in SQL shows 1000 rows, but Excel has 980. What will you
do?
Answer:
Check filters/joins in SQL query.
Ensure Excel didn’t miss rows due to blank cells or import limit.
Use COUNT in SQL and compare with Excel row count.
8. HR-style Question
Q: Tell me about a project where you used SQL.
Answer:
"In a Sales Analysis project, I used SQL to extract and clean raw sales data from the
database. I wrote queries to calculate total sales by region, top customers, and
average order value. This cleaned data was then used in Excel for validation and in
Power BI for dashboard creation."
Part C: Must-Read Extra SQL Topics for Interviews
1. SQL Categories
DDL (Data Definition Language)
o Defines structure of database objects.
o Commands:
CREATE → create table/database
ALTER → modify table structure
DROP → delete table/database
TRUNCATE → remove all rows but keep structure
Q: Difference between DELETE and TRUNCATE?
A:
DELETE → Removes specific rows, can use WHERE, can be rolled back.
TRUNCATE → Removes all rows, faster, cannot use WHERE, minimal logging.
DML (Data Manipulation Language)
o Manages data inside tables.
o Commands:
INSERT → Add data
UPDATE → Modify data
DELETE → Remove data
DQL (Data Query Language)
o SELECT → Fetch data
DCL (Data Control Language)
o GRANT, REVOKE → Permissions
TCL (Transaction Control Language)
o COMMIT → Save changes
o ROLLBACK → Undo
o SAVEPOINT → Save partial transaction
2. Keys & Constraints
Primary Key
o Uniquely identifies each row.
o Only one per table.
o Cannot contain NULL.
CREATE TABLE Employees(
EmpID INT PRIMARY KEY,
Name VARCHAR(50)
);
Foreign Key
o References primary key in another table.
o Ensures referential integrity.
CREATE TABLE Orders(
OrderID INT PRIMARY KEY,
EmpID INT,
FOREIGN KEY (EmpID) REFERENCES Employees(EmpID)
);
Unique Key
o Ensures all values in a column are unique.
o Allows one NULL value.
Composite Key
o Combination of two or more columns used as a primary key.
Check Constraint
o Restricts column values.
Age INT CHECK (Age >= 18)
Default Constraint
o Assigns default value if no value provided.
Salary DECIMAL(10,2) DEFAULT 30000
3. Transactions Example
BEGIN TRANSACTION;
UPDATE Accounts SET Balance = Balance - 1000 WHERE AccID = 101;
UPDATE Accounts SET Balance = Balance + 1000 WHERE AccID = 102;
IF @@ERROR <> 0
ROLLBACK TRANSACTION;
ELSE
COMMIT TRANSACTION;
👉 Ensures money transfer happens fully or not at all.
4. Normalization
1NF: No repeating groups, atomic values.
2NF: 1NF + no partial dependency.
3NF: 2NF + no transitive dependency.
Q: Why normalization?
A: To reduce data redundancy and improve integrity.
5. Views & Stored Procedures
View → Virtual table (saved SELECT query).
CREATE VIEW HighSalary AS
SELECT Name, Salary FROM Employees WHERE Salary > 50000;
Stored Procedure → Predefined set of SQL statements.
CREATE PROCEDURE GetTop5Customers
AS
SELECT TOP 5 CustomerID, SUM(Sales)
FROM Orders
GROUP BY CustomerID
ORDER BY SUM(Sales) DESC;
Excel
Part A: Quick Booster (Short Notes + Q&A)
1. VLOOKUP vs XLOOKUP
VLOOKUP → Works vertically only, searches from left to right.
XLOOKUP → Works vertically & horizontally, can search left/right/up/down,
supports error handling.
Q: Why is XLOOKUP better than VLOOKUP?
A: XLOOKUP doesn’t need column index number, can search in both directions, and is
more flexible.
2. IF, AND, OR
IF:
=IF(A2>=35,"Pass","Fail")
IF + AND:
=IF(AND(A2>=35,B2>=35),"Pass","Fail")
IF + OR:
=IF(OR(A2="Hyderabad",A2="Chennai"),"South Zone","Other")
3. COUNT Functions
COUNT → Counts only numbers.
COUNTA → Counts numbers + text + non-empty cells.
COUNTIF → Counts with a condition.
=COUNTIF(A2:A10,">=50")
4. Basic Stats Functions
AVERAGE → Mean value
MEDIAN → Middle value
MODE → Most frequent value
MIN / MAX → Lowest / Highest value
STDEV → Standard deviation
5. Pivot Tables
Summarize large data easily.
Steps: Insert → PivotTable → Drag fields into Rows, Columns, Values.
Example: Region-wise total sales.
6. Conditional Formatting
Highlight cells using rules.
Example: Highlight duplicates → Conditional Formatting → Duplicate Values.
Example: Data Bars for visual comparison.
7. Charts
Column Chart → Compare categories.
Line Chart → Trends over time.
Pie Chart → Percentage distribution.
8. Data Cleaning Tools
Remove Duplicates
Trim spaces: =TRIM(A2)
Text to Columns → Split data
Flash Fill → Auto-complete patterns
9. Data Validation
Restrict input (example: marks between 0–100).
Add dropdown lists for input control.
10. Named Ranges & Tables
Named ranges for easy formulas.
Converting range → Table (Ctrl + T) for dynamic analysis.
Part B: Detailed Explanations + Examples
1. VLOOKUP & XLOOKUP Example
Data:
RollNo Name Marks
101 Ravi 75
102 Sita 82
103 Raju 65
VLOOKUP Example:
=VLOOKUP(102,A2:C4,3,FALSE)
Output: 82
XLOOKUP Example:
=XLOOKUP(102,A2:A4,C2:C4)
Output: 82
2. IF with AND/OR Example
Data:
Nam Mat Scienc
e h e
Ravi 40 60
Sita 30 80
Formula:
=IF(AND(B2>=35,C2>=35),"Pass","Fail")
Ravi → Pass
Sita → Fail
3. COUNTIF Example
Data:
Name Marks
Ravi 82
Sita 60
Raju 45
Formula:
=COUNTIF(B2:B4,">=60")
Output: 2
4. Pivot Table Example
Sales Data:
Regio Sale
n s
North 2000
South 3000
East 1500
Steps: Insert → PivotTable → Drag Region → Rows, Sales → Values
Output:
North → 2000
South → 3000
East → 1500
Total = 6500
5. Conditional Formatting Example
To highlight marks below 50:
Home → Conditional Formatting → New Rule → Format cells < 50 (Red Fill).
6. Data Validation Example
Restrict Age between 18 and 60:
Data → Data Validation → Whole Number → Between 18 and 60.
7. Scenario Interview Question
Q: Your Excel dataset has blank rows, duplicates, and spelling mistakes. How will you
clean it?
Answer:
"I’ll first remove duplicates using the Remove Duplicates tool, trim unwanted spaces
using TRIM, and apply Data Validation for uniform spellings. I’ll also handle blanks with
filters and use Flash Fill for corrections."
8. HR-style Question
Q: Tell me how you used Excel in your project.
Answer:
"In a Sales Analysis project, I used Excel PivotTables to summarize region-wise sales,
applied conditional formatting to highlight top-performing regions, and used formulas
like IF and VLOOKUP to validate data before connecting it to Power BI dashboards."
Power BI
Part A: Quick Booster (Short Notes + Q&A)
1. What is Power BI?
Power BI is a Business Intelligence tool by Microsoft used for data
visualization, reporting, and analysis.
2. Power BI Components
Power Query → Data cleaning & transformation.
Power Pivot → Data modeling.
DAX (Data Analysis Expressions) → Formulas/calculations.
Power BI Desktop → Build reports.
Power BI Service → Publish & share reports online.
Power BI Mobile → View dashboards on mobile.
3. Difference: Power Query vs Power Pivot
Power Query → For ETL (Extract, Transform, Load).
Power Pivot → For creating relationships, measures, calculations.
4. DAX Basics
Calculated Column → Row-level calculation.
Measure → Aggregate calculation used in visuals.
Examples:
Total Sales = SUM(Sales[Amount])
Average Profit = AVERAGE(Sales[Profit])
5. Relationships in Power BI
One-to-Many → Most common.
Many-to-Many → Allowed in Power BI but use with caution.
Active/Inactive relationships.
6. Visualizations
Bar/Column Chart → Compare categories.
Line Chart → Trends.
Pie/Donut → Percentages.
Map → Geographical analysis.
Table/Matrix → Detailed view.
7. Filters in Power BI
Visual-level Filter → Only affects one visual.
Page-level Filter → Affects entire page.
Report-level Filter → Affects entire report.
Slicers → User-friendly filter.
8. Row-Level Security (RLS)
Restricts data visibility for users.
[Region] = "South"
9. Publishing
After building in Desktop → Publish to Power BI Service.
Can share dashboards with team members.
10. Common Interview Question
Q: Difference between Calculated Column and Measure?
A:
Calculated Column → Stored in data model, works row by row.
Measure → Calculated at runtime, uses less memory, best for aggregation.
Part B: Detailed Explanations + Examples
1. Power BI Workflow
1. Connect → Import data from SQL, Excel, CSV, APIs.
2. Transform → Clean using Power Query.
3. Model → Create relationships between tables.
4. Create → Build visuals using fields, measures.
5. Publish → Share using Power BI Service.
2. Power Query Example
Suppose sales data has blanks and wrong text case.
Steps:
o Remove rows with nulls.
o Use Transform → Format → Capitalize Each Word.
o Merge columns if needed.
3. DAX Example
Data:
OrderID Amount Profit
101 200 50
102 300 70
103 150 40
Measures:
Total Sales = SUM(Sales[Amount])
Profit Margin = DIVIDE(SUM(Sales[Profit]), SUM(Sales[Amount]))
Shows both total sales and profit margin % in report.
4. Relationship Example
Tables:
Orders → OrderID, CustomerID, Amount
Customers → CustomerID, Name, Region
Relationship: Orders[CustomerID] → Customers[CustomerID] (One-to-Many).
5. Visualization Example
Bar Chart: Region-wise sales.
Pie Chart: Category-wise sales % share.
Line Chart: Month-wise sales trend.
6. Row-Level Security Example
If you want South Region Manager to see only South data:
Manage Roles → Add Role →
[Region] = "South"
When Manager logs in → only South Region visible.
7. Scenario Interview Question
Q: Your Power BI dashboard is loading very slowly. What steps will you take?
Answer:
"First, I’ll check if too much data is being loaded into the model and remove
unnecessary columns. Then, I’ll prefer measures instead of calculated columns for
efficiency. I’ll also use proper relationships and apply filters at the query level instead
of loading everything into Power BI."
8. HR-style Question
Q: Explain a project where you used Power BI.
Answer:
"In a Sales Analysis project, I used SQL to extract order data, cleaned it in Power
Query, created relationships with customer data, and built a dashboard with KPIs like
Total Sales, Top 5 Customers, and Profit Margin. I added slicers for Region and Date for
interactive analysis, and published the report to Power BI Service for management
review."
Data Analyst Interview – Final
Revision Booster
(SQL + Excel + Power BI)
SQL Quick Revision
Basics
DDL: CREATE, ALTER, DROP, TRUNCATE
DML: INSERT, UPDATE, DELETE
DQL: SELECT
TCL: COMMIT, ROLLBACK, SAVEPOINT
DELETE vs TRUNCATE
DELETE → removes specific rows, can ROLLBACK
TRUNCATE → removes all rows, faster, no WHERE
Keys & Constraints
Primary Key → Unique + Not NULL
Foreign Key → References another table
Unique Key → Unique values, allows one NULL
Check → Restrict values (Age ≥ 18)
Default → Assigns default value
Joins
INNER JOIN → Common rows
LEFT JOIN → All from Left, matched from Right
RIGHT JOIN → All from Right, matched from Left
FULL JOIN → All rows from both
GROUP BY & HAVING
WHERE → filters rows
HAVING → filters groups after aggregation
Aggregate Functions
SUM(), AVG(), COUNT(), MIN(), MAX()
Subqueries
Independent → Normal
Correlated → Depends on outer query
Second Highest Salary
SELECT MAX(salary)
FROM employees
WHERE salary < (SELECT MAX(salary) FROM employees);
Window Functions
RANK() OVER (ORDER BY Salary DESC)
Normalization
1NF: Atomic values
2NF: No partial dependency
3NF: No transitive dependency
Excel Quick Revision
Key Formulas
IF → =IF(A2>=35,"Pass","Fail")
IF + AND → =IF(AND(A2>=35,B2>=35),"Pass","Fail")
COUNTIF → =COUNTIF(A2:A10,">=60")
VLOOKUP → =VLOOKUP(102,A2:C4,3,FALSE)
XLOOKUP → =XLOOKUP(102,A2:A4,C2:C4)
Basic Stats
AVERAGE, MEDIAN, MODE, MIN, MAX, STDEV
Pivot Tables
Drag Region → Rows, Sales → Values
Quickly summarize large data
Conditional Formatting
Highlight duplicates, top 10, less than 50
Data Cleaning
Remove Duplicates
TRIM → =TRIM(A2)
Text to Columns, Flash Fill
Data Validation
Restrict inputs → Eg: Age 18–60
Dropdown lists
Charts
Bar → Compare
Line → Trend
Pie → Share %
Power BI Quick Revision
Workflow
Connect → Transform → Model → Visualize → Publish
Components
Power Query → Clean & transform data
Power Pivot → Create relationships
DAX → Formulas
Common DAX
Total Sales = SUM(Sales[Amount])
Profit Margin = DIVIDE(SUM(Sales[Profit]), SUM(Sales[Amount]))
Relationships
One-to-Many (most common)
Use CustomerID, OrderID
Filters
Visual-level, Page-level, Report-level
Slicers for user control
Row-Level Security
[Region] = "South"
Visuals
Bar → Categories
Line → Trend
Pie → Share %
Map → Geo analysis
Performance Optimization
Load only required data
Use Measures (not Calculated Columns)
Apply filters in Power Query
Common Scenario Questions
Q1. SQL result rows ≠ Excel rows.
Check SQL filters/joins, Excel import limits, blanks, COUNT for cross-check.
Q2. Excel dataset has duplicates and blanks.
Remove Duplicates, TRIM, Data Validation, Flash Fill.
Q3. Power BI dashboard is slow.
Load only needed columns, use measures, filter in Power Query.
Q4. Tell me about a project using these tools.
"In a Sales Analysis project, I extracted data with SQL, validated & cleaned in Excel,
and built a Power BI dashboard with KPIs like Total Sales, Top Customers, and Profit
Margin. Added slicers for interactive analysis and published it for management."