0% found this document useful (0 votes)
15 views22 pages

Data Analytics Interview

The document provides a comprehensive overview of SQL, Excel, and Power BI, including key concepts, functions, and examples. It covers SQL joins, aggregate functions, subqueries, and normalization, as well as Excel functions like VLOOKUP, pivot tables, and data validation. Additionally, it explains Power BI components, DAX basics, and data visualization techniques, along with common interview questions and scenarios.

Uploaded by

dnaresh2323
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views22 pages

Data Analytics Interview

The document provides a comprehensive overview of SQL, Excel, and Power BI, including key concepts, functions, and examples. It covers SQL joins, aggregate functions, subqueries, and normalization, as well as Excel functions like VLOOKUP, pivot tables, and data validation. Additionally, it explains Power BI components, DAX basics, and data visualization techniques, along with common interview questions and scenarios.

Uploaded by

dnaresh2323
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 22

SQL

Part A: Quick Booster (Short Notes + Q&A)

1. SQL Joins

 INNER JOIN → Returns only matching rows from both tables.

 LEFT JOIN → All rows from left + matching rows from right; NULL if no match.

 RIGHT JOIN → All rows from right + matching rows from left; NULL if no match.

 FULL OUTER JOIN → All rows from both, NULLs for missing matches.

Q: What is the difference between INNER JOIN and LEFT JOIN?


A: INNER JOIN gives only matched records. LEFT JOIN gives all records from the left
table even if there is no match in the right table.

2. WHERE vs HAVING

 WHERE → Filters rows before GROUP BY.

 HAVING → Filters aggregated groups after GROUP BY.

3. GROUP BY & ORDER BY

 GROUP BY → Used for summarizing (SUM, AVG, COUNT).

 ORDER BY → Used for sorting results ascending/descending.

4. Aggregate Functions

 SUM() → Adds values

 AVG() → Calculates average

 MIN() / MAX() → Smallest / Largest value

 COUNT() → Counts rows

5. Subqueries

 Independent Subquery: Runs separately.

 Correlated Subquery: Depends on outer query.

6. Common Queries

 Second Highest Salary


SELECT MAX(salary)

FROM employees

WHERE salary < (SELECT MAX(salary) FROM employees);

 Find Duplicates

SELECT name, COUNT(*)

FROM employees

GROUP BY name

HAVING COUNT(*) > 1;

 Top 5 Customers by Sales

SELECT customer_id, SUM(sales) AS total_sales

FROM orders

GROUP BY customer_id

ORDER BY total_sales DESC

LIMIT 5;

7. COUNT(*) vs COUNT(column)

 COUNT(*) → Counts all rows (including NULLs).

 COUNT(column) → Counts only non-NULL values.

8. Window Functions

 ROW_NUMBER() → Unique number for each row.

 RANK() → Ranking with gaps.

 DENSE_RANK() → Ranking without gaps.

Example:

SELECT name, salary,

RANK() OVER (ORDER BY salary DESC) AS rank

FROM employees;

9. Indexes

 Clustered Index → Sorts and stores data physically. Only one per table.
 Non-Clustered Index → Separate from data; multiple allowed.

10. Normalization

 Breaking tables into smaller tables to reduce redundancy and improve data
integrity.

Part B: Detailed Explanations + Examples


Joins with Example

Employees Table

EmpID Name DeptID

1 Ravi 10

2 Sita 20

3 Raju NULL

Departments Table

DeptID DeptName

10 HR

20 IT

30 Finance

INNER JOIN

SELECT e.Name, d.DeptName

FROM Employees e

INNER JOIN Departments d

ON e.DeptID = d.DeptID;

👉 Output: Ravi-HR, Sita-IT

LEFT JOIN

SELECT e.Name, d.DeptName

FROM Employees e
LEFT JOIN Departments d

ON e.DeptID = d.DeptID;

👉 Output: Ravi-HR, Sita-IT, Raju-NULL

2. WHERE vs HAVING Example

Sales Table

Region Sales

North 1000

South 2000

East 800

SELECT Region, SUM(Sales) AS TotalSales

FROM Sales

GROUP BY Region

HAVING SUM(Sales) > 1000;

👉 Output: South (2000)

3. Subquery Example

SELECT name, salary

FROM employees

WHERE salary > (SELECT AVG(salary) FROM employees);

👉 Finds employees earning above average salary.

Correlated Subquery

SELECT e1.Name, e1.Salary

FROM employees e1

WHERE e1.Salary > (

SELECT AVG(e2.Salary)

FROM employees e2

WHERE e1.DeptID = e2.DeptID

);

👉 Finds employees earning above their department’s average.


4. Window Function Example

SELECT Name, Salary,

DENSE_RANK() OVER (ORDER BY Salary DESC) AS SalaryRank

FROM Employees;

Ranks employees by salary, with no gaps in rank numbers.

5. Indexes

Q: Why use Indexes?


A: To improve query performance.
Example: If table has 1 million rows, index on EmpID helps quick searching instead of
scanning all rows.

6. CTE (Common Table Expression)

WITH HighSalary AS (

SELECT name, salary

FROM employees

WHERE salary > 50000

SELECT * FROM HighSalary;

Temporary result set that can be used like a table.

7. Scenario Question
Q: Sales report in SQL shows 1000 rows, but Excel has 980. What will you
do?
Answer:

 Check filters/joins in SQL query.

 Ensure Excel didn’t miss rows due to blank cells or import limit.

 Use COUNT in SQL and compare with Excel row count.

8. HR-style Question
Q: Tell me about a project where you used SQL.
Answer:
"In a Sales Analysis project, I used SQL to extract and clean raw sales data from the
database. I wrote queries to calculate total sales by region, top customers, and
average order value. This cleaned data was then used in Excel for validation and in
Power BI for dashboard creation."

Part C: Must-Read Extra SQL Topics for Interviews

1. SQL Categories

 DDL (Data Definition Language)

o Defines structure of database objects.

o Commands:

 CREATE → create table/database

 ALTER → modify table structure

 DROP → delete table/database

 TRUNCATE → remove all rows but keep structure

Q: Difference between DELETE and TRUNCATE?


A:

 DELETE → Removes specific rows, can use WHERE, can be rolled back.

 TRUNCATE → Removes all rows, faster, cannot use WHERE, minimal logging.

 DML (Data Manipulation Language)

o Manages data inside tables.

o Commands:

 INSERT → Add data

 UPDATE → Modify data

 DELETE → Remove data

 DQL (Data Query Language)

o SELECT → Fetch data

 DCL (Data Control Language)

o GRANT, REVOKE → Permissions

 TCL (Transaction Control Language)

o COMMIT → Save changes


o ROLLBACK → Undo

o SAVEPOINT → Save partial transaction

2. Keys & Constraints

 Primary Key

o Uniquely identifies each row.

o Only one per table.

o Cannot contain NULL.

CREATE TABLE Employees(

EmpID INT PRIMARY KEY,

Name VARCHAR(50)

);

 Foreign Key

o References primary key in another table.

o Ensures referential integrity.

CREATE TABLE Orders(

OrderID INT PRIMARY KEY,

EmpID INT,

FOREIGN KEY (EmpID) REFERENCES Employees(EmpID)

);

 Unique Key

o Ensures all values in a column are unique.

o Allows one NULL value.

 Composite Key

o Combination of two or more columns used as a primary key.

 Check Constraint

o Restricts column values.

Age INT CHECK (Age >= 18)

 Default Constraint

o Assigns default value if no value provided.

Salary DECIMAL(10,2) DEFAULT 30000


3. Transactions Example

BEGIN TRANSACTION;

UPDATE Accounts SET Balance = Balance - 1000 WHERE AccID = 101;

UPDATE Accounts SET Balance = Balance + 1000 WHERE AccID = 102;

IF @@ERROR <> 0

ROLLBACK TRANSACTION;

ELSE

COMMIT TRANSACTION;

👉 Ensures money transfer happens fully or not at all.

4. Normalization

 1NF: No repeating groups, atomic values.

 2NF: 1NF + no partial dependency.

 3NF: 2NF + no transitive dependency.

Q: Why normalization?
A: To reduce data redundancy and improve integrity.

5. Views & Stored Procedures

 View → Virtual table (saved SELECT query).

CREATE VIEW HighSalary AS

SELECT Name, Salary FROM Employees WHERE Salary > 50000;

 Stored Procedure → Predefined set of SQL statements.

CREATE PROCEDURE GetTop5Customers

AS

SELECT TOP 5 CustomerID, SUM(Sales)

FROM Orders

GROUP BY CustomerID

ORDER BY SUM(Sales) DESC;


Excel
Part A: Quick Booster (Short Notes + Q&A)

1. VLOOKUP vs XLOOKUP

 VLOOKUP → Works vertically only, searches from left to right.

 XLOOKUP → Works vertically & horizontally, can search left/right/up/down,


supports error handling.

Q: Why is XLOOKUP better than VLOOKUP?


A: XLOOKUP doesn’t need column index number, can search in both directions, and is
more flexible.
2. IF, AND, OR

 IF:

=IF(A2>=35,"Pass","Fail")

 IF + AND:

=IF(AND(A2>=35,B2>=35),"Pass","Fail")

 IF + OR:

=IF(OR(A2="Hyderabad",A2="Chennai"),"South Zone","Other")

3. COUNT Functions

 COUNT → Counts only numbers.

 COUNTA → Counts numbers + text + non-empty cells.

 COUNTIF → Counts with a condition.

=COUNTIF(A2:A10,">=50")

4. Basic Stats Functions

 AVERAGE → Mean value

 MEDIAN → Middle value

 MODE → Most frequent value

 MIN / MAX → Lowest / Highest value

 STDEV → Standard deviation

5. Pivot Tables

 Summarize large data easily.

 Steps: Insert → PivotTable → Drag fields into Rows, Columns, Values.

 Example: Region-wise total sales.

6. Conditional Formatting

 Highlight cells using rules.

 Example: Highlight duplicates → Conditional Formatting → Duplicate Values.

 Example: Data Bars for visual comparison.


7. Charts

 Column Chart → Compare categories.

 Line Chart → Trends over time.

 Pie Chart → Percentage distribution.

8. Data Cleaning Tools

 Remove Duplicates

 Trim spaces: =TRIM(A2)

 Text to Columns → Split data

 Flash Fill → Auto-complete patterns

9. Data Validation

 Restrict input (example: marks between 0–100).

 Add dropdown lists for input control.

10. Named Ranges & Tables

 Named ranges for easy formulas.

 Converting range → Table (Ctrl + T) for dynamic analysis.

Part B: Detailed Explanations + Examples

1. VLOOKUP & XLOOKUP Example

Data:

RollNo Name Marks

101 Ravi 75

102 Sita 82

103 Raju 65

VLOOKUP Example:

=VLOOKUP(102,A2:C4,3,FALSE)

Output: 82

XLOOKUP Example:

=XLOOKUP(102,A2:A4,C2:C4)
Output: 82

2. IF with AND/OR Example

Data:

Nam Mat Scienc


e h e

Ravi 40 60

Sita 30 80

Formula:

=IF(AND(B2>=35,C2>=35),"Pass","Fail")

Ravi → Pass
Sita → Fail

3. COUNTIF Example

Data:

Name Marks

Ravi 82

Sita 60

Raju 45

Formula:

=COUNTIF(B2:B4,">=60")

Output: 2

4. Pivot Table Example

Sales Data:

Regio Sale
n s

North 2000

South 3000

East 1500

Steps: Insert → PivotTable → Drag Region → Rows, Sales → Values


Output:

 North → 2000
 South → 3000

 East → 1500

 Total = 6500

5. Conditional Formatting Example

 To highlight marks below 50:


Home → Conditional Formatting → New Rule → Format cells < 50 (Red Fill).

6. Data Validation Example

 Restrict Age between 18 and 60:


Data → Data Validation → Whole Number → Between 18 and 60.

7. Scenario Interview Question

Q: Your Excel dataset has blank rows, duplicates, and spelling mistakes. How will you
clean it?
Answer:
"I’ll first remove duplicates using the Remove Duplicates tool, trim unwanted spaces
using TRIM, and apply Data Validation for uniform spellings. I’ll also handle blanks with
filters and use Flash Fill for corrections."

8. HR-style Question

Q: Tell me how you used Excel in your project.


Answer:
"In a Sales Analysis project, I used Excel PivotTables to summarize region-wise sales,
applied conditional formatting to highlight top-performing regions, and used formulas
like IF and VLOOKUP to validate data before connecting it to Power BI dashboards."
Power BI
Part A: Quick Booster (Short Notes + Q&A)

1. What is Power BI?

 Power BI is a Business Intelligence tool by Microsoft used for data


visualization, reporting, and analysis.

2. Power BI Components

 Power Query → Data cleaning & transformation.

 Power Pivot → Data modeling.

 DAX (Data Analysis Expressions) → Formulas/calculations.

 Power BI Desktop → Build reports.

 Power BI Service → Publish & share reports online.

 Power BI Mobile → View dashboards on mobile.

3. Difference: Power Query vs Power Pivot

 Power Query → For ETL (Extract, Transform, Load).

 Power Pivot → For creating relationships, measures, calculations.


4. DAX Basics

 Calculated Column → Row-level calculation.

 Measure → Aggregate calculation used in visuals.

Examples:

Total Sales = SUM(Sales[Amount])

Average Profit = AVERAGE(Sales[Profit])

5. Relationships in Power BI

 One-to-Many → Most common.

 Many-to-Many → Allowed in Power BI but use with caution.

 Active/Inactive relationships.

6. Visualizations

 Bar/Column Chart → Compare categories.

 Line Chart → Trends.

 Pie/Donut → Percentages.

 Map → Geographical analysis.

 Table/Matrix → Detailed view.

7. Filters in Power BI

 Visual-level Filter → Only affects one visual.

 Page-level Filter → Affects entire page.

 Report-level Filter → Affects entire report.

 Slicers → User-friendly filter.

8. Row-Level Security (RLS)

 Restricts data visibility for users.

[Region] = "South"

9. Publishing

 After building in Desktop → Publish to Power BI Service.


 Can share dashboards with team members.

10. Common Interview Question

Q: Difference between Calculated Column and Measure?


A:

 Calculated Column → Stored in data model, works row by row.

 Measure → Calculated at runtime, uses less memory, best for aggregation.

Part B: Detailed Explanations + Examples

1. Power BI Workflow

1. Connect → Import data from SQL, Excel, CSV, APIs.

2. Transform → Clean using Power Query.

3. Model → Create relationships between tables.

4. Create → Build visuals using fields, measures.

5. Publish → Share using Power BI Service.

2. Power Query Example

Suppose sales data has blanks and wrong text case.

 Steps:

o Remove rows with nulls.

o Use Transform → Format → Capitalize Each Word.

o Merge columns if needed.

3. DAX Example

Data:

OrderID Amount Profit

101 200 50

102 300 70

103 150 40

Measures:
Total Sales = SUM(Sales[Amount])

Profit Margin = DIVIDE(SUM(Sales[Profit]), SUM(Sales[Amount]))

Shows both total sales and profit margin % in report.

4. Relationship Example

Tables:

 Orders → OrderID, CustomerID, Amount

 Customers → CustomerID, Name, Region

Relationship: Orders[CustomerID] → Customers[CustomerID] (One-to-Many).

5. Visualization Example

 Bar Chart: Region-wise sales.

 Pie Chart: Category-wise sales % share.

 Line Chart: Month-wise sales trend.

6. Row-Level Security Example

If you want South Region Manager to see only South data:


Manage Roles → Add Role →

[Region] = "South"

When Manager logs in → only South Region visible.

7. Scenario Interview Question

Q: Your Power BI dashboard is loading very slowly. What steps will you take?
Answer:
"First, I’ll check if too much data is being loaded into the model and remove
unnecessary columns. Then, I’ll prefer measures instead of calculated columns for
efficiency. I’ll also use proper relationships and apply filters at the query level instead
of loading everything into Power BI."

8. HR-style Question

Q: Explain a project where you used Power BI.


Answer:
"In a Sales Analysis project, I used SQL to extract order data, cleaned it in Power
Query, created relationships with customer data, and built a dashboard with KPIs like
Total Sales, Top 5 Customers, and Profit Margin. I added slicers for Region and Date for
interactive analysis, and published the report to Power BI Service for management
review."

Data Analyst Interview – Final


Revision Booster
(SQL + Excel + Power BI)

SQL Quick Revision


Basics

 DDL: CREATE, ALTER, DROP, TRUNCATE

 DML: INSERT, UPDATE, DELETE

 DQL: SELECT

 TCL: COMMIT, ROLLBACK, SAVEPOINT

DELETE vs TRUNCATE

 DELETE → removes specific rows, can ROLLBACK

 TRUNCATE → removes all rows, faster, no WHERE

Keys & Constraints

 Primary Key → Unique + Not NULL

 Foreign Key → References another table

 Unique Key → Unique values, allows one NULL

 Check → Restrict values (Age ≥ 18)

 Default → Assigns default value


Joins

 INNER JOIN → Common rows

 LEFT JOIN → All from Left, matched from Right

 RIGHT JOIN → All from Right, matched from Left

 FULL JOIN → All rows from both

GROUP BY & HAVING

 WHERE → filters rows

 HAVING → filters groups after aggregation

Aggregate Functions

 SUM(), AVG(), COUNT(), MIN(), MAX()

Subqueries

 Independent → Normal

 Correlated → Depends on outer query

Second Highest Salary

SELECT MAX(salary)

FROM employees

WHERE salary < (SELECT MAX(salary) FROM employees);

Window Functions

RANK() OVER (ORDER BY Salary DESC)

Normalization

 1NF: Atomic values

 2NF: No partial dependency

 3NF: No transitive dependency

Excel Quick Revision


Key Formulas

 IF → =IF(A2>=35,"Pass","Fail")

 IF + AND → =IF(AND(A2>=35,B2>=35),"Pass","Fail")

 COUNTIF → =COUNTIF(A2:A10,">=60")

 VLOOKUP → =VLOOKUP(102,A2:C4,3,FALSE)

 XLOOKUP → =XLOOKUP(102,A2:A4,C2:C4)

Basic Stats

 AVERAGE, MEDIAN, MODE, MIN, MAX, STDEV

Pivot Tables

 Drag Region → Rows, Sales → Values

 Quickly summarize large data

Conditional Formatting

 Highlight duplicates, top 10, less than 50

Data Cleaning

 Remove Duplicates

 TRIM → =TRIM(A2)

 Text to Columns, Flash Fill

Data Validation

 Restrict inputs → Eg: Age 18–60

 Dropdown lists

Charts

 Bar → Compare

 Line → Trend

 Pie → Share %
Power BI Quick Revision
Workflow

 Connect → Transform → Model → Visualize → Publish

Components

 Power Query → Clean & transform data

 Power Pivot → Create relationships

 DAX → Formulas

Common DAX

Total Sales = SUM(Sales[Amount])

Profit Margin = DIVIDE(SUM(Sales[Profit]), SUM(Sales[Amount]))

Relationships

 One-to-Many (most common)

 Use CustomerID, OrderID

Filters

 Visual-level, Page-level, Report-level

 Slicers for user control

Row-Level Security

[Region] = "South"

Visuals

 Bar → Categories

 Line → Trend

 Pie → Share %

 Map → Geo analysis

Performance Optimization
 Load only required data

 Use Measures (not Calculated Columns)

 Apply filters in Power Query

Common Scenario Questions

Q1. SQL result rows ≠ Excel rows.


Check SQL filters/joins, Excel import limits, blanks, COUNT for cross-check.

Q2. Excel dataset has duplicates and blanks.


Remove Duplicates, TRIM, Data Validation, Flash Fill.

Q3. Power BI dashboard is slow.


Load only needed columns, use measures, filter in Power Query.

Q4. Tell me about a project using these tools.


"In a Sales Analysis project, I extracted data with SQL, validated & cleaned in Excel,
and built a Power BI dashboard with KPIs like Total Sales, Top Customers, and Profit
Margin. Added slicers for interactive analysis and published it for management."

You might also like