Data Analyst Interview Questions & Answers
SQL Questions & Answers
1. Write a query to find the second highest salary:
SQL Query:
SELECT MAX(salary)
FROM employees
WHERE salary < (SELECT MAX(salary) FROM employees);
2. Explain different types of joins:
- INNER JOIN: Matches records in both tables.
- LEFT JOIN: All records from the left table and matched records from the right.
- RIGHT JOIN: All records from the right table and matched records from the left.
- FULL JOIN: All records from both tables.
3. Write a query to calculate average sales by region:
SELECT region, AVG(sales) AS average_sales
FROM sales_data
GROUP BY region;
4. Handling NULL values in SQL:
- IS NULL / IS NOT NULL to filter NULL values.
- COALESCE() to replace NULL with a default value.
- IFNULL() (MySQL) or NVL() (Oracle) to replace NULL.
Data Analyst Interview Questions & Answers
Excel Questions & Answers
1. How do you use VLOOKUP?
=VLOOKUP(lookup_value, table_array, col_index_num, [range_lookup])
2. How to use INDEX-MATCH?
=INDEX(array, MATCH(lookup_value, lookup_array, 0))
3. Explain Pivot Tables:
Pivot tables summarize large datasets by grouping and aggregating data.
Example: Calculate total sales by region or average sales by product.
4. Cleaning Data in Excel:
- Remove duplicates using "Remove Duplicates."
- Use TRIM() to remove extra spaces.
- Use IF() or IFERROR() to handle errors or missing values.
Python/R Questions & Answers
1. Handling Missing Data in Python:
import pandas as pd
df.dropna() # Drop rows with missing values
df.fillna(0) # Fill missing values with 0
Data Analyst Interview Questions & Answers
2. Calculate Mean, Median, and Mode:
import statistics
data = [1, 2, 3, 4, 5, 5]
print(statistics.mean(data), statistics.median(data), statistics.mode(data))
3. Merging Two Datasets in Python:
pd.merge(df1, df2, on='key', how='inner') # Inner join
Data Visualization & Analytical Thinking
1. Tools for Visualization:
- Tableau, Power BI, Matplotlib, Seaborn, ggplot2 (R).
2. Choosing the Right Chart:
- Bar charts for comparisons.
- Line charts for trends.
- Scatter plots for relationships.
- Pie charts for proportions.
3. Analyzing a 20% Sales Drop:
- Examine sales trends, region-wise or product-wise.
- Check external factors like seasonality or competitors.
- Review internal strategies like pricing and marketing.
Data Analyst Interview Questions & Answers
4. Handling Outliers:
Use IQR (Interquartile Range) or Z-score to detect outliers.
Decide to remove, transform, or keep them based on the context.
A/B Testing & Behavioral Questions
1. Designing an A/B Test:
- Define your hypothesis.
- Split users into control and treatment groups.
- Measure key metrics (e.g., click-through rate).
2. Ensuring Statistical Significance:
- Use a significance level (e.g., 0.05).
- Calculate the p-value.
- Ensure the sample size is large enough.
3. Describing Complex Data to Non-Technical Audiences:
- Example: "I created a dashboard in Tableau to visualize sales trends for the marketing team using
simple charts and avoiding technical jargon."
4. Ensuring Analysis Accuracy:
- Double-check calculations.
- Cross-check with stakeholders.
Data Analyst Interview Questions & Answers
- Validate results using alternative methods.
Case Study & Tools
1. Identifying Profitable Customer Segments:
- Analyze customer purchase history and demographics.
- Calculate metrics like Customer Lifetime Value (CLV).
- Identify high spenders or frequent buyers.
2. Handling Large Datasets:
- Use SQL or Python (Pandas, NumPy) for efficient processing.
- Sample the data if possible.
- Utilize cloud-based tools like BigQuery or Spark.
3. Data Cleaning Tools:
- Python (Pandas), Excel, or OpenRefine.
4. Handling Duplicate Data:
- In Python: df.drop_duplicates()
- In Excel: Use the "Remove Duplicates" feature.