Comprehensive list of interview questions
● OLS Assumptions and test related to most of the assumptions
● How you will continue with ols regression in case of Heteroscedasticity
● Difference between linear regression and logistic regression
● Explain Credit Risk
● In the covid times how will estimate demand of a product since past
information is not useful now ?
● How to check if two distributions are different or not
● How to check normality assumption ?
● What is meant by the term “Linear” in linear regression
● How to treat outliers if they don’t have to be removed
● How to check for multicollinearity and how to choose which variable to omit in
case of multicollinearity between two variables statistically
● How are coefficients different in linear and multi-regression ?
● Probability question : probability of finding an odd number on one dice and even number
on the other dice when 2 dice are thrown
● What is R square?
● What is adjusted R square?
● Probability question : you have 9 pairs of different color socks. What is the probability of
picking 2 socks of different color (replacement is allowed ).
● Guesstimate: How many barber shops are there in delhi
● Case: You are a OLA cab driver. Your profit has fallen by 20%. Give 10 reasons for this.
● Why is homoscedasticity assumption necessary and what happens if it doesn’t hold ?
● What are robust standard errors ?
● How do they fix the problem of hetero?
● What is F test and what is the assumption behind the Ftest ?
● Can F test be used for simple regression and why ?
● What is t-test and what is the assumption behind the t test ?
● Given that you are coming to a technical company how can I be sure that you will be able
to cope with the technical rigours of the job.
● What is intuition behind Maximum Likelihood ?
● How will you evaluate whether a estimator is good enough or not ?
● Suppose you are given 7 independent variables affecting a dependent . You have to rank
them in terms of importance . How will you do it ?
● What factors will you consider if you want to issue a credit card to a individual ?
● Suppose a transaction takes places through credit card ? What factors will you use to
determine whether the transaction is fraud or not ?
● How many zeros are there in 100! ?
Interview Question
1. Walk me through your resume
2. What do you understand by data science?
3. What is big data? Answer
4. How did you get attracted towards data science?
5. What is a random variable(r.v.)?- a variable whose value depends upon the outcome of an
experiment .Example tossing of a coin – either head or tail
6. Probability distribution -of a r.v. X is the values taken by the r.v. and their associated
probabilities
7. Why do you need normality in data? Answer
8. Central limit theorem Answer
9. Law of Large numbers Answer
10. When and how you deal with missing values? Answer
11. How do you deal with outliers?
12. What are the ways to test the significance of the variables
13. What is correlation and its types Answer
14. Can you get correlation between a continuous variable and a dummy variable? Answer
15. What are the conditions for a distribution to be binomial? How do you calculate the
probability in binomial distribution? Asked with an example of tossing a coin Answer
16. What are the conditions for a distribution to be poisson? Answer
17. Explain Linear Regression- why and how does it work (intuition)
18. What is meant by OLS? How is it a good way to find estimators?
19. CLRM Assumptions and Violations of it- ( I told him all assumptions, violations and remedies
of each violation, I was then asked to explain the working behind why remedy, how is it
actually fixing the issue Answer
20. Multicollinearity
21. How to measure it?
22. Tests of Multicollinearity- Spearman Correlation, Kendall Tau, Why is chi-squared used for
these tests? (Which metric is used when?)
23. Steps to be taken to tackle multicollinearity Answer
24. What tests do you use to check for autocorrelation?
25. What is the difference between autocorrelation and multicollinearity?
26. How to deal with autocorrelation problem?
27. What is VIF? What is the R-squared in VIF? What is the auxiliary regression that you have to
run for autocorrelation?
28. Consequences, detection methods, and remedies for heteroscedasticity(Tests of
Heteroskedasticity- Process and Intuition (Which test is the best and why) Answer
29. How does the variance-covariance matrix respond to robust estimators, how is
heteroskedasticity corrected for (explain the workings
30. Why do you assume, mean of the error term is zero?
31. Variance bias tradeoff
32. Is linear regression a biased model or a variance model
33. ANOVA table, ESS, TSS ,RSS
34. Regression from origin
35. Why ESS is not expressed in terms of intercept coefficient
36. Hypothesis testing...what is null and alternative hypothesis
37. Type 1 error type 2 error
38. Which is the most important assumption in OLS according to you?
39. Why do you minimise the square of sums of the error terms? Why square?
40. Derive the coefficients a,b of linear regression?
41. What are good information measures for testing a linear regression model? Explain the
intuition of R-squared (He said how would I react if a model has R-squared -0.3, it was a trick
question because it cannot be negative.) What is the need for Adjusted R-square? (This was
because I mentioned Adjusted R-squared as a good information measure) Formula for
adjusted R^2 and what do you mean by n and k in that formula? Can R-squared be negative?
Its range. Answer
● There are 2 jugs with 5lt and 4lt capacity and a 20lt bucket. Fill exactly 7lt without
measuring it
● There are 8 batteries, where 4 work and 4 don’t. You need 2 working batteries to
make the remote work. How many minimum tries before you find the right one?
● What do you think of Nirmala Sitharaman as a FM?
● What is R-square? Does a high R-square entail that the model has high explanatory
power?
● A model has a R-square of 0.2. What will be your deductions for such a low R-
square?
● What is hypothesis testing?
● Why do we test for null hypothesis, not alternative hypothesis?
● What is the p-value formula?
● What is a test statistic?
● When we use Z-statistic?
● how would you rate yourself in python?
Some indices to assess performance of a company- lower attrition rate,
volume growth and revenue growth.
What is Insider Trading- sale and purchase of securities on the basis of
access to unpublished price sensitive information and such practices
provide unfair advantage to the entity who is privy to such details.
Who is a guarantor? someone who guarantees that the loan will be
repaid to the lender. Hence, if the borrower is not able to pay or defaults
on the loan, the lender has every right to pursue the guarantor for
recovery. If you do not pay or do not get any communication from the
lender because of the change in address or
for any other reason, this impacts your CIBIL Score. Any adverse impact on
the CIBIL score lessens or even kills your chance of taking a loan in future.
And it takes time to improve the score and make you creditworthy again.
Estimate the total number of dry cleaners in Japan
a) Assume there are two million people in Japan.
b) Estimate the size of market by segmenting the population.
c) Assume the population consists of 25% adult men, 25% adult women,
and 50% children.
d) Assume the population consists of 25% adult men, 25% adult women,
and 50% children.
e) Assume children have no dry cleaning and only 25% of adults use dry
cleaning.
f) Estimate the average number of “units” of clothing each man and
woman brings weekly to the cleaners.
g) For this case, assume that 3 shirts/ blouses and 1 suit are brought to
the cleaners each week.
h) Thus the total size of the market (per week) is one million units of
clothing (1 million people x 25% x 4 units per person).
i) Estimate the average number of units a dry cleaner can handle per week.
j) Assume that the average dry cleaner has two workers who typically
handle 20–30 customers (or 80–120 units of clothing) per hour.
k) If the average dry cleaner is open eight hours a day, 5 days/week, they
typically handle 3200–4800 units per week (80–120 units x 8 hours x 5
days).
l) Divide the total market size by the average units handled per dry cleaner
to find the total number of dry cleaners.
m) There are between 208 and 312 dry cleaners in Japan.
••••