UNIT1 Introduction to Datascience
PART-A
1. List down any five skills required for a data analyst. (K1) [NOV/DEC 2023]
2. Outline the significance of Exploratory Data analysis. (K1) (K4) [NOV/DEC 2023]
3. Tabulate the differences between univariate, bivariate and multivariate analysis. Give
Example. [K4] [NOV/DEC 2023]
4. Define Big Data and Data Science. [K1]
5. What is meant by RDBMS? [K2]
6. List the characteristics of Big Data. [K1]
7. Who coined the term Data Science and Big Data? [K4]
8. State the 4Vs of Big Data. [K1]
9. State the abilities needed for a Data Scientist. [K3]
10. Illustrate the purpose of SQL, NOSQL and SPARQL database. [K3]
11. Why Python is used in Data Science? [K4]
12. List any 4 benefits of Data Science and Big Data. [K3]
13. State few applications of Data Science. [K3]
14. Explore the various data types supported for Data Science. [K2]
15. Differentiate the structured and unstructured data with examples. [K4]
16. What do you mean by machine generated data? Give some examples. [K3]
17. Define graph theory. [K1]
18. List down the 6 steps of Data Science process. [K1]
19. Define the terms: a) Data Cleansing b) Data Integration [K1]
20. List the various components of Big Data Ecosystem. [K2]
21. State the need of distributed file systems. Give some examples. [K2]
22. Differentiate between horizontal and vertical scaling. [K4]
23. Illustrate the purpose of HDFS. [K3]
24. What is meant by Neural Networks? [K1]
25. What are the various types or categories of databases? [K3]
26. List some Big Data scheduling tools. [K3]
27. Define service programming. [K1]
28. List the pros and cons of structured approach. [K1]
29. Define agile process. [K1]
30. State the purpose of using project charter. [K2]
31. List out few data repositories to store the data. [K3]
32. Define a) Interpretation error. b) Data entry error. c) Redundant whitespace [K1]
33. What do you mean by outliers? [K1]
34. What are the techniques available to handle missing values? [K2]
35. List some common errors and possible solution for solving it. [K2]
36. What are the possible ways to combine information from different data sets? [K3]
37. State the process involved in appending or stacking tables. [K4]
38. Draw a Pareto diagram or 80-20 diagram for an example of your choice. [K1]
39. List the components of model building. [K4]
40. Illustrate the purpose of each data science component. [K3]
41. Discuss about various job roles and skills required for Data Science. [K2]
42. Define the terms: a) People Analytics b) Text Mining .c) EDA [K1]
43. State the process of brushing and linking. [K1]
44. What is the process used behind converting normal variable to dummy variable and why?
[K3, K4].
45. List few challenges of Big Data. [K4]
46. What are the tools used for Big Data? [K3]
PART-B & C
1. Describe in detail about benefits and uses of Data Science & Big Data. [K1]
2. Explain about facets of data (or) What are the various forms of data? Explain each
type with an example. [K2, K4]
3. Differentiate between structured and unstructured data with examples. [K4]
4. Briefly discuss about: a) Natural Language b) Machine generated
c) Graph based d) Streaming e) Audio, Video and Images [K2]
5. Explain briefly about the steps involved in data science process. [K1]
6. Write short notes on:
a) Setting the Research goals
b) Retrieving data
c) Data Preparation. [K1]
7. What is meant by Project Charter? How to create it? Explain each requirement. [K1,
K4]
8. State the purpose of Cleansing, Integrating and transforming data. [K4]
9. Describe about the process involved in exploratory data analysis. [K1]
10. Write about data modeling process in detail. [K4]
11. How presentation and automation is achieved in data science process? Explain the
process with neat sketch. [K4]
12. Write short notes about: a) Model & Variable Selection b) Model Execution c) Model
diagnostics & comparison. [K1]