0% found this document useful (0 votes)

8 views20 pages

Set. No - 1 P18pecs021-Data Science QP - Ph.d.

The document outlines a Ph.D. examination paper for Data Science at Bharath Institute of Higher Education & Research, featuring questions on fundamental concepts, applications, and techniques in data science. It includes sections on data visualization, statistical analysis, and the significance of APIs, along with practical examples from various industries. The exam is structured into three parts: Part A with short answer questions, Part B with options for deeper analysis, and Part C requiring detailed responses on core data science concepts.

Uploaded by

Mr Krishna

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views20 pages

Set. No - 1 P18pecs021-Data Science QP - Ph.d.

Uploaded by

Mr Krishna

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 20

Reg.

# : 20NF

P18PECS021
(only above code to be shaded in the Answer book #)
Bharath Institute of Higher Education & Research, Chennai – 73
Ph.D, dept, May / June - 2025
P18PECS021 – Data science

Time: 3 Hrs Maximum: 100

Marks
(10 x 2 =
20)
Part A
Answer All Questions
1. What is Data Science?
2. Explain the concept of 'Technology' in the context of data science.
3. What are the different sources of data in data science?
4. Define API and its role in data collection.
5. Define variance and its significance in data analysis.
6. What is the Central Limit Theorem (CLT)?
7. What are the main types of data visualization?
8. Explain the role of "retinal variables" in data visualization.
9. List two applications of data science in the healthcare industry.
10. What is Bokeh and how is it used for data visualization in Python?

1
1. What is Data Science?
Answer: Data Science is an interdisciplinary field that uses scientific
methods, processes, algorithms, and systems to extract knowledge and
insights from structured and unstructured data.
Example: Netflix uses data science to analyze users' watch history
and recommend personalized movie suggestions.

2. Explain the concept of 'Technology' in the context of Data Science.

Answer: Technology in Data Science refers to the tools, software,
and frameworks used to collect, store, process, analyze, and visualize
data.
Example: Python and R are popular programming languages used
for data analysis, machine learning, and visualization in Data
Science.

3. What are the different sources of data in Data Science?

Answer: Data can come from multiple sources, including:
- Structured sources (databases like MySQL)
- Unstructured sources (social media posts, text files)
- Sensor data (IoT devices)
- Web data (scraped web pages)
Example: Google Analytics collects web data to help businesses
understand user behavior on their websites.

4. Define API and its role in data collection.

2
Answer: An API (Application Programming Interface) allows
different software applications to communicate and exchange data.

Example: The Twitter API enables developers to fetch tweets, analyze

trends, and perform sentiment analysis.

5. Define variance and its significance in data analysis.

Answer: Variance measures the spread of data points from the mean,
helping determine how much data varies.
Example: If a company's monthly sales vary widely, the variance will
be high, indicating inconsistency in performance.

6. What is the Central Limit Theorem (CLT)?

Answer: CLT states that the distribution of the sample means will
approximate a normal distribution as the sample size increases,
regardless of the population distribution.
Example: If a researcher collects multiple samples of student heights
from different schools, their average height distribution will tend
toward a normal curve.

7. What are the main types of data visualization?

Answer: Common data visualization types include:
- Bar Charts (Comparing categories)
- Line Graphs (Trends over time)
- Scatter Plots (Relationships between variables)
- Heatmaps (Patterns in large datasets)

3
Example: A company might use a line graph to track its monthly
revenue growth.

8. Explain the role of "retinal variables" in data visualization.

Answer: Retinal variables, such as size, shape, color, and orientation,
help in encoding data visually to improve understanding.
Example: Heatmaps use color intensity to show variations in values,
making patterns more recognizable.

9. List two applications of data science in the healthcare industry.

Answer:
- Disease Prediction: AI models analyze patient symptoms and
medical history to predict illnesses early.
- Medical Imaging Analysis: Machine learning aids in detecting
anomalies in X-rays and MRIs.
Example: IBM Watson helps doctors diagnose diseases by analyzing
vast amounts of medical data.

10. What is Bokeh and how is it used for data visualization in Python?
Answer: Bokeh is a Python library that creates interactive and
visually appealing visualizations for web applications.
Example: A data analyst can use Bokeh to build an interactive
dashboard displaying real-time sales trends.

4
Part B (5 x 6 = 30)

Answer either (a) or (b) from each question

1. (a) Discuss the role of Data Science in modern industries and how it
drives decision-making processes. (or)
(b) Explain the importance of the Data Science Process and its stages
in transforming raw data into actionable insights.
2. (a) Analyze the challenges involved in collecting data from multiple
sources and propose strategies for managing and integrating these
data sources. (or)
(b) Discuss the significance of APIs in modern data collection. How
do they facilitate data exchange across systems?
3. (a) Examine the concept of central tendency in statistics. How do
measures like mean, median, and mode contribute to understanding
data? (or)
(b) Discuss the Central Limit Theorem (CLT) and its significance in
statistical analysis and hypothesis testing.
4. (a) Describe the different types of data visualizations and the types
of data each is best suited for. (or)
(b) Evaluate the importance of mapping variables to visual
encodings in data visualization and how it impacts the clarity of
insights.
5. (a) Discuss the various applications of data science in sectors like
healthcare, finance, and e-commerce, and how they contribute to
solving industry-specific challenges. (or)
(b) Analyze the role of Bokeh in Python for creating interactive
visualizations. How does it differ from other visualization libraries
like Matplotlib and Seaborn?

5
1. (a) Role of Data Science in Modern Industries & Decision-
Making
Data Science plays a crucial role in modern industries by
analyzing large volumes of data to uncover patterns, trends, and
insights that drive business decisions. Companies across various
sectors use Data Science to optimize operations, improve
customer experience, and increase profitability.

Key Roles in Different Industries:

- Healthcare: Predict disease outbreaks, analyze patient
records for personalized treatments.
- Finance: Detect fraudulent transactions, assess credit risks.
- Retail & E-commerce: Enhance customer recommendations,
forecast demand trends.
- Manufacturing: Optimize supply chain efficiency, reduce
production downtime.
- Marketing: Perform sentiment analysis, target
advertisements effectively.

Example:

6
Netflix uses Data Science to analyze user preferences and
recommend personalized content, leading to increased
engagement and subscriptions.

1. (b) Importance of the Data Science Process

The Data Science process is a structured approach for
converting raw data into actionable insights. It consists of
multiple stages that ensure data-driven decision-making.

Stages of the Data Science Process:

1. Data Collection: Gathering relevant data from sources like
APIs, databases, and web scraping.
2. Data Cleaning: Removing inconsistencies, missing values,
and errors.
3. Exploratory Data Analysis (EDA): Understanding data
patterns and relationships using visualization.
4. Feature Engineering: Creating meaningful variables for
better predictive models.
5. Model Building & Training: Applying Machine Learning
algorithms to derive insights.
6. Evaluation & Deployment: Assessing model accuracy and
integrating insights into decision-making.
7. Monitoring & Improvement: Continuously refining models
based on new data.

7
Example:
A retail company may use Data Science to forecast sales by
analyzing past trends, seasonal effects, and customer buying
habits.

2. (a) Challenges in Collecting Data from Multiple Sources &

Solutions
Data collection from various sources presents several challenges,
including inconsistencies, privacy concerns, and integration
difficulties.

Challenges:
- Data Format Variability: Different sources use different
formats, making integration complex.
- Data Accuracy & Quality Issues: Incorrect, incomplete, or
duplicated data can mislead analytics.
- Scalability Concerns: Handling large datasets requires
efficient infrastructure.
- Security & Privacy Regulations: Compliance with data
protection laws (e.g., GDPR).

Strategies to Overcome Challenges:

8
- Standardizing data formats using ETL (Extract, Transform,
Load) processes.
- Implementing data validation techniques to ensure quality.
- Using cloud storage solutions for scalability.
- Adopting encryption and authentication for secure data
handling.
Example:
A company aggregating data from social media, website
analytics, and customer surveys must harmonize different
formats before drawing insights.

2. (b) Significance of APIs in Modern Data Collection

APIs (Application Programming Interfaces) facilitate seamless
data exchange between applications and systems, enabling real-
time data retrieval and automation.

How APIs Help in Data Collection:

- Automated Data Access: Allows direct extraction without
manual entry.
- Interoperability Across Systems: Enables different platforms
to communicate.
- Scalability & Efficiency: Handles high-volume data requests
dynamically.
- Real-Time Insights: Provides up-to-date information for
decision-making.

9
Example:
The Twitter API allows businesses to fetch live tweets, analyze
trends, and monitor customer sentiments to refine marketing
strategies.

3. (a) Concept of Central Tendency in Statistics

Central tendency describes how data points are distributed
around a central value, helping summarize data with key
metrics: Mean, Median, and Mode .

Definitions:
- Mean (Average): Sum of all values divided by total count.
- Median: Middle value when data is sorted.
- Mode: Most frequently occurring value.

Example:
In exam scores—if the marks are [60, 70, 80, 90, 100]:
- Mean = (60+70+80+90+100)/5 = 80
- Median = 80 (middle value)
- Mode = If 80 appears most frequently, it’s the mode.

3. (b) Central Limit Theorem (CLT) in Statistics

10
The CLT states that the distribution of sample means
approximates a normal distribution as the sample size
increases, regardless of the original data's distribution.

Significance of CLT:
- Helps estimate population characteristics from sample data.
- Forms the foundation of hypothesis testing and confidence
intervals.
- Enables predictions in financial risk analysis, healthcare, and
business intelligence.

Example:
If multiple small samples of customer purchasing amounts are
taken, their average will eventually form a normal distribution.

4. (a) Types of Data Visualizations & Their Uses

Data visualization helps make complex data understandable
through graphical representations. Different types serve various
purposes:

- Bar Chart: Comparing categorical data (e.g., sales by

product category).
- Line Graph: Showing trends over time (e.g., stock prices).
- Scatter Plot: Illustrating relationships (e.g., correlation
between height and weight).

11
- Heatmap: Displaying patterns in a large dataset (e.g., website
user activity).

Example:
A heatmap visualizing user clicks on an e-commerce website
helps identify popular sections.

4. (b) Mapping Variables to Visual Encodings

Retinal variables (size, shape, color, orientation) help in visual
encoding, improving clarity and interpretation.

Impact of Proper Mapping:

- Enhances user comprehension.
- Avoids misleading interpretations.
- Highlights key insights effectively.

Example:
Using different colors for data points in a scatter plot to
distinguish between groups improves readability.

5. (a) Applications of Data Science in Various Industries

Data Science has revolutionized numerous industries:

12
- Healthcare: Disease prediction, personalized treatment
recommendations.
- Finance: Fraud detection, stock market trend analysis.
- E-commerce: Customer preference analysis,
recommendation engines.

Example:
Amazon uses predictive analytics to suggest products based on
browsing and purchase history.

5. (b) Role of Bokeh in Python for Interactive Visualizations

Bokeh is a Python visualization library known for its interactive
and web-based visualizations.

Comparison with Other Libraries:

- Matplotlib: Suitable for static plots (e.g., reports).
- Seaborn: Enhances statistical visualizations (e.g., correlation
matrices).
- Bokeh: Best for dynamic dashboards and interactive
applications.

Example:
A data analyst can use Bokeh to create interactive stock market
graphs where users can zoom, hover, and filter data.

13
Part C (5 x 10 = 50)
Answer Five questions out of Seven

1. Discuss the core concepts of Data Science and how they are applied
across various industries such as healthcare, finance, and retail.
2. Analyze the different sources of data used in Data Science and
explain how they contribute to the data collection process.
3. Explain the concept of central tendency and its importance in
statistical analysis. Discuss how measures like mean, median, and
mode are used to summarize a dataset.
4. Discuss the importance of data visualization in Data Science and
explain how it aids in decision-making.
5. Analyze the various applications of Data Science in business,
particularly in areas like marketing, customer segmentation, and
fraud detection.
6. Discuss the challenges of visualizing time-series data and the
techniques that can be used to overcome them. How can
visualization help in identifying trends and patterns in such data?

14
7. Explain the importance of structure learning techniques in graph
mining. How do constraint-based and score-based algorithms differ,
and when would you use each approach?

1. Core Concepts of Data Science and Their Applications Across

Industries

Data Science is a multidisciplinary field that involves extracting

meaningful insights from structured and unstructured data
using statistical, machine learning, and computational
techniques. Its core concepts include data collection, cleaning,
analysis, visualization, and interpretation.

# Applications Across Industries

- Healthcare: Data Science helps in predictive analytics,
personalized treatment, and medical imaging. For example, AI-
driven diagnostics assist radiologists in detecting diseases like
cancer from medical scans.
- Finance: Fraud detection, risk assessment, and algorithmic
trading are major applications. Banks utilize machine learning
models to detect anomalies in transaction patterns.

15
- Retail: Data Science enhances customer segmentation,
inventory management, and personalized marketing strategies.
Companies like Amazon use recommendation systems to provide
personalized product suggestions.

2. Sources of Data in Data Science and Their Contribution to

Data Collection

Data Science relies on various sources of data that contribute to

the data collection process:

- Structured Data: Organized into rows and columns (e.g.,

relational databases, spreadsheets).
- Unstructured Data: Includes images, text, videos, social
media posts, and logs.
- Real-Time Data: Streaming data from sensors, financial
transactions, or user interactions.
- Public Data: Open-source datasets such as government
statistics, research papers, or census information.

Each type of data contributes to a holistic analysis, enabling

deeper insights into trends, behaviors, and anomalies.

3. Central Tendency and Its Importance in Statistical Analysis

16
Central tendency measures summarize a dataset by identifying a
central value around which the data is distributed. The three key
measures include:

- Mean: The arithmetic average of a dataset, used in academic

grading and financial metrics.
- Median: The middle value, useful in skewed distributions
like income levels.
- Mode: The most frequently occurring value, applied in
categorical data analysis.

Understanding central tendency is crucial for decision-making,

as it provides a representative value that simplifies complex
datasets.

4. Importance of Data Visualization in Data Science and Its Role

in Decision-Making

Data visualization translates complex data into graphical

representations, making it easier to identify patterns, trends, and
outliers.

# Advantages
- Enhanced Understanding: Graphs, charts, and heatmaps
provide intuitive insights.

17
- Improved Decision-Making: Businesses use dashboards for
real-time monitoring of performance.
- Simplified Communication: Visual data is more accessible to
stakeholders with varying technical expertise.

For example, a retailer analyzing sales performance using bar

graphs can easily identify seasonal trends and adjust inventory
accordingly.

5. Applications of Data Science in Business: Marketing,

Customer Segmentation, and Fraud Detection

Businesses leverage Data Science to enhance operations and

profitability:

- Marketing: Predictive analytics optimizes advertisement

targeting and pricing strategies.
- Customer Segmentation: Clustering techniques group
customers based on preferences, leading to personalized
promotions.
- Fraud Detection: Banks implement anomaly detection
models to flag suspicious transactions, preventing financial
losses.

For example, Netflix uses Data Science to recommend content

based on viewing history, increasing user engagement.

18
6. Challenges of Visualizing Time-Series Data and Overcoming
Them

Time-series data, which tracks variables over time, poses unique

visualization challenges:

- Data Overload: Large datasets may clutter graphs.

- Seasonality and Trends: Identifying recurring patterns can
be complex.
- Noise in Data: External fluctuations obscure meaningful
insights.

# Techniques to Overcome Challenges

- Smoothing Techniques: Moving averages help reduce noise.
- Interactive Visualizations: Dynamic graphs allow zooming
into specific time frames.
- Feature Engineering: Extracting time-based features
improves pattern recognition.

Visualizing stock market trends using candlestick charts helps

traders make informed investment decisions.

7. Importance of Structure Learning in Graph Mining and

Comparison of Algorithms

19
Graph mining helps uncover relationships in interconnected
data, such as social networks and transportation systems.

# Structure Learning in Graph Mining

It involves identifying underlying patterns in graphs to optimize
decision-making.

# Comparison of Algorithms
- Constraint-Based Algorithms: Define rules and dependencies
(e.g., Bayesian networks).
- Score-Based Algorithms: Use optimization techniques (e.g.,
Maximum Likelihood Estimation).

Constraint-based approaches are preferable when domain

knowledge is available, while score-based methods work well for
complex data with unknown relationships.

******

Data Science
No ratings yet
Data Science
10 pages
Impact of Data Science Across Industries
No ratings yet
Impact of Data Science Across Industries
3 pages
Data Science
No ratings yet
Data Science
10 pages
Ixs8h l8mgc
No ratings yet
Ixs8h l8mgc
40 pages
Technical Report Writing For Ca2 Examination: Topic: Introduction To Data Science
No ratings yet
Technical Report Writing For Ca2 Examination: Topic: Introduction To Data Science
7 pages
Evolution of Data Science Overview
No ratings yet
Evolution of Data Science Overview
11 pages
All Answers
No ratings yet
All Answers
55 pages
DS 3-Marks Semeseter Suggestion
No ratings yet
DS 3-Marks Semeseter Suggestion
54 pages
Introduction To Data Science - 23CSH-283
100% (1)
Introduction To Data Science - 23CSH-283
48 pages
DS Unit 1
No ratings yet
DS Unit 1
35 pages
Da Ans (GKJ)
No ratings yet
Da Ans (GKJ)
11 pages
Q1. Explain Data Science Process Along With Detailed Diagram
No ratings yet
Q1. Explain Data Science Process Along With Detailed Diagram
7 pages
Fods QB
No ratings yet
Fods QB
35 pages
DA-1,2,3 (1) Merged
No ratings yet
DA-1,2,3 (1) Merged
39 pages
Notes Data Science
100% (1)
Notes Data Science
5 pages
Class 9 (Chap #4)
No ratings yet
Class 9 (Chap #4)
9 pages
Data Science and Analytics Reviewer
No ratings yet
Data Science and Analytics Reviewer
5 pages
Cs3352 - Foundation of Data Science
No ratings yet
Cs3352 - Foundation of Data Science
56 pages
FDS - Unit 1 Question Bank
No ratings yet
FDS - Unit 1 Question Bank
16 pages
PDS Question Bank
No ratings yet
PDS Question Bank
19 pages
Fd45092a Ccad 459e Bc18 B01536fd6bac Untitled
No ratings yet
Fd45092a Ccad 459e Bc18 B01536fd6bac Untitled
53 pages
Set. No - 2 P18pecs021-Data Science QP - Ph.d.
No ratings yet
Set. No - 2 P18pecs021-Data Science QP - Ph.d.
20 pages
DS QB Unit 1
No ratings yet
DS QB Unit 1
45 pages
2 Marks With Answers
No ratings yet
2 Marks With Answers
39 pages
DS Unit 1
No ratings yet
DS Unit 1
23 pages
Datascience
No ratings yet
Datascience
12 pages
BI Unit 2
No ratings yet
BI Unit 2
113 pages
File
No ratings yet
File
27 pages
Fods MQP Solutions - 025136
No ratings yet
Fods MQP Solutions - 025136
76 pages
Unit 4
No ratings yet
Unit 4
6 pages
Data Science Fundamentals QB
No ratings yet
Data Science Fundamentals QB
23 pages
IDS Unit 1
No ratings yet
IDS Unit 1
67 pages
Data Science Course in Pitampura
No ratings yet
Data Science Course in Pitampura
19 pages
CS3352-FDS 2 Marks Questions With Answer
No ratings yet
CS3352-FDS 2 Marks Questions With Answer
20 pages
DS B&V-1
No ratings yet
DS B&V-1
30 pages
Selected Topics - Datascience
No ratings yet
Selected Topics - Datascience
17 pages
Data Science Unit 1
No ratings yet
Data Science Unit 1
85 pages
Data Science
No ratings yet
Data Science
2 pages
23SC3201 Data Science and Challenges-2
No ratings yet
23SC3201 Data Science and Challenges-2
28 pages
CUITM217-DATA-SCIENCE Data
No ratings yet
CUITM217-DATA-SCIENCE Data
48 pages
Data Science Management - Vss
No ratings yet
Data Science Management - Vss
84 pages
ChatGPT - MyLearning On Big Data, Data Science and Machine Learning
No ratings yet
ChatGPT - MyLearning On Big Data, Data Science and Machine Learning
44 pages
Data Science Report
No ratings yet
Data Science Report
32 pages
Data Science Fundamentals Detailed Notes
No ratings yet
Data Science Fundamentals Detailed Notes
31 pages
Question Bank With Answers
No ratings yet
Question Bank With Answers
103 pages
Data Science Unit-1 Notes
No ratings yet
Data Science Unit-1 Notes
19 pages
Overview of Data Science
No ratings yet
Overview of Data Science
3 pages
Exploratory Data Analysis With Python
No ratings yet
Exploratory Data Analysis With Python
24 pages
Data Science Book
No ratings yet
Data Science Book
383 pages
Session 1819
No ratings yet
Session 1819
47 pages
Chapter No.4 Exercise Solution (Computer)
No ratings yet
Chapter No.4 Exercise Solution (Computer)
8 pages
Notes Unit1 Unit2
No ratings yet
Notes Unit1 Unit2
83 pages
FDSNotes
No ratings yet
FDSNotes
12 pages
The Field of Data Science
No ratings yet
The Field of Data Science
4 pages
FDS Unit 1 QB
No ratings yet
FDS Unit 1 QB
7 pages
Data Science Course in Hyderabad
No ratings yet
Data Science Course in Hyderabad
9 pages
Internship Report: T.J.Instituteoftechnology
No ratings yet
Internship Report: T.J.Instituteoftechnology
29 pages
Dpa-Set - A
No ratings yet
Dpa-Set - A
29 pages
Set-2 QP
No ratings yet
Set-2 QP
16 pages
Dpa-Set - 2
No ratings yet
Dpa-Set - 2
4 pages
Data Science Set - B
No ratings yet
Data Science Set - B
5 pages
Os - QB
No ratings yet
Os - QB
5 pages
Probability and Statistics Basics
No ratings yet
Probability and Statistics Basics
5 pages
Process Capability Analysis Guide
100% (1)
Process Capability Analysis Guide
3 pages
Topic: Measures of Central Tendency and Measures of Dispersion
No ratings yet
Topic: Measures of Central Tendency and Measures of Dispersion
45 pages
DataSheet 5600 Dse
No ratings yet
DataSheet 5600 Dse
2 pages
BSC Psychology 2 ND Sem Stati QP
No ratings yet
BSC Psychology 2 ND Sem Stati QP
2 pages
Inferential Hypothesis Testing
100% (1)
Inferential Hypothesis Testing
108 pages
CEM 515 SPC Quiz Student Name: - Student No
No ratings yet
CEM 515 SPC Quiz Student Name: - Student No
2 pages
STATISTICS Class 10
No ratings yet
STATISTICS Class 10
17 pages
(Business Statistics) Chapter 3 Part 1
No ratings yet
(Business Statistics) Chapter 3 Part 1
30 pages
2020 Mock Exam C - Afternoon Session (With Solutions)
100% (1)
2020 Mock Exam C - Afternoon Session (With Solutions)
62 pages
Full Thesis SMIS 2018 19
No ratings yet
Full Thesis SMIS 2018 19
50 pages
Probability & Queueing Theory QBank
No ratings yet
Probability & Queueing Theory QBank
48 pages
LRDI-17 Practice Exercise 1 With Solutions
No ratings yet
LRDI-17 Practice Exercise 1 With Solutions
12 pages
Mean, Variance, and SD
No ratings yet
Mean, Variance, and SD
19 pages
Statistics for College Students
No ratings yet
Statistics for College Students
14 pages
Understanding Micromeritics and Particle Size
No ratings yet
Understanding Micromeritics and Particle Size
11 pages
Chapter3 Sampling Proportions Percentages
No ratings yet
Chapter3 Sampling Proportions Percentages
10 pages
Inferential Statistics Powerpoint
100% (1)
Inferential Statistics Powerpoint
65 pages
QP, P & S (Cse & It), Nov 10
No ratings yet
QP, P & S (Cse & It), Nov 10
8 pages
Parameters For A Binomial Distribution Quiz: Ap Statistics Test Booklet
No ratings yet
Parameters For A Binomial Distribution Quiz: Ap Statistics Test Booklet
2 pages
Examination Instructions and Guidelines
No ratings yet
Examination Instructions and Guidelines
32 pages
Data Analysis and Visualization EDA
No ratings yet
Data Analysis and Visualization EDA
51 pages
BS en Iso 00177-2017
100% (1)
BS en Iso 00177-2017
14 pages
Maed Critique Paper 2
No ratings yet
Maed Critique Paper 2
12 pages
Toward An Instance Theory of Automatization
No ratings yet
Toward An Instance Theory of Automatization
36 pages
EDA Loan Case Study PPT - Ver 1.1
80% (5)
EDA Loan Case Study PPT - Ver 1.1
22 pages
Continuous Probability Insights
No ratings yet
Continuous Probability Insights
42 pages
BGS Class XII Set - 1 Applied Math Marking Scheme Pre - Board Exam 2023-24
No ratings yet
BGS Class XII Set - 1 Applied Math Marking Scheme Pre - Board Exam 2023-24
13 pages
CSEC Add Maths - Paper 2 - June 2021 - Solutio
No ratings yet
CSEC Add Maths - Paper 2 - June 2021 - Solutio
32 pages
Procedure For Calculating Z-Scores in Inter-Laboratory Com-Parison (ILC)
No ratings yet
Procedure For Calculating Z-Scores in Inter-Laboratory Com-Parison (ILC)
2 pages

Set. No - 1 P18pecs021-Data Science QP - Ph.d.

Uploaded by

Set. No - 1 P18pecs021-Data Science QP - Ph.d.

Uploaded by

Reg.

Time: 3 Hrs Maximum: 100

2. Explain the concept of 'Technology' in the context of Data Science.

3. What are the different sources of data in Data Science?

4. Define API and its role in data collection.

Example: The Twitter API enables developers to fetch tweets, analyze

5. Define variance and its significance in data analysis.

6. What is the Central Limit Theorem (CLT)?

7. What are the main types of data visualization?

8. Explain the role of "retinal variables" in data visualization.

9. List two applications of data science in the healthcare industry.

Answer either (a) or (b) from each question

Key Roles in Different Industries:

1. (b) Importance of the Data Science Process

Stages of the Data Science Process:

2. (a) Challenges in Collecting Data from Multiple Sources &

Strategies to Overcome Challenges:

2. (b) Significance of APIs in Modern Data Collection

How APIs Help in Data Collection:

3. (a) Concept of Central Tendency in Statistics

3. (b) Central Limit Theorem (CLT) in Statistics

4. (a) Types of Data Visualizations & Their Uses

- Bar Chart: Comparing categorical data (e.g., sales by

4. (b) Mapping Variables to Visual Encodings

Impact of Proper Mapping:

5. (a) Applications of Data Science in Various Industries

5. (b) Role of Bokeh in Python for Interactive Visualizations

Comparison with Other Libraries:

1. Core Concepts of Data Science and Their Applications Across

Data Science is a multidisciplinary field that involves extracting

# Applications Across Industries

2. Sources of Data in Data Science and Their Contribution to

Data Science relies on various sources of data that contribute to

- Structured Data: Organized into rows and columns (e.g.,

Each type of data contributes to a holistic analysis, enabling

3. Central Tendency and Its Importance in Statistical Analysis

- Mean: The arithmetic average of a dataset, used in academic

Understanding central tendency is crucial for decision-making,

4. Importance of Data Visualization in Data Science and Its Role

Data visualization translates complex data into graphical

For example, a retailer analyzing sales performance using bar

5. Applications of Data Science in Business: Marketing,

Businesses leverage Data Science to enhance operations and

- Marketing: Predictive analytics optimizes advertisement

For example, Netflix uses Data Science to recommend content

Time-series data, which tracks variables over time, poses unique

- Data Overload: Large datasets may clutter graphs.

# Techniques to Overcome Challenges

Visualizing stock market trends using candlestick charts helps

7. Importance of Structure Learning in Graph Mining and

# Structure Learning in Graph Mining

Constraint-based approaches are preferable when domain

You might also like