0% found this document useful (0 votes)
22 views3 pages

Ch.4.Data Science X-1

Uploaded by

manojthaware1972
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views3 pages

Ch.4.Data Science X-1

Uploaded by

manojthaware1972
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

STD-X AI PART-B

UNIT. 4 – DATA SCIENCE


Q1. Distinguish between data acquisition and data exploration.
Data Acquisition deals with collecting or obtaining data from different sources. This is based on the
need on the project. For example, traffic data for traffic lights management system. Data Exploration
deals on the other hand deals with finding patterns/trends from the acquired data. The AI system
will use these findings for solving the given problems.

Q2. How are NumPy arrays better than Python lists?


NumPy arrays have number of advantages over the Python lists. They are more convenient to use,
they are much faster than Python lists and the memory consumed by them is less because their data
structure takes less space

Q3. Explain some errors that can occur during data collection.
Some of the errors possible in data collection are:
1. Collection of incorrect values
2. Collecting invalid or null values
3. Data missing from dataset cells
4. Out of range data in the datasets

Q4. Explain the KNN model.


KNN or K-Nearest Neighbours is an algorithm used for supervised machine learning. KNN can be
used with both regression and classification tasks. This method evaluates the labels of a selected
number of data points around a target data point to predict the class on which the data point falls.

Q5. Write a short note on the Five-Factor Model.


The Five Factor Model is used for measuring the following five personality traits:
Openness: The trait of an individual to accept new things
Conscientiousness: The trait of being particular, organised, watchful etc.
Extroversion: The trait related to being social and outgoing
Agreeableness: The trait measuring friendliness of people with each other
Neuroticism: The trait related to controlling ones’ own emotions

Q6. In what ways can an AI capture data?


AI systems can collect or capture data from multiple sources. For example, eye-tracking sensors can
be used for capturing the movement of eye and body language, smartphones can be used for
capturing the usage data of the user, smart cameras can be used for capturing video data, data
streams can be used for capturing live data etc.

Q7. What is data visualisation? How can we use Python for data visualisation?
Data visualisation is a technique used for understanding and getting insights from the data. The two
most basic forms of data visualisation are graphs and charts. Python provides different plots for the
data visualisation, like Line plot, Histogram Plot, Scatter Plot, Bar Chart, Box and Whisker Plot etc.
Q8. What are Pandas and SciPy? Why are they used?
Pandas and SciPy are python libraries for assisting in scientific computations and data analysis.
Pandas deals with structured data operations and manipulations. It provides facility for data cleaning
and preparation. SciPy is one of the commonly used libraries for advanced-level science and
engineering functions. It provides access to Linear Algebra, Optimisation, Sparse Matrices etc.

Q9. Explain at least three different uses of data science in the financial sector.
The use of data science in the financial sector include:
Risk Analytics: this deals with analysing the risk associated with financial transactions like providing
loans or insuring something.
Fraud Detection: analysis of big data allows the financial sector to reduce frauds and scams
Personalised Services: financial sector use data science for providing personalised services to the
clients at the best possible rate.

Q10. What is problem scoping?


Problem scoping is part of the planning stage of the AI projects. It starts once the problem is
identified and includes establishing of goals, identification of stakeholders, finding out what is being
currently done for dealing with the problem and the ethical concerns related to the problem.
Problem Scoping establishes the limit of the problem.

Worksheet -1
Find out at least three uses of Data Science in the following sectors:

1. Education: Providing adaptive learning opportunities to students. Assisting in improving teacher


assessments.
2. Hospitality: Assisting hotels in predicting demand and customer behaviour. Automated dynamic
pricing systems for ensuring maximum revenue generation.
3. Defence: Assisting in smart targeting of enemy installations. Assisting in threat risk assessments.
4. Real Estate: Providing help with property valuations. Risk mitigation based on predictive models.
5. Law Enforcement: Smart surveillance systems. Assisting with criminal identification.

Worksheet -2
Recently, several unauthorised persons were seen roaming around in your colony. For dealing
with this situation, you have been asked to design a smart security system, which will allow only
authorised people to enter the colony.

Q1. What data will you need for designing and implementing the system?

1. The images of all the individuals authorised to enter the colony


2. The details of all the individuals authorised to enter the colony

Q2. From where will you collect this data?


1. Colony records or asking the authorised individuals to provide information.
Worksheet 3
Mean: sum of all the terms/number of terms
Median: {(n+1)/2)} th value. N= number of values in the data set
Mode: The most frequently occurring value/observation
Standard Deviation: � ∑|𝑥𝑥−𝜇𝜇|2
𝑁𝑁 Where ∑ = sum of, x = value in the data set, 𝜇𝜇 = mean of the data set
and N = number of data points
Variance: 𝑆𝑆 2 = ∑(𝑥𝑥 𝑖𝑖 −𝑥𝑥)2
𝑛𝑛−1 where 𝑆𝑆 2 = sample variance, 𝑥𝑥𝑖𝑖 = the value of one observation, 𝑥𝑥 = the mean
value of all observations and 𝑛𝑛 = the number of observations.

Q.Browse the internet and find out the names of five personality structure models like the Big Five
Model. Also, find out the name of the people who introduced the model.

1. Sixteen Personality Factor Questionnaire (Raymond Cattell)


2. Myers–Briggs Type Indicator (Katharine Cook Briggs and Isabel Briggs Myers)
3. Keirsey Temperament Sorter (Keirsey)
4. Enneagram of Personality (Oscar Ichazo is generally recognised for making this model known)
5. Type A and Type B personality theory (Meyer Friedman and Ray Rosenman)

Worksheet: 4

Q.Find out the five uses of the K-Nearest Neighbour model.

1. Used for smart surveillance. For example, detecting hidden packages at the bottom of shopping
carts.
2. For creating recommendation systems based on what the customer buys/watches/listens etc.
3. For finding documents which are semantically identical.
4. Predicting illnesses, for example breast cancer cases
5. Stock market predictions for trading

You might also like