0% found this document useful (0 votes)
198 views26 pages

Class Xii Ai Worksheet Booklet Part2 2023-2024

The document outlines the curriculum for the Artificial Intelligence course at Delhi Public School Bangalore East for the academic year 2023-2024, detailing topics such as Capstone Projects, Model Life Cycle, and Storytelling through Data. It emphasizes the importance of design thinking, data modeling, and evaluation techniques like train-test split and cross-validation in AI projects. Additionally, it includes worksheets with questions and fill-in-the-blank exercises to assess students' understanding of AI concepts.

Uploaded by

vigilantswift07
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
198 views26 pages

Class Xii Ai Worksheet Booklet Part2 2023-2024

The document outlines the curriculum for the Artificial Intelligence course at Delhi Public School Bangalore East for the academic year 2023-2024, detailing topics such as Capstone Projects, Model Life Cycle, and Storytelling through Data. It emphasizes the importance of design thinking, data modeling, and evaluation techniques like train-test split and cross-validation in AI projects. Additionally, it includes worksheets with questions and fill-in-the-blank exercises to assess students' understanding of AI concepts.

Uploaded by

vigilantswift07
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

DELHI PUBLIC SCHOOL

BANGALORE EAST

WORKSHEETS
2023 – 2024
ARTIFICIAL INTELLIGENCE

NAME :
CLASS: XII SECTION :
CONTENTS
PAGE TEACHER’S
[Link]. TOPIC
NO. SIGN
1. UNIT 1: CAPSTONE PROJECT 1

2. UNIT 2: MODEL LIFE CYCLE 12

3. UNIT 3: STORYTELLING THROUGH DATA 22


DELHI PUBLIC SCHOOL BANGALORE - EAST
ARTIFICIAL INTELLIGENCE
UNIT I: CAPSTONE PROJECT
NAME: CLASS:XII SEC: DATE:

A capstone project is a culminating assignment, on which students usually work on during their
final year in school or at the end of the academic program. It requires different intellectual activities.
This project helps young people learn how to find and analyze information and how to work with it
efficiently.
AI project follows the following six steps:
1) Problem definition i.e. Understanding the problem
2) Data gathering
3) Feature definition
4) AI model construction
5) Evaluation & refinements
6) Deployment

What is Design Thinking?


Design thinking is a non-linear, iterative process that teams use to understand users, challenge
assumptions, redefine problems and create innovative solutions to prototype and test. Involving five
phases—Empathize, Define, Ideate, Prototype and Test—it is most useful to tackle problems that are
ill-defined or unknown.

Stage 1: Empathize—Research Your Users' Needs

Here, you should gain an empathetic understanding of the problem you’re trying to solve,
typically through user research. Empathy is crucial to a human-centered design process such as

1
design thinking because it allows you to set aside your own assumptions about the world and gain
real insight into users and their needs.

Stage 2: Define—State Your Users' Needs and Problems

It’s time to accumulate the information gathered during the Empathize stage. You then analyze
your observations and synthesize them to define the core problems you and your team have
identified. These definitions are called problem statements. You can create personas to help keep
your efforts human-centered before proceeding to ideation.

Stage 3: Ideate—Challenge Assumptions and Create Ideas

Now, you’re ready to generate ideas. The solid background of knowledge from the first two
phases means you can start to “think outside the box”, look for alternative ways to view the
problem and identify innovative solutions to the problem statement you’ve
created. Brainstorming is particularly useful here..

Stage 4: Prototype—Start to Create Solutions

This is an experimental phase. The aim is to identify the best possible solution for each problem
found. Your team should produce some inexpensive, scaled-down versions of the product (or
specific features found within the product) to investigate the ideas you’ve generated. This could
involve simply paper prototyping.

Stage 5: Test—Try Your Solutions Out

Evaluators rigorously test the prototypes. Although this is the final phase, design thinking is
iterative: Teams often use the results to redefine one or more further problems. So, you can return
to previous stages to make further iterations, alterations and refinements – to find or rule out
alternative solutions.

Overall, you should understand that these stages are different modes which contribute to the entire
design project, rather than sequential steps. Your goal throughout is to gain the deepest
understanding of the users and what their ideal solution/product would be.

3. Analytic approach:
Those who work in the domain of AI and Machine Learning solve problems and answer questions
through data every day. They build models to predict outcomes or discover underlying patterns, all to
gain insights leading to actions that will improve future outcomes.

2
• If the question is to determine probabilities of an action, then a predictive model might be
used.

• If the question is to show relationships, a descriptive approach maybe be required.

• Statistical analysis applies to problems that require counts: if the question requires a yes/ no
answer, then a classification approach to predicting a response would be suitable.
4. Data requirement:

In this phase the data requirements are revised and decisions are made as to whether or not the
collection requires more or less data. Once the data ingredients are collected, the data scientist will
have a good understanding of what they will be working with.
Techniques such as descriptive statistics and visualization can be applied to the data set, to
assess the content, quality, and initial insights about the data. Gaps in data will be identified and
plans to either fill or make substitutions will have to be made.

3
5. ‘Modeling Approach:

Data Modelling focuses on developing models that are either descriptive or predictive.
• An example of a descriptive model might examine things like: if a person did this, then
they're likely to prefer that.
• A predictive model tries to yield yes/no, or stop/go type outcomes. These models are based on
the analytic approach that was taken, either statistically driven or machine learning driven.

The data scientist will use a training set for predictive modelling. A training set is a set of
historical data in which the outcomes are already known. The training set acts like a gauge to
determine if the model needs to be calibrated. In this stage, the data scientist will play
around with different algorithms to ensure that the variables in play are actually required.
Constant refinement, adjustments and tweaking are necessary within each step to ensure the
outcome is one that is solid. The framework is geared to do 3 things:
• First, understand the question at hand.
• Second, select an analytic approach or method to solve the problem.
• Third, obtain, understand, prepare, and model the data.

6. How to validate model quality:


Train-Test Split Evaluation:

The train-test split is a technique for evaluating the performance of a machine learning algorithm.
It can be used for classification or regression problems and can be used for any supervised
learning algorithm.
The procedure involves taking a dataset and dividing it into two subsets. The first subset is used to
fit the model and is referred to as the training dataset. The second subset is not used to train the
model; instead, the input element of the dataset is provided to the model, then predictions are
made and compared to the expected values. This second dataset is referred to as the test dataset.

4
• Train Dataset: Used to fit the machine learning model.
• Test Dataset: Used to evaluate the fit machine learning model.

The objective is to estimate the performance of the machine learning model on new data: data not
used to train the model.
How to Configure the Train-Test Split:

The procedure has one main configuration parameter, which is the size of the train and test sets.
This is most commonly expressed as a percentage between 0 and 1 for either the train or test
datasets. For example, a training set with the size of 0.67 (67 percent) means that the remainder
percentage 0.33 (33 percent) is assigned to the test set. There is no optimal split percentage.

You must choose a split percentage that meets your project’s objectives with considerations that
include:
Computational cost in training the model.
Computational cost in evaluating the model.
Training set representativeness.
Test set representativeness.

Nevertheless, common split percentages include:


Train: 80%, Test: 20%
Train: 67%, Test: 33%
Train: 50%, Test: 50%

Introduce concept of cross validation:

Machine learning is an iterative process. You will face choices about predictive variables to use, what
types of models to use, what arguments to supply those models, etc. We make these choices in a data-
driven way by measuring model quality of various alternatives.

The Cross-Validation Procedure:


In cross-validation, we run our modeling process on different subsets of the data to get multiple
measures of model quality. For example, we could have 5 folds or experiments. We divide the
data into 5 pieces, each being 20% of the full dataset.

Cross-validation gives a more accurate measure of model quality, which is especially


important if you are making a lot of modeling decisions. However, it can take more time to run,
because it estimates models once for each fold. So it is doing more total work.

7. Metrics of model quality by simple Math and examples:

Performance metrics like classification accuracy and root mean squared error can give you a clear
objective idea of how good a set of predictions is, and in turn how good the model is that generated
them.

This is important as it allows you to tell the difference and select among:

5
Different transforms of the data used to train the same machine learning model.
Different machine learning models trained on the same data.
Different configurations for a machine learning model trained on the same data.

All the algorithms in machine learning rely on minimizing or maximizing a function, which we call
“objective function”. The group of functions that are minimized are called “loss functions”. A loss
function is a measure of how good a prediction model does in terms of being able to predict the
expected outcome. A most commonly used method of finding the minimum point of function is
“gradient descent”. Think of loss function like undulating mountain and gradient descent is like
sliding down the mountain to reach the bottom most point.

Loss functions can be broadly categorized into 2 types: Classification and Regression Loss.

Classification:

Log Loss:

Log Loss is the most important classification metric based on probabilities. It’s hard to interpret raw
log-loss values, but log-loss is still a good metric for comparing models. For any given problem, a
lower log loss value means better predictions.

6
Focal Loss:

A Focal Loss function addresses class imbalance during training in tasks like object detection.
Focal loss applies a modulating term to the cross-entropy loss in order to focus learning on hard
misclassified examples. It is a dynamically scaled cross entropy loss, where the scaling factor
decays to zero as confidence in the correct class increases. Intuitively, this scaling factor can
automatically down-weight the contribution of easy examples during training and rapidly focus the
model on hard examples.
Exponential Loss:

The exponential loss is convex and grows exponentially for negative values which makes it more
sensitive to outliers. The exponential loss is used in the AdaBoost algorithm (statistical
classification meta-algorithm). The principal attraction of exponential loss in the context of additive
modeling is computational

Hinge Loss:

The hinge loss is a specific type of cost function that incorporates a margin or distance from the
classification boundary into the cost calculation. Even if new observations are classified correctly,
they can incur a penalty if the margin from the decision boundary is not large enough.
KL Divergence Loss:

KL divergence in simple term is a measure of how two probability distributions (say ‘p’ and ‘q’) are
different from each other. So this is exactly what we care about while calculating the loss function.
Here ‘q’ is the probability distribution that the neural network model will predict whereas ‘p’ is the
true distribution (in case of multiclass classification problem ‘p’ is the one-hot encode vector and ‘q’
is the softmax output from the dense layer).

Regression:

Log cosh Loss:

Computes the logarithm of the hyperbolic cosine of the prediction error.

Quantile Loss:

A quantile is the value below which a fraction of observations in a group falls. For example, a prediction
for quantile 0.9 should over-predict 90% of the times.
8. RMSE (Root Mean Squared Error)

In machine Learning when we want to look at the accuracy of our model we take the root mean
square of the error that has occurred between the test values and the predicted values
mathematically:
For a single value:
Let a= (predicted value- actual value) ^2

7
Let b= mean of a = a (for single value)
Then RMSE= square root of b
For a wide set of values RMSE is defined as follows:

MSE (Mean Squared Error) :

Mean Square Error (MSE) is the most commonly used regression loss function. MSE is the sum of
squared distances between our target variable and predicted values.

Why use mean squared error?

MSE is sensitive towards outliers and given several examples with the same input feature values, the optimal
prediction will be their mean target value. This should be compared with Mean Absolute Error, where the
optimal prediction is the median. MSE is thus good to use if you believe that your target data, conditioned on
the input, is normally distributed around a mean value, and when it’s important to penalize outliers extra
much.

8
DELHI PUBLIC SCHOOL BANGALORE - EAST
ARTIFICIAL INTELLIGENCE
CAPSTONE PROJECT - WORKSHEET
NAME: CLASS:XII SEC: DATE:

A. Choose the correct option:


1. Which of the following is not the part of Design thinking?
a. Prototype
b. Empathize
c. Sympathize
d. Define
2. A is a project where students must research a topic independently to get a deep
understanding of the subject matter.
a. AI model
b. Culminating report
c. Senior report
d. Capstone
3. An optimum AI model should have a .
a. Mean Square Error
b. Mean Absolute Error
c. Quantile loss
d. Root Mean Square Error
4. To determine whether an email what we received is spam or not, it uses technique.
a. Decision tree
b. Classification
c. Regression
d. Clustering
5. The train-test split is a technique for evaluating the performance of a machine learning
algorithm. Which machine learning algorithm can it be used for?
a. Regression
b. Clustering
c. Classification
d. Deep learning

9
I. Only a
II. Only b
III. Both a and c
IV. Both b and d
6. The primary way to collect data is .
a. Experiment
b. Survey
c. Interview
d. Observation
7. Which one does NOT belong with Regression loss?
a. Log Loss
b. Mean Absolute Error
c. Log cosh Loss
d. Quantile Loss
8. Regression function predict a and classification predicts a label.
a. Output
b. Quantity
c. Loss
d. Logic
9. Which of the following are common split percentages between Train & Test data?
a. Train : 50% , Test : 50%
b. Train : 5% , Test : 95%
c. Train : 67% , Test : 33%
d. Train : 80% , Test : 20%
I. a and b
II. c and d
III. a, c and d
IV. a, b and c
10. Which stage in Design Thinking missing [Prototype, Ideate, Test, Define] ?
a. Evaluation
b. Empathies
c. Evolution
d. Enrichment

10
B. Fill in the blanks:
1. The dataset is used to evaluate the model and adjust it as necessary.
2. means handling missing or invalid values, removing duplicates, applying
correct formats after the data has been collected.
3. cannot be a negative value.
4. The data scientist will use for predictive modeling.
C. State whether the following statements are true or false.
1. The problem solving methodology is iterative in nature.
2. There are 2 types of loss functions namely regression losses and classification losses.
3. Cross validation techniques divides the provided dataset into 2 subsets namely training dataset
and testing dataset.
4. Historical data in which the desired outcome is already known as .
5. The methodology for model building and deployment is an process.
D. Answer the following:
1. Define Capstone project
2. List down the various steps under AI project
3. What do you mean by Design thinking? List down & explain the stages of design thinking.
4. Write the steps which are involved in Problem decomposition.
5. Write the Train-Test split procedure in Python.
6. List down some of the sources from where the data can be gathered for data analysis?
7. What is Cross-validation?
8. What is the use of loss function? What are all 2 different categories of loss functions?
9. Define the following:
a. MSE (Mean Squared Error)
b. RMSE ( Root Mean Square Error)
10. What are hyper parameters? Explain with an example.

*******************

11
DELHI PUBLIC SCHOOL BANGALORE - EAST
ARTIFICIAL INTELLIGENCE
UNIT 2 - MODEL LIFE CYCLE
NAME: CLASS:XII SEC: DATE:
AI Project Cycle Class has the following three phases:

Phase I: Project Planning & Data Collection


Phase II: Design & Testing
Phase III: Deployment & Maintenance

The following stages are involved in each phase.

Phase I:

1. Problem Scoping
2. Data Acquisition
3. Data Exploration

Phase II:

1. Evaluation
2. Data Modelling

Phase III:

1. Deployment
2. Feedback

Problem Scoping:

Before beginning to build a solution, it is critical to first understand the problem description and business limitations.

1. What is Problem Scoping?

Whenever we are starting any work, certain problems always associated with the work or
process. These problems can be small or big, sometimes we ignore them, sometimes we
need urgent solutions. Problem scoping is the process by which we figure out the
problem that we need to solve.

2. The 4Ws

The 4Ws of Problem Scoping:

The 4Ws are very helpful in problem scoping. They are:

1. Who? – Refers that who is facing a problem and who are the stakeholders of the problem
2. What? – Refers to what is the problem and how you know about the problem

12
3. Where? – It is related to the context or situation or location of the problem
4. Why? – Refers to why we need to solve the problem and what are the benefits to the
stakeholders after solving the problem

After understanding and writing the problems, set your goals, and make them your AI project target.
Write your goals for your selected theme.
Suppose you have selected theme of agriculture then write how AI will help farmers to solve their
problems.

1. Determine what will a good time for seeding?


2. Determine what will be a good time for harvesting?
3. Determine when and how much fertilizer will be applied to the selected crop?
These goals can be more!
Now think and apply the 4Ws strategy for each problem or goal.

Your final problem statement will look likes the following table:

Who Stakeholders

Farmers, Fertilizer Producers, Labours, Tractor Companies

What The problem, Issue, Need

Determine what will a good time for seeding or crop harvesting?

When Context/Situation

Decide the mature age for the crop and determine its time

Ideal Solution Benefits

Take the crop on time and supply against market demand on time

Introduction to Data Acquisition:

Data Acquisition consists of two words:

1. Data: Data refers to the raw facts , figures, or piece of facts, or statistics collected for reference
or analysis.
2. Acquisition: Acquisition refers to acquiring data for the project.

Classification of Data:

Now Observe the following diagram to for the data classification, we will discuss each of them in detail:

13
Basic Data:
Basically, data is classified into two categories:

14
1. Numeric Data: Mainly used for computation. Numeric data can be classified into the following:
o Discrete Data: Discrete data only contains integer numeric data. It doesn’t have any decimal or
fractional value. The countable data can be considered as discrete data. For example 132
customers, 126 Students etc.
o Continuous Data: It represents data with any range. The uncountable data can be represented in
this category. For example 10.5 KGS, 100.50 Kms etc.
2. Text Data: mainly used to represent names, collection of words together, phrases, textual information
etc.

These data can be Qualitative and Quantitative.

Qualitative Data Quantitative Data

– Text
– Sound
Numbers
– Videos
– Images

Structural Classification:
The data which is going to be feed in the system to train the model or already fed in the system can
have a specific set of constraints or rules or unique pattern can be considered as structural data.

The structure classification is divided into 3 categories:

1. Structured Data: As we discussed the structured data can have a specific pattern or set of rules. These
data have a simple structure and stores the data in specific forms such as tabular form. Example, The
cricket scoreboard, Your school time table, Exam datasheet etc.
2. Unstructured Data: The data structure which doesn’t have any specific pattern or constraints as well as
can be stored in any form is known as unstructured data. Mostly the data that exists in the world is
unstructured data. Example, Youtube Videos, Facebook Photos, Dashboard data of any reporting tool etc.
3. Semi-Structured Data: It is the combination of both structured and unstructured data. Some data can
have a structure like a database whereas some data can have markers and tags to identify the structure of
data.

Other Classification:
This classification is sub divided into the following branches:

1. Time-Stamped Data: This structure helps the system to predict the next best action. It is following a
specific time-order to define the sequence. This time can be the time of data captured or processed or
collected.
2. Machine Data: The result or output of a specific program, system or technology considered as machine
data. It consists of data related to a user’s interaction with the system like the user’s logged-in session
data, specific search records, user engagement such as comments, likes and shares etc.
3. Spatiotemporal Data: The data which contains information related to geographical location and time is
considered as spatiotemporal data. It records the location through GPS and time-stamped data where the
event is captured or data is collected.

15
4. Open Data: It is freely available data for everyone. Anyone can reuse this kind of data.
5. Real-time Data: The data which is available with the event is considered as real-time data.
6. Big Data: You may hear this word most often. The data which cannot be stored by any system or
traditional data collection software like DBMS or RDBMS software can be considered as Big data. Big
data itself a very deep topic.

Example of Data Acquisition:


The example is continued which were discussed in the problem scoping stage.

• Now, as you interact with the authorities, you get to know that some people are allowed to enter the area
where the diamond is kept.
• Some of them being – the maintenance people; officials; VIPs, etc.
• Now, your challenge is to make sure that no unauthorised person enters the premises.
• For this, you: (choose one)
o Get photographs of all the authorised people.
o Get photographs of all the unauthorised people.
o Get photographs of the premises in which the diamond has been kept.
o Get photographs of all the visitors

Data Features:

Data features refer to the type of data you want to collects. Here three terms are associated with this:

1. Training Data: The collected data through the system is known as training data. In other words
the input given by the user in the system can be considered as training data.
2. Testing Data: The result data set or processed data is known as testing data. In other words, the
output of the data is known as testing data.
3. Validation set: Data the model has not been trained on and used to tune hyperparameters

Methods of Data Acquisition:


The most common methods of data acquisition are:

1. Surveys: Through Google Forms, MS Teams Forms or any other interface


2. Web Scrapping: Some software are Scarpy, Scrape hero Cloud, ParseHub, OutHitHub, Visual
Web Ripper, [Link]
3. Sensors: to convert physical parameters to electrical signals, to convert sensor signals into a
form that can be converted to digital values and to convert conditioned sensor signals to digital
values
4. Cameras: To capture images
5. Observations: Way of gathering data by watching behaviour, events, or noting physical
characteristics in their natural setting
6. API (Application Program Interface)

Types of Big Data:


There are three types of big data:

16
Structured Semi structured Unstructured

Having a pattern, usually stored in No well-defined structures but Without any structure or
tabular form and accessed by some categorized data using some not defined in any
applications like MS excel or DBMS meta tags framework
– Employees data of a company – HTML Page – Audio Video file
– Result dataset of a board – CSV Files – Social Media posts

Data Exploration:

Data Exploration refers to the techniques and tools used to visualize data through complex statistical
methods.

Need of data visualization:

• Quickly get a sense of the trends, relationships and patterns contained within the data.
• Define strategy for which model to use at a later stage.
• Communicate the same to others effectively.
• To visualise data, we can use various types of visual representations.

Data Visualization tools:


● Microsoft Excel ●Tableau ●Qlikview ●Datawrapper Google Data Studio

Modelling:

AI Modelling refers to developing algorithms, also called models which can be trained to get intelligent
outputs. That is, writing codes to make a machine artificially intelligent.

Types of AI models:

17
Rule-Based model refers to setting up rules and training the model accordingly. It follows an algorithm
or code to train, test and validate data.

Learning-based refer to identifying the data by its attributes and behaviour and training the model
accordingly. There is no code or algorithm to train, test and validate the data. It learns from past
behavior and attributes received from data.

Types of learning:
There are three types of learning:

1. Supervised
2. Unsupervised
3. Reinforcement

Testing / Evaluation:

Once a model has been created and trained, it must be properly tested to calculate the model’s efficiency
& performance. As a result, the model is evaluated using Testing data & the model’s efficiency is
assessed.

The set of measurements will differ depending on the problem you are working on. For regression
problems, MSE or MAE are commonly used. For a balanced dataset, accuracy may be useful choice for
evaluating a classification model. For imbalanced datasets, F1 Score is useful.

A separate validation dataset is used for evaluation during training. It monitors how well our model
generalises, avoiding bias & overfitting.

While the fundamental testing concepts are fully applicable in AI development projects, there are
additional considerations too. These are as follows:
• The volume of test data can be large, which presents complexities.
• Human biases in selecting test data can adversely impact the testing phase, therefore, data
validation is important.
• Your testing team should test the AI and ML algorithms keeping model validation, successful
learnability, and algorithm effectiveness in mind.
• Regulatory compliance testing and security testing are important since the system might deal
with sensitive data, moreover, the large volume of data makes performance testing crucial.
• You are implementing an AI solution that will need to use data from your other systems,
therefore, systems integration testing assumes importance.
• Test data should include all relevant subsets of training data, i.e., the data you will use for
training the AI system.
• Your team must create test suites that help you validate your ML models.

Deployment and Maintenance:

Finally, we come to the model deployment stage. This means that we must implement it in an environment with a
web interface or some kind of application where the new data can flow and our ML models can show the analysis
in the new interface.

18
An artificial intelligence solution that predicts energy consumption for energy providers would take related data,
analyze it, and send its prediction to a web portal or app for companies to view and act on. Such tools simplify the
decision-making process for end-users.

However, just because you’ve launched your AI solution live doesn’t mean the project is done. As in the previous
steps, an equally important part is monitoring, reviewing, and making sure that your solution continues to deliver
the desired results.

Most likely some adjustments and alterations will be required. This will depend on your customer and staff
feedback or on trial and error. New data may be entered into the model to ensure that the results are accurate and
up-to-date.

19
DELHI PUBLIC SCHOOL BANGALORE - EAST
ARTIFICIAL INTELLIGENCE
UNIT 2 - MODEL LIFE CYCLE - WORKSHEET
NAME: CLASS:XII SEC: DATE:

A. Choose the correct option:


1. Which of the following is not an AI development platform?
a. Google Cloud
b. EVA
c. IBM Watson
d. BigML
2. Which of the following is not a part of AI life cycle?
a. Problem Scoping
b. Deployment
c. Iteration
d. Testing
3. During the stage, the various AI development platforms like Scikit-learn, Watson
Studio etc. are elevated.
a. Problem Scoping
b. Deployment
c. Acquire
d. Design
4. Which language is most suitable for developing AI?
a. Python
b. HTML
c. Kotlin
d. Swift
5. Choose the correct option.
a. Scope > Acquire > Explore > Prepare > Model > Assess > Deploy > Batch
b. Scope > Acquire > Prepare > Assess > Deploy > Batch > Real time > Explore
c. Scope > Acquire > Explore > Prepare > Model > Deploy > Real time >Batch
d. Scope > Acquire > Explore >Model >Prepare > Assess > Deploy > Batch
6. In AI development, which framework is used?
a. Tkinter

20
b. Matplotlib
c. PyCharm
d. Scikit-learn
7. Which of the following statements is/are incorrect?
a. The volume of test data can be large, which presents complexities.
b. Testing team should test the AI & ML algorithms keeping model validation, successful
learn ability and algorithm effectiveness in mind.
c. Test data should include all irrelevant subsets of training data, that is the data you will
use for training the AI system.
i. All are incorrect
ii. b
iii. c
iv. a, b & c
B. Answer the following:
1. Explain the following:
a. Data Exploration
b. Modeling
2. Explain the terms over fitting, under fitting and perfect fit in terms of model testing.
3. What should be considered during testing phase?
4. Explain briefly the 4Ws of problem scoping.
C. Fill in the blanks:
1. During phase, you need to evaluate the various AI development platforms.
2. The refers to the type of data you want to collect.
3. For a balanced dataset, is a useful choice for evaluating a classification
model.
4. Two sources of authentic data are and .
5. The block in 4W Problem Canvas refers to the stakeholders.
6. The is used to assist us in identifying the important factors associated with
the problem.
D. Match & choose:
a. Open languages - i. ML techniques, GAN framework
b. Open frameworks - ii. Python, Scala
c. Approaches - iii. Watson, Azure
d. Tools to help - iv. Scikit-learn, TensorFlow

21
DELHI PUBLIC SCHOOL BANGALORE - EAST
ARTIFICIAL INTELLIGENCE
UNIT III: STORYTELLING THROUGH DATA
NAME: CLASS:XII SEC: DATE:

A well-told story is an inspirational narrative that is crafted to engage the audience across boundaries and
cultures, as they have the impact that isn’t possible with data alone. Data can be persuasive, but stories are
much more. They change the way that we interact with data, transforming it from a dry collection of
“facts” to something that can be entertaining, engaging, thought provoking, and inspiring change.

Each data point holds some information which maybe unclear and contextually deficient on its own. The
visualizations of such data are therefore, subject to interpretation (and misinterpretation). However, stories
are more likely to drive action than are statistics and numbers. Therefore, when told in the form of a
narrative, it reduces ambiguity, connects data with context, and describes a specific interpretation –
communicating the important messages in most effective ways. The steps involved in telling an effective
data story are given below:
• Understanding the audience
• Choosing the right data and visualisations
• Drawing attention to key information
• Developing a narrative
• Engaging your audience

Storytelling with Data:


Data storytelling is a structured approach for communicating insights drawn from data, and
invariably involves a combination of three key elements: data, visuals, and narrative. When the
narrative is accompanied with data, it helps to explain the audience what’s happening in the data
and why a particular insight has been generated. When visuals are applied to data, they can
enlighten the audience to the insights that they wouldn’t perceive without the charts or graphs.

Finally, when narrative and visuals are merged together, they can engage or even entertain an
audience. When you combine the right visuals and narrative with the right data, you have a data
story that can influence and drive change.

22
By the numbers: How to tell a great story with your data?

Presenting the data as a series of disjointed charts and graphs could result in the audience struggling to
understand it – or worse, come to the wrong conclusions entirely. Thus, the importance of a narrative
comes from the fact that it explains what is going on within the data set.

Some easy steps that can assist in finding compelling stories in the data sets are as follows:

Step 1: Get the data and organise it.


Step 2: Visualize the data.
Step 3: Examine data relationships.
Step 4: Create a simple narrative embedded with conflict.

Data storytelling has acquired a place of importance because:

• It is an effective tool to transmit human experience. Narrative is the way we simplify and make
sense of a complex world. It supplies context, insight, interpretation—all the things that make
data meaningful, more relevant and interesting.
• No matter how impressive an analysis, or how high-quality the data, it is not going to compel
change unless the people involved understand what is explained through a story.
• Stories that incorporate data and analytics are more convincing than those based entirely on
anecdotes or personal experience.
• It helps to standardize communications and spread results.
• It makes information memorable and easier to retain in the long run.

23
DELHI PUBLIC SCHOOL BANGALORE - EAST
ARTIFICIAL INTELLIGENCE
UNIT III: STORYTELLING THROUGH DATA – WORKSHEET

NAME: CLASS:XII SEC: DATE:


Answer the following:
1. What is data story telling?
2. is the first step involved in Data Storytelling
3. When visuals are applied to data, they provide to audience
4. List down the steps to be involved in telling an effective data story.
5. A well-told story is an inspirational narrative that is crafted to engage the audience across
.
6. Which chart type is good for numeric data visualization?
7. Why data story telling acquired a place of importance?
8. List down the key elements of story-telling.
9. Identify the elements and name them.

10. Consider the following, analyze the data and convert it into tabular form.

********

24

You might also like