UNIT-2
DATA LITERACY
INTRODUCTION TO DATA LITERACY
• Data literacy means knowing how to understand, work
with, and talks about data. It's about being able to
collect, analyze, and show data in ways that make
sense. Reference Video:
https://www.youtube.com/watch?v=yhO_t-c3yJY
• Data Pyramid is made of different
stages of working with data
LET’S UNDERSTAND DATA PYRAMID WITH A
SIMPLE TRAFFIC LIGHT EXAMPLE:
● Data is available in a raw form. Data
in this form is not very useful.
● Data is processed to give us
information about the world.
● Information about the world leads to
knowledge of how things are
happening.
● Wisdom allows us to understand why
things are happening in a particular
HOW TO BECOME DATA LITERATE?
• Every data tells a story, but we must be careful before believing the story.
• Data Literate is a person who can interact with data to understand the world around
them.
• Scenario: Buying a Video game online
• Data literacy helps people research about products while shopping over the internet
How do you decide the following things when we are shopping online?
● Which is the cheapest product available?
● Which product is liked by the users the most?
● Does a particular product meet all the requirements? A data literate person can –
● Filter the category as per the requirement – If the budget is low, select the price
filter as low to high
● Check the user ratings of the products
● Check for specific requirements in the product
DATA LITERACY PROCESS FRAMEWORK
• The data literacy framework provides guidance on using data efficiently and
with all levels of awareness.
• Data literacy framework is an iterative process.
WHAT ARE DATA SECURITY AND PRIVACY?
HOW ARE THEY RELATED TO AI?
WHAT IS DATA PRIVACY?
• Data privacy referred to as information privacy is concerned with
the proper handling of sensitive data including personal data and
other confidential data, such as certain financial data and
intellectual property data, to meet regulatory requirements as
well as protecting the confidentiality and immutability of the data.
The following best practices can help you ensure data privacy:
● Understanding what data, you have collected, how it is handled,
and where it is stored.
● Necessary data required for a project should only be collected.
● User consent while data collection must be of utmost
importance.
WHAT IS DATA SECURITY?
• Data security is the practice of protecting digital information from
unauthorized access, corruption, or theft throughout its entire
lifecycle.
Why is it important?
Due to the rising amount of data in the cloud there is an increased risk
of cyber threats.
The most possible reasons why data security is more important now
are:
• Cyber-attacks affect all the people
• The fast-technological changes will boom cyber attacks
BEST PRACTICES FOR CYBER SECURITY
Do’s
• Use strong, unique passwords with a mix of characters for each account.
• Activate Two-Factor Authentication (2FA) for added security.
• Download software from trusted sources and scan files before opening.
• Prioritize websites with "https://" for secure logins.
• Keep your browser, OS, and antivirus updated regularly.
• Adjust social media privacy settings for limited visibility to close contacts.
• Always lock your screen when away.
• Connect only with trusted individuals online.
• Use secure Wi-Fi networks.
• Report online bullying to a trusted adult immediately.
Don’t ‘s
• Avoid sharing personal info like real name or phone number.
• Don't send pictures to strangers or post them on social media.
• Don't open emails or attachments from unknown sources.
• Ignore suspicious requests for personal info like bank account
details.
• Keep passwords and security questions private.
• Don't copy copyrighted software without permission.
• Avoid cyberbullying or using offensive language online
ACQUIRING DATA, PROCESSING, AND
INTERPRETING DATA
• Types of data
Numeric Data is further classified as:
● Continuous data is numeric data that is continuous. E.g., height,
weight, temperature, voltage
● Discrete data is numeric data that contains only whole numbers and
cannot be fractional E.g. the number of students in the class – it can
only be a whole number, not in decimals
TYPES OF DATA USED IN THREE DOMAINS
OF AI:
DATA ACQUISITION/ACQUIRING DATA
• Data Acquisition, also known as acquiring data, refers to the procedure of
gathering data. This involves searching for datasets suitable for training AI
models. The process typically comprises three key steps:
ACQUIRING DATA – SAMPLE DATA DISCOVERY
• Let’s say we want to collect data for making a CV model for
a self-driving car
● We will require pictures of roads and the objects on roads
● We can search and download this data from the internet
● This process is called data discovery
ACQUIRING DATA – SAMPLE DATA AUGMENTATION
• Data augmentation means increasing the amount of data
by adding copies of existing data with small changes
● The image given here does not change, but we get data
on the image by changing different parameters like color
and brightness
● New data is added by slightly changing the existing data
ACQUIRING DATA – SAMPLE DATA GENERATION
• Data generation refers to generating or recording data using
sensors
● Recording temperature readings of a building is an example
of data generation
● Recorded data is stored in a computer in a suitable form
SOURCES OF DATA
• Primary Data Sources — Some of the sources for primary
data include surveys, interviews, experiments, etc. The
data generated from the experiment is an example of
primary data. Here is an excel sheet showing the data
collected for students of a class.
• Secondary Data Sources—Secondary data collection
obtains information from external sources, rather than
generating it personally. Some sources for secondary
data collection include:
BEST PRACTICES FOR ACQUIRING DATA
• Checklist of factors that make data good or bad
Data acquisition from websites
ETHICAL CONCERNS IN DATA
ACQUISITION
• While gathering data and choosing datasets, certain ethical
issues can be addressed before they occur.
FEATURES OF DATA AND DATA PREPROCESSING
• Usability of Data There are three primary factors determining the
usability of data:
Structure- Defines how data is stored.
• Cleanliness- Clean data is free from duplicates, missing values,
outliers, and other anomalies that may affect its reliability and
usefulness for analysis. In this particular example, duplicate
values are removed after cleaning the data.
• Accuracy- Accuracy indicates how well the data matches
real-world values, ensuring reliability. Accurate data
closely reflects actual values without errors, enhancing
the quality and trustworthiness of the dataset. In this
particular example, we are comparing data gathered
from measuring the length of a small box in centimeters.
FEATURES OF DATA
• Data features are the characteristics or properties of the data.
They describe each piece of information in a dataset.
• For example, in a table of student records, features could
include things like the student's name, age, or grade.
• In a photo dataset, features might be the colors present in
each image. These features help us understand and analyze
the data.
• In AI models, we need two types of features: independent and
dependent.
• Independent features are the input to the model—they're the information we
provide to make predictions.
• Dependent features, on the other hand, are the outputs or results of the
model—they're what we're trying to predict.
DATA PROCESSING AND DATA
INTERPRETATION
• Data processing and interpretation have become very
important in today’s world Can you answer this?
• Niki has 7 candies, and Ruchi has 4 candies
• How many candies do Niki and Ruchi have in total?
• We can answer this question using data processing
• Who should get more candies so that both Niki and Ruchi
have an equal number of candies?
• How many candies should they get?
• We can answer this question using data interpretation
Data Processing
▪ Data processing helps computers understand raw data.
▪ Use of computers to perform different operations on data
is included under data processing.
Data Interpretation
▪ It is the process of making sense out of data that has been
processed.
▪ The interpretation of data helps us answer critical
questions using data.
UNDERSTANDING SOME KEYWORDS RELATED TO DATA
• Acquire Data- Acquiring data is to collect data from various data
sources.
• Data Processing- After raw data is collected, data is processed to
derive meaningful information from it.
• Data Analysis – Data analysis is to examine each component of
the data in order to draw conclusions.
• Data Interpretation – It is to be able to explain what these
findings/conclusions mean in a given context.
• Data Presentation- In this step, you select, organize, and group
ideas and evidence in a logical way
METHODS OF DATA INTERPRETATION
• How to interpret Data?
• Based on the two types of data, there are two ways to interpret data-
● Quantitative Data Interpretation
● Qualitative Data Interpretation
QUALITATIVE DATA INTERPRETATION
● Qualitative data tells us about the emotions and feelings of people
● Qualitative data interpretation is focused on insights and motivations of people .
Data Collection Methods – Qualitative Data Interpretation
Record keeping: This method uses existing reliable documents and other similar
sources of information as the data source. It is similar to going to a library.
Observation: In this method, the participant – their behavior and emotions – are
observed carefully
Case Studies: In this method, data is collected from case studies. Focus groups: In this
method, data is collected from a group discussion on relevant topic.
Longitudinal Studies: This data collection method is performed on the same data
source repeatedly over an extended period.
One-to-One Interviews: In this method, data is collected using a one-to-one interview.
5 STEPS TO QUALITATIVE DATA ANALYSIS
1. Collect Data
2. Organize
3. Set a code to the Data Collected
4. Analyze your data
5. Reporting
QUANTITATIVE DATA INTERPRETATION
• Quantitative data interpretation is made on
numerical data
• It helps us answer questions like “when,” “how
many,” and “how often”
• For example – (how many) numbers of likes on the
Instagram post
DATA COLLECTION METHODS-
QUANTITATIVE DATA INTERPRETATION
• Interviews: Quantitative interviews play a key role in collecting
information.
• Polls: A poll is a type of survey that asks simple questions to
respondents. Polls are usually limited to one question.
• Observations: Quantitative data can be collected through
observations in a particular time period
• Longitudinal Studies: A type of study conducted over a long time
• Survey: Surveys can be conducted for a large number of people to
collect quantitative data.
4 STEPS TO QUANTITATIVE DATA
ANALYSIS
1. Relate measurement scales with
variables
2. Connect descriptive statistics with data
3. Decide a measurement scale
4. Represent data in an appropriate format
TYPES OF DATA INTERPRETATION
• Textual DI
▪ The data is mentioned in the text form, usually in a paragraph. ▪
Used when the data is not large and can be easily comprehended by
reading.
▪ Textual presentation is not suitable for large data.
• Tabular DI
▪ Data is represented systematically in the form of rows and columns.
▪ Title of the Table (Item of Expenditure) contains the description of the
table content.
▪ Column Headings(Year; Salary; Fuel and Transport; Bonus; Interest on
Loans; Taxes) contains the description of information contained in
columns.
• Graphical DI
• Bar Graphs : In a Bar Graph, data is represented using
vertical and horizontal bars.
• Pie Charts : Pie Charts have the shape of a pie and each
slice of the pie represents the portion of the entire pie
allocated to each category . It is a circular chart divided into
various sections (think of a cake cut into slices) . Each
section of the pie chart is proportional to the corresponding
value.
• Line Graphs :A line graph is created by connecting various
data points. It shows the change in quantity over time.
IMPORTANCE OF DATA INTERPRETATION