Introduction to Data
analytics
By,
Merin Prakash
Introduction
• Data refers to the information that is collected from a particular
source , such as survey, experiment or database
• A variable is a characteristic, number, or quantity that can be
measured or counted, and whose value can change between data
units or over time.
For example, income is a variable because it can vary between people
or businesses, and it can also change over time for each person or
business.
• Data made up of different variables
Analytics
• Analytics is the systematic computational analysis of data or statistics. It is
used for the discovery, interpretation, and communication of meaningful
patterns in data
• Organizations may apply analytics to business data to describe, predict, and
improve business performance.
• Analytics help organizations to create value by solving problems effectively
and assisting in decision making
How much time do consumers spend using mobile media?
Data Analytics
• Data analytics help a business optimize its performance, perform more efficiently, maximize profit,
or make more strategically-guided decisions.
Data analytics uses various tools to process and analyze data effectively:
• Spreadsheets (e.g., Excel) – Ideal for organizing, calculating, and creating simple charts or pivot
tables.
• Data Visualization Tools (e.g., Tableau, Power BI) – Transform data into interactive visuals to
identify trends and patterns.
• Reporting Tools (e.g., SAP Crystal Reports) – Generate structured and automated reports from
datasets.
• Data Mining Programs (e.g., RapidMiner) – Extract patterns and insights from large datasets using
AI and machine learning.
• Open-Source Languages (e.g., Python, R) – Enable advanced analysis, automation, and modelling
with extensive libraries.
These tools collectively simplify data interpretation and decision-making.
Applications
Marketing optimization
• Marketing organizations use analytics to determine the outcomes of
campaigns or efforts, and to guide decisions for investment and
consumer targeting.
• Demographic studies, customer segmentation, and other techniques
allow marketers to use large amounts of consumer purchase, survey
and panel data to understand and communicate marketing strategy.
• People Analytics refers to using data about people's behavior to
better understand how they work and improve how companies
manage their workforce. It helps organizations make decisions based
on facts and patterns instead of just intuition.
• Portfolio analytics
A common application of business analytics is portfolio analysis. In this,
a bank or lending agency has a collection of accounts of
varying value and risk.
The accounts may differ by the social status (wealthy, middle-class,
poor, etc.) of the holder, the geographical location, its net value, and
many other factors.
• Risk analytics
Predictive models in the banking industry are developed to bring
certainty across the risk scores for individual customers. Credit scores
are built to predict an individual's delinquency behaviour and are
widely used to evaluate the credit worthiness of each applicant.
Digital analytics
• Digital analytics is a set of business and technical activities that
define, create, collect, verify or transform digital data into reporting,
research, analyses, recommendations, optimizations, predictions, and
automation. This also includes the SEO (search engine optimization)
where the keyword search is tracked and that data is used for
marketing purposes.
Challenges
• Big Data Challenges: Managing massive, complex, and constantly changing
data sets, along with rapid data accumulation in online transactional
systems.
• Unstructured Data Issues: Difficulty analyzing and storing varied formats of
unstructured data, which require significant transformation for effective
use.
• Technological Limitations: Challenges in implementing advanced analytics
innovations and massively parallel processing to enhance data analysis.
• Educational Analytics Problems: Educators face difficulties in
understanding, interpreting, and effectively using complex student
performance data for decision-making
Role of Analytics in Industry
Applications of Data
Analytics
1. Transportation
• Data analytics can help reduce traffic congestion
and improve travel by analysing large data sets to
create alternative routes, which can minimize
accidents.
• Travel companies also use data analytics to
enhance travel packages by understanding
customer preferences from social media, boosting
both customer satisfaction and business growth.
Applications of Data Analytics
2. Education
• Data analytics allows policymakers to
enhance education by personalizing learning
and optimizing resource management based
on student needs and usage patterns.
Applications of Data Analytics
3. Internet web search results
• Search engines like Google,
Amazon e-commerce search,
Bing, etc., use analytics to
arrange data and deliver the
best search results.
Applications of Data
Analytics
4. Marketing and digital
advertising
• Marketers use data analytics to
understand the audience and get
high conversion rates
• To understand the audience,
digital ad experts use analytics to
know the intended audience’s
likes, dislikes, age, race, gender,
and other features.
Applications of Data Analytics
6. Security
• Security personnel use data analytics
(especially predictive analytics) to find
future cases of crimes or security
breaches. They can also investigate
past or ongoing attacks.
• Some cities use data analytics to
monitor areas with high crime rates.
They monitor crime patterns and
predict future crime possibilities from
these patterns.
7. Fraud detection
• Data analytics is widely used across
industries like pharmaceuticals,
banking, finance, tax, and retail to
detect fraud, using predictive
analysis to assess tax return
reliability and monitor customer
interactions for signs of bank fraud.
Current Trends in data Analytics
1. AI-Powered Data Analytics:
• AI-Driven Insights: AI algorithms are enhancing data analysis
capabilities, enabling faster and more accurate insights.
• Automated Data Preparation: AI tools are automating data cleaning,
transformation, and feature engineering, saving time and effort.
• Predictive Analytics: AI models are improving forecasting accuracy,
helping businesses make informed decisions.
2. Data-Centric AI:
• Focus on Data Quality: Organizations are prioritizing data quality and
governance to ensure reliable AI models.
• Data Labelling: Accurate data labelling is crucial for training AI models,
and automated tools are emerging to streamline this process.
• Data Privacy and Security: Robust data privacy and security measures
are essential to protect sensitive data and comply with regulations.
3. Metadata-Driven Data Fabric:
• Unified Data Access: Data fabrics provide a unified view of data across
various sources, improving data accessibility and integration.
• Self-Service Data Discovery: Metadata-driven data fabrics empower
users to find and understand relevant data without relying on
technical experts.
• Data Lineage and Impact Analysis: Tracking data source helps
organizations understand data flows and potential impacts of
changes.
4. Edge Computing:
• Real-Time Analytics: Edge computing enables real-time data
processing and analysis at the source, reducing delay and improving
decision-making.
• IoT Integration: Edge computing is essential for processing and
analysing data generated by IoT devices.
• Privacy-Preserving Analytics: Edge computing can help protect
sensitive data by processing it locally instead of sending it to
centralized servers.
5. Augmented Analytics:
• Natural Language Processing (NLP): NLP techniques are enabling
users to interact with data using natural language queries.
• Automated Insights Generation: AI-powered tools can automatically
generate insights and recommendations from data.
• Visualizations and Storytelling: Augmented analytics enhances data
visualization capabilities, making it easier to communicate insights to
diverse audiences.
6. Generative AI:
• AI-Generated Data: Generative AI models can create artificially
generated data that mimics real-world data for training and testing AI
models.
• Automated Report Generation: AI can automate the generation of
reports and presentations, saving time and effort.
• Personalized Insights: Generative AI can tailor insights and
recommendations to individual users based on their preferences and
needs.
Different types of Analytics
Descriptive analytics
• Descriptive analytics focuses on summarizing past data to understand
what has happened.
• It uses historical data to create reports, visualizations, and dashboards.
• Visualization: Bar charts, line graphs, and pie charts.
Examples:
• Business: A retail store analyses monthly sales data to identify trends in
customer purchases.
• Healthcare: A hospital reviews patient admission records to determine
the most common illnesses treated over the past year.
• Education: A university generates a report on student performance in
exams to identify pass and fail rates.
Frequency distribution
• The frequency of a value is the number of times it occurs in a dataset.
• A frequency distribution is the pattern of frequencies of a variable.
It's the number of times each possible value of a variable occurs in a
dataset
Central Tendency Measures
• Central tendency measures are used to describe the typical or central
value of a dataset. Some commonly used measures include:
• Mean: This is the average value obtained by adding up all values in a
dataset and then dividing by the total number of values. For example,
consider a dataset of exam scores: 80, 85, 90, 75, 88. The mean score
= (80 + 85 + 90 + 75 + 88) / 5 = 83.6.
• Median: This is the middle value when the dataset is arranged in
ascending order. For the same exam scores dataset: 75, 80, 85, 88, 90.
The median score = 85.
• Mode: This is the value that appears most frequently in the dataset.
In the exam scores dataset, there is no mode as each value appears
only once.
Measure of Dispersion
• Measures of Dispersion measure the scattering of the data.
• It tells us how the values are distributed in the data set.
Variability Measures
These measures indicate the spread or dispersion of data points around the
central value. Common variability measures include:
• Range: It defines the difference between the maximum and minimum
values in the dataset. Using the exam scores dataset: Range = Maximum
score – Minimum score = 90 – 75 = 15.
• Variance: This measures the average squared deviation from the mean. It
indicates how much the values in a dataset vary from the mean.
• Standard Deviation: This is the square root of the variance, which measures
the average distance of data points from the mean.
Benefits
• Clear Summaries: Descriptive analytics offers a straightforward
summary of what has happened, making it easier to understand past
performance.
• Trend Identification: By revealing trends and patterns, it helps in
making informed decisions based on historical data.
• Enhanced Understanding: It provides valuable insights into past
performance, helping organizations learn from previous experiences.
Limitations:
• Historical Focus: Descriptive analytics only deals with past data and
cannot forecast future events or trends.
• Lack of Causation: It may not explain why certain trends occurred,
limiting the ability to understand the underlying causes behind
observed patterns.
Real-World Examples
• Sales Reports: Businesses frequently generate monthly or quarterly
sales reports that showcase performance metrics, customer
behaviour, and revenue trends. These reports help in tracking
progress and making informed decisions about sales strategies.
• Web Analytics: Analyzing website traffic and user behaviour provides
insights into how visitors interact with a site. By examining metrics
such as page views, bounce rates, and engagement levels, businesses
can optimize their online presence and improve user experience.
Data collection
• The first step in the descriptive analytics process is to gather relevant
data from various sources. This data could be sourced from
databases, spreadsheets, surveys, or other structured or unstructured
data repositories.
2. Cleaning and preparation
• Data collection sets the stage but must be followed by thorough data
cleansing and preparation to ensure accurate and reliable analysis.
• This step involves identifying and resolving issues such as missing
values, inconsistencies, duplicates, and outliers. Data cleaning ensures
the data is high quality, reliable, and ready for further analysis.
3. Exploration
• In this step, data analysts explore the data to understand its
characteristics better and identify initial patterns or trends.
• This can be achieved through various techniques such as summary
statistics, data visualization, and exploratory data analysis.
• Summary statistics, including measures such as mean, median, mode,
and standard deviation, provide an overview of the data’s central
tendencies and dispersion.
• Data visualization techniques such as charts, graphs, and histograms
help visualize the distribution and relationships within the data,
making it easier to identify patterns or anomalies.
4. Segmentation
• Data segmentation involves dividing the dataset into meaningful
subsets based on specific criteria.
• This segmentation can be done based on variables such as
demographics, geographic location, time periods, or product
categories
5. Summary
• Descriptive analytics aims to summarize data to provide key insights.
• This involves calculating summary measures such as averages, totals,
percentages, or ratios relevant to the subject being analyzed.
6. Historical trend analysis
• Descriptive analytics includes analysing historical trends to
understand how variables or metrics have changed over time. This
analysis reveals patterns, seasonality, or long-term trends.
7. Data reporting and
visualization
• The insights and findings derived from the descriptive analytics
process must be communicated effectively. This is typically done
through reports or visual dashboards
Data
Structured Data
• Structured data is information that has been formatted and
transformed into a well-defined data model.
Examples :
• Name
• Age
• Profession
• Citizenship
• Nationality
Demographic Data
Demographic data is data related to personal and geographic attributes,
like:
• Age
• Current Location
• Email
• Mailing Address
• Name
• Telephone number
Firmographic Data
Firmographic data is data related to companies, used for account-based
marketing (ABM) campaigns. This data can include:
• Company Address
• Company Name
• Industry
• Number of Employees
• Revenue
Behavioural Data
Behavioural data is data related to deeper insights into your customers.
This allows brands to do more effective audience segmentation,
targeting, and behavioural marketing. Insights include:
• Email Open Rates
• Product and Service Usage Patterns
• Purchase Patterns
• Social Media Engagement
• Videos and Content Consumed
• Web Activity history
Transactional Data
Transactional data is data related to how a customer transacts with
your business, including:
• Credit Card Payments
• Insurance Claims
• Invoices
• Purchase Orders
• Sales Orders
• Shipping Documents
Structured Data
Unstructured data
• Unstructured data defines all the information in any format. It can be
images, videos, songs, speech recordings, etc. The diversity of formats
is limitless.
Differen
ces
Source of Unstructured Data
• Social media platforms and messages are the largest sources of unstructured data. Unlike
computers, people don’t like keeping everything unified.
• Different types of unstructured data include:
• Audio Files
• Images
• Video
• PDFs
• PPTs
• Social Media Posts, Comments, and Likes
• Word documents
Some examples of unstructured
data include:
• Emails—while certain fields, like the sender and timestamp, are structured, the
email body itself is unstructured text.
• Photos and videos, because multimedia files are usually stored as raw data and
lack predefined fields.
• Audio files (e.g., recordings of customer service calls, podcasts, and music files)
• Text documents (e.g., PDFs, Word documents, and open-ended survey
responses)
• Social media content, such as posts, tweets, comments, and other user-
generated content—all of which are unstructured and vary widely in format.
• Call center transcripts or recordings; while voice interactions can be analyzed
for sentiment or trends, they are naturally unstructured.
Semi-Structured Data
• Semi-structured data is a type of information that is a combination of
structured and unstructured data, and is not strictly defined by a fixed
structure.
• It uses tags or markers to separate data elements, and is often found
in formats like JSON, XML, and CSV files
Different Types of Semi-Structured Data
• Compressed Files
• Emails (unstructured body text, but with structured data like subject
line and send date)
• Images (that include metadata)
• Webpages
Examples of Semi – Structured data
• Emails
• XML, JSON, and CSV files
• HTML
• NoSQL databases
• Log files
Example
Database, Data Lake, and Data
Warehouse
These systems serve different purposes in data storage and
management:
• Database: A system to store, organize, and retrieve structured data
efficiently for transactional or operational use.
• Data Lake: A centralized repository to store vast amounts of raw,
unstructured, semi-structured, and structured data.
• Data Warehouse: A structured system optimized for analytics and
reporting, storing processed and organized data.
Data Warehouse Platforms
Cloud-Based Data Warehouses Industry-Specific Examples
• Amazon Redshift • Healthcare: Cerner’s data warehouse
• Google Big Query platform for patient data analytics.
• Snowflake • Retail: Walmart’s custom-built data
warehouse for inventory and customer
• Microsoft Azure Synapse analytics.
Analytics
• Finance: JP Morgan Chase's advanced
• IBM Db2 Warehouse on Cloud data warehouse for fraud detection
and risk management.
Terms in Data Analytics
Artificial intelligence
• Artificial intelligence (AI) is a field of study that focuses on creating
machines and computers that can mimic human intelligence, such as
learning, reasoning, and acting.
• AI systems learn and improve by analysing large amounts of data to
identify patterns and relationships.
Examples of AI
• Chess-playing computers - Use deep learning and natural language processing to
accomplish specific tasks
• Self-driving cars - Use deep learning and natural language processing to accomplish specific
tasks
• Medical diagnosis - AI can identify anomalies in scans to help triangulate diagnoses from a
patient's symptoms and vitals
• Retail - AI can power user personalization, product recommendations, shopping assistants,
and facial recognition for payments
Deep learning
Deep learning is a type of machine learning that teaches computers to
perform tasks by learning from examples.
• Image recognition
• Medical image analysis
• Self-driving cars
Natural language processing (NLP)
• Natural language processing (NLP) is a branch of artificial intelligence
that allows computers to understand, interpret, and generate human
language
• Machine translation
• Sentiment analysis
• Chatbots
• Natural language generation
Augmented analytics
• Augmented analytics is a process that uses artificial intelligence (AI),
machine learning (ML), and natural language processing (NLP) to
improve data analytics.
Examples
• Agriculture - Farmers can use augmented analytics to make sense of
data on water use, soil temperature, and crop growth.
• Smart cities - Augmented analytics can help simplify large amounts of
data collected by cities to improve transportation and natural disaster
management
Internet of Things (IoT)
• The Internet of Things (IoT) refers to a network of physical devices
—"things"—that are embedded with sensors, software, and other
technologies to collect, exchange, and process data over the internet.
Applications of IoT
• Devices like smart thermostats, security cameras, and voice assistants
automate home functions.
• Wearable devices monitor health parameters like heart rate or blood
pressure.
Edge analytics
• Edge analytics is a method of analysing data close to where it's
generated, rather than sending it to a centralized location for
processing.
• This approach is becoming more popular as the internet of things
(IoT) model of connected devices
Predictive analytics
• Predictive analytics uses statistical algorithms and machine learning
techniques to process historical data to anticipate future events or
outcomes.
• A simple use case is extracting patterns and relationships from large
datasets to identify trends, patterns, and probabilities.
Predictive Analytics
Prescriptive analytics
• Prescriptive analytics is a data analytics technique that uses data,
machine learning, and statistical algorithms to recommend the best
course of action for a given situation.
• It's the most advanced type of data analytics and is used to answer
the question, "What should we do next?"
Benefits of prescriptive analytics
• Better decision-making
• Enhanced operational efficiency
• Risk mitigation and fraud detection
• Improved customer experience
Big data analytics
• Big data analytics refers to the systematic processing and analysis of
large amounts of data and complex data sets, to extract valuable
insights is Known as Big Data
• Big data analytics allows for the uncovering of trends, patterns and
correlations in large amounts of raw data to help analysts make data-
informed decisions.
Comparison between Big data analytics and
Traditional data analytics
1. Data Type:
• Traditional Data Analytics: Handles structured data, typically organized
in rows and columns in relational databases.
• Big Data Analytics: Works with structured, semi-structured, and
unstructured data, including text, images, videos, and sensor data.
Differences…
2. Data Volume:
• Traditional Data Analytics: Limited to relatively smaller and
manageable data volumes.
• Big Data Analytics: Deals with massive data sets that exceed the
capacity of traditional systems.
Differences…
3. Tools Used:
• Traditional Data Analytics: Relies on tools like SQL and relational
database management systems (RDBMS).
• Big Data Analytics: Utilizes advanced tools like Hadoop, Spark, and
NoSQL databases for distributed processing and storage.
Differences…
4. Complexity:
• Traditional Data Analytics: Simpler analysis, focusing on well-
structured and formatted data.
• Big Data Analytics: Involves analysing complex and diverse data sets
requiring advanced techniques.
Differences…
5. Analysis Techniques:
• Traditional Data Analytics: Based on statistical methods and basic querying.
• Big Data Analytics: Employs advanced methods like machine learning, artificial
intelligence, and data mining.
6. Speed and Real-Time Capability:
• Traditional Data Analytics: Designed for batch processing, may not handle
real-time analytics efficiently.
• Big Data Analytics: Often supports real-time processing for immediate insights.
Differences…
8. Use Cases:
• Traditional Data Analytics: Suitable for structured reporting, basic
trend analysis, and operational decision-making.
• Big Data Analytics: Applied in predictive analytics, customer
behaviour analysis, fraud detection, and Internet of Things (IoT)
applications.
Big Data Types
• Social Networks and Web data – Facebook, YouTube, Blogs
• Transactions data and Business Processes data- Credit Card Transactions ,
Flight Bookings Etc.
• Customer Master Data – Facial Recognition, Name , DOB
• Machine Generated Data- Data From IoT, Sensors, Trackers etc
• Human Generated Data – Biometric Data, Human- Machine interaction Data,
Audio, Video.
Quantitative data
• This data is numerical, countable, or measurable, and is used to
answer questions about how many, how much, or how often.
Examples
• Quantitative data: Age, weight, height, length, population, size, and
other numerical values
Qualitative data
• This data is descriptive, interpretation-based, and relates to language,
and is used to answer questions about why, how, or what happened
behind certain behaviours.
Examples
• Qualitative data: Color, smell, taste, touch or feeling, typology, and
shapes
Levels of Measurements
• There are four different scales of measurement. The data can be
defined as being one of the four scales. The four types of scales are:
• Nominal Scale
• Ordinal Scale
• Interval Scale
• Ratio Scale
Nominal Scale
• A nominal scale is the 1st level of measurement scale in which the
numbers serve as “tags” or “labels” to classify or identify the objects.
A nominal scale usually deals with the non-numeric variables or the
numbers that do not have any value.
Example:
What is your gender?
• M- Male
• F- Female
characteristics of nominal scales
• The categories are represented by their names and are not
necessarily ordered.
• The order of the categories is irrelevant.
• The only permissible aspect of numbers in the nominal scale is
“counting.”
Examples
• Gender: Male or female
• Location: Country, state, city, or neighbourhood
• Political affiliation: Republican, Democrat, or Independent
• Ice cream flavour: Chocolate, strawberry, and so on
• Jersey numbers: 2, 16, 84, and so on
• Food type: Sweet, sour, bitter, and salty
• Hair colour: Blonde, brunette, black, red, and so on
• Eye colour: Black, brown, blue, green, and so on
• Favourite pet: Dog, cat, fish, bird, and so on
Ordinal Scale
• The ordinal scale is the 2nd level of measurement that reports the
ordering and ranking of data without establishing the degree of
variation between them. Ordinal represents the “order.”
• Ordinal data is known as qualitative data or categorical data. It can be
grouped, named and also ranked.
Characteristics of the Ordinal
Scale
• The ordinal scale shows the relative ranking of the variables
• It identifies and describes the magnitude of a variable
• Along with the information provided by the nominal scale, ordinal scales give
the rankings of those variables
• The interval properties are not known
• The surveyors can quickly analyse the degree of agreement concerning the
identified order of variables
Examples of Ordinal Scale
• Socioeconomic status: For example, "low income", "middle income", "high income".
• Education level: For example, "high school", "BS", "MS", "PhD".
• Income level: For example, "less than 50K", "50K-100K", "over 100K".
• Satisfaction rating: For example, "extremely dislike", "dislike", "neutral", "like", "extremely like
• Medals in the Olympics: For example, bronze, silver, and gold.
• Letter grades for student test scores: For example, low, medium, and high
Likert Scale
• A Likert scale is a rating system that asks respondents to choose from
a range of answers to measure their attitudes, opinions, or
perceptions
Example
• "Strongly agree," "Agree," "Neutral," "Disagree," "Strongly disagree"
Ordinal Scale
Interval Scale
• It is defined as a quantitative measurement scale in which the
difference between the two variables is meaningful.
Characteristics of Interval Scale:
• The interval scale is quantitative as it can quantify the difference
between the values
• It allows calculating the mean and median of the variables
• To understand the difference between the variables, you can subtract
the values between the variables
• The interval scale is the preferred scale in Statistics as it helps to
assign any numerical values to arbitrary assessment such as feelings,
calendar types, etc.
Ratio scale
• Ratio scale is a type of variable measurement scale which is
quantitative in nature. It allows any researcher to compare the
intervals or differences.
• Ratio scale is the 4th level of measurement and possesses a zero point
or character of origin. This is a unique feature of this scale.
Examples of Ratio Scale
Measurements
• Weight: A measurement of 0 kg indicates the absence of weight. If
one object weighs 10 kg and another 5 kg, we can say the first object
is twice as heavy as the second.
• Height: A height of 0 cm represents no height. If one person is 180 cm
tall and another is 90 cm tall, the first person is twice as tall as the
second.
• Age: A value of 0 years means no age (beginning of life). A person
aged 40 years is twice as old as someone aged 20 years.
• Income: An income of $0 indicates no earnings. Someone earning
$100,000 earns 10 times as much as someone earning $10,000.
Characteristics of Ratio Scale:
• Ratio scale has a feature of absolute zero
• It doesn’t have negative numbers, because of its zero-point feature
• It affords unique opportunities for statistical analysis. The variables
can be orderly added, subtracted, multiplied, divided. Mean, median,
and mode can be calculated using the ratio scale.
• Ratio scale has unique and useful properties. One such feature is that
it allows unit conversions like kilogram – calories, gram – calories, etc.
Example:
An example of a ratio scale is:
What is your weight in Kgs?
• Less than 55 kgs
• 55 – 75 kgs
• 76 – 85 kgs
• 86 – 95 kgs
• More than 95 kgs
Difference Between Interval Scale and
Ratio Scale
Aspect Interval Scale Ratio Scale
Measures with equal intervals between values Measures with equal intervals between values
Definition
but no absolute zero. and an absolute zero.
The zero point is absolute and indicates the
The zero point is arbitrary (e.g., 0°F doesn’t
Zero Point absence of the variable (e.g., 0 kg means no
mean no temperature).
weight).
Allows addition and subtraction, but ratios are Allows addition, subtraction, multiplication, and
Comparison
not meaningful. division. Ratios are meaningful.
Temperature (Celsius, Fahrenheit), calendar
Examples Weight, height, income, age, distance.
years (e.g., 2023).
Can calculate differences but not ratios (e.g., Can calculate both differences and ratios (e.g.,
Mathematical Operations
30°F is not twice as hot as 15°F). 20 kg is twice as heavy as 10 kg).
Nature of Zero Zero is relative (it doesn’t mean "nothing"). Zero is absolute (it means "nothing" or "none").
Independent Variable (IV)
• The variable that is manipulated or changed by the researcher to
observe its effect on another variable.
• Example:
• In a study on how teaching methods affect student performance:
• Independent Variable: Teaching Method (e.g., traditional vs. modern).
• In a marketing study on advertisement duration:
• Independent Variable: Duration of Advertisement (e.g., 30 seconds, 1
minute).
Dependent Variable (DV)
• The variable that is measured or observed to assess the impact of changes in
the independent variable.
Example:
• In the same teaching study:
• Dependent Variable: Student Performance (e.g., test scores).
• In the marketing study:
• Dependent Variable: Sales Growth (e.g., percentage increase in sales).
“The IV causes a change in the DV. It is not possible that DV could cause any change in IV.”
Examples
• How does the amount of sleep impact test scores?
• Independent Variable: Time spent on sleeping before the exam
• Dependent Variable: Test Score
• What is the effect of fast food on blood pressure?
• Independent Variable: Consumption of fast food
• Dependent Variable: Blood Pressure
• What is the effect of caffeine on sleep?
• Independent Variable: the amount of caffeine consumed
• Dependent Variable: Sleep
- Explore the factors influencing
Sample Questionnaire
student performance.
Section A: Demographic Information
• Gender: Demographic Variables (Q1–Q4):
• Male
• Female
• Other • Use Frequencies to analyse distributions
• Age (in years): __________
• Grade/Year of Study: (e.g., percentage of males and females, age
• High School (+1, +2)
• Undergraduate
• Postgraduate range).
• Field of Study:
• Science • Use Descriptive Statistics for continuous
• Commerce
• Arts
• Other: __________ variables (e.g., mean age).
Section B: Academic Background
• What was your percentage/grade in the last academic year?
• Below 50% 50–60% 61–70% 71–80% Above 80%
• How many hours per day do you dedicate to self-study (excluding classes)?
• Less than 1 hour 1–2 hours 3–4 hours More than 4 hours
• How frequently do you attend your classes?
• Rarely (Less than 50%) Sometimes (50–75%) Always (More than 75%)
• Do you participate in extracurricular activities (sports, arts, clubs, etc.)?
• Yes
• No
How to Analyse This Questionnaire
in SPSS
Academic Background :
• Frequencies for categorical responses (e.g., grade distribution, self-
study hours).
• Compare groups using t-tests or ANOVA (e.g., performance vs. study
hours).
Section C: Learning
Environment (IV)
Rate the following statements based on your level of agreement. Use the
scale:
(1 = Strongly Disagree, 2 = Disagree, 3 = Neutral, 4 = Agree, 5 = Strongly
Agree)
• I find the teaching methods in my institution effective.
1 2 3 4 5
• I have access to adequate learning resources (library, online materials, etc.).
1 2 3 4 5
• My classroom environment motivates me to learn.
1 2 3 4 5
• My teachers provide regular feedback on my academic progress.
1 2 3 4 5
Section C: Study Habits and Support (IV)
• How many hours per day do you dedicate to self-study (excluding classes)?
• Less than 1 hour
• 1–2 hours
• 3–4 hours
• More than 4 hours
• Do you have access to a quiet study space at home?
• Yes No
• Rate the following statements based on your level of agreement:
(1 = Strongly Disagree, 2 = Disagree, 3 = Neutral, 4 = Agree, 5 = Strongly Agree)
• My teachers explain topics clearly.
• I feel supported by my teachers in my learning journey.
• I have access to sufficient learning materials (books, online resources, etc.).
• The classroom environment encourages learning.
Section E: Academic Performance (DV)
• What was your percentage/grade in the last academic year?
• Below 50%
• 50–60%
• 61–70%
• 71–80%
• Above 80%
Rate your agreement with the following statements:
(1 = Strongly Disagree, 2 = Disagree, 3 = Neutral, 4 = Agree, 5 = Strongly
Agree)
• I feel satisfied with my academic performance. 1 2 3 4 5
• I usually meet the academic goals I set for myself. 1 2 3 4 5
• My performance has improved over the last year. 1 2 3 4 5
Mapping Variables to Analysis
• Independent Variables:
• Demographic Information: Gender, Age, Field of Study
• Socio-Economic Factors: Residence, Income, Parental Education
• Study Habits: Self-study hours, study space, classroom environment
• External Influences: Extracurricular participation, part-time jobs, sleep, social
media usage
• Dependent Variable:
• Academic Performance
How to Analyse This Questionnaire
in SPSS
• Use Correlation to explore relationships (e.g., teaching effectiveness
vs. academic performance).
• Compare groups using t-tests or ANOVA (e.g., performance vs. study
hours).
Human Resources (HR)
• The Role of Emotional Intelligence in Employee Retention
• Independent Variables: Emotional intelligence dimensions (self-
awareness, self-regulation, empathy, social skills).
• Dependent Variables: Employee retention rates, job satisfaction.
HRM
Topic Independent Variable (IV) Dependent Variable (DV)
Effect of Flexible Working Hours on
Flexible working hours (yes/no) Employee happiness (1-5 scale)
Employee Happiness
Impact of Teamwork on Job Level of teamwork (low, medium,
Job satisfaction (1-5 scale)
Satisfaction high)
Role of Communication in Reducing Frequency of team communication Workplace conflict level (low,
Workplace Conflicts (weekly, monthly) medium)
Influence of Workload on Employee Workload intensity (light,
Employee stress levels (1-10 scale)
Stress Levels moderate, heavy)
Impact of Supervisor Support on Supervisor support perception Employee retention (years in
Employee Retention (yes/no) company)
Finance
Finance
Topic Independent Variable (IV) Dependent Variable (DV)
How Budgeting Affects Monthly Monthly savings amount (in
Budgeting practice (yes/no)
Savings currency)
Do Discounts Increase Spending? Discounts offered (percentage) Spending amount (in currency)
Impact of Financial Education on
Financial education (yes/no) Monthly savings (in currency)
Saving Habits
Relationship Between Investment Investment knowledge (low, Risk-taking behavior (low, medium,
Knowledge and Risk-Taking medium, high) high)
Effect of Inflation Awareness on
Inflation awareness (yes/no) Monthly savings (in currency)
Personal Savings
Marketing
Topic Independent Variable (IV) Dependent Variable (DV)
Effect of Social Media Advertising on Social media ad frequency (daily, Brand awareness (measured on a 1-10
Brand Awareness weekly, rarely) scale)
Impact of Discounts on Consumer Purchase volume (number of items
Discount percentage (10%, 20%, 30%)
Purchase Decisions bought)
Influence of Product Packaging on Packaging attractiveness (low, medium,
Consumer preference (1-5 scale)
Consumer Preference high)
Effect of Celebrity Endorsements on Presence of celebrity endorsement Brand loyalty (measured on repeat
Brand Loyalty (yes/no) purchases)
Relationship Between Online Reviews Purchase intent (likelihood on a 1-10
Online review rating (1-5 stars)
and Purchase Intent scale)