0% found this document useful (0 votes)
51 views30 pages

Introduction Business Analytics Merged

The document discusses the importance of business analytics in decision-making and strategy development, highlighting its role in interpreting large data volumes. It outlines various types of analytics, common mistakes in data analysis, and ethical considerations in data collection and usage. The document emphasizes the need for informed consent, data security, and the balance between innovation and privacy rights.

Uploaded by

Sora Boru Guyo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
51 views30 pages

Introduction Business Analytics Merged

The document discusses the importance of business analytics in decision-making and strategy development, highlighting its role in interpreting large data volumes. It outlines various types of analytics, common mistakes in data analysis, and ethical considerations in data collection and usage. The document emphasizes the need for informed consent, data security, and the balance between innovation and privacy rights.

Uploaded by

Sora Boru Guyo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

Wollo University, KIOT,

College of Informatics, Department of Information System


Data and Business Analytics (InSy4143)
By: Kedir Abdu (M.Sc)

First Draft

Kombolcha,Ethiopia
2017 E.C
Table of Contents
Chapter One ................................................................................................................................................ 1
The Growing Role of Business Analytics .................................................................................................. 1
1.1 Introduction to business analytics ............................................................................................. 1
1.2 Roles of business analytics in modern business analytics ........................................................ 3
1.3 Common mistakes in data analysis............................................................................................ 4
1.4 Ethics of data collection, management, usage and privacy ..................................................... 8
1.5 Summary...................................................................................................................................... 9
1.6 Summary Questions: ................................................................................................................ 11
Chapter 2 ................................................................................................................................................... 12
Big Data Collection and Ethics ................................................................................................................ 12
2.1 Definition of Big Data ............................................................................................................... 12
2.2 Difference between Big Data and Other Types of Information ............................................ 13
2.3 How to Collect Big Data ........................................................................................................... 13
2.4 Ethical Implications of Collecting and Using Big Data ......................................................... 14
2.5 Opportunities and Challenges of Big Data for Decision-Makers and Business
Organizations ........................................................................................................................................ 15
2.6 Summary Questions: ................................................................................................................ 16
Chapter 3 ................................................................................................................................................... 17
Answering Business Questions with Data analytics ............................................................................... 17
3.1 Quantitative strategies to answer for business questions ...................................................... 17
3.1.1 Predicting analyzed data ...................................................................................... 17
3.1.1 Evaluating information ........................................................................................ 18
3.1.2 Identifying causal relationship ............................................................................ 19
3.1 Difference between correlation and causation ....................................................................... 19
3.2 Data mining vs. Data Analysis ................................................................................................. 22
3.3 Summary Questions ........................................................................................................................ 27
Chapter 4 ................................................................................................................................................... 28
The Basic Tools of Business Analytics..................................................................................................... 28
Chapter One
The Growing Role of Business Analytics

1.1 Introduction to business analytics


Business analytics is a powerful tool in today’s marketplace that can be used to make decisions
and craft business strategies. Across industries, organizations generate vast amounts of data which,
in turn, has heightened the need for professionals who are data literate and know how to interpret
and analyze that information.

Business analytics is the process of using quantitative methods to drive meaning from data to make
informed business decisions.

Business analytics bridges the gap between information technology and business by using
analytics to provide data-driven recommendations. The business part requires deep business
understanding, while the analytics part requires an understanding of data, statistics and computer
science.

There are four primary methods of business analysis:


Descriptive: The interpretation of historical data to identify trends and patterns
Diagnostic: The interpretation of historical data to determine why something has happened
Predictive: The use of statistics to forecast future outcomes
Prescriptive: The application of testing and other techniques to determine which outcome will
yield the best result in a given scenario

1
Figure 1 Types of Data Analytics

Reading Assignment: What are the top business analyst technical and non-
technical skills?

Business Analytics vs. Data Science

To understand what business analytics is, it’s also important to distinguish it from data science.
While both processes analyze data to solve business problems, the difference between business
analytics and data science lies in how data is used.

Business analytics is concerned with extracting meaningful insights from and visualizing data to
facilitate the decision-making process, whereas data science is focused on making sense of raw
data using algorithms, statistical models, and computer programming. Despite their differences,
both business analytics and data science glean insights from data to inform business decisions.

Comparing Business Analytics to Data Science

Business Analytics Data Science

Business Analytics is the statistical Data science is the study of data using statistics,
study of business data to gain insights. algorithms and technology.

Uses mostly structured data. Uses both structured and unstructured data.

2
Coding is widely used. This field is a combination
Does not involve much coding. It is of traditional analytics practice with good
more statistics oriented. computer science knowledge.

The whole analysis is based on Statistics is used at the end of analysis following
statistical concepts. coding.

Studies trends and patterns specific to


business. Studies almost every trend and pattern.

Top industries where business analytics


is used: finance, healthcare, marketing, Top industries/applications where data science is
retail, supply chain, used: e-commerce, finance, machine learning,
telecommunications. manufacturing.

To better understand how data insights can drive organizational performance, here are some of the
ways firms have benefitted from using business analytics.
1.2 Roles of business analytics in modern business analytics
1. More Informed Decision-Making
Business analytics can be a valuable resource when approaching an important strategic decision.
For example, an organization that manufactures and sells clothing might analyze data about how
different sizes sold in the previous years.
2. Greater Revenue
Companies that embrace data and analytics initiatives can experience significant financial returns.
Example: Supermarket Dynamic Pricing
A supermarket chain collects sales data from different stores and uses business analytics to adjust
prices dynamically.
How It Works:
Descriptive Analytics: Analyzes past sales trends to see which products sell best at different
times.
Predictive Analytics: Forecasts customer demand based on weather, holidays, and shopping
patterns.
Prescriptive Analytics: Suggests the best price for each product to maximize sales and profits.
Results:

3
If demand is high, prices are slightly increased to maximize revenue.
If demand is low, discounts are offered to clear inventory and boost sales.
3. Improved Operational Efficiency
Beyond financial gains, analytics can be used to fine-tune business processes and operations.

In a recent KPMG report on emerging trends in infrastructure, it was found that many firms now
use predictive analytics to anticipate maintenance and operational issues before they become larger
problems.

A mobile network operator surveyed noted that it leverages data to foresee outages seven days
before they occur. Armed with this information, the firm can prevent outages by more effectively
timing maintenance, enabling it to not only save on operational costs, but ensure it keeps assets at
optimal performance levels.

1.3 Common mistakes in data analysis


As of recent estimates, there are around 54 million data analysts worldwide. This number includes
professionals working in various industries like finance, healthcare, retail, and technology.

With the increasing demand for data-driven decision-making, the number of data analysts is
expected to grow significantly in the coming years

Data analysis is a vital part of business management, with 53% of businesses saying data access is
now more critical. Data analytics can help you make improvements that help both customers and
employees. But it’s not enough to know how to analyze data; you also need to know how to avoid
mistakes.

Mistakes in data analysis can be costly. They can lead you to make poor decisions or overlook
something important

Finding and preparing data are the most common data activities, which over 90% of data analysts
perform. These are also the tasks most prone to mistakes. In fact, analysts waste over 44% of their
time each week on unsuccessful activities. So, to be productive, you need to know which common
mistakes to avoid. Let’s take a look.

1. Sample is biased or too small

4
If your sample is too small or biased towards one group, you may miss important information or
draw incorrect conclusions. For instance, say you are user testing app functionality. Only testing
with right-handed people will miss usability issues for left-handed people. Full picture of your
customers. Also, you should look at the demographics of your target audience and make sure your
sample matches those demographics. That way, your sample should be representative of your
customers.

2. Goals and objectives are not clearly defined

Your goals and objectives shape all aspects of your analysis, from collecting data to writing your
report. So, before you start, you need to define the goal of your analysis and your objectives based
on that goal. For instance, your goal could be to compare the performance of your new multi-line
office phone system with your old single line phone system. Your objectives could then be to:

Collect data on key performance indicators for the same month for your new and old phone
systems.

Test whether there is a significant difference between the KPIs for your new and old phone
systems.

Prepare a stakeholder report detailing your findings.

3. Confusing correlation with causation

Many reasons two variables correlate, such as:

A causes B, or vice versa.

A and B are both caused by another factor, C.

A causes C, which causes B, or vice versa.

The correlation is purely a coincidence.

To find out whether two factors are related, you should look at the context. Are there any other
factors that could cause the correlation? Don’t assume a connection without conducting more
research.

5
Using the wrong benchmarks for comparison
A different organization or product but using the wrong benchmark can hide a genuine increase
or decrease in your metric or KPI.
For example, let’s say you compare your small business instant messaging engagement with the
engagement for a large business. You might think your engagement is much lower than it should
be. But if you compare your engagement with another small business, you may find your
engagement is actually above-average.
5. Presenting results without adequate context
When you write your analytical report, you need to put your results into context.
How do they relate to your goals and objectives?
How do they compare to the results of similar studies?
Where do your results fit in the wider market?

Context helps you and your readers interpret your results and gauge their significance. You
should conduct market research both before and after your analysis, and keep up-to-date with the
latest industry trends.

6. Using unreliable data

There are many reasons data may be unreliable, including:

6
Missing or duplicated data

Abnormal or incorrect values

Rounding errors

Data is second-hand or out-of-date

To ensure your data is high quality, you need to check that it is: Complete, Unique Consistent,
Valid, Accurate, and Timely. Use data from the original source, and make sure it's no more than
one or two years old. You should also check your data for missing values and other errors before
beginning your analysis.

7. Not standardizing the data

Data analysts get data from a variety of sources, including spreadsheets, SaaS apps, and cloud
databases. This data is usually formatted in different ways. For instance, some data might be in
percentages and some infractions. If you don’t standardize how the data is formatted, it can
affect the results of your analysis.

You need to ensure all your data is labeled and formatted the same way. That way, it’s easier to
catalog and compare. Some programs will automatically format the data for you, for example,
Excel has an AutoFormat option.

8. Not fully understanding your metrics and KPIs

Before you start your analysis, you need to ensure you’re clear on what a KPI is and which ones
are relevant to your study. You should also write a short definition of what each metric means.
This will help both you and your readers since metrics can have different labels and meanings. For
instance, bounce rate can mean:

The percentage of website visitors who leave after only viewing one page.

The percentage of emails that couldn’t be delivered to the addresses on your mailing list.

Defining your KPIs beforehand ensures they are clear to you and your readers.

7
9. Visualizing data the wrong way

There are many ways you can visualize data, from tables to pie charts. Visualizing your data helps
you see patterns and relationships more clearly. You can also use them in a report, infographic,
or business communication guide. But if you pick the wrong visualization method, you could end
up with a misleading picture of your data.

To choose the right visualization, think about how the data is related and how many variables you
have. You can use color to distinguish between variables or highlight key findings. Plus, you can
use size to indicate value or emphasize importance. Play around with different visualizations until
you find the one that makes the most sense.

Introduction: The Rise of Data Collection

From online shopping and social media to smart devices and artificial intelligence, our lives are
increasingly interconnected, generating vast amounts of data. However, this exponential growth
in data collection raises important ethical considerations regarding privacy, consent, and the
responsible use of personal information.

The Importance of Privacy in the Digital Era

Privacy is a fundamental human right that safeguards personal autonomy, dignity, and freedom. In
the digital era, the concept of privacy has evolved as data is constantly collected, stored, and
analyzed. Preserving privacy is crucial to protecting individuals' identities, preventing
unauthorized access, and ensuring the confidentiality and security of personal information.

1.4 Ethics of data collection, management, usage and privacy


Ethical Challenges in Data Collection

a) Informed Consent: Obtaining informed consent from individuals is crucial before collecting
their personal data. Ethical data collection practices involve transparent disclosure of the purpose,
scope, and potential uses of the data, enabling individuals to make informed choices about sharing
their information.

b) Data Minimization: Collecting only the necessary data minimizes the risk of misuse and
unauthorized access. Ethical data collectors should strive to limit the collection of personal
information to what is relevant and essential for the intended purpose.

8
c) Data Security and Protection: Safeguarding data against unauthorized access, breaches, and
cyber threats is a vital ethical responsibility. Implementing robust security measures, encryption
protocols, and data anonymization techniques helps protect personal information from
exploitation.
d) Transparency and Accountability: Ethical data collection requires transparency and
accountability from organizations. Individuals should be informed about how their data will be
used, who will have access to it, and the steps taken to ensure its security. Organizations should
also be accountable for adhering to privacy regulations and ethical data practices.
Balancing Innovation and Privacy Rights
While data collection fuels innovation and drives advancements, it is essential to strike a balance
between progress and privacy rights. Organizations must adopt responsible data collection
practices, prioritize privacy by design, and implement privacy-enhancing technologies. Privacy
regulations and ethical frameworks play a crucial role in guiding data collection practices and
ensuring accountability.
Building a Responsible Data Collection Culture
a) Data Ethics Training: Promoting awareness and understanding of data ethics among employees
is essential. Training programs can educate individuals about privacy rights, ethical data handling,
and the responsible use of personal information.
b) Privacy-First Mindset: Organizations should adopt a privacy-first mindset, placing individuals'
privacy and data protection at the core of their operations. Privacy impact assessments and ethical
guidelines can help evaluate the potential risks and ethical implications of data collection
initiatives.
c) User Empowerment: Empowering individuals with control over their data and providing clear
options for consent management fosters trust and respects privacy preferences. Giving
individuals the ability to access, correct, and delete their personal data promotes user autonomy
and strengthens ethical data practices.

1.5 Summary
Introduction to Business Analytics:
Business analytics is a powerful tool for decision-making and strategy development, helping
organizations interpret large volumes of data. It bridges the gap between business needs and
technology by providing data-driven recommendations.
Types of Business Analytics:
 Descriptive Analytics – Identifies trends and patterns in historical data.
 Diagnostic Analytics – Determines why past events occurred.
 Predictive Analytics – Uses statistical methods to forecast future outcomes.
 Prescriptive Analytics – Suggests the best course of action based on testing and
simulations.

9
Business Analytics vs. Data Science:
Business analytics focuses on extracting insights for decision-making using structured data and
statistical concepts.
Data science involves coding, algorithm development, and handling both structured and
unstructured data to uncover patterns.
Both fields contribute to business decision-making but differ in methodology and focus areas.
Benefits of Business Analytics:
More Informed Decision-Making: Example: A supermarket chain collects sales data from
different stores and uses business analytics to generate greater benefit.
Greater Revenue: Studies show businesses investing in data analytics experience profit growth
and cost reduction.
Improved Operational Efficiency: Predictive analytics helps companies anticipate maintenance
and operational issues, preventing costly downtime.

Common Mistakes in Data Analysis:

 Biased or Small Sample Size – Leads to inaccurate conclusions.


 Undefined Goals and Objectives – Results in unfocused analysis.
 Confusing Correlation with Causation – Incorrectly assuming relationships between
variables.
 Using Wrong Benchmarks – Comparing metrics to irrelevant data can be misleading.
 Lack of Context in Results – Leads to misinterpretation of data.
 Unreliable Data Sources – Leads to inaccurate insights.
 Lack of Standardization – Different data formats can cause errors.
 Misunderstanding KPIs – Metrics must be clearly defined.
 Poor Data Visualization – Choosing incorrect visualization methods can distort findings.

Ethical Considerations in Data Collection:

 Informed Consent – Users must be aware of how their data will be used.
 Data Minimization – Only relevant and necessary data should be collected.
 Data Security & Protection – Preventing breaches and unauthorized access is essential.

10
 Transparency & Accountability – Organizations should clearly disclose their data usage
practices.

Balancing Innovation and Privacy Rights:


Companies must find a balance between leveraging data for innovation and respecting user
privacy. Regulations and ethical guidelines help ensure responsible data collection.

Building a Responsible Data Culture:

 Data Ethics Training: Educating employees on ethical data practices.


 Privacy-First Mindset: Organizations should prioritize data protection.
 User Empowerment: Allowing users to control their data fosters trust.

1.6 Summary Questions:


1. What is business analytics, and how does it help organizations?
2. What are the four types of business analytics, and how do they differ?
3. How does business analytics compare to data science?
4. What are some key benefits of using business analytics?
5. How business analytics generate greater revenue?
6. What are some common mistakes in data analysis, and how can they impact decision-
making?
7. Why is it important to distinguish correlation from causation in data analysis?
8. What ethical concerns arise in data collection and analytics?
9. How can companies balance innovation with privacy rights?
10. What strategies can adopt to build a responsible data collection culture?

11
Chapter 2

Big Data Collection and Ethics

2.1 Definition of Big Data


Big Data refers to extremely large and complex datasets that cannot be processed effectively
with traditional data-processing techniques.
The Vs of big data
Big data definitions may vary slightly, but it will always be described in terms of volume,
velocity, and variety. These big data characteristics are often referred to as the “3 Vs of big data”
and were first defined by Gartner in 2001.
Volume
As its name suggests, the most common characteristic associated with big data is its high
volume. This describes the enormous amount of data that is available for collection and produced
from a variety of sources and devices on a continuous basis.
Velocity
Big data velocity refers to the speed at which data is generated. Today, data is often produced in
real time or near real time, and therefore, it must also be processed, accessed, and analyzed at the
same rate to have any meaningful impact.
Variety
Data is heterogeneous, meaning it can come from many different sources and can be structured,
unstructured, or semi-structured. More traditional structured data (such as data in spreadsheets or
relational databases) is now supplemented by unstructured text, images, audio, video files, or
semi-structured formats like sensor data that can’t be organized in a fixed data schema.
In addition to these three original Vs, three others that are often mentioned in relation to
harnessing the power of big data: veracity, variability, and value.
Veracity: Big data can be messy, noisy, and error-prone, which makes it difficult to control the
quality and accuracy of the data. Large datasets can be unwieldy and confusing, while smaller
datasets could present an incomplete picture. The higher the veracity of the data, the more
trustworthy it is.
Variability: The meaning of collected data is constantly changing, which can lead to
inconsistency over time. These shifts include not only changes in context and interpretation but
also data collection methods based on the information that companies want to capture and
analyze.

Chapter2: Big Data Collection and Ethics 12


Compiled By: Kedir Abdu Email:[email protected]
Value: It’s essential to determine the business value of the data you collect. Big data must
contain the right data and then be effectively analyzed in order to yield insights that can help
drive decision-making.
Big data examples
Data can be a company’s most valuable asset. Using big data to reveal insights can help you
understand the areas that affect your business—from market conditions and customer purchasing
behaviors to your business processes.
Here are some big data examples that are helping transform organizations across every industry:
 Tracking consumer behavior and shopping habits to deliver hyper-personalized retail
product recommendations tailored to individual customers
 Monitoring payment patterns and analyzing them against historical customer activity
to detect fraud in real time
 Combining data and information from every stage of an order’s shipment journey with
hyperlocal traffic insights to help fleet operators optimize last-mile delivery
 Using AI-powered technologies like natural language processing to analyze unstructured
medical data (such as research reports, clinical notes, and lab results) to gain new insights
for improved treatment development and enhanced patient care
 Using image data from cameras and sensors, as well as GPS data, to detect potholes and
improve road maintenance in cities
 Analyzing public datasets of satellite imagery and geospatial datasets to visualize,
monitor, measure, and predict the social and environmental impacts of supply chain
operations
These are just a few ways organizations are using big data to become more data-driven so they
can adapt better to the needs and expectations of their customers and the world around them.
2.2 Difference between Big Data and Other Types of Information
Big Data differs from traditional data in several key ways:
Size: Traditional data is typically smaller and can be stored in databases, whereas Big Data
involves vast amounts of information.
Structure: Traditional data is often structured (e.g., in relational databases), while Big Data
includes unstructured and semi-structured formats (e.g., social media posts, videos, sensor data).
Processing Methods: Traditional data can be handled using standard database management
systems (DBMS), whereas Big Data requires advanced tools like Hadoop, Spark, and NoSQL
databases.
2.3 How to Collect Big Data
Big Data is collected from various sources, including:

Chapter2: Big Data Collection and Ethics 13


Compiled By: Kedir Abdu Email:[email protected]
 Social Media: Platforms like Twitter, Facebook, and Instagram provide user-generated
data.
 Sensors and IoT Devices: Devices like smart meters, wearables, and industrial sensors
continuously generate data.
 Web Scraping: Automated extraction of data from websites.
 Transactional Data: Information from online purchases, banking transactions, and
customer interactions.
 Surveys and Research: Large-scale surveys and academic studies contribute to Big Data.
 Public and Government Databases: Open data sources from government agencies and
institutions.
How does big data work?
The central concept of big data is that the more visibility you have into anything, the more
effectively you can gain insights to make better decisions, uncover growth opportunities, and
improve your business model.
Making big data work requires three main actions:
 Integration: Big data collects terabytes, and sometimes even petabytes, of raw data from
many sources that must be received, processed, and transformed into the format that
business users and analysts need to start analyzing it.
 Management: Big data needs big storage, whether in the cloud, on-premises, or both.
Data must also be stored in whatever form required. It also needs to be processed and
made available in real time. Increasingly, companies are turning to cloud solutions to take
advantage of the unlimited compute and scalability.
 Analysis: The final step is analyzing and acting on big data—otherwise, the investment
won’t be worth it. Beyond exploring the data itself, it’s also critical to communicate and
share insights across the business in a way that everyone can understand. This
includes using tools to create data visualizations like charts, graphs, and
dashboards.

2.4 Ethical Implications of Collecting and Using Big Data


The collection and use of Big Data raise several ethical concerns:
 Privacy Issues: Unauthorized collection or misuse of personal information.
 Data Security: The risk of data breaches exposing sensitive information.
 Bias and Discrimination: AI models trained on biased data may lead to unfair outcomes.
 Informed Consent: Users may not always be aware that their data is being collected.
 Surveillance and Autonomy: Excessive data collection can lead to concerns over mass
surveillance and personal freedom.

Chapter2: Big Data Collection and Ethics 14


Compiled By: Kedir Abdu Email:[email protected]
 Transparency and Accountability: Organizations must disclose how data is collected,
used, and stored.
2.5 Opportunities and Challenges of Big Data for Decision-Makers and Business
Organizations
Opportunities:
 Improved Decision-Making: Data-driven insights help businesses and policymakers make
informed choices.
 Enhanced Customer Experience: Personalization based on user behavior improves
satisfaction.
 Operational Efficiency: Automation and predictive analytics optimize workflows and
reduce costs.
 Innovation and Competitive Advantage: Data analytics drives new product development
and market trends.
 Public Health and Safety: Big Data aids in medical research, disaster response, and crime
prevention.
Challenges of implementing big data analytics
While big data has many advantages, it does present some challenges that organizations must be
ready to tackle when collecting, managing, and taking action on such an enormous amount of
data.
The most commonly reported big data challenges include:
 Lack of data talent and skills. Data scientists, data analysts, and data engineers are in short
supply—and are some of the most highly sought after (and highly paid) professionals in
the IT industry. Lack of big data skills and experience with advanced data tools is one of
the primary barriers to realizing value from big data environments.
 Speed of data growth. Big data, by nature, is always rapidly changing and increasing.
Without a solid infrastructure in place that can handle your processing, storage, network,
and security needs, it can become extremely difficult to manage.
 High Costs: Infrastructure, storage, and analytics require significant investment.
 Problems with data quality. Data quality directly impacts the quality of decision-making,
data analytics, and planning strategies. Raw data is messy and can be difficult to curate.
Having big data doesn’t guarantee results unless the data is accurate, relevant, and properly
organized for analysis. This can slow down reporting, but if not addressed, you can end up
with misleading results and worthless insights.

Chapter2: Big Data Collection and Ethics 15


Compiled By: Kedir Abdu Email:[email protected]
 Compliance violations. Big data contains a lot of sensitive data and information, making it
a tricky task to continuously ensure data processing and storage meet data privacy and
regulatory requirements, such as data localization and data residency laws.
 Integration complexity. Most companies work with data siloed across various systems and
applications across the organization. Integrating disparate data sources and making data
accessible for business users is complex, but vital, if you hope to realize any value from
your big data.
 Security concerns. Big data contains valuable business and customer information, making
big data stores high-value targets for attackers. Since these datasets are varied and complex,
it can be harder to implement comprehensive strategies and policies to protect them.

2.6 Summary Questions:


1. What is big Data?
2. What are the key characteristics of Big Data, and how do they influence data processing
and analysis?
3. How does Big Data differ from traditional data in terms of size, structure, and processing
methods?
4. What are the main sources of Big Data collection, and what role do they play in business
analytics?
5. Discuss the ethical concerns associated with Big Data collection and usage. How can
businesses address these challenges?
6. What are the opportunities and challenges of implementing Big Data analytics in
decision-making and business organizations?

Chapter2: Big Data Collection and Ethics 16


Compiled By: Kedir Abdu Email:[email protected]
Chapter 3
Answering Business Questions with Data analytics

3.1 Quantitative strategies to answer for business questions

3.1.1 Predicting analyzed data

Predictive analytics is the use of data to predict future trends and events. It uses historical data to
forecast potential scenarios that can help drive strategic decisions.
📘 Example: Predicting Monthly Sales for a Store

🎯 Goal: Use past business data to predict next month's sales.

Marketing_Spend ($) Store_Footfall Avg_Discount (%) Monthly_Sales ($)

1000 2000 10 25,000

500 1500 5 18,000

1500 2500 12 30,000

400 1200 4 15,000

1100 2100 8 26,500

Step 2: Analyze the Data


Patterns we discover:
More marketing spend = more foot traffic = more sales
Higher discounts boost sales (to a point)
Foot traffic is a strong predictor
Step 3: Prediction Formula (Simplified)
A basic predictive model (like linear regression) might learn something like:
Step 4: Make a Prediction

🔍 For a new month:


Chapter 3: Answering Business Questions with Data analytics

Compiled By: Kedir Abdu Email:[email protected]


17
Marketing_Spend = $800
Store_Footfall = 1800
Avg_Discount = 7%

📐 Plug into the formula:


Monthly_Sales ≈ 10 * Marketing_Spend + 5 * Store_Footfall + 100 * Avg_Discount – 15000
Sales ≈ 10*800 + 5*1800 + 100*7 - 15000
= 8000 + 9000 + 700 - 15000
= 24,700 - 15000
= $9,700

👉 Predicted Monthly Sales = $9,700

3.1.1 Evaluating information

Evaluating information is the process of reviewing and analyzing data to ensure it is reliable,
relevant, and suitable for solving a specific business problem. Before you can trust your data for
analysis, predictions, or strategic decisions, you need to evaluate it. Bad or misleading data leads
to poor business outcomes.

✅ Key Aspects of Evaluating Information:

Criteria What It Means in Business Analytics

Accuracy Is the data correct? Are there typos, duplicates, or errors?

Relevance Is the data related to the business question you’re answering?

Completeness Are there missing values? Do you have all the necessary variables?

Is the data up to date? Does it reflect current trends and customer


Timeliness
behavior?

Consistency Is the data format and logic uniform across sources?

Credibility Is the source of data trustworthy? Internal systems? Third-party reports?

🏢 Business Example:

Chapter 3: Answering Business Questions with Data analytics

Compiled By: Kedir Abdu Email:[email protected]


18
Let’s say you're trying to forecast product sales.

❌ If your data includes incomplete customer records, outdated sales, or duplicate transactions,
your predictions will be unreliable.

✅ Evaluating the data first helps you clean it, fill gaps, and focus only on relevant variables,
ensuring better insights and decisions

🔧 Tools & Techniques Used:


Data validation (Excel, SQL, Python)
Exploratory data analysis (Pandas, Tableau, Power BI)
Descriptive statistics (mean, median, null counts)
Data cleaning scripts (removing duplicates, fixing formats)

3.1.2 Identifying causal relationship

Identifying a causal relationship means figuring out whether one thing actually causes another
to happen — not just that they happen together (which would be correlation), but that one
directly influences the other.
3.1 Difference between correlation and causation
In analytics, correlation and causation both describe relationships between variables. However, the
two terms are not interchangeable and have significant differences. Causation indicates that one
event causes another. Correlation only identifies that a relationship exists between two events or
outcomes.

In a situation where two variables have a similar response to an event, you may assume that one
event caused the other or that the two variables are somehow directly connected. However, this
isn’t always the case, making it important to be able to distinguish between correlation and
causation. Explore correlation versus causation as well as how to differentiate these two terms
from one another when describing the relationship between variables.

What is correlation?

Chapter 3: Answering Business Questions with Data analytics

Compiled By: Kedir Abdu Email:[email protected]


19
Correlation measures the linear relationship between variables. In a positive correlation, when the
value of one variable goes up, the other does as well. When one variable goes down, the other
variable descends, too.

A negative correlation describes the opposite—as one variable goes up, the other goes down, with
the two variables moving in opposite directions. If no relationship exists between variables, you
would say zero correlation is present.

You can represent the strength of the relationship between variables using a correlation coefficient
ranging from -1 to +1, where the closer the linear relationship is to zero, the weaker the correlation
is:

1 = Perfect positive correlation

0.5 = Weak positive correlation

0 = Zero correlation

-0.5 = Weak negative correlation

-1 = Perfect negative correlation

You can also use scatter plots to visualize correlations. If you have a positive correlation, you will
notice points on the scatter plot moving up from left to right and points moving down from left to
right if a negative correlation is present. A scatter plot representing variables with no correlation
will have points that appear spread throughout the graph.

Limitations exist when it comes to how much you can learn from correlations, as correlation alone
isn’t enough to prove causation. Additionally, correlations are only able to establish linear
relationships between variables.

Even when variables are strongly correlated, it doesn’t prove a change in one variable caused the
change in the other. To be able to do that, you must establish causation. Causation occurs when
one variable is directly responsible for the change in the other. This is much more difficult to prove
than correlation and requires experimentation using both independent and controlled variables.

Chapter 3: Answering Business Questions with Data analytics

Compiled By: Kedir Abdu Email:[email protected]


20
What is causation?

Causation occurs when one variable is directly responsible for the change in the other. In other
words, a change in one variable causes a change in another variable. Causation can be more
challenging to prove than correlation and requires experimentation using both independent and
controlled variables.

In order to prove causation, you need a properly designed experiment that demonstrates these three
conditions:

Temporal sequencing: Temporal sequencing states that X, referring to the variable causing the
change, comes before Y, the variable that changes.

✔️ Example:
The company launches the marketing campaign on May 1st. online sales begin increasing
steadily starting May 3rd.
The campaign (X) occurred before the increase in sales (Y), not the other way around.

Non-spurious relationship: A non-spurious relationship means that you can demonstrate with
certainty that the relationship between X and Y couldn’t occur simply by chance.

✔️ Example:

The marketing team analyzes data across multiple months and finds that every time similar
campaigns are run, sales increase. When no campaigns are active, sales remain flat. Statistical
analysis confirms the correlation is strong and unlikely to be random.
This consistent pattern suggests the campaign is likely contributing to sales growth.

Elimination of alternative causes: By eliminating alternative causes, you are stating that the
relationship between X and Y isn’t due to other outside variables that aren’t considered part of the
experiment.

Chapter 3: Answering Business Questions with Data analytics

Compiled By: Kedir Abdu Email:[email protected]


21
Example:
The team checks for other variables — there were no discounts, holidays, or product launches
during the time sales went up. No other factors could have caused the spike.
With these variables eliminated, it’s more reasonable to conclude that the marketing campaign is
causing the increase in sales.

If your experiment fails to demonstrate temporal sequencing, a non-spurious relationship, or


eliminate any possible alternative causes, you can’t prove causation. A complication of causation
compared to correlation is that it’s difficult to prove that one thing causes another.

3.2 Data mining vs. Data Analysis


Introduction

In the age of big data, organizations are increasingly relying on techniques like data mining
and data analysis to make informed decisions. Both processes are integral to transforming raw
data into actionable insights, yet they serve distinct purposes and are often confused with one
another. This topic delves into the difference between data mining and data analysis, exploring
their definitions, processes, applications, and how they contribute to data-driven decision-
making.

What is Data Mining?

Data mining is the process of discovering hidden patterns and relationships within large datasets.
It involves using data mining algorithms and analytical methods to sift through vast amounts of
data, uncovering hidden patterns and insights that may not be immediately obvious. The primary
goal of data mining is to extract valuable information that can predict outcomes and support
decision-making.

Data mining is often associated with business intelligence, where it plays a critical role in
identifying trends, customer preferences, and potential risks. For instance, companies use data
mining for fraud detection, marketing campaigns, and sales data analysis. It is a crucial component
of knowledge discovery, where data is transformed from unstructured or semi-structured formats
into meaningful insights.

Chapter 3: Answering Business Questions with Data analytics

Compiled By: Kedir Abdu Email:[email protected]


22
Key components of data mining include:

 Data Preparation: Transforming raw data into a format suitable for mining.
 Data Exploration: Examining datasets to identify potential patterns.
 Data Mining Techniques: Employing methods like clustering, classification, and
association to discover patterns.
 Pattern Evaluation: Assessing the discovered patterns to determine their significance.
 Knowledge Representation: Presenting the findings in a comprehensible format, such as
data visualization.

What is Data Analysis?

Data analysis, on the other hand, refers to the systematic examination of datasets to interpret data,
draw conclusions, and support decision-making. Unlike data mining, which focuses on discovering
hidden data patterns only, data analysis is concerned with understanding the data at hand, testing
hypotheses, and making data-driven decisions based on the analysis.

The data analysis process typically involves:

Data Collection: Gathering data from various sources, including structured and unstructured
data.

Data Preparation: Cleaning and organizing data for analysis.

Exploratory Data Analysis (EDA): Investigating data characteristics, identifying outliers, and
summarizing data distributions.

Statistical Analysis: Applying statistical methods to test hypotheses and draw conclusions.

Data Visualization: Creating visual representations like bar charts to communicate findings.

Interpretation: Drawing meaningful insights from the analysis to inform decision-making.

Data analysis is widely used across industries, including finance, healthcare, and human
resources. For example, data scientists analyze historical data to forecast future trends, optimize
marketing campaigns, and improve customer satisfaction. In essence, data analysis provides a

Chapter 3: Answering Business Questions with Data analytics

Compiled By: Kedir Abdu Email:[email protected]


23
thorough understanding of the available data, enabling organizations to make informed
decisions.

Data Mining vs. Data Analysis: Key Differences

While both data mining and data analysis are essential for extracting meaningful information
from data, they differ from data science in several ways:

Primary Focus:

Data Mining: Focuses on discovering hidden patterns within large datasets to predict outcomes.

Data Analysis: Focuses on examining and interpreting data to draw conclusions and test
hypotheses.

Process:

Data Mining: Involves a more complex and algorithm-driven process, including the use of
machine learning and artificial intelligence.

Data Analysis: Involves a systematic approach to exploring and interpreting data, often relying
on statistical methods.

Data Structure:

Data Mining: Can work with both structured and unstructured data.

Data Analysis: Primarily deals with structured data, though it can also handle semi-structured
data.

Applications:

Data Mining: Used in fraud detection, predictive analytics, and discovering patterns in customer
behavior.

Data Analysis: Applied in business intelligence, financial forecasting, and sales data analysis.

Outcome:

Chapter 3: Answering Business Questions with Data analytics

Compiled By: Kedir Abdu Email:[email protected]


24
Data Mining: Aims to uncover unknown insights that can influence future behavior and trends.

Data Analysis: Aims to provide a deeper understanding of the current data, enabling better
support decision making.

The Intersection of Data Mining and Data Analysis

While data mining and data analysis have distinct roles, they often intersect in the broader field of
data science. Data scientists use both techniques to extract valuable insights from data, helping
organizations make data-driven decisions. For instance, a data scientist might use data mining
techniques to identify a pattern in customer behavior and then apply data analysis to understand
the factors driving that behavior.

Additionally, data mining can serve as a precursor to data analysis. After discovering a data pattern,
through data mining, an analyst might conduct further analysis to validate the findings and
understand the underlying causes. This iterative process of mining and analysis is crucial for
making accurate predictions and informed decisions.

Tools and Technologies in Data Mining and Data Analysis

Both data mining and data analysis leverage a wide range of tools and technologies to process and
interpret data. Some popular tools include:

Data Mining Tools: Rapid Miner, KNIME, Weka, SAS Enterprise Miner, and Apache Mahout.

Data Analysis Tools: Sprinkle Data, Microsoft Excel, R, Python, Tableau, and SPSS.

These tools enable professionals to handle large datasets, mine data warehouse perform complex
calculations, and visualize data for easier interpretation. The choice of tool often depends on the
specific needs of the project and the complexity of the data.

The Role of Data Mining and Data Analysis in Business Intelligence

Business intelligence (BI) relies heavily on both data mining and data analysis to transform data
into actionable insights. By combining these techniques, organizations can not only uncover

Chapter 3: Answering Business Questions with Data analytics

Compiled By: Kedir Abdu Email:[email protected]


25
patterns in their various data sets but also understand the implications of these patterns on their
business.

For example, in the retail industry, data mining can help identify trends in customer preferences,
while data analysis can provide insights into the effectiveness of marketing strategies. Together,
these insights enable businesses to optimize their operations, improve customer satisfaction, and
drive growth.

Future Trends in Data Mining and Data Analysis

As the volume of data continues to grow, the importance of data mining and data analysis will
only increase. Emerging technologies such as artificial intelligence and machine learning are
enhancing these processes, making it possible to analyze larger datasets and uncover more
complex patterns between data points.

In the future, we can expect to see more advanced data mining and predictive analysis algorithms
that can handle big data and unstructured data more efficiently. Additionally, the integration of
predictive analytics with data mining and data analysis will enable organizations to not only
understand their data but also predict future outcomes with greater accuracy.

Conclusion

Understanding the difference between data mining and data analysis is crucial for leveraging these
techniques effectively. While data mining focuses on discovering hidden patterns and predicting
outcomes, data analysis is concerned with both examining data sets and interpreting data to make
informed decisions. Together, they form the backbone of data science and are essential for driving
data-driven decision-making in today's business environment.

As organizations continue to generate vast amounts of data, the ability to extract meaningful
insights through data mining and data analysis will become increasingly valuable. By using data
analytics and staying at the forefront of these techniques, businesses can gain a competitive edge,
uncover new opportunities, and navigate the complexities of the digital age.

Chapter 3: Answering Business Questions with Data analytics

Compiled By: Kedir Abdu Email:[email protected]


26
3.3 Summary Questions
1. What is predictive analytics, and how can it be used to forecast future business
outcomes?
2. Why is it important to evaluate information before using it for data analysis or decision-
making? What key factors should be checked during evaluation?
3. What does identifying a causal relationship mean in the context of business analytics, and
how is it different from identifying a correlation?
4. What are the three essential conditions required to prove a causal relationship in business
experiments? Provide an example.
5. How can correlation be misleading if it's mistaken for causation? Give a business-related
example.
6. What is the difference between data mining and data analysis in terms of their goals,
processes, and applications?
7. How do data mining and data analysis work together in the broader scope of business
intelligence and decision-making?
8. What tools and technologies are commonly used for data mining and data analysis? How
do they support analytics processes?
9. Why is it important for businesses to understand the difference between data mining and
data analysis when working with large datasets?
10. How are trends like artificial intelligence and machine learning shaping the future of data
mining and data analysis?

Chapter 3: Answering Business Questions with Data analytics

Compiled By: Kedir Abdu Email:[email protected]


27
Chapter 4

The Basic Tools of Business Analytics

To be continued…

Chapter 4: The Basic Tools of Business Analytics

Compiled By: Kedir Abdu Email:[email protected] 28

You might also like