Business Analysis Using Python
Business Analysis Using Python
Analysis using
Python
Contents
Overview of Data Analytics:
Definition of Data Analytics,
Importance in decision-making and business intelligence,
Types of data analytics: Descriptive, Diagnostic, Predictive, and
Prescriptive.
Python for Data Analytics:
Python installation and setup,
Introduction to Python Integrated Development
Environments (IDEs),
Basics of Python programming (variables, data types, loops, and
functions)
Overview of Data Analytics
Definition of Data Analytics:
Data analytics is an important field that involves the process of
collecting, processing, and interpreting data to uncover insights
and help in making decisions. Data analytics is the practice of
examining raw data to identify trends, draw conclusions, and
extract meaningful information. This involves various techniques
and tools to process and transform data into valuable insights
that can be used for decision-making.
Steps in Data Analysis
Define Data Requirements : This involves determining how the
data will be grouped or categorized. Data can be segmented based
on various factors such as age, demographic, income, or gender,
and can consist of numerical values or categorical data.
Data Collection : Data is gathered from different sources,
including computers, online platforms, cameras, environmental
sensors, or through human personnel.
Data Organization : Once collected, the data needs to be
organized in a structured format to facilitate analysis. This could
involve using spreadsheets or specialized software designed for
managing and analyzing statistical data.
Data Cleaning : Before analysis, the data undergoes a cleaning
process to ensure accuracy and reliability. This involves identifying
and removing any duplicate or erroneous entries, as well as
addressing any missing or incomplete data. Cleaning the data helps
to mitigate potential biases and errors that could affect the analysis
results.
Importance of Data Analytics
in Decision-Making and
Business Intelligence
In today’s digital world, data is considered one of the most valuable assets for any business.
Data analytics plays a crucial role in decision-making and business intelligence by providing
actionable insights that drive informed strategies. The importance of data analytics
includes:
Informed Decision-Making: Data analytics helps organizations make decisions based on
facts, reducing reliance on intuition or guesswork. Leaders can use data-driven insights to
allocate resources more efficiently, optimize processes, and identify opportunities for
growth.
Improved Business Intelligence: Business intelligence (BI) is about gathering and
analyzing data to understand market trends, consumer behavior, and internal operations.
Data analytics allows businesses to convert raw data into meaningful information, helping
organizations stay competitive and make strategic choices.
Cost Reduction: By identifying inefficiencies or wasteful spending, companies can optimize
their operations and reduce costs. Analytics helps businesses prioritize resources and
optimize spending.
Predicting Trends: Analytics provides a view into future trends, enabling businesses to
prepare and adapt accordingly, rather than reacting to changes after they happen.
Types of Data Analytics
Descriptive Analytics :
Descriptive analytics is one of the most basic level of classification of analytics used
by almost 90% of organizations. It focuses on answering "What has happened?" by
analyzing real-time and historical data. It helps organizations understand past
successes and failures and provides insights for future decision-making
Example -
Let's take an example of DMart, we can look at the product's history and find out
which products have been sold more or which products have large demand by
looking at the product sold trends and based on their analysis we can further make
the decision of putting a stock of that item in large quantity for the coming year.
Diagnostic Analytics :
Diagnostic analysis works hand in hand with Descriptive analytics. As descriptive
analytics find out what happened in the past, diagnostic analytics, on the other
hand, finds out why did that happen or what measures were taken at that time, or
how frequent it has happened.it basically gives a detailed explanation of a particular
scenario by understanding behavior patterns.
Example -
Let's take the example of Dmart again. Now if we want to find out why a particular
product has a lot of demand, is it because of their brand or is it because of quality. All
this information can easily be identified using diagnostic analytics.
Predictive Analytics :
Whatever information we have received from descriptive and diagnostic
analytics, we can use that information to predict future data. it basically
finds out what is likely to happen in the future. Now when I say future data
doesn't mean we have become fortune-tellers, by looking at the past
trends and behavioral patterns we are forecasting that it might happen in
the future.
Example -
The best example would be Amazon and Netflix recommender system. You
might have noticed that whenever you buy any product from Amazon, on
the payment side it shows you a recommendation saying the customer
who purchased this has also purchased this product that recommendation
is based on the customer purchased behavior in the past. By looking at
customer past purchase behavior analyst creates an association between
each product and that's the reason it shows recommendation when you
buy any product.
The next example would be Netflix, when you watch any movies or web
series on Netflix you can see that Netflix provide you with a lot of
recommended movies or web series, that recommendation is based on
past data or past trends, it identifies which movie or series has gain lot of
public interest and based on that it creates a recommendation
Prescriptive Analytics :
This is an advanced method of Predictive analytics. Now when
you predict something or when you start thinking out of the box
you will definitely have a lot of options, and then we get confused
as to which option will actually work. Prescriptive analytics helps
to find which is the best option to make it happen or work. As
predictive analytics forecast future data, Prescriptive analytics on
the other hand helps to make it happen whatever we have
forecasted. Prescriptive analytics is the highest level of analytics
that is used for choosing the best optimal solution by looking at
descriptive, diagnostic, and predictive data.
Example-
The best example would be Google self-driving Car, by looking at
the past trends and forecasted data it identifies when to turn or
when to slow down, works much like a human driver.
Python Introduction
What is Python?
Python is a popular programming language. It was created by Guido van
Rossum, and released in 1991.
It is used for:
o web development (server-side),
o software development,
o mathematics,
o system scripting.
What can Python do?
Python can be used on a server to create web applications.
Python can be used alongside software to create workflows.
Python can connect to database systems. It can also read and modify files.
Python can be used to handle big data and perform complex mathematics.
Python can be used for rapid prototyping, or for production-ready software
development.
Why Python?
Python works on different platforms (Windows, Mac, Linux,
Raspberry Pi, etc).
Python has a simple syntax similar to the English language.
Python has syntax that allows developers to write programs with
fewer lines than some other programming languages.
Python runs on an interpreter system, meaning that code can be
executed as soon as it is written. This means that prototyping can
be very quick.
Python can be treated in a procedural way, an object-oriented
way or a functional way.
The latest stable version of Python is 3.13.5, released in July
2025. This release includes important bug fixes and
optimizations, enhancing the language's stability and
performance
How to Install Python in
Windows?
To download Python on your system, you can use the following steps
Step 1: Select Version to Install Python
Visit the official page for Python https://www.python.org//downloads/
on the Windows operating system. Locate a reliable version of
Python 3, preferably version 3.10.11, which was used in testing this
tutorial. Choose the correct link for your device from the options
provided: either Windows installer (64-bit) or Windows
installer (32-bit) and proceed to download the executable file.
After Clicking the Install Now Button the setup will start
installing Python on your Windows system. You will see a
window like this.
Step 3: Running the Executable Installer
After completing the setup. Python will be installed on your
Windows system. You will see a successful message.
Step 4: Verify the Python Installation in Windows
Close the window after successful installation of Python. You can
check if the installation of Python was successful by using either
the command line or the Integrated Development
Environment (IDLE), which you may have installed. To access
the command line, click on the Start menu and type "cmd" in the
search bar. Then click on Command Prompt.
Introduction to Python Integrated
Development
An Integrated Development Environment (IDE) for Python is a software
application that provides a comprehensive set of tools and features to
facilitate the development of Python applications. IDEs aim to streamline
the coding process by integrating various functionalities into a single
environment, enhancing productivity and simplifying the development
workflow.
For Python, IDEs help by combining:
Code Editor: Where you write your Python code with features like
syntax highlighting and auto-completion.
Interpreter/Compiler: Executes the Python code.
Debugger: Helps find and fix errors.
Build Automation Tools: To streamline running and packaging your
code.
Version Control Integration: Helps manage code changes.
Popular Python IDEs and Editors
PyCharm: A powerful IDE specifically for Python with professional
and community editions.
Visual Studio Code (VS Code): A lightweight but powerful editor
with Python extensions.
Jupyter Notebook: Ideal for data science and interactive coding
with code, text, and visuals combined.
Spyder: Popular among scientists and engineers for scientific
computing.
IDLE: Python’s built-in simple IDE, great for beginners.
Essential Python Libraries
if condition:
# code block if condition is True
else:
# code block if condition is False
Example:
num = int(input("Enter a number: "))
if num > 0:
print("Positive number")
else:
print("Negative number or zero")
Function and object method calls
What is a function?
A function is a block of reusable code that performs a specific
task.
It’s defined independently using the def keyword.
It can accept input parameters and optionally return a result.
Syntax:
def function_name(parameters):
# function body
return value # optional
Syntax to call a function:
function_name(arguments)
Example:
def method_name(self):
# Method that can use self to access instance attributes
pass
class Employee:
def __init__(self, name, salary):
self.name = name # Employee's name
self.salary = salary # Employee's salary
def display(self):
print(f"Employee: {self.name}, Salary: ${self.salary}")
Example:
Python For Loop with String
This code uses a for loop to iterate over a string and print each
character on a new line. The loop assigns each character to the
variable i and continues until all characters in the string have
been processed.
s = "Geeks"
for i in s:
print(i)
Using range() with For Loop
The range() function is commonly used with for loops to generate a
sequence of numbers. It can take one, two, or three arguments:
Working with
Python Libraries
Contents
Introduction to key Libraries
Numpy
What is NumPy?
NumPy is a Python library used for working with arrays.
It also has functions for working in domain of linear algebra, fourier transform,
and matrices.
NumPy was created in 2005 by Travis Oliphant. It is an open source project and
you can use it freely.
NumPy stands for Numerical Python.
NumPy aims to provide an array object that is up to 50x faster than traditional
Python lists.
The array object in NumPy is called ndarray, it provides a lot of supporting
functions that make working with ndarray very easy.
Arrays are very frequently used in data science, where speed and resources are
very important
Installation of NumPy
If you have Python and PIP already installed on a system, then
installation of NumPy is very easy.
Install it using this command:
C:\Users\Your Name>pip install numpy
Import NumPy
Once NumPy is installed, import it in your applications by adding the
import keyword:
import numpy
Example
import numpy
print(arr)
NumPy as np
NumPy is usually imported under the np alias.
alias: In Python alias are an alternate name for referring to the same thing.
import numpy as np
Now the NumPy package can be referred to as np instead of numpy.
Example
import numpy as np
print(arr)
Data Types in NumPy
NumPy has some extra data types, and refer to data types with one
character, like i for integers, u for unsigned integers etc.
Below is a list of all data types in NumPy and the characters used to
represent them.
i - integer
b - boolean
u - unsigned integer
f - float
c - complex float
m - timedelta
M - datetime
O - object
S - string
U - unicode string
V - fixed chunk of memory for other type ( void )
Checking the Data Type of an Array
The NumPy array object has a property called dtype that returns
the data type of the array:
import numpy as np
print(arr.dtype)
Output: int64
Pandas
What is Pandas?
Pandas is a Python library used for working with data sets.
It has functions for analyzing, cleaning, exploring, and
manipulating data.
The name "Pandas" has a reference to both "Panel Data", and
"Python Data Analysis" and was created by Wes McKinney in
2008.
Why Use Pandas?
Pandas allows us to analyze big data and make conclusions
based on statistical theories.
Pandas can clean messy data sets, and make them readable and
relevant.
Relevant data is very important in data science.
Installation of Pandas
If you have Python and PIP already installed on a system, then
installation of Pandas is very easy.
Install it using this command:
C:\Users\Your Name>pip install pandas
Import Pandas
Once Pandas is installed, import it in your applications by adding
the import keyword:
import pandas
import pandas
mydataset = {
'cars': ["BMW", "Volvo", "Ford"],
'passings': [3, 7, 2]
}
myvar = pandas.DataFrame(mydataset)
print(myvar)
Output:
cars passings
0 BMW 3
1 Volvo 7
2 Ford 2
What is a Series?
A Pandas Series is like a column in a table.
It is a one-dimensional array holding data of any type.
Example
Create a simple Pandas Series from a list:
import pandas as pd
a = [1, 7, 2]
myvar = pd.Series(a)
print(myvar)
Output
0 1
1 7
2 2
dtype: int64
Create Labels
With the index argument, you can name your own labels.
Example
Create your own labels:
import pandas as pd
a = [1, 7, 2]
myvar = pd.Series(a, index = ["x", "y", "z"])
print(myvar)
Output
x 1
y 7
z 2
dtype: int64
Pandas DataFrames
A Pandas DataFrame is a 2 dimensional data structure, like a 2
dimensional array, or a table with rows and columns.
import pandas as pd
data = {
"calories": [420, 380, 390],
"duration": [50, 40, 45]
}
print(df)
Locate Row:Pandas use the loc attribute to return one or more
specified row(s)
import pandas as pd
data = {
"calories": [420, 380, 390],
"duration": [50, 40, 45]
}
#load data into a DataFrame object:
df = pd.DataFrame(data)
print(df.loc[0])
Output:
calories 420
duration 50
Name: 0, dtype: int64
Load Files Into a DataFrame:If your data sets are stored in a
file, Pandas can load them into a DataFrame.
import pandas as pd
df = pd.read_csv('data.csv')
print(df)
plt.plot(xpoints, ypoints)
plt.show()
Seaborn
Seaborn is a Python library for creating statistical graphics. It is built on top of
Matplotlib and integrates closely with pandas data structures. Seaborn aims to make
data visualization a central part of exploring and understanding datasets by providing
a high-level interface for drawing informative and attractive statistical plots.
It offers a wide range of specialized plots for different types of data and analysis,
including:
Relational plots: Scatter plots, line plots (e.g., scatterplot, lineplot).
Distributional plots: Histograms, kernel density estimates, box plots, violin plots (e.g.,
histplot, kdeplot, boxplot, violinplot).
Categorical plots: Bar plots, count plots, point plots (e.g., barplot, countplot,
pointplot).
Multi-variate plots: Joint plots, pair plots for exploring relationships between multiple
variables (e.g., jointplot, pairplot).
Importing:
It is conventionally imported with the alias sns:
import seaborn as sns
import matplotlib.pyplot as plt
Scikit
# Empty list
x = []
print(x)
print(y)
print(z)
The list data type has some more methods. Here are all of the
methods of list objects:
1. append()Adds a single element to the end of a list.
Syntax:
list.append(element)
Example
x = [1, 2, 3]
x.append(4)
print(x)
# Output: [1, 2, 3, 4]
2. sort():
Sorts the list in-place (modifies the original list) in ascending
order by default.
Syntax: list.sort()
Example:
x = [3, 1, 4, 2]
x.sort()
print(x)
# Output: [1, 2, 3, 4]
3. Extend vs Append
append(element) adds a single element to the end of the list.
extend(iterable) adds all elements from another iterable (like
another list) to the end.
Example:
lst = [1, 2, 3]
lst.append([4, 5])
print(lst)
# Output: [1, 2, 3, [4, 5]] ← Adds the whole list as one element
lst = [1, 2, 3]
lst.extend([4, 5])
print(lst)
# Output: [1, 2, 3, 4, 5] ← Adds elements individually
4. Insert
insert(i, x) puts x at index i, pushing the rest to the right.
Example:
lst = [1, 3, 4]
lst.insert(1, 2)
print(lst)
# Output: [1, 2, 3, 4]
Tuples
Set Items
Set items are unordered, unchangeable, and do not allow duplicate values.
Unordered
Unordered means that the items in a set do not have a defined order.
Set items can appear in a different order every time you use them, and
cannot be referred to by index or key.
Unchangeable
Set items are unchangeable, meaning that we cannot change the items
after the set has been created.
Duplicates Not Allowed
Sets cannot have two items with the same value.
Example:
thisset = {"apple", "banana", "cherry", "apple"}
print(thisset)
Get the Length of a Set
To determine how many items a set has, use the len() function.
Example:
thisset = {"apple", "banana", "cherry"}
print(len(thisset))
Output :3
print("banana" in thisset)
Add Items
To add one item to a set use the add() method.
thisset.add("orange")
print(thisset)
Python Dictionaries
Output
Prajjwal
Prajjwal
Adding and Updating Dictionary Items
We can add new key-value pairs or update existing keys by using
assignment.
Example
d = {1: 'Geeks', 2: 'For', 3: 'Geeks'}
# Adding a new key-value pair
d["age"] = 22
# Updating an existing value
d[1] = "Python dict"
print(d)
Output
{1: 'Python dict', 2: 'For', 3: 'Geeks', 'age': 22}
Removing Dictionary Items
We can remove items from dictionary using the following methods:
del: Removes an item by key.
pop(): Removes an item by key and returns its value.
clear(): Empties the dictionary.
popitem(): Removes and returns the last key-value pair.
d = {1: 'Geeks', 2: 'For', 3: 'Geeks', 'age':22}
# Using del to remove an item
del d["age"]
print(d)
# Using pop() to remove an item and return the value
val = d.pop(1)
print(val)
# Using popitem to removes and returns
# the last key-value pair.
key, val = d.popitem()
print(f"Key: {key}, Value: {val}")
# Clear all items from the dictionary
d.clear()
print(d)
Output
{1: 'Geeks', 2: 'For', 3: 'Geeks'}
Geeks
Key: 3, Value: Geeks