Data Analytics
Data Analytics
Data analytics is the science of analyzing raw data to find trends and answer question.
FUTURE SCOPE
With the increasing amount of data being generated every day, data analysts are expected to
continue to be demand in the future.
Python
Python is both compiled and interpreted, object-oriented, high-level programming language
with dynamic semantics.
Guido Van Rossum named Python from the British sketch-comedy series Monty Python’s Flying
Circus, of which he was a big fan.
Features of Python
Object
Oriented
Easy to learn & Dynamically
Use Typed
Interpreted
Extensible
language
Cross Platform
Large Standard Libray
Language
Data Science
Web Development
Data Engineering
Machine Learning
Artificial Intelligence
Data Analytics
Installing PyCharm
Print Statement
To call a print function in python we just need to write print followed by parentheses () and value
written inside quotation marks “”.
Input -
Print (“Hello World”)
Output -
Hello World
Comment in Python
Single Line comments
To add single line comment, #hash is used.
Python completely ignores anything written after #.
Multiline comments
To add multiline comments in python, triple Quotations are used.
Variables
Variables are placeholder, which can store a value.
Input –
A = “hello world”
Print (a)
Output –
hello world
User-inputs
To ask for the input from the user. Default datatype is string
Input:
Name = input(“enter your name here”)
Print(Name)
Output:
Entered name by the user:
Input:
Age =int(input(“enter your age here”)
Print(age)
Output:
Entered age by the user
TypeCasting and Subtypes
Conversion of one datatype to another is called as type-casting.
Implicit Type Conversion: where python itself converts one datatype to another.
Explicit Type Conversion: where the user converts one data to another
Problem Solving
Write a program to display a person’s name, age and address in three different lines.
Write a program to details from a student for ID-card and then print it in different lines.
Write a program to take an user input as integer then convert into float.
Operators and Operands
Operators indicates what operation is to be performed while Operands indicates on what
x+y=0
Types of Operators:
1. Arithmetic Operators
2. Comparison Operators
3. Logical Operators
4. Assignment Operators
5. Identity Operators
6. Membership Operators
7. Bitwise Operators
Floor Division (//) 8/3 = for getting values before decimals = 8//3 = 2
Comparison Operators
< Less than
!= Not Equal to
== Equal to
Logical Operators
Operator Meaning Example
And True if both the operands are true x and y 3>2 and 3>7
Assignment Operators
Assignment Operators are used in Python to assign value to variables.
a = 6 is a simple assignment operator that assign the value 6 on the right to the variable a on the left.
= x=6 x=6
+= x+=6 x=x+6
-= x-=6 x=x–6
*= x*=6 x = x*6
Identity Operators
Identity Operators are used to compare items to see if they are the same object with the same
memory address.
Types:
1. Is
2. Is not
Bitwise Operators
These Operators are used to compare the Binary number
Types:
1 = on
0 = off
AND Operator:
Operation Result
0&0 0
1&0 0
0&1 0
1&1 1
Binary number
10
2 10 0
2 2 0 10 =1010
1 1 8 =1000
8 1000
15
2 15 1
2 7 1
2 3 1
1 1
OR Operator:
Operation Result
1|0 1 10 =1010
0|1 1 8 =1000
1|1 1 10 1010
XOR Operator:
Operation Result
1^0 1 10 =1010
0^1 1 8 =1000
1^1 0 2 0010
10>>2
10 = 1010 = 0010 = 2
10>>1
10 = 1010 = 0101 = 5
Add 0 in end
10<<1
10 = 1010 = 10100 = 20
10<<2
10 = 1010 = 101000 = 40
Membership Operators
Membership Operators are used to check the presence of the sequence in an object.
Types:
1. In
2. Not in
Conditional Statements
Conditional Statement allows computer to execute a certain condition only if it is true.
1. If the Statement
2. If-else Statement
3. If-elif-else Statement
4. Nested Statement
5. Short hand If Statement
6. Short hand If-else Statement
If the Statement
The If statement is the most fundamental decision-making statement
If expression
Statement
Mark = 87
If (condition) :
Body of if
If - Else Statement
If-else statement is used when you want to give two conditions to the computer.
if condition:
else:
A = 10
If (condition):
(body of if)
Else:
(body of else)
If-elif-else Statement
In this case, the if condition is evaluated first. if it is false, the elif statement will be executed, if it also
comes false then else statement will be executed.
If (condition):
(body of if)
Elif (condition):
(body of elif)
Else:
(body of else)
Nested IF Statement
A Nested IF statement is one in which an If statement is nestled inside another If statement. This is
used when a variable must be processed more than once. The Nested if statement in Python has the
following syntax: if (condition1):
if (condition 2):
If (condition1):
Body of if
If (condition2)
Body of if
Else:
Body of else
if condition: statement
if (condition): Body of if
Problem Solving
1. Write a program to check if a number is positive.
5. Write a program to check if a number is a single digit number, 2- digit number and so on.., up to 5
digits.
Introduction to Loops
A loop means to repeat something in the exact same way.
1. For loop
2. While loop
3. While True
4. Nested loop
For Loop
For loop is a loop that repeat something in a given range.
The range has a starting point, ending point and step/gap in it.
+1 is added to the ending point while defining a range.
1,6 = range
2 = gap
n=7
While loop
While Loop executes till the given condition is true.
In while loop, the increment is done inside the loop.
While (condition):
(Body of while)
(Increment)
While True
It is an infinite loop
To break a while true loop, break statement is used.
while True:
print("hello")
this creates infinite loop
while True:
num1 = int(input("Enter a number here: "))
num2 = int(input("Enter another number here: "))
print(num1+num2)
repeat = input("Do you want to stop the program: ")
if repeat == "yes":
break
when repeat is yes only that case this loop stop. For stop this loop must be given break statement.
Nested loop
A loop inside a loop is called is called as nested loop.
Nested loops are also used to solve patter problems.
For loop:
For loop:
(body of inner for loop)
(body of outer for loop)
Break Statement:
Break Statement is used when you want to destroy a loop at a certain condition and come out of the
loop.
for i in range(1,11):
if i == 5:
continue
else:
print(i)
Problem Solving
Write a program to find sum of first 10 odd numbers using while loop.
Write a program to convert the whole string into lower and upper cases.
Write a program to convert the following string into a title.
Write a program to find the index number of “fit in”.
1
12
123
1234
12345
Write a program to display this patter
11111
2222
333
44
5
Write a program to display this patter
*
**
***
****
*****
Write a program to display this patter
1
21
321
4321
54321
Write a program to display this patter
*
**
***
****
*****
****
***
**
*
Write a program to display this pattern
1
24
369
4 8 12 16
5 10 15 20 25
6 12 18 24 30 36
7 14 21 28 35 42 49
8 16 24 32 40 48 56 64
Write a program to display this following pattern
String Manipulation/Functions
Strings are the combination of number, symbols and letters, enclosed inside doubles quotations.
a = (“hello world”)
print (a)
a = “hello world”
1) Length
print(len(a)) = 11
2) Count
print(a.count(“o”) = 2
3) Upper
print(a.upper())
4) Lower
print(a.lower())
5) Index
print(a.index(“o”))
6) Capitalize
print(a.capitalize())
7) Casefold
print(
8) Find
print(a.find(“l”))
9) Format
print(a.format(Name))
10) Center
print(name.center(10))
print(name.center(10,’*’))
isalpha - Reture True if all characters in the string are in the alphabet
endswith() - Reture true if the string ends with the specified value
startswith() - Reture true if the string starts with the specified value
swapcase() - Swaps cases, lower case becomes upper case and vice versa
replace() - Reture a string where a specified value is replaced with a specified value
rindex() - Searches the string for a specified value and retures the last position of where it
was found
rfind() - Searches the string for a specified value and retures the last position of where it
was found
Slicing in Strings
First colon “:” is for range of printing & Second colon “:” is for giving gap in printing
-num before the : is using for backward counting of range & -num after the :is using for reverse
printing or +num for gapping
Problem Solving
Q. Write a program to get Fibonacci series up to 10 numbers.
Palindrome integer = 131 111 1221 read from starting or end it would be same. 123 134 = it is not
Palindrome intreger.
A = "OOTD.YOLO.ASAP.BRB.GTG.OTW"
Write a program to separate the following string into coma(,) separated values.
Z = "F.R.I.E.N.D.S."
Write a program to check if every word in a string begin with a capital letter.
Introduction to lists
List:
Lists are the collection of ordered and mutable data.
Slicing List:
First colon “:” is for range of printing & Second colon “:” is for giving gap in printing
-num before the : is using for backward counting of range & -num after the :is using for reverse
printing or +num for gapping
List Iteration:
Iteration using For Loop
List Functions:
To find the length of a list
To count an occurence of a particular element
To add to the list
To add to a specific location
To remove from a list
To remove from a certain location
to create a copy of a list
to access an element
to entend the list
to reverse the list
to sort the list
to clear all the data from list
List Comprehension:
l1 = [30,40,50,60]
l2 = []
for i in l1:
if i>45:
l2.append(i)
print(l1,"\n",l2)
l3 = [i for i in l1 if i>45]
print(l3)
Problem Solving
A = ["Ross", "Rachel","Monica","Joe"]
B = [13,7,12,10]
For tuples no brackets are mandatory. By choice one can use parenthese.
The value inside a Tuple is separated by coma(,).
Once created, tuples cannot be changed.
Multiple datatypes can be written inside a tuples.
-num before the : is using for backward counting of range & -num after the :is using for reverse
printing or +num for gapping
a.count(“--”)
a.index(“--”)
Use a.append(“**”)
Iteration in Dictionary
Printing all the key names one by one
Dictionary Functions
get item
keys values
copy setdefault
update pop
popitem clear
Nested Dictionary
Employee = {1: {“Name”:”John”,”age”:24}
2:{“Name”:”Lily”,”Gender”:”Female”}}
Problem Solving
Write a python program to sort a dictionary by value.
Write a python script to print a dictionary where the keys are numbers between 1 to 15 and the
values are square of keys.
Sets Functions
add isdisjoint
pop issubset
remove issuperset
discard update
copy clear
Union
Difference
Difference update
Intersection
Intersection Update
Symmetric Differerce
Problem Solving
Write a program to find max and min in a set.
Write a Python program to remove an item from a set if it is present in the set.
Functions help break our program into smaller parts and helps it look more organized and
manageable.
Functions
(Body) Var ()
Arguments are the values passed to the parameters while calling the function.
In simple word, recursion means a function can call itself, giving us a benefit of looping through data
in order to get a result.
Disadv.:
Global Variables are not restricted to one block of code and be changed throughout the program.
Problem Solving
Write a function to find maximum of three number in Python.
Write a Python function to create and print a list where the values are square of numbers between 1
and 30.
Write a Python function that takes a number as a parameter and check if the number is prime or not.
%B = Month
%Y = year
%p = pm/am
%M = mintues
%S = Second
%f = microsec
Random
Math
Creation of Modules
To create a module, all you need to do is create a .py file in a similar path to your python file. Inside
that file, you can add required function you need your program to perform.
To call the module inside your program, all you need to do is use import keyword followed by the
name of your .py file.
Main.py demo.py
a = demo.add(2,3) return(x+y)
print(a)
NumPy is a package that define a multi-dimension array object and associates fast math functions
that operate on
It also has function for working in domain of linear Algebra, Fourier Transformation and Matrices.
Arrays:
An array is defined as a collection of items that are stored at contiguous memory locations.
It is a container which can hold a fixed number of items, and these items should be of the same type.
A combination of arrays saves a lot of time, The array can reduce the overall size of the code.
A list can store different datatypes, while you can’t do that in an Array.
A list is easier to modify since a list store a list store each element individually, it is easier to add and
delete an element than an array does.
In Lists one can have nested data with different size, while you can’t do the same in Array.
Creation, Indexing and Slicing of NumPy Arrays:
NumPy - Creating Arrays, Slicing and Attributes
Inspection an Arrays
Function:
g=a–b
np.subtract(a,b)
b +a n
p.add(b,a)
a/b
np.divide(a,b)
a*b
np.multiply(a,b)
np.exp(b)
np.sqrt(b)
np.pow(a)
np.sort()
np.where(condition)
np.searchsorted()
fa = condition
new = arr[fa]
Here are just a few of the thing that pandas does well:
The name “Pandas” has reference to both “Panel Data”, and “Python Data Analysis” and was created
by Wes McKinney in 2008.
Pandas Applications
Easy handling of missing data.
Size Mutability: Columns can be inserted and deleted from DataFrame and higher dimensional
objects.
Automatic and explicit data alignment: object can be explicitly aligned to a set of labels, or the user
can simply ignore the labels and let Series, DataFrame, etc.
Intelligent label-based slicing, fancy indexing, and subsetting of large data sets.
Series in Pandas
Pandas Series is one-dimensional labelled array capable of holding data of any type (integer, string,
float, python objects, etc.) The axis labels are collectively called index. Pandas Series is nothing but a
column in an excel sheet.
The object supports both integer and label-based indexing and provides a host of method for
performing operations involving the index.
DataFrames
Pandas DataFrame is two-dimensional size-mutable, potentially heterogeneous tabular data
structure with labelled axes (rows and columns). A Data frame is a two-dimensional data structure,
i.e., data is aligned in a tabular fashion in row and column. Pandas DataFrame consists of three
principal components, the data, row, and columns.
data["Bonus"] = (data["Salary"]/100)*20
print(data)
Group By in Pandas
gp = data.groupby("Job Title").agg({"EEID":"count"})
gp = data.groupby("Department").agg({"EEID":"count"})
gp = data.groupby(["Department","Gender"]).agg({"EEID":"count"})
gp1 = data.groupby("Country").agg({"Age":"mean"})
gp2 = data.groupby("Country").agg({"Annual Salary":"max"}
gp3 = data.groupby(["Country","Gender"]).agg({"Annual Salary":"max"})
gp4 = data.groupby(["Country","Gender"]).agg({"Annual Salary":"max","Age":"min"})
print(df.pivot(index="keys",columns="Names",values="Houses"))
print(df.pivot(index="keys",columns="Names",values=["Houses","Grades"]))
Melt
print(pd.melt(df,id_vars=["Names"],value_vars=["Houses"]))
print(pd.melt(df,id_vars=["Names"],value_vars=["Houses","Grades"]))
print(pd.melt(df,id_vars=["Names"],value_vars=["Houses","Grades"],var_name="Houses&Gr
ades",value_name="values"))
Introduction to Matplotlib
Data Visualization
In the world of Big Data, data visualization tools and technologies are essential to analyze massive
amounts of information and make data-driven decisions.
Name Apples
Apples Eaten This Month
Jill 13
16
Jack 11
14
Susan 7
12
Adrian 10
10
Sam 6
8
Seth 14
6
Maria 10
4
Jamal 9
2
0
Jill Jack Susan Adrian Sam Seth Maria Jamal
Matplotlib:
Matplotlib is a low-level graph plotting library in python that serves as a visualization utility.
Matplotlib Chart:
Bar Plot Matplotlib
import matplotlib.pyplot as plt
y = [98,67,88,95,88]
x = ["Part1","Part2","Part3","Part4","Part5"]
color = ["Red","Green","Blue","Yellow","Orange"]
plt.bar(x,y,color = color)
or
plt.bar(x,y,color = "red")
or
plt.ylabel("Popularity",fontsize = 17)
plt.show
import pandas as pd
import matplotlib.pyplot as plt
data = pd.read_excel("D:/Smriti/Expenses.xlsx")
df = pd.DataFrame(data)
print(df)
grouped_by = df.groupby("Payment Mode")["Amount"].sum()
print(grouped_by)
plt.bar(grouped_by.index,grouped_by.values)
plt.show()
Line Plot-Matplotlib
import pandas as pd
data = pd.read_excel("D:/Smriti/Expenses.xlsx")
df = pd.DataFrame(data)
#print(df)
grouped_by = df.groupby("Category")["Amount"].sum()
print(grouped_by)
plt.plot(grouped_by.index,grouped_by.values)
plt.show()
# plt.plot(x,y,marker ="o")
#plt.plot(x,y,marker ="*")
#plt.plot(x,y,marker ="D")
plt.legend()
plt.show()
Scatter Plot-Matplotlib
import numpy as np
x = np.random.randint(1,10,50)
y = np.random.randint(10,100,50)
color = np.random.randint(10,100,50)
size = np.random.randint(10,100,50)
print(x,y)
plt.colorbar()
plt.show()
data = pd.read_excel("D:/Smriti/Employee_Data.xlsx")
#print(data,type(data))
df = pd.DataFrame(data)
size = df["Age"]
plt.show()
or
plt.scatter(df["Age"],df["EEID"],s = 10)
plt.show()
x = [22,35,30,3,10]
c = ["red","purple","blue","Magenta","orange"]
ex =[0,0.1,0,0,0]
plt.show()
data = pd.read_excel("D:/Smriti/Expenses.xlsx")
df = pd.DataFrame(data)
print(group_by)
ex = [0,0,0.1]
plt.show()
Q3 = 75% Q3 = 3n+1/4
Q1 = 25% Q1 = n+1/4
Q1 = 10/4 = 2.5th
Q3 = 3*10/4 = 7.5th
24 is outlier
Histogram Matplotlib
data = pd.read_excel("D:/Smriti/Employee_Data.xlsx")
df = pd.DataFrame(data)
plt.show()
Violin Plot-Matplotlib
plt.violinplot(df["Annual Salary"],showmedians = True)
plt.show()
plt.plot(df1["Age"])
plt.show()
plt.show()
grouped = df.groupby("Category")[["Calories","Protein","Fat"]].agg("mean")
print(grouped)
plt.stackplot(df["Category"].unique(),grouped["Calories"],grouped["Protein"],grouped["Fat"])
OR
plt.stackplot(grouped.index,grouped["Calories"],grouped["Protein"],grouped["Fat"])
plt.show()
print(group)
plt.show()
Legend Matplotlib
plt.figure(figsize= [5,3])
plt.plot(x,y,label = "Male")
plt.plot(x,y1,label = "Female")
#plt.legend(["a1","a2"] ncols = 2) a1,a2...is for label name; ncols is for create 2 column in label
plt.show()
Subplot Matplotlib
import matplotlib.pyplot as plt
x = [1,2,3,4,5]
y = [45,34,56,23,45]
plt.plot(x,y)
x = [5,6,7,8,9] import matplotlib.pyplot as plt
y = [67,50,66,56,82] x = [1,2,3,4,5]
plt.subplot(2,2,2) y = [45,34,56,23,45]
plt.bar(x,y) plt.subplot(1,2,1)
plt.title("Age")
x = [2,4,6,8,10] plt.bar(x,y)
y = [57,50,60,55,60]
plt.subplot(2,2,3) x = [5,6,7,8,9]
plt.scatter(x,y) y = [67,50,66,56,82]
plt.subplot(1,2,2)
plt.plot(x,y)
x = [1,3,5,7,9] plt.title("Weight")
plt.subplot(2,2,4) plt.show()
plt.step(x,y)
plt.show()
In the world of Big Data, data visualization tools and technologies are essential to analyze massive
amounts of information and make data-driven decisions.
Seaborn
Seaborn is a Python data visualization library based on matplotlib. It provides a high-level interface
for drawing attractive and informative statistical graphics.
Seaborn Chart
import pandas as pd
data = {"Days":[1,2,3,4,5],
"NOP":[50,40,60,30,44]}
df = pd.DataFrame(data)
data = pd.read_excel("D:/Smriti/Employee_Data.xlsx")
#print(data)
color = sns.color_palette("GnBu")
plt.show()
data = sns.load_dataset("tips")
print(data)
#estimator = "mean","median","sum"
plt.plot()
plt.show()
data = sns.load_dataset("titanic")
print(data)
#discrete = True
plt.show()
Scatter Plot Seaborn
data = sns.load_dataset("tips")
#print(data)
plt.show()
data = pd.read_excel("D:/Smriti/Employee_Data.xlsx")
#print(data)
plt.show()
Heatmap Seaborn
data = sns.load_dataset("tips")
gp = data.groupby("day").agg({"tip":"mean"})
sns.heatmap(gp)
data = pd.read_excel("D:/Smriti/Employee_Data.xlsx")
sns.heatmap(gp)
plt.show()
df = pd.read_excel("D:/Smriti/Employee_Data.xlsx")
plt.show()
Violin Plot Seaborn
data = sns.load_dataset("tips")
plt.show()
df = pd.read_excel("D:/Smriti/Employee_Data.xlsx")
plt.show()
plt.show()
sns.pairplot(data,hue = "day")
plt.show()
data = sns.load_dataset("iris")
sns.pairplot(data,hue = "species")
plt.show()
#stripplot use for bar data show in dots form (x axis may be string value)
#dodge = show hue and x value separate/ jitter = gap b/w dots
plt.show()
data = sns.load_dataset("tips")
#scatterplot use for bar data show in dots form on adject data point.(both num value)
plt.show()
plt.show()
sns.set_style(style = "darkgrid")
plt.show()
sns.palplot(sns.color_palette())
plt.show()
a.map(sns.barplot,"day", "tip")
plt.show()
plt.show()
plt.show()
as compare…
data = sns.load_dataset("tips")
plt.show()
plt.show()
plt.show()
sns.histplot(data,x= "total_bill")
plt.show()
Introduction to MySQL for Data Analytics
The software used to store, manage, query, and retrieve data stored in a relational database is
called a relational database management system (RDBMS).
The RDBMS provides an interface between users and applications and the database, as well as
administrative functions for managing data storage, access and performance.
Go Rust C
Reliability
Scalability
High Availability
Security
Flexibility
Installation of MySQL
Download MySQL >>> 2nd File Download and install follow video
Import CSV file >>> Convert file into csv format >>> Schemas >>> Demo >>> click on Table >>> Table
Data Import >>> Select File >>> Next >>> Drop Table (Right) >>>Next >>> Next >>> Execute >>
>Next>>>Finish
select *from demo.>>> Select File(ESD) >>>select *from demo.esd >>>press Ctrl + Enter
User: root
Password: 12346789
For Everything
select Gender, FullName, AnnualSalary, Country from demo.esd where Gender = "Female" or
Country = "China";
select * from demo.esd where not Gender = "Female";
select * from demo.esd order by Department, Age asc; #Both are arrange in ascending order
case
end as Production_Details
from classicmodels.products;
case
end as Order_Type
from classicmodels.orderdetails;
Group By in MySQL
select Department, count(EEID) as Total_Employes from demo.esd group by Department;
Order Shipping
613 Laptop
411 Tablet
on classicmodels.products.productCode = classicmodels.orderdetails.productCode;
on classicmodels.products.productCode = classicmodels.orderdetails.productCode
group by classicmodels.products.productName;
Left join means all data 1st table and common data of both table.
select classicmodels.products.productName,classicmodels.orderdetails.quantityOrdered
from classicmodels.products left join classicmodels.orderdetails
on classicmodels.products.productCode = classicmodels.orderdetails.productCode;
Right join means all data 2nd table and common data of both table.
select classicmodels.products.productName,classicmodels.orderdetails.quantityOrdered
on classicmodels.products.productCode = classicmodels.orderdetails.productCode;
Cross join means all data 1st and 2nd table including common and uncommon data.
select classicmodels.products.productName,
classicmodels.products.quantityInStock,classicmodels.orderdetails.quantityOrdered from
classicmodels.products cross join classicmodels.orderdetails
on classicmodels.products.productCode = classicmodels.orderdetails.productCode;
union
union all means show 1st and 2nd table all different data common data are show twics.
union all
intersect
except means show all diff. data from 1st and 2nd table excepting common data.
except
Subqueries in MySQL
select avg(creditLimit) from classicmodels.customers;
Firstly execute inner query within bracket then execute outer query.
Views in MySQL
For creating of virtual table
group by country;
begin
end &&
Delimiter ;
call classicmodels.get_data();
Delimiter &&
begin
end &&
Delimiter ;
call classicmodels.get_limit(3)
Delimiter &&
create procedure get_credit(out var int)
begin
end &&
Delimiter ;
call classicmodels.get_credit(@abc);
select @adc;
Delimiter &&
begin
end &&
Delimiter ;
set @m = 125;
call classicmodels.get_name(@m);
select @m;
from expense.customer
group by FirstName,Occupation,EducationLevel;
select*, row_number()
select*, rank()
over(partition by product_detail.Product_Name
select*, dense_rank()
over(partition by product_detail.Product_Name
#ranking on the basis like we have 1-5 Bachalors then 6- 12 Graduction in that case all bachalors
have 1st rank then Graduction have 6th rank
#ranking on the basis like we have 1-5 Bachalors then 6- 12 Graduction in that case all bachalors
have 1st rank then Graduction have 2th rank
Introduction to MS Excel Beginner’s Guide
Ribbon have diff. tools
Formula bar
Spreadsheet grid
Status bar
Basic Function
Auto fill Double click and drag (For creation of series)(Sheet1 1,2,3… series)
Flash fill Write in 2 rows flash patter in entire rows >>> Enter(Sheet 2 First Name, Last Name)
Data >>> text to column >>> Delimited >>> Next >>> Basis(Comma,etc) >>> Next >>> Finish(sheet 2
Country , Abbreviations)
Data Validation
Select cells >>> Data >>> Data Validation >>> Choose Date for dates >>> enter start and end >>> if
you will write out of range date, it show error. (sheet 1 Date 2(01-01-2023 to 31-01-2023))
Data >>> Data Validation >>> Choose Text length >>> enter min and max >>> if you will write out of
range date, it show error. (sheet 1 Sub Category 2 (2-20 letter))
Data >>> Data Validation >>> Choose Whole Number >>> enter min and max >>> if you will write out
of range date, it show error. (sheet 1 Amount2 (10 to 50000 max amount))
Data >>> Data Validation >>> Choose List >>> Make list >>> if you will write out of range date, it show
error. (sheet 1 Payment Mode2 (UPI, Cash, Card))
Select cells >>> Data >>> Data Validation >>> Error Alert >>> Write Title >>> Write Error Msg >>> if
you will write out of range date, it show this error. (sheet 1 Date 2 (Title: Invalid Date, Error Msg:
Date should be between 01-01-2023 to 31-01-2023))
Data Connectors
BY LINK
Data >>> From Web >>> Paste link From Chorme for export data from website >>> Select table which
you want and then load >>> Right click on query on table >>> go to LOAD TO >>> Select Table and
select cell where you want show >>> Then OKK. {Table 1}
BY OTHER SOURCE
Get Data >>> Other Source >>> From ODBC >>> select MySQL >>> enter user password (root
12346789) >>> okk >>> select Table which you want from data >>> Then Load {customers}
For ODBC >>> Download ODBC MySQL Driver (64 bit first link) >>> open and accept term condition
>>> choose normal 1st option >> and then okk
Then Search App (ODBC Data source 64 bit) >>> Go to Drive tap on MySQL ODBC 9.0 Unicode driver
>>> Okk >>> Go to User DSN >>Add >>> MySQL ODBC 9.0 Unicode driver >>> enter name, username,
password >>> Then Okk (After this process you while connect with ms excel)
Conditional Formatting
Firstly Select Cells >>> Home >>> Conditional Formatting >>> Highlight Cells Rules >>> A Date
Occurring >>> Select where you need highlighting or on which basis (sheet 3 {Next Month} data must
be in current date, it use for checking last month, yesterday,…etc date)
Conditional Formatting >>> Highlight Cells Rules >>> Text that Contains >>> Select where you need
highlighting or on which basis (sheet 3 {Food}, it use for checking any particular data)
Highlight Cells Rules >>> Greater than, Less than, Between, Equal to, A Date Occurring, Text that
Contains, Duplicate Value
Top/Bottom Rules >>> Top/Bottom 10 item, Top/Bott 10%, Above Avg, Below Avg.
Firstly Select Cells >>> Home >>> Conditional Formatting >>> Data Bars (Price)
Basic Formatting
Home >>> Size Increasing, Changing Font Style, Bold (Ctrl +B), Italic (ctrl +I), Underline (Ctrl +U), Text
Color, Font Color, Diff. types of borders, Diff. types of Alignment, Merger & Center, Wrap Text, Diff.
formate of number like date, %, Currency, time, etc., Reducing or increasing points in number of Data
or any Particular part of data.
Sorting Formatting
Select table >>> Home >>> Editing >>> Sort & Filter >>> Smallest to largest or Largest to Smallest
Home >>> Editing >>> Sort & Filter >>> Custom Sort >>> Add level (Column, Sort On, Order)
Filtering Data
Home >>> Editing >>> Sort & Filter or use Ctrl + Shift + L >>> Add Filter icon >>> Text or Number or
Date Filter shows on basis of color, Greater or less than no., Start with or end with or contain
alphabet, this month or next month etc.
Select cells >>> Home >>> Find & Select >>> Go to Special >>> Blank >>> Okk
Then >>> Ctrl + - >>> Delete >>> Shift cells up >>> okk (Sheet 6)
Select table >>> F5 (Go to) >>> Special >>> Blank >>> Okk >>> Ctrl + - >>> Delete >>> Entire row
(Data)
Select cells >>> F5 (Go to) >>> Special >>> Blank >>> Okk >>> = Cell of downward >>> Ctrl+ Enter
(Sheet 6 B-c)
Select cells >>> F5 (Go to) >>> Special >>> Blank >>> Okk >>> Anything write(NY) >>> Ctrl+ Enter
(Sheet 6 C-c)
Select cells >>> F5 (Go to) >>> Special >>> Blank >>> Okk >>> Take avg of data >>> write >>> Ctrl+
Enter (Sheet 6 D-c)
Data >>> Data Tools >>> Text to column >>> Delimited >>> Comma, Other, Anything Where
to u need separate value
Text Functions
Concatenate:
It is use for merge diff. column text in a column.
=CONCATENATE(text1,text2,text3…)
Lower:
It is use for show text in the small letters.
=LOWER(text)
Upper:
It is use for show text in the Capital letters.
=UPPER(text)
Proper:
It is use for show value in the proper letters 1st Alphabet is Big and then all in small.
=PROPER(text)
Length:
It is use for count no. of length in text.
= =LEN(text)
Left:
It is use of write left value of text.
=LEFT (text, num_char)
Right:
It is use of write right value of text.
=RIGHT (text, num_char)
Mid:
It is use of write mid value of text.
=MID (text,start_num, num_char)
Find:
Find text or any alphabet in text. It is sensitive case must be careful use of Small and big
alphabet.
=FIND (find text, within text, [start num])
Search:
Search text or any alphabet in text. It is not sensitive case use any alphabet Small or big.
=SEARCH (find text, within text, [start num])
Replace:
Replace text or any alphabet in text.
=REPLACE(text, start num, num char, new text)
Substitute:
Substitute of text or any alphabet in text. Use Instance for how many time you need to use
subsitute.
=REPLACE(text, old text, new text,[instance num])
In customer (2) sheet
If:
Given a condition and if it is right show text of ‘Value if True’ otherwise show text of ‘Value if
False’.
=If (logical text, [ Value if True], [ Value if False])
And:
And is use with if function for give Multiple condition. If both condition is satisfied show text
of ‘Value if True’ otherwise show text of ‘Value if False’.
=If (and (logical1, logical2….), [ Value if True], [ Value if False])
Or:
Or is use with if function for give Multiple condition. If any one condition is satisfied show
text of ‘Value if True’ otherwise show text of ‘Value if False’.
=If (or (logical1, logical2….), [ Value if True], [ Value if False])
Esd (4) sheet
=sum(array)
=Count(array)
Countif: it is use for count with criteria
Sheet 9
XLookup
Xlookup is use to look related data to in constant to available data in data.
Ex. If we want Salary or Job Title of any emp Id. This Emp Id is lookup value, emp Id array is lookup
array and Salary or Job Title array is return array.
Hlook is also use for hortizontal Value finding hlookup don't able find upward value on the basis of
downward value.
=HLOOKUP (lookup value, Table array, Row index no., [range lookup])
And also for return array use additional xlookup and create list for lookup array
=XLOOKUP(F27,Table20[EEID],(XLOOKUP(F28,Table20[[#Headers],[FullName]:[Annual
Salary]],Table20[[FullName]:[Annual Salary]])))
Sheet 10
Power Query
Data >>> Get Data >>> From File or else >>> Select a file >>> Transform data >>> Close Load from
power query
Query >>> Edit >>> Opened Power Query >>> Transform >>> click on heading bar on 123 >>> whole
number >>> Replace value
Query >>> Edit >>> Opened Power Query >>> Transform >>> click on heading bar on 123 >>> Text
>>> Replace value
Query >>> Edit >>> Opened Power Query >>> Transform >>> click on heading bar on 123 >>> whole
number >>> Home >>> Remove rows >>> with error
(Postal Code)
Query >>> Edit >>> Opened Power Query >>> Add column >>> Custom column >>> Enter column
name >>> Enter formula
(Full Name
Power Query >>> Add column >>> From Text >>> Extract >>> First Characters >>> 3
(Transform is use for existing column and add column is use for new column)
Sheet product
Power Query >>> Add Column >>> Custom Column >>> (Name = Day Diff., formula =Select shipped
date – order date) >>> Duration >>> Days
Sheet order
For combine for data add excel file in folder >>> go to Data in excel >>> And refresh all.
After importing data Diagram View show modelling lines, cardinality, flow direction.
Importance
File >>> Option >>> Customize Ribbon >>> Select developer >>> okk
File >>> Option >>> Add ins >>> Com add-ins >>> Microsoft Power Pivot
Power pivot >>> Manage >>> open Power Pivot >>> From Other Sources >>> Text File(If csv) or Excel
File >>> Browse (Location) & tick in Use first row as column header >>> next show preview >>> ok
>>> Finish >>> View >>> Diagram view >>> it show all data.
Create a excel file with multiple sheet which have link same where (cs) >>> data >>> From file >>>
excel >>> import >>> open power query navigator >>> Select multiple time >>> select all sheets >>>
Transform data >>> Close and load to
Data tab >>> Data modelling >>> go to Diagram view and move tables >>> Create relations between
diff. data >>> tap on column name and black line join with another column name where you want to
build relation >>> employee no. To employee no. show 1 to 1 relation & sales rep employee to
employee no. show 1 to many relation bcoz in customer data show employee no. multiple times in
sales rep employee where 1 employee represent many customers.
May be 1 table data have relation with multiple tables. But we can build one relation only with one
column basis with another table, we can create multiple column relation with 2 same table its called
data modelling.
Benefit
Customer Sheet >>> Insert >>> Pivot Table from data model >>> new worksheet >>> Employee id add
in row label and customer number in Value as Count
bnhvvn