0% found this document useful (0 votes)

2 views2 pages

Data Python

This cheat sheet provides methods and code examples for conducting exploratory data analysis using Python. It covers various techniques such as correlation matrices, scatter plots, regression plots, box plots, grouping attributes, and creating pivot tables. Additionally, it includes methods for calculating the Pearson coefficient and p-value for attribute pairs.

Uploaded by

ayushman2258r

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views2 pages

Data Python

Uploaded by

ayushman2258r

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2

Data Analysis with Python

Cheat Sheet: Exploratory Data Analysis

Package/Method Description Code Example

df.corr()

Complete dataframe Correlation matrix created using all

correlation the attributes of the dataset.

df[['attribute1','attribute2',...]].corr()

Specific Attribute Correlation matrix created using

correlation specific attributes of the dataset.

from matlplotlib import pyplot as plt

plt.scatter(df[['attribute_1']],df[['attribute_2']])

Create a scatter plot using the data

points of the dependent variable along
Scatter Plot
the x-axis and the independent
variable along the y-axis.

import seaborn as sns

sns.regplot(x='attribute_1',y='attribute_2', data=df)

Uses the dependent and independent

variables in a Pandas data frame to
Regression Plot
create a scatter plot with a generated
linear regression line for the data.

import seaborn as sns

sns.boxplot(x='attribute_1',y='attribute_2', data=df)

Create a box-and-whisker plot that

uses the pandas dataframe, the
Box plot
dependent, and the independent
variables.

df_group = df[['attribute_1','attribute_2',...]]

Create a group of different attributes

Grouping by attributes of a dataset to create a subset of the
data.
a) df_group = df.groupby(['attribute_1'],as_index=False).mean()
a. Group the data by different b) df_group = df.groupby(['attribute_1','attribute_2'],as_index=False).mean()
categories of an attribute, displaying
the average value of numerical
attributes with the same category.
GroupBy statements b. Group the data by different
categories of multiple attributes,
displaying the average value of
numerical attributes with the same
category.

grouped_pivot = df_group.pivot(index='attribute_1',columns='attribute_2')

Create Pivot tables for better

Pivot Tables representation of data based on
parameters

from matlplotlib import pyplot as plt

plt.pcolor(grouped_pivot, cmap='RdBu')

Create a heatmap image using a

Pseudocolor plot PsuedoColor plot (or pcolor) using
the pivot table as data.

From scipy import stats

pearson_coef,p_value=stats.pearsonr(df['attribute_1'],df['attribute_2'])

Pearson Coefficient and p- Calculate the Pearson Coefficient and

value p-value of a pair of attributes

Python Data Analysis: Exploratory Data Analysis
No ratings yet
Python Data Analysis: Exploratory Data Analysis
1 page
Advanced Plot Types With Seaborn
No ratings yet
Advanced Plot Types With Seaborn
8 pages
Data Analysis W Pandas
No ratings yet
Data Analysis W Pandas
4 pages
Pandas Plots
No ratings yet
Pandas Plots
14 pages
BDA File
No ratings yet
BDA File
26 pages
ML with Python: Data Visualization Guide
No ratings yet
ML with Python: Data Visualization Guide
7 pages
2 Program
No ratings yet
2 Program
8 pages
3D Scatter Plot with Matplotlib
No ratings yet
3D Scatter Plot with Matplotlib
13 pages
Seaborn
No ratings yet
Seaborn
7 pages
Exploratory Data Analysis (EDA) in Python
No ratings yet
Exploratory Data Analysis (EDA) in Python
6 pages
Machine Learning: Technical Requirements & Data Processing Guide
No ratings yet
Machine Learning: Technical Requirements & Data Processing Guide
30 pages
Eda Lab Assignment2
No ratings yet
Eda Lab Assignment2
10 pages
Pandas
No ratings yet
Pandas
25 pages
Seaborn Data Visualization Guide
No ratings yet
Seaborn Data Visualization Guide
49 pages
ML Expt 1 Description
No ratings yet
ML Expt 1 Description
15 pages
DVA Practical
No ratings yet
DVA Practical
19 pages
Seaborn Data Visualization Guide
No ratings yet
Seaborn Data Visualization Guide
17 pages
Pandas Cheat Sheet 2
No ratings yet
Pandas Cheat Sheet 2
12 pages
Exp 12 and 15
No ratings yet
Exp 12 and 15
4 pages
Data Science and Analtics Laboratory
No ratings yet
Data Science and Analtics Laboratory
21 pages
DSA Lab Manual Pgms - fINAL
No ratings yet
DSA Lab Manual Pgms - fINAL
34 pages
Ai&Ml Bail606 ML Lab Manual
No ratings yet
Ai&Ml Bail606 ML Lab Manual
50 pages
Data Visualization Techniques Guide
No ratings yet
Data Visualization Techniques Guide
48 pages
Ex No 10
No ratings yet
Ex No 10
5 pages
Unit2 Modified
No ratings yet
Unit2 Modified
42 pages
Presentation
No ratings yet
Presentation
19 pages
Data Visualization
No ratings yet
Data Visualization
23 pages
4 PythonPandas
No ratings yet
4 PythonPandas
8 pages
Advanced Plot Types With Seaborn
No ratings yet
Advanced Plot Types With Seaborn
4 pages
Visualization Library Documentation
No ratings yet
Visualization Library Documentation
16 pages
Unit 5 Descriptive Statistics
No ratings yet
Unit 5 Descriptive Statistics
7 pages
Experiment No 9
No ratings yet
Experiment No 9
13 pages
Data Visualization Using Matplotlib
No ratings yet
Data Visualization Using Matplotlib
10 pages
Data Visualization Techniques Guide
No ratings yet
Data Visualization Techniques Guide
9 pages
Lab 5 &6
No ratings yet
Lab 5 &6
6 pages
Eda Code Snippets
No ratings yet
Eda Code Snippets
17 pages
EDA of Iris Dataset in Python
No ratings yet
EDA of Iris Dataset in Python
12 pages
Python Comands
No ratings yet
Python Comands
3 pages
Aanchal Borse
No ratings yet
Aanchal Borse
9 pages
Univariate Analysis in Machine Learning
No ratings yet
Univariate Analysis in Machine Learning
17 pages
Exp 2 SDK Ok
No ratings yet
Exp 2 SDK Ok
18 pages
EDA Cheatsheet - Class Note
No ratings yet
EDA Cheatsheet - Class Note
29 pages
Types of Data Attributes Explained
No ratings yet
Types of Data Attributes Explained
11 pages
Cheat Sheet Plotting With Matplotlib Using Pandas
No ratings yet
Cheat Sheet Plotting With Matplotlib Using Pandas
4 pages
Data Analysis Graphs
No ratings yet
Data Analysis Graphs
9 pages
Pandas 3-2
No ratings yet
Pandas 3-2
27 pages
Python Libraries
No ratings yet
Python Libraries
27 pages
Solution For Mid Sem Paper
No ratings yet
Solution For Mid Sem Paper
7 pages
EDA+Cheatsheet+ +Class+Note
No ratings yet
EDA+Cheatsheet+ +Class+Note
29 pages
Seaborn EDA for Python Users
No ratings yet
Seaborn EDA for Python Users
39 pages
Project Documentation
No ratings yet
Project Documentation
10 pages
EDA Cheatsheet - Class Note
No ratings yet
EDA Cheatsheet - Class Note
29 pages
Tung Wah College GEN3005 / GED3005 Big Data and Data Sciences
No ratings yet
Tung Wah College GEN3005 / GED3005 Big Data and Data Sciences
7 pages
Code Shabab Error 7
No ratings yet
Code Shabab Error 7
5 pages
EDA+Cheatsheet+ +Class+Note
No ratings yet
EDA+Cheatsheet+ +Class+Note
29 pages
Lec-5 Seaborn
No ratings yet
Lec-5 Seaborn
30 pages
CS-3361-Data-science-lab Manual
No ratings yet
CS-3361-Data-science-lab Manual
36 pages
MA Economics MCQ
No ratings yet
MA Economics MCQ
13 pages
Causal Inference & Informer for IWU Forecasting
No ratings yet
Causal Inference & Informer for IWU Forecasting
23 pages
Understanding Root Mean Square Error
No ratings yet
Understanding Root Mean Square Error
1 page
Mathematical Statistics With Applications 7th Edition Miller Solutions Manual Available Full Chapters
100% (6)
Mathematical Statistics With Applications 7th Edition Miller Solutions Manual Available Full Chapters
162 pages
Bayesian Classifier Midterm Exam Solutions
No ratings yet
Bayesian Classifier Midterm Exam Solutions
4 pages
Multiple Regression & Model Building
No ratings yet
Multiple Regression & Model Building
20 pages
Measures to Summarize Data
No ratings yet
Measures to Summarize Data
44 pages
Null and Alternative Hypothesis
No ratings yet
Null and Alternative Hypothesis
4 pages
Dynamic Linear Models & Learning
No ratings yet
Dynamic Linear Models & Learning
11 pages
Development and Validation of An Internationally Reliable Short-Form of The Positive and Negative Affect Schedule (PANAS)
No ratings yet
Development and Validation of An Internationally Reliable Short-Form of The Positive and Negative Affect Schedule (PANAS)
16 pages
Bayesian Analysis of GE Distribution
No ratings yet
Bayesian Analysis of GE Distribution
11 pages
SPSS Analysis: Correlation Matrix Results
No ratings yet
SPSS Analysis: Correlation Matrix Results
3 pages
Stat q4
No ratings yet
Stat q4
5 pages
Lead Score Case Study - Presentation
33% (3)
Lead Score Case Study - Presentation
17 pages
ch12 0
No ratings yet
ch12 0
43 pages
Time Series Forecasting - SoftDrink - Business Report
75% (4)
Time Series Forecasting - SoftDrink - Business Report
37 pages
Growth Functional Capacities and Motivation For Achievement and Competitiveness in Youth Basketball An Interdisciplinary Approach
No ratings yet
Growth Functional Capacities and Motivation For Achievement and Competitiveness in Youth Basketball An Interdisciplinary Approach
8 pages
Measures of Dispersion Quiz
No ratings yet
Measures of Dispersion Quiz
4 pages
MTH2103 Extra Questions V1 20240228
No ratings yet
MTH2103 Extra Questions V1 20240228
54 pages
LET Reviewer: Prof Ed 2017
No ratings yet
LET Reviewer: Prof Ed 2017
16 pages
DSBDA Mock Insem Question Bank
No ratings yet
DSBDA Mock Insem Question Bank
2 pages
JSO (Test - 10) Paid
No ratings yet
JSO (Test - 10) Paid
6 pages
Raihan Ilham Ramadhan - Tugas Minggu 3 - Statistika 1C PDF
No ratings yet
Raihan Ilham Ramadhan - Tugas Minggu 3 - Statistika 1C PDF
3 pages
BBM 350 Course Outline 2024
No ratings yet
BBM 350 Course Outline 2024
1 page
Business Statistics Chapter 5
No ratings yet
Business Statistics Chapter 5
43 pages
Econometrics for Bachelor Students
No ratings yet
Econometrics for Bachelor Students
4 pages
MGT 340 Help A Guide To Career/snaptutorial
No ratings yet
MGT 340 Help A Guide To Career/snaptutorial
7 pages
Analysis of Variance: Mcgraw-Hill/Irwin
No ratings yet
Analysis of Variance: Mcgraw-Hill/Irwin
84 pages
DataScience Lab
No ratings yet
DataScience Lab
28 pages
Statistics Foundations Capstone Deck TEMPLATE
No ratings yet
Statistics Foundations Capstone Deck TEMPLATE
21 pages

Data Python

Uploaded by

Data Python

Uploaded by

Data Analysis with Python

Cheat Sheet: Exploratory Data Analysis

Package/Method Description Code Example

Complete dataframe Correlation matrix created using all

Specific Attribute Correlation matrix created using

from matlplotlib import pyplot as plt

Create a scatter plot using the data

import seaborn as sns

Uses the dependent and independent

import seaborn as sns

Create a box-and-whisker plot that

Create a group of different attributes

Create Pivot tables for better

from matlplotlib import pyplot as plt

Create a heatmap image using a

From scipy import stats

Pearson Coefficient and p- Calculate the Pearson Coefficient and

You might also like