Exploratory Data Analysis
Reference : [Link]
learning/lecture/KYAbU/introduction-to-exploratory-data-analysis-eda
Exploratory Data Analysis (EDA) analyse the data to find its main features.
Sampling Data Frames
sample = [Link](n=5, replace = True)
print([Link][:,-3:])
Visualization Libraries
Matplotlib
Pandas (via Matplotlib)
Seaborn
Scatter plot with Matplotlib
import [Link] as plt
[Link](data.x, data.y, ls = '', marker = 'o')
Histograms
[Link](data.x, bins = 25)
Customising Plots
fig, ax = [Link]()
[Link]([Link](10), [Link][:10])
# Setting postion of ticks and tick labels
ax.set_yticks([Link](0.4, 10.4, 1.0))
ax.set_yticklabels([Link](1, 11))
[Link](xlabel = 'xlabel', ylabel = 'ylabel', title = 'Title')
Customising Plots by Group
[Link]('species').mean().plot(color = ['red', 'blue', 'black', 'green'],
fontsize = 10.0, figsize=(4,4))
Using Seaborn
[Link](data, hue = 'species', size =3)
Hexbin
This shows density of the plot. ![[Exploratory Data [Link]]]
[Link](x=data['sepal_length'], y=data['sepal_width'], kind='hex')
Facet Grid
histogram for different features
![[Exploratory Data [Link]]]
# first plot statement
plot = [Link](data, col='species', margin_titles=True)
[Link]([Link], 'sepal_width', color='green')
# second plot statement
plot = [Link](data, col='species', margin_titles=True)
[Link]([Link], 'sepal_length', color='blue')
Example EDA Analysis
This analyse uses a data set containing iris species sepal and petal measurements which is saved in
'iris_data.csv'.