0% found this document useful (0 votes)
19 views5 pages

Lab Exercise 2

Uploaded by

srujalprusty555
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views5 pages

Lab Exercise 2

Uploaded by

srujalprusty555
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

LAB-EXERCISE FOR ML LAB(18-09-24)

1. Write a Python program to load the iris data from a given csv file into a dataframe and
print the shape of the data, type of the data and first 3 rows.
import pandas as pd
data = pd.read_csv("iris.csv")
print("Shape of the data:")
print(data.shape)
print("\nData Type:")
print(type(data))
print("\nFirst 3 rows:")
print(data.head(3))
2. Write a Python program using Scikit-learn to print the keys, number of rows-
columns, feature names and the description of the Iris data.
import pandas as pd
iris_data = pd.read_csv("iris.csv")
print("\nKeys of Iris dataset:")
print(iris_data.keys())
print("\nNumber of rows and columns of Iris dataset:")
print(iris_data.shape)
3. Write a Python program to get the number of observations, missing values and nan
values.
import pandas as pd
iris = pd.read_csv("iris.csv")
print(iris.info())
4. Write a Python program to view basic statistical details like percentile, mean, std etc.
of iris data.
import pandas as pd
data = pd.read_csv("iris.csv")
print(data.describe())
5. Write a Python program to drop Id column from a given Dataframe and print the
modified part. Call iris.csv to create the Dataframe.
import pandas as pd
data = pd.read_csv("iris.csv")
print("Original Data:")
print(data.head())
new_data = data.drop('Id',axis=1)
print("After removing id column:")
print(new_data.head())
6. Write a Python program to access first four cells from a given Dataframe using the
index and column labels. Call iris.csv to create the Dataframe.
import pandas as pd
data = pd.read_csv("iris.csv")
print("Original Data:")
print(data.head())
new_data = data.drop('Id',axis=1)
print("After removing id column:")
print(new_data.head())
x = data.iloc[:, [1, 2, 3, 4]].values
print(x)
7. Write a Python program to create a plot to get a general Statistics of Iris data.
import pandas as pd
import matplotlib.pyplot as plt
iris = pd.read_csv("iris.csv")
iris.describe().plot(kind = "area",fontsize=16, figsize = (15,8), table = True,
colormap="Accent")
plt.xlabel('Statistics',)
plt.ylabel('Value')
plt.title("General Statistics of Iris Dataset")
plt.show()
8. Write a Python program to create a Bar plot to get the frequency of the three species
of the Iris data.
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
iris = pd.read_csv("iris.csv")
ax=plt.subplots(1,1,figsize=(10,8))
sns.countplot(x='Species',data=iris)
plt.title("Iris Species Count")
plt.show()
9. Write a Python program to create a Pie plot to get the frequency of the three species
of the Iris data.
import pandas as pd
import matplotlib.pyplot as plt
iris = pd.read_csv("iris.csv")
ax=plt.subplots(1,1,figsize=(10,8))
iris['Species'].value_counts().plot.pie(explode=[0.1,0.1,0.1],autopct='%1.1f%
%',shadow=True,figsize=(10,8))
plt.title("Iris Species %")
plt.show()
10. Write a Python program to create a graph to find relationship between the sepal length
and width.
import pandas as pd
import matplotlib.pyplot as plt
iris = pd.read_csv("iris.csv")
fig=iris[iris.Species=='Iris-
Setosa'].plot(kind='scatter',x='SepalLengthCm',y='SepalWidthCm',color='orange',
label='Setosa')
iris[iris.Species=='Iris-
versicolor'].plot(kind='scatter',x='SepalLengthCm',y='SepalWidthCm',color='blue',
label='versicolor',ax=fig)
iris[iris.Species=='Iris-
virginica'].plot(kind='scatter',x='SepalLengthCm',y='SepalWidthCm',color='green',
label='virginica', ax=fig)
fig.set_xlabel("Sepal Length")
fig.set_ylabel("Sepal Width")
fig.set_title("Sepal Length VS Width")
fig=plt.gcf()
fig.set_size_inches(12,8)
plt.show()

11. Write a Python program to create a graph to find relationship between the petal length
and width.
import pandas as pd
import matplotlib.pyplot as plt
iris = pd.read_csv("iris.csv")
fig=iris[iris.Species=='Iris-
setosa'].plot.scatter(x='PetalLengthCm',y='PetalWidthCm',color='orange',
label='Setosa')
iris[iris.Species=='Iris-
versicolor'].plot.scatter(x='PetalLengthCm',y='PetalWidthCm',color='blue',
label='versicolor',ax=fig)
iris[iris.Species=='Iris-
virginica'].plot.scatter(x='PetalLengthCm',y='PetalWidthCm',color='green',
label='virginica', ax=fig)
fig.set_xlabel("Petal Length")
fig.set_ylabel("Petal Width")
fig.set_title(" Petal Length VS Width")
fig=plt.gcf()
fig.set_size_inches(12,8)
plt.show()
12. Write a Python program to create a graph to see how the length and width of
SepalLength, SepalWidth, PetalLength, PetalWidth are distributed.
import pandas as pd
import matplotlib.pyplot as plt
iris = pd.read_csv("iris.csv")
# Drop id column
new_data = iris.drop('Id',axis=1)
new_data.hist(edgecolor='black', linewidth=1.2)
fig=plt.gcf()
fig.set_size_inches(12,12)
plt.show()
13. Write a Python program to create a joinplot to describe individual distributions on the
same plot between Sepal length and Sepal width.
Note: joinplot - Draw a plot of two variables with bivariate and univariate graphs.
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
iris = pd.read_csv("iris.csv")
fig=sns.jointplot(x='SepalLengthCm', y='SepalWidthCm',
data=iris, color='blue')
plt.show()

14. Write a Python program to create a joinplot using "hexbin" to describe individual
distributions on the same plot between Sepal length and Sepal width.
Note:
The bivariate analogue of a histogram is known as a "hexbin" plot, because it shows
the counts of observations that fall within hexagonal bins. This plot works best with
relatively large datasets. It's available through the matplotlib plt.hexbin function and
as a style in jointplot(). It looks best with a white background.
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
iris = pd.read_csv("iris.csv")
fig=sns.jointplot(x='SepalLengthCm', y='SepalWidthCm', kind="hex", color="red",
data=iris)
plt.show()
15. Write a Python program to create a joinplot using "kde" to describe individual
distributions on the same plot between Sepal length and Sepal width.
Note:
The kernel density estimation (kde) procedure visualize a bivariate distribution. In
seaborn, this kind of plot is shown with a contour plot and is available as a style in
jointplot().
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
iris = pd.read_csv("iris.csv")
fig=sns.jointplot(x='SepalLengthCm', y='SepalWidthCm', kind="kde", color='cyan',
data=iris)
plt.show()
16. Write a Python program to create a joinplot and add regression and kernel density fits
using "reg" to describe individual distributions on the same plot between Sepal length
and Sepal width.
17. Write a Python program to draw a scatterplot, then add a joint density estimate to
describe individual distributions on the same plot between Sepal length and Sepal
width.
18. Write a Python program to create a joinplot using "kde" to describe individual
distributions on the same plot between Sepal length and Sepal width and use '+' sign
as marker.
Note:
The kernel density estimation (kde) procedure visualize a bivariate distribution. In
seaborn, this kind of plot is shown with a contour plot and is available as a style in
jointplot().
19. Write a Python program to create a pairplot of the iris data set and check which flower
species seems to be the most separable.
20. Write a Python program to create a box plot (or box-and-whisker plot) which shows
the distribution of quantitative data in a way that facilitates comparisons between
variables or across levels of a categorical variable of iris dataset. Use seaborn.

You might also like