VIRTUAL LABS: BATCH-7
Roll Number:
MACHINE LEARNING VIRTUAL LAB
TITLE: Wheat Seed Prediction
Batch Number-7
Dataset Link: [Link]
Description of the Dataset:
The dataset is comprised with the kernels belonging to three different varieties of wheat:
Kama, Rosa and Canadian, 70 elements each, randomly selected for the experiment. High
quality visualization of the internal kernel structure was detected using a soft X-ray
technique. It is non-destructive and considerably cheaper than other more sophisticated
imaging techniques like scanning microscopy or laser technology. The images were recorded
on 13x18 cm X-ray KODAK plates. Studies were conducted using combine harvested wheat
grain originating from experimental fields, explored at the Institute of Agrophysics of the
Polish Academy of Sciences in Lublin. The data set can be used for the tasks of classification
and cluster analysis. The Wheat Seeds Dataset involves the prediction of species given
measurements of seeds from different varieties of wheat. It is a multiclass (3-class)
classification problem. The number of observations for each class is balanced. There are 210
observations with 7 input variables and 1 output variable. The variable names are as follows:
1. Area.
2. Perimeter.
3. Compactness
4. Length of kernel.
5. Width of kernel.
6. Asymmetry coefficient.
7. Length of kernel groove.
8. Class (1, 2, 3).
9.
SOURCE CODE: Classification using Logistic Regression
import pandas as pd
import numpy as np
import [Link] as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from [Link] import MinMaxScaler
from [Link] import confusion_matrix
from sklearn.linear_model import LogisticRegression
from sklearn.feature_selection import SelectKBest, f_classif
from [Link] import classification_report,
accuracy_score, f1_score, precision_score, recall_score,
roc_curve, roc_auc_score
VRSEC Dept of AI&DS
VIRTUAL LABS: BATCH-7
Roll Number:
df = pd.read_csv('[Link]')
y = [Link][:, "Type"].values
X = [Link](['Type'], axis=1)
selector = SelectKBest(score_func=f_classif, k=5)
X_selected = selector.fit_transform(X, y)
selected_indices = selector.get_support(indices=True)
selected_features = [Link][selected_indices]
print(selected_features)
scaler = MinMaxScaler(feature_range=(0, 1))
X_selected_normalized = scaler.fit_transform(X_selected)
x_train, x_test, y_train, y_test =
train_test_split(X_selected_normalized, y, test_size=0.33,
random_state=123)
lr = LogisticRegression(solver='liblinear', max_iter=1000)
[Link](x_train, y_train)
y_pred_train = [Link](x_train)
print("Confusion Matrix - Training Data:")
print(confusion_matrix(y_train, y_pred_train))
print("---------------------------------------")
print("Training Accuracy:", accuracy_score(y_train,
y_pred_train))
y_pred = [Link](x_test)
print("Confusion Matrix - Test Data:")
print(confusion_matrix(y_test, y_pred))
print("---------------------------------------")
print("Testing Accuracy:", accuracy_score(y_test, y_pred))
print("---------------Performance Metrics--------------")
cm1=confusion_matrix(y_test,y_pred)
[Link](cm1,annot=True)
[Link]('Predicited Values')
[Link]('Actual Values')
[Link]("Confusion Matrix")
[Link]()
print(classification_report(y_test,y_pred))
print("-------------------Validation----------------------")
data=[[17.89, 18.99 , 4.777 , 6.6, 5.678]]
df_test=[Link](data)
result=[Link](df_test)
print(result)
VRSEC Dept of AI&DS
VIRTUAL LABS: BATCH-7
Roll Number:
OUTPUTS:
Index(['Area', 'Perimeter', 'KernelLength', 'KernelWidth',
'KernelGroove'], dtype='object')
Confusion Matrix - Training Data:
[[43 3 3]
[ 1 44 0]
[ 2 0 37]]
---------------------------------------
Training Accuracy: 0.9323308270676691
Confusion Matrix - Test Data:
[[12 5 0]
[ 0 23 0]
[ 3 0 23]]
---------------------------------------
Testing Accuracy: 0.8787878787878788
-------------Performance Metrics------------------------
VRSEC Dept of AI&DS
VIRTUAL LABS: BATCH-7
Roll Number:
precision recall f1-score support
1 0.80 0.71 0.75 17
2 0.82 1.00 0.90 23
3 1.00 0.88 0.94 26
accuracy 0.88 66
macro avg 0.87 0.86 0.86 66
weighted avg 0.89 0.88 0.88 66
-------------------Validation----------------------
[2]
RESULT: Thus the classification of wheat seeds was performed successfully
and got verified.
VRSEC Dept of AI&DS