0% found this document useful (0 votes)
53 views10 pages

Scikit-Learn CodeMixed Beginner Guide

The document is a beginner's guide to using the Scikit-learn library in Python for machine learning, covering key steps such as data loading, preprocessing, model building, prediction, evaluation, and hyperparameter tuning. It provides code snippets for each step, including techniques like standardization, normalization, and PCA. Additionally, it includes a practice task involving the Titanic dataset to reinforce learning.

Uploaded by

zyx10283746
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
53 views10 pages

Scikit-Learn CodeMixed Beginner Guide

The document is a beginner's guide to using the Scikit-learn library in Python for machine learning, covering key steps such as data loading, preprocessing, model building, prediction, evaluation, and hyperparameter tuning. It provides code snippets for each step, including techniques like standardization, normalization, and PCA. Additionally, it includes a practice task involving the Titanic dataset to reinforce learning.

Uploaded by

zyx10283746
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Scikit-learn Code-Mixed Beginner Guide

What is Scikit-learn?

Scikit-learn holo Python er ekta powerful library ja diye supervised o unsupervised machine learning model

banaite paren. Ekhane data preprocessing, model building, evaluation etc. ekdom easy kore dewa hoyeche.
Scikit-learn Code-Mixed Beginner Guide
1. Data Load Kora

from sklearn import datasets

iris = datasets.load_iris()

X = iris.data

y = iris.target

Ekhane X holo features (e.g. petal length, width), ar y holo labels (flower type).
Scikit-learn Code-Mixed Beginner Guide
2. Train-Test Split

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

Ei step e data ke training ar testing part e divide kora hoy. Model shudhu train data diye shikhe, test data diye

performance measure kora hoy.


Scikit-learn Code-Mixed Beginner Guide
3. Preprocessing

Standardization:

from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()

X_train = scaler.fit_transform(X_train)

X_test = scaler.transform(X_test)

Missing Value Handle:

from sklearn.impute import SimpleImputer

imputer = SimpleImputer(strategy='mean')

X_train = imputer.fit_transform(X_train)

Normalization:

from sklearn.preprocessing import Normalizer

normalizer = Normalizer()

X_train = normalizer.fit_transform(X_train)

Binarization:

from sklearn.preprocessing import Binarizer

binarizer = Binarizer(threshold=0.0)

X_bin = binarizer.fit_transform(X_train)

Label Encoding:

from sklearn.preprocessing import LabelEncoder

encoder = LabelEncoder()

y = encoder.fit_transform(y)
Scikit-learn Code-Mixed Beginner Guide
4. Model Building

from sklearn.neighbors import KNeighborsClassifier

model = KNeighborsClassifier(n_neighbors=3)

model.fit(X_train, y_train)

Ekhane KNN algorithm use kore model train kora holo.


Scikit-learn Code-Mixed Beginner Guide
5. Prediction

y_pred = model.predict(X_test)

Test data diye model predict kortese je flower ta kon class e pore.
Scikit-learn Code-Mixed Beginner Guide
6. Evaluation

from sklearn.metrics import accuracy_score, confusion_matrix, classification_report

print("Accuracy:", accuracy_score(y_test, y_pred))

print(confusion_matrix(y_test, y_pred))

print(classification_report(y_test, y_pred))

Ei step e bojha jay model koto bhalo predict kortese.


Scikit-learn Code-Mixed Beginner Guide
7. Grid Search

from sklearn.model_selection import GridSearchCV

params = {"n_neighbors": [1, 2, 3, 4]}

grid = GridSearchCV(model, param_grid=params, cv=4)

grid.fit(X_train, y_train)

print(grid.best_score_)

print(grid.best_params_)

Eta diye best parameter ber kora hoy.


Scikit-learn Code-Mixed Beginner Guide
8. PCA

from sklearn.decomposition import PCA

pca = PCA(n_components=0.95)

X_pca = pca.fit_transform(X_train)

PCA holo dimensionality komanor technique.


Scikit-learn Code-Mixed Beginner Guide
Practice Task

1. Titanic dataset load kore preprocessing koro (handle missing values, encode categorical data).

2. Model build koro (e.g. Decision Tree, Logistic Regression).

3. Accuracy check koro.

4. Grid search use kore best parameter ber koro.

5. Confusion matrix o classification report print koro.

You might also like