Machine Learning: Data Loading, Splitting, and Model Training
This document includes the basic steps of building a machine learning model in Python using pandas and
scikit-learn.
1. Importing Required Libraries:
-------------------------------
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from [Link] import accuracy_score
2. Loading CSV Data using pandas:
---------------------------------
data = pd.read_csv('your_file.csv')
3. Splitting Features (X) and Target (y):
-----------------------------------------
# Assuming 'target' is the column you want to predict
X = [Link]('target', axis=1)
y = data['target']
4. Splitting into Training and Testing Sets:
--------------------------------------------
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
5. Training the Model (Logistic Regression):
--------------------------------------------
model = LogisticRegression()
[Link](X_train, y_train)
6. Predicting on Test Data:
---------------------------
y_pred = [Link](X_test)
7. Evaluating the Model:
------------------------
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)
Make sure to replace 'your_file.csv' and 'target' with the actual file name and column name you're using.