0% found this document useful (0 votes)
24 views3 pages

Linear REgression Lab Report 4

The document outlines a lab report detailing the process of data analysis using Python libraries such as NumPy, Matplotlib, and Pandas. It describes steps including data importation, null value checking, encoding categorical variables, and splitting the dataset into training and testing sets. Finally, it discusses training a Linear Regression model and comparing predicted results with actual outcomes.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views3 pages

Linear REgression Lab Report 4

The document outlines a lab report detailing the process of data analysis using Python libraries such as NumPy, Matplotlib, and Pandas. It describes steps including data importation, null value checking, encoding categorical variables, and splitting the dataset into training and testing sets. Finally, it discusses training a Linear Regression model and comparing predicted results with actual outcomes.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

LAB 04 REPORT

Code:

Explanations:
Importing libraries that are useful for our project
The shor form of numpy,matplotlib and pandas as np,plt and pd

Code & Output:

Explanations:
Importing dataset through pandas and
showing data set

Code & Output:


Explanations:
There are 5 columns in which 4 are independent and 1 are dependent. So we are storing
independent columns in x frame, and storing dependent column in y . Then we are showing
the elements of y
Code & Output:

In this we are finding if there is any null or NAN values in my dataset. isnull() is a python
built in function which explores if there is any null values and sum() is also a python built in
function which accumulates all the null values of each column. In the result,we are getting no
null values in each column and as a result no null values in the whole

Code & Output:

There’s a text/string value column in my dataset. So we need to encode the values so that our
models can clearly understand , before encoding we need to watch if that column values are
independent values or dependent values. If they are independent values, then we can
implement OneHotEncoder,but if they are dependent values, then we can implement
LabelEncoder. For OneHotEncoding, we are importing ColumnTransformer and
OneHotEncoder from sklearn and then we are passing props to the ColumnTransformer
object. To place the encoding values in the text/string values we are using fit_transform() .

Code & Output:


Explanation:
Here we are splitting the dataset into two parts, Training set and Testing set.Usually the ratio
is maintained as 80:20 ( 80% data used for Training and 20% used for Testing ) .To split, we
are importing train_test_split from sklearn. Then we’re passing props to train_test_split ,
test_size=0.2 means 20% data will be reserved for testing data, and random state means
The data will be shuffled and the model should pick the testing data in a random order .

What if i don’t use Random State ?

Ans: It will be a risk factor for my model. Suppose i have 100 data, I split the data in a ratio
of 80:20 for training and testing. Here model will pick first 80 data for training and
remaining 20 datas for testing. The remaining 20 data can be of the same value/category. If
we are just testing our model on the same category values, then we are not knowing how the
model is reacting to different categorical values. As a result, it is hard to find out if the
model is actually good or not! So we shuffle the dataset and using random state allows us to
pick the testing data from anywhere from the dataset.

Now we’re importing LinearRegression from sklearn and we are using x_train and y_train for
our model training .
s

we want to see the predictions our model.We are training on x_training and y_training and
our predictions results are getting stored in y_pred. We want to see two more values after
fraction, so we are using precision=2 . if we want to see n more values then precision =n . By
the next line we are showing the actual result and our predicted result.

You might also like