0% found this document useful (0 votes)

23 views7 pages

Lab 11,12

The document outlines two experiments: one on statistical analysis using NumPy in Python, detailing various statistical functions like mean, median, standard deviation, and variance, and the second on implementing simple linear regression to predict salary based on years of experience. It provides code examples for calculating statistical measures and constructing a regression line using Python libraries such as NumPy and Pandas. The experiments demonstrate practical applications of statistical methods and data analysis in Python programming.

Uploaded by

Prince Yadav

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views7 pages

Lab 11,12

Uploaded by

Prince Yadav

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

EXPERIMENT NO.

Program Name: Statistical Analysis using Numpy in Python Programming Language

Implementation: Statistics is concerned with collecting and then analyzing that data. It includes methods
for collecting the samples, describing the data, and then concluding that data. NumPy is the fundamental
package for scientific calculations and hence goes hand-in-hand for NumPy statistical Functions.
NumPy contains various statistical functions that are used to perform statistical data analysis. These
statistical functions are useful when finding a maximum or minimum of elements. It is also used to find
basic statistical concepts like standard deviation, variance, etc.

NumPy is equipped with the following statistical functions:

1. [Link]()- This function determines the minimum value of the element along a specified axis.
2. [Link]()- This function determines the maximum value of the element along a specified axis.
3. [Link]()- It determines the mean value of the data set.
4. [Link]()- It determines the median value of the data set.
5. [Link]()- It determines the standard deviation
6. [Link] – It determines the variance.
7. [Link]()- It returns a range of values along an axis.
8. [Link]()- It determines the weighted average
9. [Link]()- It determines the nth percentile of data along the specified axis.
Finding maximum and minimum of array in NumPy

NumPy [Link]()and [Link]()functions are useful to determine the minimum and maximum value of
array elements along a specified axis.
import numpy as np
arr= [Link]([[1,23,78],[98,60,75],[79,25,48]])
print(arr)
#Minimum Function
print([Link](arr))
#Maximum Function
print([Link](arr))
Output
[[ 1 23 78]
[98 60 75]
[79 25 48]]
1
98

Prince Yadav IT3 2100270130133

Finding Mean, Median, Standard Deviation and Variance in NumPy
Mean
Mean is the sum of the elements divided by its sum and given by the following formula:

It calculates the mean by adding all the items of the arrays and then divides it by the number of elements.
We can also mention the axis along which the mean can be calculated.

import numpy as np
a = [Link]([5,6,7])
print(a)
print([Link](a))
Output
[5 6 7]
6.0
Median
Median is the middle element of the array. The formula differs for odd and even sets.

It can calculate the median for both one-dimensional and multi-dimensional arrays. Median separates the
higher and lower range of data values.
import numpy as np
a = [Link]([5,6,7])
print(a)
print([Link](a))
Output
[5 6 7]
6.0
Standard Deviation
Standard deviation is the square root of the average of square deviations from mean. The formula for
standard deviation is:

import numpy as np
a = [Link]([5,6,7])
print(a)
print([Link](a))

Prince Yadav IT3 2100270130133

Output
[5 6 7]
0.816496580927726
Variance

Variance is the average of the square deviations. Following is the formula for the same:

import numpy as np
a = [Link]([5,6,7])
print(a)
print([Link](a))
Output
[5 6 7]
0.6666666666666666
NumPy Average Function

NumPy [Link]() function determines the weighted average along with the multi-dimensional arrays.
The weighted average is calculated by multiplying the component by its weight, the weights are specified
separately. If weights are not specified it produces the same output as mean.
import numpy as np
a = [Link]([5,6,7])
print(a)
#without weight same as mean
print([Link](a))
#with weight gives weighted average
wt = [Link]([8,2,3])
print([Link](a, weights=wt))
Output
[5 6 7]
6.0
5.615384615384615
NumPy Percentile Function

It has the following syntax:

[Link](input, q, axis)
The accepted parameters are:

 input: it is the input array.

 q: it is the percentile which it calculates of the array elements between 0-100.
 axis: it specifies the axis along which calculation is performed.
a = [Link]([2,10,20])

Prince Yadav IT3 2100270130133

print(a)
print([Link](a,10,0))
Output
[ 2 10 20]
3.6
NumPy Peak-to-Peak Function

NumPy [Link]() function is useful to determine the range of values along an axis.
a = [Link]([[2,10,20],[6,10,60]])
print([Link](a,0))
Output
[4 0 40]

Prince Yadav IT3 2100270130133

EXPERIMENT NO. 12

Program Name: Implementation of Linear Regression using Python Programming Language

Implementation:
Simple Linear Regression in Python :
Simple linear regression is a statistical method that we can use to find a relationship between two variables
and make predictions. The two variables used are typically denoted as y and x. The independent variable, or
the variable used to predict the dependent variable is denoted as x. The dependent variable, or
the outcome/output, is denoted as y.A simple linear regression model will produce a line of best fit, or the
regression line. You may have heard about drawing the line of best fit through a scatter plot of data. For
example, let's say we have a scatter plot showing how years of experience affect salaries. Imagine drawing a
line to predict the trend.

The simple linear regression equation we will use is written below. The constant is the y-intercept (𝜷0), or
where the regression line will start on the y-axis. The beta coefficient (𝜷1) is the slope and describes the
relationship between the independent variable and the dependent variable. The coefficient can be positive
or negative and is the degree of change in the dependent variable for every 1-unit of change in the inde-
pendent variable.

For example, let's say we have a re-

gression equation of y = 2 + 0.5x. For
every 1-unit increase in the indepen-
dent variable (x), there will be a 0.50
increase in the dependent variable (y).
Simple Linear Regression Using
Python

For this example, we will be using salary data from Kaggle. The data consists of two columns, years of ex-
perience and the corresponding salary.

First, we will import the Python packages that we will need for this analysis. All we will need is NumPy,
to help with the math calculations, Pandas, to store and manipulate the data and Matplotlib (optional), to
plot the data.
import numpy as np
import pandas as pd
import [Link] as plt
Next, we will load in the data and then assign each column to its appropriate variable. For this example,
we will be using the years of experience to predict the salary, so the dependent variable will be the salary
(y) and the independent variable will be the years of experience (x).
data = pd.read_csv('Salary_Data.csv')
x = data['YearsExperience']
y = data['Salary']
To get a look at the data we can use the .head() function provided by Pandas, which will show us the first
few rows of the data.
print([Link]())

YearsExperience Salary
0 1.1 39343.0
1 1.3 46205.0
Prince Yadav IT3 2100270130133
2 1.5 37731.0
3 2.0 43525.0
4 2.2 39891.0

Above is a scatter plot showing our data. We can see a positive linear relationship between Years of Expe-

rience and Salary, meaning that as a person gains more experience, they also get paid more.
Calculating the Regression Line

While we could spend all day guessing the slope and intercept of the linear regression line, luckily there

are formulas that we can use to quickly make these calculations.

To estimate the slope 𝜷1 of the data we will use the following formula:

To estimate the intercept 𝜷0, we can use the following formula:

Now we will have to translate these two formulas to Python to calculate the regression line. First I will

show the full function, then I will break it down further.

def linear_regression(x, y):
N = len(x)
x_mean = [Link]()
y_mean = [Link]()

B1_num = ((x - x_mean) * (y - y_mean)).sum()

B1_den = ((x - x_mean)**2).sum()
B1 = B1_num / B1_den

B0 = y_mean - (B1*x_mean)

reg_line = 'y = {} + {}β'.format(B0, round(B1, 3))

return (B0, B1, reg_line)

First, we will use the len() function to get the number of observations in our dataset and set this to
the N variable. We can then calculate the mean for both X and Y by simply using the .mean() function.

Prince Yadav IT3 2100270130133

N = len(x)
x_mean = [Link]()

Now we can begin to calculate the slope 𝜷1. To shorten the length of these lines of code, we can calculate
y_mean = [Link]()

and assign it to a variable named 𝜷1. We can just follow the slope formula given above.
the numerator and denominator of the slope formula first then divide the numerator by the denominator

B1_num = ((x - x_mean) * (y - y_mean)).sum()

B1_den = ((x - x_mean)**2).sum()

Now that we have calculated the slope 𝜷1, we can use the formula for the intercept 𝜷0.
B1 = B1_num / B1_den

B0 = y_mean - (B1 * x_mean)

Now if we apply this linear_regression() function to our data, it will return the intercept, slope and the
regression line rounded to 3 decimal places.

Prince Yadav IT3 2100270130133

ML Updated File
No ratings yet
ML Updated File
36 pages
Smec ML Lab Manual R22
No ratings yet
Smec ML Lab Manual R22
21 pages
CO-367 Machine Learning Lab File: Submitted To: Submitted by
No ratings yet
CO-367 Machine Learning Lab File: Submitted To: Submitted by
12 pages
ML Manual New
No ratings yet
ML Manual New
38 pages
AD3411 DATA SCIENCE AND ANALYTICS LAB (2) - Removed
No ratings yet
AD3411 DATA SCIENCE AND ANALYTICS LAB (2) - Removed
24 pages
ML Lab Mala Reddy CLG
No ratings yet
ML Lab Mala Reddy CLG
23 pages
Python Data Analytics Techniques
No ratings yet
Python Data Analytics Techniques
10 pages
Sandeep ML Record
No ratings yet
Sandeep ML Record
31 pages
Machine Learning Lab Word 12-1-2025. Document
No ratings yet
Machine Learning Lab Word 12-1-2025. Document
68 pages
Python Data Analysis Guide
No ratings yet
Python Data Analysis Guide
171 pages
Regression Analysis and Equations
No ratings yet
Regression Analysis and Equations
16 pages
Linear Regression
No ratings yet
Linear Regression
20 pages
Ad3411 - Dsa Lab Manual
No ratings yet
Ad3411 - Dsa Lab Manual
34 pages
Understanding Linear Regression Methods
No ratings yet
Understanding Linear Regression Methods
17 pages
Practical 5
No ratings yet
Practical 5
8 pages
Linear Regression
No ratings yet
Linear Regression
6 pages
Ad3411 - Student
No ratings yet
Ad3411 - Student
27 pages
Dsa Lab
No ratings yet
Dsa Lab
28 pages
FDSA Lab Manual
No ratings yet
FDSA Lab Manual
31 pages
ML Manoj
No ratings yet
ML Manoj
51 pages
Lecture 3
No ratings yet
Lecture 3
42 pages
ML Lab Manual
No ratings yet
ML Lab Manual
21 pages
Machine Learning
No ratings yet
Machine Learning
30 pages
Python Code for Central Tendency
No ratings yet
Python Code for Central Tendency
28 pages
Linear Regression 1
No ratings yet
Linear Regression 1
2 pages
Practical # 10
No ratings yet
Practical # 10
5 pages
Python Simple Linear Regression Guide
No ratings yet
Python Simple Linear Regression Guide
14 pages
Machine Learning 2
No ratings yet
Machine Learning 2
45 pages
FDSA Lab Manual
No ratings yet
FDSA Lab Manual
27 pages
DVA Lab Manual
No ratings yet
DVA Lab Manual
20 pages
Linear Regression Analysis and MSE Evaluation
No ratings yet
Linear Regression Analysis and MSE Evaluation
5 pages
Datascience Lab
No ratings yet
Datascience Lab
24 pages
Linear Regression - Numpy and Sklearn
No ratings yet
Linear Regression - Numpy and Sklearn
7 pages
Linear Regression & Python Guide
No ratings yet
Linear Regression & Python Guide
24 pages
Dsa Lab Manual
No ratings yet
Dsa Lab Manual
17 pages
ML Programs
No ratings yet
ML Programs
41 pages
DS Unit 4
No ratings yet
DS Unit 4
21 pages
Data Science
No ratings yet
Data Science
18 pages
Ad3411-Data Science and Analytics Laboratory
No ratings yet
Ad3411-Data Science and Analytics Laboratory
27 pages
ML Lab (R22) Manual
No ratings yet
ML Lab (R22) Manual
25 pages
CL IV Manual
No ratings yet
CL IV Manual
108 pages
Lab Experiments Vi Sem-1
No ratings yet
Lab Experiments Vi Sem-1
10 pages
Pds Record Document Ds II
No ratings yet
Pds Record Document Ds II
36 pages
FDSA Lab Manual Aim Algorithm
No ratings yet
FDSA Lab Manual Aim Algorithm
32 pages
Fdsa Lab Manual
No ratings yet
Fdsa Lab Manual
17 pages
Lecture-2 Unit 2
No ratings yet
Lecture-2 Unit 2
56 pages
Ad3411 - Data Science and Analytics Laboratory
No ratings yet
Ad3411 - Data Science and Analytics Laboratory
26 pages
Data Science and Analtics Laboratory
No ratings yet
Data Science and Analtics Laboratory
21 pages
Assignment No.4 - (20-Ele-68)
No ratings yet
Assignment No.4 - (20-Ele-68)
17 pages
Simple Linear Regression Overview
No ratings yet
Simple Linear Regression Overview
18 pages
Lab Mannual of ML
No ratings yet
Lab Mannual of ML
43 pages
Fdsa Lab Algorithm
No ratings yet
Fdsa Lab Algorithm
21 pages
Machine Learning Laboratory Exercises
No ratings yet
Machine Learning Laboratory Exercises
16 pages
R22 ML Lab Manual
No ratings yet
R22 ML Lab Manual
25 pages
Assignment 2
No ratings yet
Assignment 2
5 pages
Simple Linear Regression in Machine Learning
No ratings yet
Simple Linear Regression in Machine Learning
7 pages
Machine Learning Lab Experiments Guide
No ratings yet
Machine Learning Lab Experiments Guide
47 pages
Lab 7 Submission PPJ
No ratings yet
Lab 7 Submission PPJ
13 pages
Quantitative Methods: Regression Models, Types of Errors
No ratings yet
Quantitative Methods: Regression Models, Types of Errors
42 pages
Artikel Jurnal Skripsi
No ratings yet
Artikel Jurnal Skripsi
17 pages
05 - Statind2 - Regresi Linier Sederhana Dan Korelasi
No ratings yet
05 - Statind2 - Regresi Linier Sederhana Dan Korelasi
15 pages
Types of Correlation
No ratings yet
Types of Correlation
3 pages
Probability and Statistics Formula Sheet
No ratings yet
Probability and Statistics Formula Sheet
2 pages
Materials Handling Case Study
No ratings yet
Materials Handling Case Study
13 pages
Yang, Logan & Coffey, 1995 - Mathematical Formulae For Calculating The Base Temperature For Growing Degree Days
No ratings yet
Yang, Logan & Coffey, 1995 - Mathematical Formulae For Calculating The Base Temperature For Growing Degree Days
14 pages
Intro Stats 5th Edition Richard D de Veaux Ebook and TestBank Bundle Official Test Bank
No ratings yet
Intro Stats 5th Edition Richard D de Veaux Ebook and TestBank Bundle Official Test Bank
320 pages
Course - Machine Learning Part 1 Batch 2025
No ratings yet
Course - Machine Learning Part 1 Batch 2025
30 pages
Industry Analysis Example 08
No ratings yet
Industry Analysis Example 08
35 pages
Climate Change Trends, Impacts and Adaptation of Upland Farmers in La Trinidad, Benguet, Philippines
No ratings yet
Climate Change Trends, Impacts and Adaptation of Upland Farmers in La Trinidad, Benguet, Philippines
15 pages
MABE Syllabus 2016-17 New
No ratings yet
MABE Syllabus 2016-17 New
91 pages
Latin America's Democratic Deficit
No ratings yet
Latin America's Democratic Deficit
19 pages
Durbin Watson Tables
No ratings yet
Durbin Watson Tables
35 pages
Advanced F-Test Analysis Guide
No ratings yet
Advanced F-Test Analysis Guide
7 pages
Low Latency Trading and The Comovement of Order Flow, Prices, and Market Conditions
No ratings yet
Low Latency Trading and The Comovement of Order Flow, Prices, and Market Conditions
35 pages
Artificial Intelligence Questions
No ratings yet
Artificial Intelligence Questions
15 pages
Prof Oliver Hcra031 Class Note-Research Methodology
No ratings yet
Prof Oliver Hcra031 Class Note-Research Methodology
22 pages
B.Sc. Mathematics Syllabus
0% (2)
B.Sc. Mathematics Syllabus
39 pages
Fundamental Analysis and Relative Valuation Multiples Pg80
No ratings yet
Fundamental Analysis and Relative Valuation Multiples Pg80
411 pages
Python Machine Learning Course
No ratings yet
Python Machine Learning Course
39 pages
Raphael L. Chua, Estefen Emerson R. Contreras, Zahra Mikael M. French, Katrina Angela B. Pangilinan, Janah Andrea W. Tibudan
No ratings yet
Raphael L. Chua, Estefen Emerson R. Contreras, Zahra Mikael M. French, Katrina Angela B. Pangilinan, Janah Andrea W. Tibudan
32 pages
FACTORS AFFECTING CUSTOMER SATISFACTION THECASE OF Cbe
No ratings yet
FACTORS AFFECTING CUSTOMER SATISFACTION THECASE OF Cbe
82 pages
Regression Binomial Negative STATA
No ratings yet
Regression Binomial Negative STATA
14 pages
Alumni Giving and Class Size Impact
No ratings yet
Alumni Giving and Class Size Impact
23 pages
Real Estate Price Prediction with ML
No ratings yet
Real Estate Price Prediction with ML
8 pages
Journal of The Air & Waste Management Association
No ratings yet
Journal of The Air & Waste Management Association
6 pages
Soybean Market Econometric Analysis
No ratings yet
Soybean Market Econometric Analysis
3 pages
A Review On Prediction of Municipal Solid Waste Generation Models
No ratings yet
A Review On Prediction of Municipal Solid Waste Generation Models
7 pages
SEHH2031 Exercises Chapter 11
No ratings yet
SEHH2031 Exercises Chapter 11
6 pages

Lab 11,12

Uploaded by

Lab 11,12

Uploaded by

EXPERIMENT NO.

Program Name: Statistical Analysis using Numpy in Python Programming Language

NumPy is equipped with the following statistical functions:

Prince Yadav IT3 2100270130133

Prince Yadav IT3 2100270130133

It has the following syntax:

 input: it is the input array.

Prince Yadav IT3 2100270130133

Prince Yadav IT3 2100270130133

Program Name: Implementation of Linear Regression using Python Programming Language

For example, let's say we have a re-

are formulas that we can use to quickly make these calculations.

To estimate the intercept 𝜷0, we can use the following formula:

show the full function, then I will break it down further.

B1_num = ((x - x_mean) * (y - y_mean)).sum()

reg_line = 'y = {} + {}β'.format(B0, round(B1, 3))

return (B0, B1, reg_line)

Prince Yadav IT3 2100270130133

B1_num = ((x - x_mean) * (y - y_mean)).sum()

B0 = y_mean - (B1 * x_mean)

Prince Yadav IT3 2100270130133

You might also like