0% found this document useful (0 votes)

1K views12 pages

Lab - Manual FDS

The document describes the contents and objectives of a data science laboratory course. It includes: 1. The course objectives are to understand Python libraries for data science like NumPy, Pandas, and Matplotlib, and to learn statistical analysis, data visualization, and machine learning techniques. 2. The list of experiments cover downloading and using data science packages, working with NumPy arrays, reading and exploring data, and applying techniques like regression, correlation, and plotting. 3. Students will learn to use Python libraries for data science, apply statistical measures, perform descriptive analytics, and present data visually.

Uploaded by

Prabha K

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

1K views12 pages

Lab - Manual FDS

Uploaded by

Prabha K

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

DATA SCIENCE LABORATORY

DATA SCIENCE
LAB MANUAL
(Under Revision)

Prepared & Consolidated

by
Vignesh L S

VIGNESH LS 1
DATA SCIENCE LABORATORY

CS3362 DATA SCIENCE LABORATORY (Under Revision)

COURSE OBJECTIVES:
 To understand the python libraries for data science
 To understand the basic Statistical and Probability measures for data science.
 To learn descriptive analytics on the benchmark data sets.
 To apply correlation and regression analytics on standard data sets.
 To present and interpret data using visualization packages in Python.
LIST OF EXPERIMENTS:
1. Download, install and explore the features of NumPy, SciPy, Jupyter, Statsmodels and
Pandas packages.
2. Working with Numpy arrays
3. Working with Pandas data frames
4. Reading data from text files, Excel and the web and exploring various commands for
doing descriptive analytics on the Iris data set.
5. Use the diabetes data set from UCI and Pima Indians Diabetes data set for performing the
following:
 Univariate analysis: Frequency, Mean, Median, Mode, Variance, Standard Deviation,
Skewness and Kurtosis.
 Bivariate analysis: Linear and logistic regression modeling
 Multiple Regression analysis
 Also compare the results of the above analysis for the two data sets.
6. Apply and explore various plotting functions on UCI data sets.
 Normal curves
 Density and contour plots
 Correlation and scatter plots
 Histograms
 Three dimensional plotting
7. Visualizing Geographic Data with Basemap

LIST OF EQUIPMENTS :(30 Students per Batch)

Tools: Python, Numpy, Scipy, Matplotlib, Pandas, statmodels, seaborn, plotly, bokeh
Note: Example data sets like: UCI, Iris, Pima Indians Diabetes etc.

VIGNESH LS 2
DATA SCIENCE LABORATORY

COURSE OUTCOMES:
At the end of this course, the students will be able to:
 CO1: Make use of the python libraries for data science
 CO2: Make use of the basic Statistical and Probability measures for data science.
 CO3: Perform descriptive analytics on the benchmark data sets.
 CO4: Perform correlation and regression analytics on standard data sets
 CO5: Present and interpret data using visualization packages in Python.

VIGNESH LS 3
DATA SCIENCE LABORATORY

1. Download, install and explore the features of NumPy, SciPy, Jupyter, Statsmodels
and Pandas packages.
Aim
To Download and install python and its packages using pip installation
Procedure
Install Python Data Science Packages
Python is a high-level and general-purpose programming language with data science and
machine learning packages. Use the video below to install on Windows, MacOS, or Linux. As
a first step, install Python for Windows, MacOS, or Linux.
Python Packages
The power of Python is in the packages that are available either through the pip or conda
package managers. This page is an overview of some of the best packages for machine
learning and data science and how to install them.
We will explore the Python packages that are commonly used for data science and machine
learning. You may need to install the packages from the terminal, Anaconda prompt,
command prompt, or from the Jupyter Notebook. If you have multiple versions of Python or
have specific dependencies then use an environment manager such as pyenv. For most users,
a single installation is typically sufficient. The Python package manager pip has all of the
packages (such as gekko) that we need for this course. If there is an administrative access
error, install to the local profile with the --user flag.

pip install gekko

Gekko
Gekko provides an interface to gradient-based solvers for machine learning and
optimization of mixed-integer, differential algebraic equations, and time series models.
Gekko provides exact first and second derivatives through automatic differentiation and
discretization with simultaneous or sequential methods.

pip install gekko

Keras
Keras provides an interface for artificial neural networks. Keras acts as an interface for the
TensorFlow library. Other backend packages were supported until version 2.4. TensorFlow
is now the only backend and is installed separately with pip install tensorflow.

pip install keras

VIGNESH LS 4
DATA SCIENCE LABORATORY

Matplotlib
The package matplotlib generates plots in Python.

pip install matplotlib

Numpy
Numpy is a numerical computing package for mathematics, science, and engineering. Many
data science packages use Numpy as a dependency.

pip install numpy

OpenCV
OpenCV (Open Source Computer Vision Library) is a package for real-time computer vision
and developed with support from Intel Research.

pip install opencv-python

Pandas
Pandas visualizes and manipulates data tables. There are many functions that allow efficient
manipulation for the preliminary steps of data analysis problems.

pip install pandas

Plotly
Plotly renders interactive plots with HTML and JavaScript. Plotly Express is included with
Plotly.

pip install plotly

PyTorch
PyTorch enables deep learning, computer vision, and natural language processing.
Development is led by Facebook's AI Research lab (FAIR).

pip install torch

Scikit-Learn
Scikit-Learn (or sklearn) includes a wide variety of classification, regression and clustering
algorithms including neural network, support vector machine, random forest, gradient
boosting, k-means clustering, and other supervised or unsupervised learning methods.

pip install scikit-learn

SciPy
SciPy is a general-purpose package for mathematics, science, and engineering and extends
the base capabilities of NumPy.

pip install scipy

VIGNESH LS 5
DATA SCIENCE LABORATORY

Seaborn
Seaborn is built on matplotlib, and produces detailed plots in few lines of code.

pip install seaborn

Statsmodels
Statsmodels is a package for exploring data, estimating statistical models, and performing
statistical tests. It include descriptive statistics, statistical tests, plotting functions, and result
statistics.

pip install statsmodels

TensorFlow
TensorFlow is an open source machine learning platform with particular focus on training
and inference of deep neural networks. Development is led by the Google Brain team.

pip install tensorflow

VIGNESH LS 6
DATA SCIENCE LABORATORY

Working with Numpy arrays

CREATE A NUMPY NDARRAY OBJECT
NumPy is used to work with arrays. The array object in NumPy is called ndarray.

Example
import numpy as np
arr = np.array([1, 2, 3, 4, 5])
print(arr)
print(type(arr))

To create an ndarray, we can pass a list, tuple or any array-like object into the array()
method, and it will be converted into an ndarray:

Example
Use a tuple to create a NumPy array:
import numpy as np
arr = np.array((1, 2, 3, 4, 5))
print(arr)

Dimensions in Arrays
A dimension in arrays is one level of array depth (nested arrays).
0-D Arrays
0-D arrays, or Scalars, are the elements in an array. Each value in an array is a 0-D array.

Example
Create a 0-D array with value 42
import numpy as np
arr = np.array(42)
print(arr)

1-D Arrays
An array that has 0-D arrays as its elements is called uni-dimensional or 1-D array.
These are the most common and basic arrays.

Example
Create a 1-D array containing the values 1,2,3,4,5:

VIGNESH LS 7
DATA SCIENCE LABORATORY

import numpy as np
arr = np.array([1, 2, 3, 4, 5])
print(arr)

2-D Arrays
An array that has 1-D arrays as its elements is called a 2-D array. These are often used to
represent matrix or 2nd order tensors.

Create a 2-D array containing two arrays with the values 1,2,3 and 4,5,6:
import numpy as np
arr = np.array([[1, 2, 3], [4, 5, 6]])
print(arr)

3-D arrays
An array that has 2-D arrays (matrices) as its elements is called 3-D array.
These are often used to represent a 3rd order tensor.

Example
Create a 3-D array with two 2-D arrays, both containing two arrays with the values
1,2,3 and 4,5,6:
import numpy as np
arr = np.array([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [4, 5, 6]]])
print(arr)

Check Number of Dimensions?

NumPy Arrays provides the ndim attribute that returns an integer that tells us how many
dimensions the array have.

Example
Check how many dimensions the arrays have:
import numpy as np
a = np.array(42)
b = np.array([1, 2, 3, 4, 5])
c = np.array([[1, 2, 3], [4, 5, 6]])
d = np.array([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [4, 5, 6]]])
print(a.ndim)

VIGNESH LS 8
DATA SCIENCE LABORATORY

print(b.ndim)
print(c.ndim)
print(d.ndim)

Higher Dimensional Arrays

An array can have any number of dimensions.
When the array is created, you can define the number of dimensions by using the ndmin
argument.

Example
Create an array with 5 dimensions and verify that it has 5 dimensions:
import numpy as np
arr = np.array([1, 2, 3, 4], ndmin=5)
print(arr)
print('number of dimensions :', arr.ndim)

In this array the innermost dimension (5th dim) has 4 elements, the 4th dim has 1 element
that is the vector, the 3rd dim has 1 element that is the matrix with the vector, the 2nd dim
has 1 element that is 3D array and 1st dim has 1 element that is a 4D array.

VIGNESH LS 9
DATA SCIENCE LABORATORY

Working with Pandas data frames

A Pandas DataFrame is a 2 dimensional data structure, like a 2 dimensional array, or a table
with rows and columns.

Example
Create a simple Pandas DataFrame:
import pandas as pd
data = {
"calories": [420, 380, 390],
"duration": [50, 40, 45]
}
#load data into a DataFrame object:
df = pd.DataFrame(data)
print(df)

Result
calories duration
0 420 50
1 380 40
2 390 45

Locate Row
As you can see from the result above, the DataFrame is like a table with rows and columns.
Pandas use the loc attribute to return one or more specified row(s)

Example
Return row 0:
#refer to the row index:

print(df.loc[0])

VIGNESH LS 10
DATA SCIENCE LABORATORY

Result
calories 420
duration 50
Name: 0, dtype: int64
Note: This example returns a Pandas Series.

Example
Return row 0 and 1:
#use a list of indexes:
print(df.loc[[0, 1]])

Result
calories duration
0 420 50
1 380 40
Note: When using [], the result is a Pandas DataFrame.

Named Indexes
With the index argument, you can name your own indexes.

Example
Add a list of names to give each row a name:
import pandas as pd
data = {
"calories": [420, 380, 390],
"duration": [50, 40, 45]
}
df = pd.DataFrame(data, index = ["day1", "day2", "day3"])
print(df)

VIGNESH LS 11
DATA SCIENCE LABORATORY

Result
calories duration
day1 420 50
day2 380 40
day3 390 45

Locate Named Indexes

Use the named index in the loc attribute to return the specified row(s).

Example
Return "day2":
#refer to the named index:
print(df.loc["day2"])

Result
calories 380
duration 40
Name: 0, dtype: int64

Load Files Into a DataFrame

If your data sets are stored in a file, Pandas can load them into a DataFrame.

Example
Load a comma separated file (CSV file) into a DataFrame:

import pandas as pd
df = pd.read_csv('data.csv')
print(df)

VIGNESH LS 12

Data Science Lab Manual Overview
No ratings yet
Data Science Lab Manual Overview
74 pages
Machine Learning Lab Manual (15CSL76)
No ratings yet
Machine Learning Lab Manual (15CSL76)
30 pages
CS8091 Bigdata Analytics Lessonplan With Date
No ratings yet
CS8091 Bigdata Analytics Lessonplan With Date
11 pages
Data Science Foundations Question Bank
No ratings yet
Data Science Foundations Question Bank
16 pages
Data Preprocessing Overview and Techniques
100% (1)
Data Preprocessing Overview and Techniques
41 pages
Data Science Roles, Stages in A Data Science Project
No ratings yet
Data Science Roles, Stages in A Data Science Project
14 pages
Data Analytics Question Bank for CSE
No ratings yet
Data Analytics Question Bank for CSE
12 pages
CS-605 Data - Analytics - Lab Complete Manual (2) - 1672730238
No ratings yet
CS-605 Data - Analytics - Lab Complete Manual (2) - 1672730238
56 pages
Implementing the FIND-S Algorithm in Python
No ratings yet
Implementing the FIND-S Algorithm in Python
3 pages
Data Visualization Exam Guide
100% (1)
Data Visualization Exam Guide
4 pages
Data Exploration and Visualization Laboratory - AD3301 - Lab Manual
No ratings yet
Data Exploration and Visualization Laboratory - AD3301 - Lab Manual
55 pages
Data Science Lab Manual for B.Tech CSE
No ratings yet
Data Science Lab Manual for B.Tech CSE
39 pages
Database Design Lab Record 2023-24
No ratings yet
Database Design Lab Record 2023-24
99 pages
CS 601 ML Lab Manual
0% (1)
CS 601 ML Lab Manual
14 pages
FDS Lesson Plan
No ratings yet
FDS Lesson Plan
8 pages
Utility Theory in Artificial Intelligence
No ratings yet
Utility Theory in Artificial Intelligence
20 pages
Ad3002 - Question Bank Health Care
100% (1)
Ad3002 - Question Bank Health Care
16 pages
DSM Module 1
No ratings yet
DSM Module 1
60 pages
MATPLOTLIB Updated
No ratings yet
MATPLOTLIB Updated
95 pages
Fundamentals of Data Science: Nehru Institute of Engineering and Technology
100% (1)
Fundamentals of Data Science: Nehru Institute of Engineering and Technology
17 pages
AL3451 Machine Learning Apr May 2024 Question Paper Download
No ratings yet
AL3451 Machine Learning Apr May 2024 Question Paper Download
3 pages
Ad3311 Set4
No ratings yet
Ad3311 Set4
2 pages
Question Paper - AI (Feb 1)
No ratings yet
Question Paper - AI (Feb 1)
2 pages
Introduction To Data Science Lab Manual
100% (1)
Introduction To Data Science Lab Manual
76 pages
18CS42 Design and Analysis of Algorithms
No ratings yet
18CS42 Design and Analysis of Algorithms
16 pages
Big Data - SRM University PDF
No ratings yet
Big Data - SRM University PDF
29 pages
Data Science Lab Guide
No ratings yet
Data Science Lab Guide
98 pages
Software Project Management Course Overview
No ratings yet
Software Project Management Course Overview
18 pages
VTU Exam Question Paper With Solution of 17CS73 Machine Learning Jan-2021-Swathi Y
No ratings yet
VTU Exam Question Paper With Solution of 17CS73 Machine Learning Jan-2021-Swathi Y
7 pages
Ad3491 Fdsa Unit 4 Notes Eduengg-2
No ratings yet
Ad3491 Fdsa Unit 4 Notes Eduengg-2
16 pages
EDA - With Python Question Bank
0% (1)
EDA - With Python Question Bank
3 pages
Daa Question Bank Unit-3
No ratings yet
Daa Question Bank Unit-3
4 pages
Academy Aptitude Questions
No ratings yet
Academy Aptitude Questions
5 pages
(New) (New) ML KNN Introduction Handwritten Notes
No ratings yet
(New) (New) ML KNN Introduction Handwritten Notes
6 pages
ProgramsGenAI BAIL657C
No ratings yet
ProgramsGenAI BAIL657C
18 pages
Data Science Laboratory Lab Manual: Prepared by Dr. R Obulakonda Reddy, Associate Professor
No ratings yet
Data Science Laboratory Lab Manual: Prepared by Dr. R Obulakonda Reddy, Associate Professor
35 pages
Ad3301 Data Exploration and Visualization
No ratings yet
Ad3301 Data Exploration and Visualization
24 pages
Introduction to Data Science Concepts
No ratings yet
Introduction to Data Science Concepts
161 pages
CCS345 Ethics and AI Lab Record
No ratings yet
CCS345 Ethics and AI Lab Record
35 pages
DAA
No ratings yet
DAA
149 pages
Ad3301 Dev Full Notes
No ratings yet
Ad3301 Dev Full Notes
53 pages
CD3291 Data Structures Lab Manual
No ratings yet
CD3291 Data Structures Lab Manual
84 pages
Theory of Computation Assignments
No ratings yet
Theory of Computation Assignments
6 pages
Question Bank Module-1: Department of Computer Applications 18mca53 - Machine Learning
No ratings yet
Question Bank Module-1: Department of Computer Applications 18mca53 - Machine Learning
7 pages
Python Programming and Problem Solving
No ratings yet
Python Programming and Problem Solving
1 page
Python Programming Course Outline
No ratings yet
Python Programming Course Outline
5 pages
8-Puzzle and 8-Queens Solutions in Python
No ratings yet
8-Puzzle and 8-Queens Solutions in Python
6 pages
Data Warehousing Exam Tasks 2023
100% (1)
Data Warehousing Exam Tasks 2023
2 pages
Tsa Ut III Tsa Notes
No ratings yet
Tsa Ut III Tsa Notes
30 pages
Mining Frequent Itemset-Association Analysis
No ratings yet
Mining Frequent Itemset-Association Analysis
59 pages
ML - LAB Record
No ratings yet
ML - LAB Record
36 pages
What's Next?: Rule Models Learning Ordered Rule Lists Learning Unordered Rule Sets Descriptive Rule Learning
No ratings yet
What's Next?: Rule Models Learning Ordered Rule Lists Learning Unordered Rule Sets Descriptive Rule Learning
47 pages
Cp5094 IRT University Question
75% (8)
Cp5094 IRT University Question
3 pages
Ccs374 Web Application Security
No ratings yet
Ccs374 Web Application Security
20 pages
CCS334 BDA Practical Question
No ratings yet
CCS334 BDA Practical Question
2 pages
Machine Learning Lab Manual
100% (1)
Machine Learning Lab Manual
38 pages
Fds Lab Manual
No ratings yet
Fds Lab Manual
31 pages
CS3361 - Data Science Laboratory
No ratings yet
CS3361 - Data Science Laboratory
31 pages
Data Science Lab Manual
No ratings yet
Data Science Lab Manual
18 pages
Cs3361 Data Science Laboratory
No ratings yet
Cs3361 Data Science Laboratory
139 pages
TDD
No ratings yet
TDD
222 pages
AI Manual-2021-2022 (Even) - Lab Manual
100% (1)
AI Manual-2021-2022 (Even) - Lab Manual
37 pages
Java Encryption Techniques Overview
No ratings yet
Java Encryption Techniques Overview
43 pages
Cloud Computing Exam Prep
No ratings yet
Cloud Computing Exam Prep
2 pages
On Campus Mock Test Question Set 2
No ratings yet
On Campus Mock Test Question Set 2
2 pages
Digital Marketing Lab Manual
92% (12)
Digital Marketing Lab Manual
19 pages
Java Data Structures Quiz
No ratings yet
Java Data Structures Quiz
5 pages
Week-02 Assignment - 2023 Updated
No ratings yet
Week-02 Assignment - 2023 Updated
5 pages
Aptitude 5
No ratings yet
Aptitude 5
11 pages
Week-01 Assignment
No ratings yet
Week-01 Assignment
7 pages
Model Evaluation Techniques Guide
No ratings yet
Model Evaluation Techniques Guide
40 pages
Methods of Homological Algebra - Sergei I. Gelfand, Yuri I. Manin.
No ratings yet
Methods of Homological Algebra - Sergei I. Gelfand, Yuri I. Manin.
388 pages
English Grammar and Vocabulary Guide
No ratings yet
English Grammar and Vocabulary Guide
129 pages
Engineering Mathematics 2A (Scee08009) : Refresher Sheet
No ratings yet
Engineering Mathematics 2A (Scee08009) : Refresher Sheet
2 pages
Chaos Theory & The Butterfly Effect
No ratings yet
Chaos Theory & The Butterfly Effect
14 pages
SEQUENCE Function in Excel - Auto Generate Number Series
No ratings yet
SEQUENCE Function in Excel - Auto Generate Number Series
8 pages
Agile Web Development with Rails 7 2nd Edition Sam Ruby ebook one stop download
100% (1)
Agile Web Development with Rails 7 2nd Edition Sam Ruby ebook one stop download
146 pages
9th Maths Geo Test Ans
No ratings yet
9th Maths Geo Test Ans
6 pages
12 Public Question Merged
No ratings yet
12 Public Question Merged
148 pages
Intersection Design Principles
100% (1)
Intersection Design Principles
48 pages
PID Controller Tuning Techniques Guide
No ratings yet
PID Controller Tuning Techniques Guide
5 pages
Interaction Formula For Purlins
No ratings yet
Interaction Formula For Purlins
10 pages
Class 10 Mathematics Question Paper
No ratings yet
Class 10 Mathematics Question Paper
25 pages
0607 s11 QP 22 PDF
No ratings yet
0607 s11 QP 22 PDF
8 pages
DLL Math q1 Week 1 June 23-27, 2025
No ratings yet
DLL Math q1 Week 1 June 23-27, 2025
13 pages
Stat6201 ch3-3
No ratings yet
Stat6201 ch3-3
3 pages
Plato's " Saving The Appearances" .
50% (2)
Plato's " Saving The Appearances" .
41 pages
Gender Poverty Differences in Kenya
No ratings yet
Gender Poverty Differences in Kenya
28 pages
Numerical Analyses of Steel Beam-Column Joints Subjected To Catenary Action
No ratings yet
Numerical Analyses of Steel Beam-Column Joints Subjected To Catenary Action
11 pages
Masmoudi Et Al 2019
No ratings yet
Masmoudi Et Al 2019
11 pages
Coordinate Geometry Solutions
No ratings yet
Coordinate Geometry Solutions
2 pages
Fourth Quarterly Math Test for Grade 5
No ratings yet
Fourth Quarterly Math Test for Grade 5
10 pages
Inp013
No ratings yet
Inp013
12 pages
Allocation Problem
No ratings yet
Allocation Problem
9 pages
Complex Number Assignment
No ratings yet
Complex Number Assignment
30 pages
Week 11 Graded Solution
No ratings yet
Week 11 Graded Solution
10 pages
New Keynesian Model Overview
No ratings yet
New Keynesian Model Overview
38 pages
RL Lecture5
No ratings yet
RL Lecture5
16 pages
Sec 3.4
No ratings yet
Sec 3.4
17 pages
Uzan-The Arrow of Time and Meaning PDF
No ratings yet
Uzan-The Arrow of Time and Meaning PDF
29 pages

Lab - Manual FDS

Uploaded by

Lab - Manual FDS

Uploaded by

DATA SCIENCE LABORATORY

Prepared & Consolidated

CS3362 DATA SCIENCE LABORATORY (Under Revision)

LIST OF EQUIPMENTS :(30 Students per Batch)

pip install gekko

pip install gekko

pip install keras

pip install matplotlib

pip install numpy

pip install opencv-python

pip install pandas

pip install plotly

pip install torch

pip install scikit-learn

pip install scipy

pip install seaborn

pip install statsmodels

pip install tensorflow

Working with Numpy arrays

Check Number of Dimensions?

Higher Dimensional Arrays

Working with Pandas data frames

Locate Named Indexes

Load Files Into a DataFrame

You might also like