0% found this document useful (0 votes)

21 views39 pages

Data Science Lab Manual (EDA)

The document outlines the installation and setup of data analysis and visualization tools including Python, R, Tableau Public, and Power BI. It details procedures for installing these tools, performing exploratory data analysis (EDA) on an email dataset, and using libraries like NumPy, Pandas, and Matplotlib for data manipulation and visualization. Additionally, it covers data cleaning and visualization techniques in R, as well as time series analysis methods.

Uploaded by

harishgowthamranganathan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views39 pages

Data Science Lab Manual (EDA)

Uploaded by

harishgowthamranganathan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 39

EX.

NO:1 Install the Data Analysis and Visualization Tool: R /

DATE: Python / Tableau Public / Power BI

AIM:
To install and set up a data analysis and visualization environment using
Python (with Jupyter Notebook), or other tools like R, Tableau Public, or Power
BI, enabling the user to perform data analysis, visualization, and basic data
science operations.

ALGORITHM / PROCEDURE:
A. Installation of Python (Anaconda Distribution)
1. Download Anaconda:

Visit https://www.anaconda.com/products/distribution

Choose the installer for your operating system (Windows / macOS /

Linux).

2. Install Anaconda:

Run the installer and follow on-screen instructions.

Select the option to add Anaconda to the system PATH.

3. Launch Jupyter Notebook:

Open Anaconda Navigator → Click on Jupyter Notebook.

Create a new Python notebook (.ipynb).

4. Verify Installation:

Import key data science libraries like NumPy, Pandas, and Matplotlib.

B. Installation of R (optional alternative)

1. Download and install R from https://cran.r-project.org/.

2. Download and install RStudio (IDE) from

https://posit.co/download/rstudio-desktop/.

3. Launch RStudio and test with a simple script.

C. Installation of Tableau Public / Power BI

Tableau Public: Download from https://public.tableau.com/en-us/s/ and
install.

Power BI: Download from Microsoft Store or

https://powerbi.microsoft.com/.

Load a sample dataset and verify visualization functionality.

RESULT:
The data analysis and visualization environment was successfully
installed and configured using Python (Anaconda Distribution).
Basic data science libraries such as NumPy, Pandas, Matplotlib, and
Seaborn were verified and tested with a sample visualization.
EX.NO:2
Perform Exploratory Data Analysis (EDA) with an
DATE: Email Dataset — Import, Visualize, and Derive
Insights

AIM:
To perform Exploratory Data Analysis (EDA) on an email dataset by
importing exported email data into a Pandas DataFrame, cleaning and
exploring it, visualizing patterns, and deriving meaningful insights about the
data.

ALGORITHM / PROCEDURE:

Data Collection:
Export email data (e.g., from Gmail using Google Takeout or from
Outlook) in a .csv or .xlsx format.

The dataset may include columns like Sender, Receiver, Subject, Date,
Time, Message Length, Folder (Inbox/Sent/Spam), etc.

Import the Dataset:

Load the dataset into a Pandas DataFrame using
pd.read_csv() or pd.read_excel().

Data Cleaning:
Check for missing values, duplicates, and invalid entries.
Convert date/time columns to datetime format.

Exploratory Data Analysis (EDA):

Display dataset information (.info(), .describe()).

Check distributions and relationships between variables.

Analyze patterns such as:

Number of emails sent/received per day.

Most frequent senders/receivers.

Common keywords in subjects.

Visualization:

Use Matplotlib and Seaborn for data visualization.

Create plots such as:

Bar charts for most frequent senders.

Line charts for emails per day.

Pie chart for email categories.

Word cloud for subject keywords (optional).

Derive Insights:
Summarize findings from EDA and visualizations
PROGRAM :
# Importing required libraries
import pandas as pd

import matplotlib.pyplot as plt

import seaborn as sns

from wordcloud import WordCloud

# Step 1: Load the dataset

data = pd.read_csv('emails.csv')

# Step 2: Display basic info

print("Dataset Info:")

print(data.info())

print("\nFirst 5 Records:")

print(data.head())

# Step 3: Data Cleaning

data.drop_duplicates(inplace=True)

data['Date'] = pd.to_datetime(data['Date'], errors='coerce')

# Step 4: Basic EDA

print("\nSummary Statistics:")

print(data.describe())
# Number of emails per day

emails_per_day = data['Date'].dt.date.value_counts().sort_index()

# Step 5: Visualization

# 1. Emails per day

plt.figure(figsize=(10,5))

emails_per_day.plot(kind='line', color='blue')

plt.title('Number of Emails per Day')

plt.xlabel('Date')

plt.ylabel('Count')

plt.grid(True)

plt.show()

# 2. Top 10 Senders

plt.figure(figsize=(10,5))

top_senders = data['From'].value_counts().head(10)

sns.barplot(x=top_senders.index, y=top_senders.values, palette='viridis')

plt.title('Top 10 Senders')

plt.xlabel('Sender')

plt.ylabel('Email Count')

plt.xticks(rotation=45)

plt.show()
# 3. Word Cloud for Subject Keywords

text = " ".join(str(subj) for subj in data['Subject'].dropna())

wordcloud = WordCloud(width=800, height=400,

background_color='white').generate(text)

plt.figure(figsize=(10,5))

plt.imshow(wordcloud, interpolation='bilinear')

plt.axis('off')

plt.title('Most Common Words in Email Subjects')

plt.show()
OUTPUT :

1. Dataset Info
Dataset Info:

RangeIndex: 1000 entries, 0 to 999

Data columns (total 5 columns):

# Column Non-Null Count Dtype

--- ------ -------------- -----

0 Date 1000 non-null datetime64[ns]

1 From 1000 non-null object

2 To 1000 non-null object

3 Subject 980 non-null object

4 Folder 1000 non-null object

dtypes: datetime64 , object(4)

2. First 5 Record
Date From To Subject Folder

0 2023-05-02 [email protected] [email protected] Project Update - Phase 2 Inbox

1 2023-05-03 [email protected] [email protected] Interview Schedule Inbox

2 2023-05-04 [email protected] [email protected] Meeting Notes - Review Sent

3 2023-05-05 [email protected] [email protected] Order Confirmation #12345 Inbox

4 2023-05-06 [email protected] [email protected] Win a Free iPhone! Spam

3. Summary Statistics

Summary Statistics:

Date

count 1000

unique 300

top 2023-07-15

freq 15

Name: Date, dtype: object

RESULT:

Exploratory Data Analysis (EDA) was successfully performed on the email

dataset.
Different visualizations helped identify:

 The most frequent senders and active dates.

 Distribution of emails over time.
 Common keywords in email subjects.
This experiment demonstrates how EDA techniques reveal trends and
patterns in textual and time-based data.
EX.NO:3
Working with NumPy Arrays, Pandas
DATE: DataFrames, and Basic Plots using Matplotlib

AIM:
To understand and demonstrate the creation, manipulation, and analysis
of NumPy arrays and Pandas DataFrames, and to visualize data using basic
Matplotlib plots such as line plots, bar charts, histograms, and scatter plots.

ALGORITHM / PROCEDURE:

Import Required Libraries

Import the numpy, pandas, and matplotlib.pyplot libraries.

Create and Manipulate NumPy Arrays

Create 1D and 2D arrays using np.array().

Perform basic operations: addition, multiplication, slicing, and reshaping.

Create and Explore Pandas DataFrames

Create a DataFrame using a dictionary or CSV file.

Display rows and columns using .head(), .tail(), .info(),

.describe().

Perform column operations and basic statistical analysis .

Visualize Data Using Matplotlib

Create different types of plots:

 Line plot
 Bar chart
 Histogram
 Scatter plot

Add titles, labels, legends, and grid for better readability.

Interpret the Output

Analyze the visualizations and draw conclusions.

PROGRAM :

# Import required libraries

import numpy as np

import pandas as pd

import matplotlib.pyplot as plt

Part 1: Working with NumPy

# Creating arrays

arr1 = np.array([10, 20, 30, 40, 50])

arr2 = np.array([[1, 2, 3], [4, 5, 6]])

print("1D Array:", arr1)

print("2D Array:\n", arr2)

# Array operations

print("Array Sum:", arr1.sum())

print("Array Mean:", arr1.mean())

print("Array Slicing:", arr1[1:4])

print("Reshaped 2D Array:\n", arr2.reshape(3, 2))

# Part 2: Working with Pandas

# Creating a DataFrame

data = {

'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eva'],

'Age': [24, 27, 22, 32, 29],

'Marks': [88, 92, 95, 70, 85]

df = pd.DataFrame(data)

print("\nDataFrame:\n", df)

# Display basic info

print("\nDataFrame Info:")

print(df.info())

print("\nStatistical Summary:\n", df.describe())

# Part 3: Data Visualization using Matplotlib

# Line Plot

plt.figure(figsize=(6,4))

plt.plot(df['Name'], df['Marks'], marker='o', color='green')

plt.title('Line Plot - Student Marks')

plt.xlabel('Name')

plt.ylabel('Marks')

plt.grid(True)

plt.show()

# Bar Chart

plt.bar(df['Name'], df['Age'], color='orange')

plt.title('Bar Chart - Student Ages')

plt.xlabel('Name')

plt.ylabel('Age')

plt.show()

# Histogram

plt.hist(df['Marks'], bins=5, color='skyblue', edgecolor='black')

plt.title('Histogram - Marks Distribution')

plt.xlabel('Marks')

plt.ylabel('Frequency')

plt.show()

# Scatter Plot

plt.scatter(df['Age'], df['Marks'], color='red')

plt.title('Scatter Plot - Age vs Marks')

plt.xlabel('Age')
plt.ylabel('Marks')

plt.grid(True)

plt.show()
OUTPUT:
1. NumPy Array Output

1D Array: [10 20 30 40 50]

2D Array:

[[1 2 3]

[4 5 6]]

Array Sum: 150

Array Mean: 30.0

Array Slicing: [20 30 40]

Reshaped 2D Array:

[[1 2]

[3 4]

[5 6]]

2. Pandas DataFrame Output

Name Age Marks

0 Alice 24 88

1 Bob 27 92

2 Charlie 22 95

3 David 32 70

4 Eva 29 85
RESULT:
Successfully demonstrated the creation and manipulation of NumPy arrays and
Pandas DataFrames, along with visualization of data using Matplotlib.
The experiment highlights how Python’s core data science libraries enable
numerical computation, tabular data handling, and effective graphical
representation of information.
EX.NO:4
Data Cleaning and Visualization, Exploring
DATE: Filters and Plot Features in R

AIM:
To understand and demonstrate how to filter variables (columns) and rows in R
for data cleaning purposes, and to apply various plotting features on sample
datasets for visual analysis.

ALGORITHM / PROCEDURE:

Load Required Libraries

Use the tidyverse package for data manipulation (dplyr) and visualization
(ggplot2).

Load Sample Dataset

Use built-in datasets such as mtcars, iris, or diamonds.

Explore and Inspect Data

View the structure, summary, and first few rows of the dataset using functions like
head(), str(), summary().

Apply Variable (Column) Filters

Select specific columns using select() function from dplyr.
Remove unnecessary columns for cleaning.

Apply Row Filters

Use filter() function to include or exclude rows based on specific conditions.

Example: select only cars with mpg > 20 from mtcars dataset

Handle Missing or Invalid Data (if any)

Use na.omit() or is.na() to remove or identify missing values.

Visualization using ggplot2

Create various plots using ggplot():

o Bar Plot
o Histogram
o Scatter Plot
o Box Plot

Add titles, axis labels, and color themes for clarity.

Analyze and Interpret the Output

PROGRAM (R):
# Load required libraries

library(dplyr)

library(ggplot2)

# Step 1: Load sample dataset

data("mtcars")

# Step 2: Explore dataset

head(mtcars)

str(mtcars)

summary(mtcars)

# Step 3: Variable (Column) Filtering

selected_data <- select(mtcars, mpg, cyl, hp, wt)

print("Selected Columns:")

print(head(selected_data))

# Step 4: Row Filtering

filtered_data <- filter(selected_data, mpg > 20)

print("Filtered Rows where mpg > 20:")

print(filtered_data)

# Step 5: Handling Missing Values (if any)

cleaned_data <- na.omit(filtered_data)

# Step 6: Visualization

# (a) Scatter Plot - Horsepower vs. MPG

ggplot(cleaned_data, aes(x = hp, y = mpg)) +

geom_point(color = "blue", size = 3) +

ggtitle("Scatter Plot: Horsepower vs. Mileage (mpg)") +

xlab("Horsepower (hp)") +

ylab("Miles per Gallon (mpg)") +

theme_minimal()

# (b) Bar Plot - Count of Cylinders

ggplot(cleaned_data, aes(x = factor(cyl))) +

geom_bar(fill = "orange") +

ggtitle("Bar Plot: Number of Cars by Cylinders") +

xlab("Cylinders") +

ylab("Count") +

theme_bw()

# (c) Box Plot - Weight vs. MPG

ggplot(cleaned_data, aes(x = factor(cyl), y = wt, fill = factor(cyl))) +

geom_boxplot() +

ggtitle("Box Plot: Weight Distribution by Cylinders") +

xlab("Cylinders") +

ylab("Weight (wt)") +

theme_classic()
OUTPUT:
1. Data Inspection

'data.frame': 32 obs. of 11 variables:

$ mpg: Miles per gallon (numeric)

$ cyl: Number of cylinders (numeric)

$ disp: Displacement (numeric)

$ hp : Horsepower (numeric)

$ drat: Rear axle ratio (numeric)

$ wt : Weight (numeric)

2. Filtered Data Example

Selected Columns:

mpg cyl hp wt

1 21.0 6 110 2.620

2 21.0 6 110 2.875

3 22.8 4 93 2.320

4 24.4 4 62 2.200

5 22.8 4 95 2.320

Filtered Rows (mpg > 20):

mpg cyl hp wt

1 21.0 6 110 2.620

2 21.0 6 110 2.875

3 22.8 4 93 2.320

4 24.4 4 62 2.200
RESULT:
Successfully explored row and variable filtering techniques in R for data
cleaning using dplyr functions such as select() and filter().
Additionally, different visualization techniques were applied using ggplot2,
providing insights into relationships among key variables.
The experiment demonstrates the use of R as a powerful tool for data wrangling
and visualization.
EX.NO:5
Perform Time Series Analysis and Apply
DATE: Various Visualization Techniques

AIM:
To perform time series analysis on a dataset and apply various visualization
techniques to understand trends, seasonality, and patterns using R.

ALGORITHM :

Start the R environment (RStudio or R GUI).

Load the necessary libraries for time series analysis and visualization:

 ggplot2 for plotting.

 forecast for time series analysis and forecasting.
 tseries for additional time series functions.

Import or load a time series dataset:

 Use a built-in dataset like AirPassengers, or import your own dataset

using read.csv().

Convert the dataset into a time series object using the ts() function if it is
not already in time series format.
Display the dataset information:

 Use functions such as head(), summary(), and str() to understand

the structure and statistics of the data.

Plot the original time series using the plot() function to visualize the overall
trend and fluctuations over time.

Decompose the time series into its components:

 Use the decompose() or stl() functions to separate the data into

trend, seasonal, and random (residual) components.
 Plot the decomposed components to study their behavior.

Apply visualization techniques using ggplot2 and forecast package

functions:

 autoplot() for enhanced plots.

 ggseasonplot() to visualize seasonal patterns across years.
 ggsubseriesplot() to identify monthly or periodic variations.

(Optional) Perform forecasting:

 Use auto.arima() or ets() to fit forecasting models.

 Predict future values using the forecast() function.
 Visualize the forecast with autoplot().

Interpret the visualizations to identify:

 Trends (increasing or decreasing behavior over time).

 Seasonality (repeating patterns at regular intervals).
 Residuals (random fluctuations).

Stop the program and record the observations and results.

PROGRAM (in R):
# Load required libraries

library(ggplot2)

library(forecast)

library(tseries)

# Load a sample time series dataset

data("AirPassengers")

# Display dataset information

print(head(AirPassengers))

summary(AirPassengers)

# Basic time series plot

plot(AirPassengers,

main = "Monthly Air Passengers Data",

ylab = "Number of Passengers",

xlab = "Year",

col = "blue")

# Decompose the time series

decomposed <- decompose(AirPassengers)

plot(decomposed)
# Seasonal plot using ggplot2 (forecast package)

autoplot(AirPassengers) +

ggtitle("Time Series Plot of Air Passengers") +

xlab("Year") + ylab("Passengers")

# Seasonal pattern visualization

ggseasonplot(AirPassengers, year.labels = TRUE, year.labels.left = TRUE) +

ggtitle("Seasonal Plot: Air Passengers")

# Subseries plot

ggsubseriesplot(AirPassengers) +

ggtitle("Subseries Plot: Air Passengers")

# Optional: Forecasting

fit <- auto.arima(AirPassengers)

forecast_values <- forecast(fit, h = 12)

autoplot(forecast_values)
OUTPUT:
1. Display of Dataset (First Few Records):

> head(AirPassengers)

[1] 112 118 132 129 121 135

2. Summary of the Dataset:

> summary(AirPassengers)
Min. 1st Qu. Median Mean 3rd Qu. Max.
104.0 180.0 265.5 280.3 360.5 622.0

RESULT:
Time series analysis was successfully performed on the AirPassengers dataset.
The analysis revealed a strong upward trend and clear seasonal patterns in the
data.
Various visualization techniques such as decomposition, seasonal, and
forecasting plots effectively demonstrated these time-based behaviors.
EX.NO:6
Perform Data Analysis and Representation on
DATE: a Map Using Various Map Datasets with
Mouse Rollover Effect and User Interaction

AIM:
To analyze spatial data and represent it visually on an interactive map using
R.
The experiment demonstrates mouse rollover effects, user interactions (zoom,
pan, popups), and data visualization using different map datasets.

ALGORITHM:
Start the R environment (RStudio or R GUI).

Install and load the required libraries:

 leaflet – for interactive map visualization.

 dplyr – for data manipulation.
 sf – for handling spatial data.
 maps or rnaturalearth – for world or country-level geographic data.

Load a map dataset:

 Use built-in world/country shapefiles or import custom data using

st_read().
 Example: Use rnaturalearth to get a world map or maps for U.S.
states.

Perform data analysis:

 Analyze attributes such as population, density, or GDP associated with
geographic regions.
 Use dplyr to summarize or group the data as needed.

Merge the geographic data with analytical results to create a data frame
containing spatial and statistical data.

Create an interactive map using leaflet:

 Initialize the map with leaflet() and addTiles().

 Add polygons or markers representing each region.
 Use color palettes (colorNumeric or colorBin) to visualize data
intensity.

Add interactivity:

 Use addPopups() or addLabelOnlyMarkers() for tooltips and

rollovers.
 Add legends, layers, and zoom controls for better user experience.

Display the map:

 Render the interactive map in the RStudio Viewer or web brows

PROGRAM (in R):

# Load necessary libraries

library(leaflet)

library(dplyr)

library(rnaturalearth)

library(rnaturalearthdata)

library(sf)

# Load world map data

world <- ne_countries(scale = "medium", returnclass = "sf")

# Create a sample dataset (Population data)

data <- world %>%

mutate(pop_density = pop_est / area_km2)

# Define color palette based on population density

pal <- colorNumeric(palette = "YlOrRd", domain = data$pop_density)

# Create interactive map

leaflet(data = data) %>%

addTiles() %>%

addPolygons(

fillColor = ~pal(pop_density),

weight = 1,
opacity = 1,

color = "white",

dashArray = "3",

fillOpacity = 0.7,

highlight = highlightOptions(

weight = 3,

color = "#666",

dashArray = "",

fillOpacity = 0.7,

bringToFront = TRUE

label = ~paste0(name, ": ", round(pop_density, 2), " people/km²"),

labelOptions = labelOptions(

style = list("font-weight" = "normal", padding = "3px 8px"),

textsize = "13px",

direction = "auto"

) %>%

addLegend(

pal = pal,

values = ~pop_density,

opacity = 0.7,

title = "Population Density",

position = "bottomright"

)
OUTPUT:

1. Map Visualization Output:

o A world map appears in the RStudio Viewer or default web browser.
o Each country is filled with a color gradient representing population
density (calculated as population per square kilometer).
2. : After executing the R program, the following outputs are observed
o Countries with higher population density appear in dark red, while
those with lower density appear in light yellow.
3. Mouse Rollover Effect:
o When the mouse pointer is moved over a country:
 The country region highlights with a bold border.
 A tooltip label appears showing:

Country Name: Population Density

Example:

India: 420.52 people/km²

China: 380.11 people/km²

Australia: 3.24 people/km²

User Interaction Features:

The user can:

o Zoom in and out of the map using the zoom control buttons.
o Pan or drag the map to view different regions.
o Hover over countries to view their details dynamically.

Legend Output:

 A color legend is displayed at the bottom right corner of the map.

 It indicates how color intensity corresponds to population density:

Yellow → Low Density

Orange → Moderate Density

Red → High Density

Sample Console Output (if any):

> library(leaflet)

> library(rnaturalearth)

> library(dplyr)

> leaflet(data = data)

Rendering map... done!

RESULT:
An interactive world map was successfully generated using the Leaflet library in
R.
The map displayed population density for each country with color-coded
visualization, along with mouse rollover effects, zooming, and user interaction
controls, enabling effective and interactive spatial data analysis.
EX.NO:7
Perform Exploratory Data Analysis (EDA) on
DATE: Wine Quality Dataset

AIM:
To perform Exploratory Data Analysis (EDA) on the Wine Quality dataset to
understand its structure, identify patterns, detect missing values, and analyze
relationships between various chemical properties and wine quality using R.

ALGORITHM:

 Start RStudio or R environment.

 Load required libraries:

 ggplot2 for visualization.

 dplyr for data manipulation.
 corrplot for correlation analysis.
 readr for reading CSV data.

 Import the dataset:

 Load the Wine Quality Dataset (e.g., winequality-red.csv or

winequality-white.csv) using read.csv() or
readr::read_csv().

 Inspect the dataset:

 Use functions such as head(), str(), summary(), and dim() to

understand data structure and summary statistics.
 Check for missing values:

 Use sum(is.na(data)) to find missing or null values.

 Perform univariate analysis:

 Plot histograms, boxplots, or density plots for numerical variables to study

their distributions.

 Perform bivariate analysis:

 Use scatter plots and correlation matrices to analyze relationships between

predictors and wine quality.

 Calculate correlation matrix:

 Compute correlations using cor() and visualize using corrplot().

 Perform feature analysis:

 Identify features most strongly correlated with quality.

 Draw inferences and observations based on the visualizations and statistical

results.

 End.
PROGRAM (in R):
(“ Density by Quality") +

theme_minimal()# Load required libraries

library(ggplot2)

library(dplyr)

library(corrplot)

library(readr)

# Import the dataset

wine <- read_csv("winequality-red.csv")

# View the structure and summary of the dataset

str(wine)

summary(wine)

dim(wine)

# Check for missing values

sum(is.na(wine))

# Univariate Analysis

ggplot(wine, aes(x = quality)) +

geom_bar(fill = "skyblue") +

ggtitle("Distribution of Wine Quality") +

xlab("Quality") + ylab("Count")
# Boxplot for alcohol vs quality

ggplot(wine, aes(x = as.factor(quality), y = alcohol, fill = as.factor(quality))) +

geom_boxplot() +

ggtitle("Alcohol Content vs Wine Quality") +

xlab("Wine Quality") + ylab("Alcohol (%)")

# Correlation matrix

corr_matrix <- cor(wine %>% select(-quality))

corrplot(corr_matrix, method = "color", type = "upper", tl.col = "black", tl.srt =

45)

# Scatter plot: alcohol vs density

ggplot(wine, aes(x = alcohol, y = density, color = as.factor(quality))) +

geom_point(alpha = 0.6) +

ggtitle("Alcohol”)
OUTPUT:
1. Dataset Summary:

> dim(wine)

[1] 1599 12

> head(wine)

fixed.acidity volatile.acidity citric.acid residual.sugar chlorides ...

7.4 0.70 0.00 1.9 0.076

...

Missing Values Check:

> sum(is.na(wine))

[1] 0

RESULT:
Exploratory Data Analysis (EDA) was successfully performed on the Wine
Quality Dataset using R.
The analysis revealed that alcohol, sulphates, and citric acid are positively
correlated with wine quality, while volatile acidity negatively

affects quality.
The dataset is clean, with no missing values, and visualizations effectively
represent important data patterns.

DEV Manual - ESEC
No ratings yet
DEV Manual - ESEC
27 pages
AI & Data Science Lab Guide
No ratings yet
AI & Data Science Lab Guide
35 pages
Ad3301 Data Exploration and Visualization
100% (3)
Ad3301 Data Exploration and Visualization
30 pages
DEV Lab Manual-1
No ratings yet
DEV Lab Manual-1
27 pages
Dev
No ratings yet
Dev
33 pages
EXP - NO:1 Installation of Data Analysis and Visualization Tool Aim: Objectives
No ratings yet
EXP - NO:1 Installation of Data Analysis and Visualization Tool Aim: Objectives
34 pages
ccs346 Eda Lab Manual
No ratings yet
ccs346 Eda Lab Manual
41 pages
Eda Lab Manual
No ratings yet
Eda Lab Manual
40 pages
Dev Practical List
No ratings yet
Dev Practical List
34 pages
Ad3301-Data-Exploration-And-Visualization Lab Manual
No ratings yet
Ad3301-Data-Exploration-And-Visualization Lab Manual
24 pages
Ad3301 Data Exploration and Visualization
No ratings yet
Ad3301 Data Exploration and Visualization
24 pages
Updated New Eda Manual
No ratings yet
Updated New Eda Manual
76 pages
Final Dev Record
No ratings yet
Final Dev Record
49 pages
Dev Record Final
No ratings yet
Dev Record Final
34 pages
Labdev
No ratings yet
Labdev
57 pages
Eda Lab Manual
No ratings yet
Eda Lab Manual
34 pages
24UAD315 DEV Final Record
No ratings yet
24UAD315 DEV Final Record
49 pages
Aids Lab
No ratings yet
Aids Lab
45 pages
De&v Record
No ratings yet
De&v Record
36 pages
Knowledge Institute of Technology: (An Autonomous Institution)
No ratings yet
Knowledge Institute of Technology: (An Autonomous Institution)
33 pages
Ad3301 Data Exploration and Visualization
No ratings yet
Ad3301 Data Exploration and Visualization
38 pages
Dev Excercises
No ratings yet
Dev Excercises
27 pages
Ad3301-Dev Lab Manual
No ratings yet
Ad3301-Dev Lab Manual
41 pages
Python Data Analysis Experiments
No ratings yet
Python Data Analysis Experiments
44 pages
AD3301 Lab Record
No ratings yet
AD3301 Lab Record
48 pages
EDA - Lab Manual
No ratings yet
EDA - Lab Manual
6 pages
EDA Progarm and Output
No ratings yet
EDA Progarm and Output
38 pages
DEV Lab Material
No ratings yet
DEV Lab Material
16 pages
EX - NO: 1 Install The Data Analysis and Visualization Tools Date
No ratings yet
EX - NO: 1 Install The Data Analysis and Visualization Tools Date
24 pages
Eda Ex 2
No ratings yet
Eda Ex 2
5 pages
2 Eda Email
No ratings yet
2 Eda Email
3 pages
Python Data Analysis Introduction
No ratings yet
Python Data Analysis Introduction
259 pages
DEV Lab Manual Student
No ratings yet
DEV Lab Manual Student
35 pages
Data Analytics Course for Beginners
No ratings yet
Data Analytics Course for Beginners
34 pages
Data Visualization Exam Guide
100% (1)
Data Visualization Exam Guide
4 pages
Dev Record Edited-4
No ratings yet
Dev Record Edited-4
69 pages
FDS Lab
No ratings yet
FDS Lab
43 pages
Chapter 2. Data Analysis and Processing - Full
No ratings yet
Chapter 2. Data Analysis and Processing - Full
49 pages
EDA Exp 2 Outout
No ratings yet
EDA Exp 2 Outout
7 pages
DEV Lab Record
No ratings yet
DEV Lab Record
46 pages
Dev Record Aids
No ratings yet
Dev Record Aids
24 pages
Data Exploration and Visualization Laboratory - AD3301 - Lab Manual
No ratings yet
Data Exploration and Visualization Laboratory - AD3301 - Lab Manual
55 pages
Data Analysis Lab with Python
No ratings yet
Data Analysis Lab with Python
11 pages
Data Analysis With Python
No ratings yet
Data Analysis With Python
32 pages
FDS Lab Manual
No ratings yet
FDS Lab Manual
62 pages
Data Science Workflow
No ratings yet
Data Science Workflow
7 pages
Data Science Lab Manual
No ratings yet
Data Science Lab Manual
63 pages
Datascience Lab Manual
No ratings yet
Datascience Lab Manual
46 pages
Data Analysis With Python & Pandas
100% (3)
Data Analysis With Python & Pandas
378 pages
FDS Index Merged
No ratings yet
FDS Index Merged
61 pages
Ad3301 Dev Lab
No ratings yet
Ad3301 Dev Lab
13 pages
DS Final
No ratings yet
DS Final
46 pages
III Aids Ad3301 Dev Lab Manual
No ratings yet
III Aids Ad3301 Dev Lab Manual
54 pages
Session-3 DS Practical
No ratings yet
Session-3 DS Practical
7 pages
Lab Record Dev
No ratings yet
Lab Record Dev
20 pages
Access Denied: HP Laserjet, Color Laserjet
No ratings yet
Access Denied: HP Laserjet, Color Laserjet
4 pages
Quiz 6 - Lesson 3 - Writing Product and Office Manuals
No ratings yet
Quiz 6 - Lesson 3 - Writing Product and Office Manuals
5 pages
SXS Library: Data Input/Output Guide
No ratings yet
SXS Library: Data Input/Output Guide
6 pages
Module 1
No ratings yet
Module 1
6 pages
MS Word Table & Design Exercise
No ratings yet
MS Word Table & Design Exercise
2 pages
Free Ai PDF Docs - Google Search
No ratings yet
Free Ai PDF Docs - Google Search
2 pages
There Are Three Basic Components of System Software That You Need To Know About
No ratings yet
There Are Three Basic Components of System Software That You Need To Know About
6 pages
Mern and Mean Stack Architecture
No ratings yet
Mern and Mean Stack Architecture
35 pages
Romanos 1.18-32 - Hernandes Dias Lopes
No ratings yet
Romanos 1.18-32 - Hernandes Dias Lopes
23 pages
Manual - MID-785 & MID-1085 PDF
No ratings yet
Manual - MID-785 & MID-1085 PDF
13 pages
Steganography
No ratings yet
Steganography
11 pages
Introduction To Autodesk Inventor
No ratings yet
Introduction To Autodesk Inventor
21 pages
Canon DR-C240 Brochure
100% (1)
Canon DR-C240 Brochure
2 pages
BIOS-Info Kabylake-Desktop D34xx R1.19.0
No ratings yet
BIOS-Info Kabylake-Desktop D34xx R1.19.0
13 pages
Learn Computer Vision Using OpenCV: With Deep Learning CNNs and RNNs 1st Edition Sunila Gollapudi No Waiting Time
No ratings yet
Learn Computer Vision Using OpenCV: With Deep Learning CNNs and RNNs 1st Edition Sunila Gollapudi No Waiting Time
160 pages
Protasteel-2022-Basic-Training-Guide EN&CN
No ratings yet
Protasteel-2022-Basic-Training-Guide EN&CN
94 pages
GMAcronyms
No ratings yet
GMAcronyms
320 pages
S+ Engineering: Composer Harmony Batch Data Manager 6.6: User Manual
No ratings yet
S+ Engineering: Composer Harmony Batch Data Manager 6.6: User Manual
173 pages
A Tutorial To Getting Started With FHC: Hydraulic Calculation Software For Water Based Fire Protection Systems
100% (2)
A Tutorial To Getting Started With FHC: Hydraulic Calculation Software For Water Based Fire Protection Systems
7 pages
Upload Image To Make Passport Photo With Proper Photos Size Online - Cutout - Pro 2
No ratings yet
Upload Image To Make Passport Photo With Proper Photos Size Online - Cutout - Pro 2
1 page
Mobile Application Interface To Register Citizen Complaint
No ratings yet
Mobile Application Interface To Register Citizen Complaint
4 pages
Hi-Target TS5 RTK System Overview
No ratings yet
Hi-Target TS5 RTK System Overview
9 pages
2 PB
No ratings yet
2 PB
10 pages
Computer Graphics
No ratings yet
Computer Graphics
11 pages
Experiment 10
No ratings yet
Experiment 10
16 pages
COURSE FILE Multimedia Technology
No ratings yet
COURSE FILE Multimedia Technology
12 pages
Commands and Shortcuts Table of Revit
No ratings yet
Commands and Shortcuts Table of Revit
8 pages
Dslrbooth Log
No ratings yet
Dslrbooth Log
2 pages
Embedded Face Recognition System
No ratings yet
Embedded Face Recognition System
8 pages
COT Datjdjdjsjsjsjsjsjsjjsjsjsjsjsjsjsjsjsjjss
No ratings yet
COT Datjdjdjsjsjsjsjsjsjjsjsjsjsjsjsjsjsjsjjss
29 pages