0% found this document useful (0 votes)

71 views2 pages

Pandas Assignment

The document provides instructions for several tasks using pandas and other Python libraries to analyze data. It includes: 1) Reading an XML and CSV file, finding and removing duplicate records from the XML data, and printing summaries of the CSV data. 2) Converting certain columns in the CSV to categorical data types. 3) Adding a new column to the CSV for total time, and printing data meeting criteria. 4) Counting flavor profiles by region in the CSV. 5) Finding and filling missing state values in the CSV. 6) Demonstrating regular expressions, stemming, stop word removal and bag of words modeling on text data.

Uploaded by

hetgoti4911

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

71 views2 pages

Pandas Assignment

Uploaded by

hetgoti4911

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2

Python for Data Science 3150713

Practical-7
Assignment For pandas library

URL for test.xml file.

https://drive.google.com/file/d/1FqOWhY2XNYkHwCBYOjhAILCzVUo9QEp6/view?usp=sharing

Read the xml file (test.xml) and create a dataframe from it and do the following.
Find and print duplicate records.
Remove duplicates and save data in other dataframe.

URL for the file for this assignment.

https://drive.google.com/file/d/1CNAdqFZ-Amji8kOMd4GovivK8UKVLQ-p/view?usp=sharing

Read the csv file (indian_food.csv). Consider value -1 for missing or NA values.(Replace -1 with
NaN when reading a csv file.)

Print the first and last 10 records of dataframe, also print column names and summary of data.
Print information about data such as data types of each column.

Convert columns with name course,diet,flavor_profile,state,region to categorical data type &

print data type for dataframe using info function.

Categories are defined as follows.

Course ['dessert' 'main course' 'starter' 'snack']
Flavor_profile ['sweet' 'spicy' 'bitter' 'sour']
State ['West Bengal' 'Rajasthan' 'Punjab' 'Uttar Pradesh' 'Odisha' 'Maharashtra' 'Uttarakhand'
'Assam' 'Bihar' 'Andhra Pradesh' 'Karnataka' 'Telangana' 'Kerala' 'Tamil Nadu' 'Gujarat' 'Tripura'
'Manipur' 'Nagaland' 'NCT of Delhi' 'Jammu & Kashmir' 'Chhattisgarh' 'Haryana' 'Madhya Pradesh'
'Goa']
Region ['East' 'West' 'North' nan 'North East' 'South' 'Central']

Print name of items with course as dessert.

Print count of items with flavor_profile with sweet type.

Print name of items with cooking_time < prep_time.

Print summary of data grouped by diet column.

Print average cooking_time & prep_time for vegetarian diet type.

S.V.I.T 210410107130
Python for Data Science 3150713

Insert a new column with column name as total_time which contains sum of cooking_time &
prep_time into existing dataframe.
Print name,cooking_time,prep_time,total_time of items with total_time >=500.
Print count of items with various flavour_profile per region.
# e.g.
# region flavor_profile
# Central spicy 2
# sweet 1
# East spicy 5
# sweet 20

Find & print records with missing data in the state column.
Fill missing data in the state column with -.

Write regular expression,

To extract phone numbers (+dd-dddd-dddd) from the following text
“Hey my number is +01-555-1212 & his number is +01-770-1410”
To extract email addresses from the following text.
“You can contact to [email protected] or to [email protected]”.

Demonstrate stemming & stop word removal using nltk library for content given below.

[“Most of the world will make decisions by either guessing or using their gut. They will be
either lucky or wrong.”,
“The goal is to turn data into information and information into insight.”]

Using a 20 newsgroup dataset, create and demonstrate a bag of words model.Also convert the raw
newsgroup documents into a matrix of TF-IDF feature.

S.V.I.T 210410107130

Informatics Practices: Project Work
No ratings yet
Informatics Practices: Project Work
24 pages
Pandas Notes
No ratings yet
Pandas Notes
5 pages
IP Project
No ratings yet
IP Project
31 pages
PYF Project LearnerNotebook LowCode
No ratings yet
PYF Project LearnerNotebook LowCode
6 pages
Foodhub Project Full Code .HTML
89% (9)
Foodhub Project Full Code .HTML
30 pages
Food Recommendation System
No ratings yet
Food Recommendation System
13 pages
Pandas DataFrames
No ratings yet
Pandas DataFrames
1 page
Data Analysis with Pandas Guide
No ratings yet
Data Analysis with Pandas Guide
40 pages
Project 16 Calories Burnt Prediction
No ratings yet
Project 16 Calories Burnt Prediction
10 pages
Data Analysis Exercises for Beginners
No ratings yet
Data Analysis Exercises for Beginners
43 pages
Project Template Notebook Ipynb 1
No ratings yet
Project Template Notebook Ipynb 1
23 pages
Student Copy of Apriori Example - Colaboratory
No ratings yet
Student Copy of Apriori Example - Colaboratory
14 pages
Class 12 Cs Investigatory Project
No ratings yet
Class 12 Cs Investigatory Project
24 pages
Introduction to Pandas for Data Wrangling
No ratings yet
Introduction to Pandas for Data Wrangling
16 pages
ST Joseph'S Convent Senior Secondary School: Name:-Shatakshi Gaur Class:-Xii Sec:-A Board Roll No.
No ratings yet
ST Joseph'S Convent Senior Secondary School: Name:-Shatakshi Gaur Class:-Xii Sec:-A Board Roll No.
65 pages
IP Project
No ratings yet
IP Project
36 pages
Final ML2
No ratings yet
Final ML2
25 pages
Pandas NumPy Practice Questions
No ratings yet
Pandas NumPy Practice Questions
2 pages
Kendriya Vidyalaya Raninagar
No ratings yet
Kendriya Vidyalaya Raninagar
29 pages
Food Portal Project Report 2024-25
No ratings yet
Food Portal Project Report 2024-25
31 pages
IP Record Python 23-24 Aryan
No ratings yet
IP Record Python 23-24 Aryan
42 pages
CS Project (Smita)
No ratings yet
CS Project (Smita)
30 pages
Python Machine Learning Cookbook Early Release 1st Ed Chris Albon Instant Download
No ratings yet
Python Machine Learning Cookbook Early Release 1st Ed Chris Albon Instant Download
55 pages
Food Portal Project Overview 2024-25
No ratings yet
Food Portal Project Overview 2024-25
36 pages
PES University, Bangalore: UE21CS342AA2 - Data Analytics - Worksheet 4B
No ratings yet
PES University, Bangalore: UE21CS342AA2 - Data Analytics - Worksheet 4B
1 page
Rahul Cs Project 2024
No ratings yet
Rahul Cs Project 2024
37 pages
Class 12 IP Practical Questions
No ratings yet
Class 12 IP Practical Questions
7 pages
Question Bank Class XII IP 065 Long Question Answer
No ratings yet
Question Bank Class XII IP 065 Long Question Answer
35 pages
Indian Food Analysis 1
No ratings yet
Indian Food Analysis 1
22 pages
Bhawini
No ratings yet
Bhawini
29 pages
Foodportal 46 Tolast
No ratings yet
Foodportal 46 Tolast
30 pages
Questions Practical File
No ratings yet
Questions Practical File
13 pages
CS Krish
No ratings yet
CS Krish
30 pages
Pandas for Data Science Beginners
No ratings yet
Pandas for Data Science Beginners
41 pages
Assignment 6
No ratings yet
Assignment 6
7 pages
Prakhar Xii A
No ratings yet
Prakhar Xii A
30 pages
Pandas Cheatsheet DF
No ratings yet
Pandas Cheatsheet DF
1 page
22AD004 - DVE - Assignment 3
No ratings yet
22AD004 - DVE - Assignment 3
13 pages
Oddstudents
No ratings yet
Oddstudents
35 pages
Mayank Singh Project CS Class 12
No ratings yet
Mayank Singh Project CS Class 12
30 pages
Food Portal Project for Students
No ratings yet
Food Portal Project for Students
30 pages
Even Students
No ratings yet
Even Students
36 pages
CS Project
100% (1)
CS Project
30 pages
24UAD315 DEV Final Record
No ratings yet
24UAD315 DEV Final Record
49 pages
CS-PROJECT (2) - Merged
No ratings yet
CS-PROJECT (2) - Merged
29 pages
Numpy & Pandas (Exp-3)
No ratings yet
Numpy & Pandas (Exp-3)
1 page
Practical7 Python Programming
No ratings yet
Practical7 Python Programming
6 pages
Practical (Data Science)
No ratings yet
Practical (Data Science)
13 pages
Question Paper
No ratings yet
Question Paper
5 pages
Project Proposal
No ratings yet
Project Proposal
1 page
Z-Test Implementation with Pandas
No ratings yet
Z-Test Implementation with Pandas
39 pages
VKM Merged
No ratings yet
VKM Merged
29 pages
45 Important Pandas Function
No ratings yet
45 Important Pandas Function
15 pages
ASHWIN Food Portal
No ratings yet
ASHWIN Food Portal
30 pages
Project Based Viva Questions IP
No ratings yet
Project Based Viva Questions IP
3 pages
FDS Notes Unit-4
No ratings yet
FDS Notes Unit-4
30 pages
Student Food Portal Project
No ratings yet
Student Food Portal Project
30 pages
Introduction to Pandas for Data Analysis
No ratings yet
Introduction to Pandas for Data Analysis
10 pages
IP Practical PRGM
No ratings yet
IP Practical PRGM
41 pages
Letter A Fisa Engleza
No ratings yet
Letter A Fisa Engleza
4 pages
Unit III. Learning Theories and Models
No ratings yet
Unit III. Learning Theories and Models
40 pages
2nd Quarter Grade 7
No ratings yet
2nd Quarter Grade 7
4 pages
Writing Effective Reports Handouts
No ratings yet
Writing Effective Reports Handouts
40 pages
HRM-Training Methods and Techniques
50% (2)
HRM-Training Methods and Techniques
20 pages
Method Statement for Battery System Installation
No ratings yet
Method Statement for Battery System Installation
50 pages
150 Financial Independence Prompt Templates
No ratings yet
150 Financial Independence Prompt Templates
7 pages
Physics 2nd Year Full Book
No ratings yet
Physics 2nd Year Full Book
3 pages
円筒形容器の選択理由
No ratings yet
円筒形容器の選択理由
1 page
China Non Metal Ships Industry Profile Cic3752
No ratings yet
China Non Metal Ships Industry Profile Cic3752
8 pages
Karthik June24
No ratings yet
Karthik June24
1 page
DPWH R3 Infrastructure Update
No ratings yet
DPWH R3 Infrastructure Update
60 pages
Unit 3
No ratings yet
Unit 3
16 pages
1st Assignment - Final
No ratings yet
1st Assignment - Final
13 pages
Learning by Solving Solved Problems
No ratings yet
Learning by Solving Solved Problems
2 pages
Ultratech Cement: Particulars Test Results Requirements of
No ratings yet
Ultratech Cement: Particulars Test Results Requirements of
1 page
PHYS 2426 Formula Sheet (SU17)
No ratings yet
PHYS 2426 Formula Sheet (SU17)
6 pages
A Matrix For Learning
No ratings yet
A Matrix For Learning
2 pages
Five Rights in Hip Fracture Care
No ratings yet
Five Rights in Hip Fracture Care
4 pages
Allowable Stress Values of Stainless Steel and Carbon Steel PDF
No ratings yet
Allowable Stress Values of Stainless Steel and Carbon Steel PDF
2 pages
Unit 9 Management Accounting Costing and Budgeting
0% (1)
Unit 9 Management Accounting Costing and Budgeting
7 pages
EBIT (P Q) - FC - (VC Q) EBIT Q (P-VC) - FC: The Algebraic Approach
0% (1)
EBIT (P Q) - FC - (VC Q) EBIT Q (P-VC) - FC: The Algebraic Approach
4 pages
Enterobacterales Summary Tables
No ratings yet
Enterobacterales Summary Tables
3 pages
Planning Ii: Urban Design Investigation
No ratings yet
Planning Ii: Urban Design Investigation
13 pages
Abb Jacking Control
100% (1)
Abb Jacking Control
4 pages
Business Management MCQs and Answers
No ratings yet
Business Management MCQs and Answers
13 pages
Python Project: How To Manage A Speed Sensor With A Labjack U3 HV
0% (1)
Python Project: How To Manage A Speed Sensor With A Labjack U3 HV
9 pages
Karan Balance Sheet 31.03.204
No ratings yet
Karan Balance Sheet 31.03.204
1 page
Manual Motovibrador
No ratings yet
Manual Motovibrador
56 pages
Food Microbial Ecology Insights
No ratings yet
Food Microbial Ecology Insights
10 pages

Pandas Assignment

Uploaded by

Pandas Assignment

Uploaded by

Python for Data Science 3150713

URL for test.xml file.

URL for the file for this assignment.

Convert columns with name course,diet,flavor_profile,state,region to categorical data type &

Categories are defined as follows.

Print name of items with course as dessert.

Print name of items with cooking_time < prep_time.

Print average cooking_time & prep_time for vegetarian diet type.

Write regular expression,

You might also like