requests

Web Scrapping (Requests)

Group 7: No Name

Group Members:

Name	Matric
Madina Suraya binti Zharin	A20EC0203
Nur Izzah Mardhiah binti Rashidi	A20EC0116
Tan Yong Sheng	A20EC0157
Chloe Racquelmae Kennedy	A20EC0026

About Requests

Using requests library, we can fetch the content from the URL given. Requests library is the best choice if we just start with web scraping and have access to an API. The requests library will make a GET request to a web server, which will download the HTML contents of a given web page for us.

It is easy to understand and does not require much practice to master.
Requests also minimizes the need to include query strings in your URLs manually.
It also supports authentication modules and handles cookies and sessions with excellent stability.

Purpose

However, using requests library solely is not enough to do web scraping. Hence, we need libraries that can parse the document. In this notebook, we use the Beautiful Soup library to parse this document, and extract the text from the div tag.

We chose Puma website to perform web scraping since it is the Chinese New Year season, and they offer sale. Therefore, we would like to see if there is any interesting data (Product Name, Price New, Price Old) related to their sneakers.

Results

There are 36 items that we had extracted. However, some of them is duplicates and contains null values.

Product Name
Price New = price after CNY sale discount
Price Old = the original price without any discount

We then perform some data cleaning before store the data into an Excel file which we also uploaded entitled puma_sneakers_women_sale.csv file.

Name		Name	Last commit message	Last commit date
parent directory ..
Requests_WebScrapping_NoName.ipynb		Requests_WebScrapping_NoName.ipynb
puma_sneakers_women_sale.csv		puma_sneakers_women_sale.csv
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

readme.md

Web Scrapping (Requests)

About Requests

Purpose

Results

FilesExpand file tree

requests

Directory actions

More options

Directory actions

More options

Latest commit

History

requests

Folders and files

parent directory

readme.md

Web Scrapping (Requests)

About Requests

Purpose

Results