UNDER THE GUIDANCE OF
WEB SCRAPING
BACHELOR OF TECHNOLOGY
IN
COMPUTER SCIENCE AND ENGINEERING
BY
SRILATHA PULI
B.GANESH-19VE1A05C6
(2019-2023)
ASSISTANT PROFESSOR
INTRODUCTION
HARDWARE AND SOFTWARE REQUIREMENTS
ADVANTAGES
IMPLEMENTATION
TEHNOLOGIES
INDEX DISADVANTAGES
CONCLUSION
REFERENCES
FUTURE SCOPE
INTRODUCTION
What is Web Scraping?
Web scraping is an automatic method to obtain large amounts of data from
websites. Most of this data is unstructured data in an HTML format which is
then converted into structured data in a spreadsheet or a database so that it can
be used in various applications.
INTRODUCTION
• Browser extensions Web
Scrapers
• Software Web Scrapers
• Cloud Web Scrapers
• Local Web Scrapers
HARDWARE AND
SOFTWARE
REQUIREMENTS
• Data discovery.
• Data extraction.
• Extraction scale.
• Data output.
Time Efficient.
Complete Automation.
Cost Efficiency is one of the common advantages of
Data Scraping.
Does not impact user experience.
ADVANTAGES
Data Accuracy.
Outdated Information.
Problems with Automation.
Rare disadvantages Data Scraping are Speed &
Protection Policies.
IMPLEMENTATION
• Web Scraping has multiple applications across various industries.
Let’s check out some of these now!
1. Price Monitoring
Web Scraping can be used by companies to scrap the product data for their products
and competing products as well to see how it impacts their pricing strategies.
Companies can use this data to fix the optimal pricing for their products so that they
can obtain maximum revenue.
• 2. Market Research
• Web scraping can be used for
market research by
companies. High-quality web
scraped data obtained in large
volumes can be very helpful
for companies in analyzing
consumer trends and
understanding which direction
the company should move in
the future.
• 3. News Monitoring
• Web scraping news sites can
provide detailed reports on the
current news to a company.
This is even more essential for
companies that are frequently
in the news or that depend on
daily news for their day-to-
day functioning. After all,
news reports can make or
break a company in a single
day!
• 4. Email Marketing
• Companies can also use
Web scraping for email
marketing. They can
collect Email ID’s from
various sites using web
scraping and then send
bulk promotional and
marketing Emails to all the
people owning these Email
ID’s.
• 5. Sentiment Analysis
• If companies want to understand the general sentiment for their products
among their consumers, then Sentiment Analysis is a must. Companies
can use web scraping to collect data from social media websites such as
Facebook and Twitter as to what the general sentiment about their
products is. This will help them in creating products that people desire
and moving ahead of their competition.
Generally, web scraping involves three
steps:
1.we send a GET request to the server and
we will receive a response in a form of
web content.
TECHNOLOGIES
2.we parse the HTML code of a web site
following a tree structure path.
3.we use the python library to search for
the parse tree.
TECHNOLOGIES
• I know what you think -- web scraping looks good on paper
but actually more complex in practice. We need coding to get
the data we want, which makes it the privilege of who’s
master of programming. As an alternative, there are web
scraping tools automating web data extraction at fingertips.
• A web scraping tool will load the URLs given by the users
and render the entire website. As a result, you can extract any
web data with simple point-and-click and file in a feasible
format into your computer without coding.
TECHNOLOGIES
• For example, you might want to extract posts and
comments from Twitter. All you have to do is to
paste the URL to the scraper, select desired posts
and comments and execute. Therefore, it saves time
and efforts from the mundane work of copy-and-
paste.
DISADVANTAGES
Data Analysis of Data Difficult to Analyze Speed and Protection
Retrieved through Scraping Policies
the Web
CONCLUSION
Don’t break the Don’t steal:
web: Denial of Copyright and
Service attacks fair use
Better be safe Be nice: ask
than sorry and share
REFERENCES
• https://en.wikipedia.org/wiki/Web_scraping
• https://schoolofdata.org/handbook/courses/scraping/
• https://doc.scrapy.org/en/latest/
• https://www.import.io/
• https://librarycarpentry.org/
• https://datacarpentry.org/
FUTURE SCOPES
•Marketing: As we go forward, marketing
will become an even more competitive
exercise.
•Sentiment Analysis: At present, the trend
has started wherein sentiment analysis
plays a part in arriving at a strategy.
•Way Forward: Whether perfectly legal or
not, web scraping has grown as an essential
requirement of a set of stakeholders of
Internet.
Any
Questions?