Skip to main content

Posts

Showing posts with the label scraping with python

Upvotocracy bot creation project (selenium + post api + webscraping)

Introduction to the project concept: After a long time with machine learning, I recently again got interested to use web-scraping for an exciting project. One of my friends, Anthony Ettinger has started a new exciting website called upvotocracy which is a reddit clone with a number of interesting changes in both voting, moderation(there is none in upvotocracy) and other things. So I have planned to create a bot to post automated contents to his site to increase my karma. And for now, this act is allowed and is not considered illegal. So I have planned to initially populate the thread FakeNews from the onion news using my bot. Later on we will focus on further progresses in the similar direction. Just to provide caution, this was a test project and posting actually using this process ended up in spamming the site unnecessarily; resulting me being heavily thrashed by the whole community. So don't try this on the actual platform. The goal of writing this post is to provide a knowled...

Web scrapping: things you will not know before getting your hand dirty

What is Web scrapping? Web scrapping is a important part of data science. Data science is not only use and analysis of the already created, usable data; but also part of it is to search and create that data. Many times, you will come across a general question in data science field for which you may not find a ready made data in CSV or JSON. So, the solution is Web scrapping.  Web scrapping means to scrap the web, i.e. programatically extracting information needed from one or more than one web page. Web scrapping also goes into detailed soft-wares named web crawlers or so called spiders.   Web scrapping can be done in many languages. For this and the coming posts, I am going to use python for web scrapping. I will show examples where the easiest approach for web scrapping works and where it does not. Also, on later posts, we will explore a bit complex methods and modules to do web scrapping. Introduction to methods: There are broadly two methods to do web...