FACULTY OF INFORMATION SYSTEMS
Course:
Web Data Analysis
(3 credits)
Lecturer: Nguyen Thon Da Ph.D.
LECTURER’S INFORMATION
Chapter 10
Working with Web-Based APIs,
Beautiful Soup and Selenium
(Part 2)
Web Data Analysis :: Thon-Da Nguyen Ph.D.
MAIN CONTENTS
Using Selenium for web scraping (cont.)
Hypertext Markup Language: HTML
Using Your Browser as a Development Tool
Cascading Style Sheets: CSS
The Beautiful Soup Library
Scraping JavaScript
Web Data Analysis :: Thon-Da Nguyen Ph.D.
Python Selenium – Find Elements by XPATH
To find the HTML Elements by an XPath (language used for locating nodes in HTML) using Selenium
in Python, call find_elements() method and pass [Link] as the first argument, and the XPath value
as the second argument. Code: find_elements([Link], "xpath_value")
find_elements() method returns all the HTML Elements, that satisfy the given XPath value, as a list. If
there are no elements in the document for the given XPath value, then find_elements() method returns
an empty list.
Web Data Analysis :: Thon-Da Nguyen Ph.D.
Python Selenium – Find Elements by XPATH
Web Data Analysis :: Thon-Da Nguyen Ph.D.
Python Selenium – Find Elements by XPATH
Web Data Analysis :: Thon-Da Nguyen Ph.D.
Python Selenium – Find Elements by XPATH
Web Data Analysis :: Thon-Da Nguyen Ph.D.
Python Selenium – Get the parent element
To get the parent element of a given element in Selenium Python, call the find_element() method on
the given element and pass [Link] for the by parameter, and '..' for the value parameter in the
function call. If myelement is the WebElement object for which we would like to find the parent, the
code snippet for find_element() method is myelement.find_element([Link], '..')
Web Data Analysis :: Thon-Da Nguyen Ph.D.
Python Selenium - Get the child elements
To get the child elements of a given element in Selenium Python, call the find_elements() method on
the given element and pass [Link] for the by parameter, and '*' for the value parameter in the
function call. If myelement is the WebElement object for which we would like to find the child
elements, the code snippet for find_elements() method is myelement.find_elements([Link], '*')
The above method call returns a list of WebElement objects.
Web Data Analysis :: Thon-Da Nguyen Ph.D.
Python Selenium - Get all the sibling elements
To get all the sibling elements of a given element in Selenium Python, call the find_elements() method
on the given element and pass [Link] for by parameter, and 'following-sibling::* | preceding-
sibling::*' for the value parameter in the function call. If myelement is the WebElement object for
which we would like to find the sibling elements, the code snippet for find_elements() method is
myelement.find_elements([Link], "following-sibling::* | preceding-sibling::*")
The above method call returns a list of WebElement objects containing the sibling elements.
Web Data Analysis :: Thon-Da Nguyen Ph.D.
Python Selenium - Get the next sibling element
Web Data Analysis :: Thon-Da Nguyen Ph.D.
Python Selenium - Get the previous sibling element
Web Data Analysis :: Thon-Da Nguyen Ph.D.
Python Selenium - Get all the next sibling elements
Web Data Analysis :: Thon-Da Nguyen Ph.D.
Python Selenium - Get all the previous sibling elements
Web Data Analysis :: Thon-Da Nguyen Ph.D.
Python Selenium - XPath for parent element
If web element myelement is already there, and you want to get the parent element of this
myelement using XPath, then use the following code: myelement.find_element([Link], "..")
Web Data Analysis :: Thon-Da Nguyen Ph.D.
Python Selenium - XPath for all child elements
Web Data Analysis :: Thon-Da Nguyen Ph.D.
Python Selenium - XPath for all sibling elements
Web Data Analysis :: Thon-Da Nguyen Ph.D.
Python Selenium - XPath for the next immediate sibling element
Web Data Analysis :: Thon-Da Nguyen Ph.D.
Python Selenium - XPath for all the next following sibling elements
Web Data Analysis :: Thon-Da Nguyen Ph.D.
Python Selenium - XPath for the previous sibling element
Web Data Analysis :: Thon-Da Nguyen Ph.D.
Python Selenium - XPath for all the previous sibling elements
Web Data Analysis :: Thon-Da Nguyen Ph.D.
Python Selenium - XPath for all the next sibling elements (using class)
Web Data Analysis :: Thon-Da Nguyen Ph.D.