This document discusses Chapter 10 of the course "Web Data Analysis" which covers using Selenium for web scraping and interacting with web pages programmatically. It discusses using Selenium to find elements by XPath, and get the parent, child, sibling, next sibling and previous sibling elements. Code examples are provided to demonstrate how to locate elements and their relatives using the Selenium Python API and XPath queries. The chapter also introduces HTML, CSS and the Beautiful Soup library for web scraping.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
20 views22 pages
Slide10 Part2
This document discusses Chapter 10 of the course "Web Data Analysis" which covers using Selenium for web scraping and interacting with web pages programmatically. It discusses using Selenium to find elements by XPath, and get the parent, child, sibling, next sibling and previous sibling elements. Code examples are provided to demonstrate how to locate elements and their relatives using the Selenium Python API and XPath queries. The chapter also introduces HTML, CSS and the Beautiful Soup library for web scraping.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22
FACULTY OF INFORMATION SYSTEMS
Course: Web Data Analysis (3 credits)
Lecturer: Nguyen Thon Da Ph.D.
LECTURER’S INFORMATION
Chapter 10 Working with Web-Based APIs, Beautiful Soup and Selenium (Part 2)
Web Data Analysis :: Thon-Da Nguyen Ph.D.
MAIN CONTENTS Using Selenium for web scraping (cont.) Hypertext Markup Language: HTML Using Your Browser as a Development Tool Cascading Style Sheets: CSS The Beautiful Soup Library Scraping JavaScript
Web Data Analysis :: Thon-Da Nguyen Ph.D.
Python Selenium – Find Elements by XPATH To find the HTML Elements by an XPath (language used for locating nodes in HTML) using Selenium in Python, call find_elements() method and pass By.XPATH as the first argument, and the XPath value as the second argument. Code: find_elements(By.XPATH, "xpath_value") find_elements() method returns all the HTML Elements, that satisfy the given XPath value, as a list. If there are no elements in the document for the given XPath value, then find_elements() method returns an empty list.
Web Data Analysis :: Thon-Da Nguyen Ph.D.
Python Selenium – Find Elements by XPATH
Web Data Analysis :: Thon-Da Nguyen Ph.D.
Python Selenium – Find Elements by XPATH
Web Data Analysis :: Thon-Da Nguyen Ph.D.
Python Selenium – Find Elements by XPATH
Web Data Analysis :: Thon-Da Nguyen Ph.D.
Python Selenium – Get the parent element To get the parent element of a given element in Selenium Python, call the find_element() method on the given element and pass By.XPATH for the by parameter, and '..' for the value parameter in the function call. If myelement is the WebElement object for which we would like to find the parent, the code snippet for find_element() method is myelement.find_element(By.XPATH, '..')
Web Data Analysis :: Thon-Da Nguyen Ph.D.
Python Selenium - Get the child elements To get the child elements of a given element in Selenium Python, call the find_elements() method on the given element and pass By.XPATH for the by parameter, and '*' for the value parameter in the function call. If myelement is the WebElement object for which we would like to find the child elements, the code snippet for find_elements() method is myelement.find_elements(By.XPATH, '*') The above method call returns a list of WebElement objects.
Web Data Analysis :: Thon-Da Nguyen Ph.D.
Python Selenium - Get all the sibling elements To get all the sibling elements of a given element in Selenium Python, call the find_elements() method on the given element and pass By.XPATH for by parameter, and 'following-sibling::* | preceding- sibling::*' for the value parameter in the function call. If myelement is the WebElement object for which we would like to find the sibling elements, the code snippet for find_elements() method is myelement.find_elements(By.XPATH, "following-sibling::* | preceding-sibling::*") The above method call returns a list of WebElement objects containing the sibling elements.
Web Data Analysis :: Thon-Da Nguyen Ph.D.
Python Selenium - Get the next sibling element
Web Data Analysis :: Thon-Da Nguyen Ph.D.
Python Selenium - Get the previous sibling element
Web Data Analysis :: Thon-Da Nguyen Ph.D.
Python Selenium - Get all the next sibling elements
Web Data Analysis :: Thon-Da Nguyen Ph.D.
Python Selenium - Get all the previous sibling elements
Web Data Analysis :: Thon-Da Nguyen Ph.D.
Python Selenium - XPath for parent element If web element myelement is already there, and you want to get the parent element of this myelement using XPath, then use the following code: myelement.find_element(By.XPATH, "..")
Web Data Analysis :: Thon-Da Nguyen Ph.D.
Python Selenium - XPath for all child elements
Web Data Analysis :: Thon-Da Nguyen Ph.D.
Python Selenium - XPath for all sibling elements
Web Data Analysis :: Thon-Da Nguyen Ph.D.
Python Selenium - XPath for the next immediate sibling element
Web Data Analysis :: Thon-Da Nguyen Ph.D.
Python Selenium - XPath for all the next following sibling elements
Web Data Analysis :: Thon-Da Nguyen Ph.D.
Python Selenium - XPath for the previous sibling element
Web Data Analysis :: Thon-Da Nguyen Ph.D.
Python Selenium - XPath for all the previous sibling elements
Web Data Analysis :: Thon-Da Nguyen Ph.D.
Python Selenium - XPath for all the next sibling elements (using class)
Selenium Testing Tools Cookbook Second Edition: Over 90 recipes to help you build and run automated tests for your web applications with Selenium WebDriver