You can use Selenium to scrape data from specific elements of a web page. Let’s take the same example from our previous post: How to web scrape with python selenium?
We have used this Python code (with Selenium) to wait for the content to load by adding some waiting time:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
import time
options = Options()
options.headless = True
driver = webdriver.Chrome(options=options, executable_path="PATH_TO_CHROMEDRIVER") # Setting up the Chrome driver
driver.get("https://demo.scrapingbee.com/content_loads_after_5s.html")
time.sleep(6) # Sleep for 6 seconds
print(driver.page_source)
driver.quit()
And we’ve had this result:
...
This is content
...
Now, we can further improve our code to extract the content itself without having to load the whole HTML code. To do that, we can run this code:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
import time
options = Options()
options.headless = True
driver = webdriver.Chrome(options=options, executable_path="PATH_TO_CHROMEDRIVER") # Setting up the Chrome driver
driver.get("https://demo.scrapingbee.com/content_loads_after_5s.html")
time.sleep(6) # Sleep for 6 seconds
element = driver.find_element(By.ID, 'content')
print(element.text)
driver.quit()
And the result will be: This is content instead of the page’s HTML code.
For more information about Python & Selenium, make sure to check this thorough blog article: Web Scraping using Selenium and Python
