0
0
Selenium-pythonHow-ToBeginner · 4 min read

How to Scrape Data Using Selenium: Simple Guide with Example

To scrape data using Selenium, you first set up a WebDriver to open a browser, then use find_element or find_elements methods with locators like By.CSS_SELECTOR to locate data on the page. Finally, extract the text or attributes from those elements to get the data you want.
📐

Syntax

Here is the basic syntax to scrape data using Selenium:

  • driver = webdriver.Chrome(): Starts the Chrome browser.
  • driver.get(url): Opens the webpage at the given URL.
  • element = driver.find_element(By.CSS_SELECTOR, 'selector'): Finds a single element using a CSS selector.
  • elements = driver.find_elements(By.CSS_SELECTOR, 'selector'): Finds multiple elements matching the selector.
  • text = element.text: Gets the visible text inside the element.
  • driver.quit(): Closes the browser when done.
python
from selenium import webdriver
from selenium.webdriver.common.by import By

# Start browser
driver = webdriver.Chrome()

# Open webpage
driver.get('https://example.com')

# Find element
element = driver.find_element(By.CSS_SELECTOR, 'h1')

# Get text
text = element.text

# Close browser
driver.quit()
💻

Example

This example opens the example.com homepage, finds the main heading <h1>, and prints its text content.

python
from selenium import webdriver
from selenium.webdriver.common.by import By

# Initialize Chrome WebDriver
driver = webdriver.Chrome()

# Open the webpage
driver.get('https://example.com')

# Locate the main heading element
heading = driver.find_element(By.CSS_SELECTOR, 'h1')

# Print the text inside the heading
print('Heading text:', heading.text)

# Close the browser
driver.quit()
Output
Heading text: Example Domain
⚠️

Common Pitfalls

  • Not waiting for elements: Pages may load slowly, so elements might not be ready. Use explicit waits like WebDriverWait to wait for elements.
  • Wrong locators: Using unstable locators like absolute XPaths can break your scraper. Prefer CSS selectors or stable attributes.
  • Not closing browser: Forgetting driver.quit() can leave browser processes running.
  • Ignoring page navigation: If scraping multiple pages, ensure navigation completes before scraping.
python
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

# Initialize Chrome WebDriver
driver = webdriver.Chrome()

# Wrong way: directly find element without wait
# element = driver.find_element(By.CSS_SELECTOR, 'div.content')  # May fail if not loaded

# Right way: wait until element is present
wait = WebDriverWait(driver, 10)
element = wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, 'div.content')))

# Close browser
driver.quit()
📊

Quick Reference

Remember these key points when scraping with Selenium:

  • Use By.CSS_SELECTOR or By.XPATH to locate elements.
  • Use element.text to get visible text.
  • Use explicit waits (WebDriverWait) to handle dynamic pages.
  • Always close the browser with driver.quit().
  • Keep locators simple and stable for reliable scraping.

Key Takeaways

Set up Selenium WebDriver and open the target webpage before scraping.
Use stable locators like CSS selectors and explicit waits to find elements reliably.
Extract data using element properties like .text or .get_attribute().
Always close the browser with driver.quit() to free resources.
Handle dynamic content by waiting for elements to load before scraping.