0
0
Selenium-pythonHow-ToBeginner · 3 min read

How to Get Page Source in Selenium: Syntax and Example

To get the page source in Selenium, use the driver.page_source property which returns the entire HTML content of the current page as a string. This lets you inspect or save the page's HTML for testing or debugging.
📐

Syntax

The syntax to get the page source in Selenium is simple. Use driver.page_source where driver is your WebDriver instance. It returns the full HTML content of the current page as a string.

  • driver: Your Selenium WebDriver object controlling the browser.
  • page_source: Property that fetches the HTML source code of the loaded page.
python
page_html = driver.page_source
💻

Example

This example shows how to open a webpage using Selenium, get its page source, and print the first 500 characters of the HTML content.

python
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options

# Setup Chrome options
options = Options()
options.add_argument('--headless')  # Run browser in headless mode

# Setup Chrome driver service (adjust path to your chromedriver)
service = Service(executable_path='./chromedriver')

# Create WebDriver instance
with webdriver.Chrome(service=service, options=options) as driver:
    driver.get('https://example.com')
    page_html = driver.page_source
    print(page_html[:500])  # Print first 500 chars of page source
Output
<!doctype html>\n<html>\n<head>\n <title>Example Domain</title>\n <meta charset="utf-8" />\n <meta http-equiv="Content-type" content="text/html; charset=utf-8" />\n <meta name="viewport" content="width=device-width, initial-scale=1" />\n <style type="text/css">\n body {\n background-color: #f0f0f2;\n margin: 40px;\n font-family: "Open Sans", "Helvetica Neue", Helvetica, Arial, sans-serif;\n \n }\n </style>\n</head>\n<body>\n<div>\n <h1>Example Domain</h1>\n <p>This domain is for use in illustrative examples in documents.</p>\n <p><a href="https://www.iana.org/domains/example">More information...</a></p>\n</div>\n</body>\n</html>
⚠️

Common Pitfalls

  • Not waiting for page load: Trying to get page_source before the page fully loads may give incomplete HTML.
  • Using wrong driver instance: Ensure you call page_source on the active WebDriver controlling the browser.
  • Expecting dynamic content: page_source shows the current HTML, but some content loaded by JavaScript after page load may not appear immediately.
python
from selenium import webdriver

# Wrong: calling page_source before get()
# driver = webdriver.Chrome()
# print(driver.page_source)  # This will be empty or default

# Right way:
# driver.get('https://example.com')
# print(driver.page_source)
📊

Quick Reference

Remember these tips when using driver.page_source:

  • Always navigate to the page first with driver.get(url).
  • Wait for page elements to load if needed before getting source.
  • page_source returns a string of the full HTML.
  • Use it for debugging, saving HTML, or verifying page content.

Key Takeaways

Use driver.page_source to get the full HTML content of the current page in Selenium.
Always navigate to the page and wait for it to load before accessing page_source.
page_source returns a string containing the entire HTML, useful for debugging or validation.
Dynamic content loaded after page load may not appear immediately in page_source.
Ensure you use the correct WebDriver instance when calling page_source.