0
0
PythonHow-ToBeginner · 4 min read

How to Validate URL in Python: Simple and Effective Methods

To validate a URL in Python, you can use the urllib.parse module to parse the URL and check its components or use regular expressions with the re module for pattern matching. The validators library also provides a simple way to check if a URL is valid.
📐

Syntax

Here are common ways to validate a URL in Python:

  • Using urllib.parse: Parse the URL and check if scheme and netloc parts exist.
  • Using re module: Match the URL string against a regular expression pattern.
  • Using validators library: Call validators.url(url) which returns True if valid.
python
from urllib.parse import urlparse
import re
import validators

# Using urllib.parse
parsed_url = urlparse('https://example.com')
if parsed_url.scheme and parsed_url.netloc:
    print('Valid URL')
else:
    print('Invalid URL')

# Using regex
pattern = re.compile(r'^(https?|ftp)://[\w.-]+(?:\.[\w\.-]+)+[/\w\-\._~:/?#[\]@!$&'"()*+,;=.]+$')
url = 'https://example.com'
if pattern.match(url):
    print('Valid URL')
else:
    print('Invalid URL')

# Using validators library
if validators.url('https://example.com'):
    print('Valid URL')
else:
    print('Invalid URL')
Output
Valid URL Valid URL Valid URL
💻

Example

This example shows how to validate URLs using urllib.parse and the validators library. It checks multiple URLs and prints if each is valid or not.

python
from urllib.parse import urlparse
import validators

def is_valid_url(url):
    parsed = urlparse(url)
    if not (parsed.scheme and parsed.netloc):
        return False
    return validators.url(url)

urls = [
    'https://www.google.com',
    'ftp://files.server.com',
    'http:/invalid-url',
    'justtext',
    'https://example.com/path?query=1'
]

for url in urls:
    print(f'{url} ->', 'Valid' if is_valid_url(url) else 'Invalid')
Output
https://www.google.com -> Valid ftp://files.server.com -> Valid http:/invalid-url -> Invalid justtext -> Invalid https://example.com/path?query=1 -> Valid
⚠️

Common Pitfalls

Common mistakes when validating URLs include:

  • Only checking if the string starts with http or https without verifying the full structure.
  • Using overly simple regex that misses valid URLs or accepts invalid ones.
  • Not handling URLs without schemes or with uncommon schemes.
  • Ignoring the need to check both scheme and network location parts.

Always use reliable parsing or validation libraries when possible.

python
from urllib.parse import urlparse

# Wrong way: only checking start
url = 'http:/example.com'
if url.startswith('http://') or url.startswith('https://'):
    print('Valid URL')
else:
    print('Invalid URL')  # This prints 'Invalid URL' but URL is invalid

# Right way: parse and check
parsed = urlparse(url)
if parsed.scheme in ('http', 'https') and parsed.netloc:
    print('Valid URL')
else:
    print('Invalid URL')  # Correctly prints 'Invalid URL'
Output
Invalid URL Invalid URL
📊

Quick Reference

Tips for URL validation in Python:

  • Use urllib.parse.urlparse() to break down the URL and check essential parts.
  • Use the validators library for simple and reliable validation.
  • Be cautious with regex; prefer tested patterns or libraries.
  • Always check both scheme (like http) and network location (domain).

Key Takeaways

Use urllib.parse.urlparse to check URL components like scheme and netloc for basic validation.
The validators library offers a simple function to confirm if a URL is valid.
Avoid relying only on string startswith checks or simple regex for URL validation.
Always verify both the scheme and network location parts of a URL.
Testing with multiple URL examples helps ensure your validation works correctly.