0
0
PythonHow-ToBeginner · 3 min read

How to Parse URL in Python: Simple Guide with Examples

You can parse a URL in Python using the urllib.parse module, specifically the urlparse() function. It breaks down a URL into components like scheme, netloc, path, params, query, and fragment for easy access.
📐

Syntax

The urlparse() function from the urllib.parse module takes a URL string and returns a ParseResult object with these parts:

  • scheme: The protocol (e.g., http, https).
  • netloc: The network location (domain and port).
  • path: The path to the resource.
  • params: Parameters for last path element.
  • query: The query string after ?.
  • fragment: The part after #.
python
from urllib.parse import urlparse

result = urlparse('https://example.com:8080/path/to/page?name=alice&age=30#section1')

print(result)
Output
ParseResult(scheme='https', netloc='example.com:8080', path='/path/to/page', params='', query='name=alice&age=30', fragment='section1')
💻

Example

This example shows how to parse a URL and access each part separately.

python
from urllib.parse import urlparse

url = 'https://example.com:8080/path/to/page?name=alice&age=30#section1'
parsed_url = urlparse(url)

print('Scheme:', parsed_url.scheme)
print('Network location:', parsed_url.netloc)
print('Path:', parsed_url.path)
print('Parameters:', parsed_url.params)
print('Query:', parsed_url.query)
print('Fragment:', parsed_url.fragment)
Output
Scheme: https Network location: example.com:8080 Path: /path/to/page Parameters: Query: name=alice&age=30 Fragment: section1
⚠️

Common Pitfalls

One common mistake is trying to parse URLs without importing urlparse from urllib.parse. Another is expecting the query string to be automatically split into key-value pairs; urlparse() only returns the raw query string.

To get query parameters as a dictionary, use parse_qs() from urllib.parse.

python
from urllib.parse import urlparse, parse_qs

url = 'https://example.com/path?name=alice&age=30'
parsed = urlparse(url)

# Wrong: expecting query to be a dict
print(parsed.query)  # Outputs raw string

# Right: parse query string into dict
query_params = parse_qs(parsed.query)
print(query_params)
Output
name=alice&age=30 {'name': ['alice'], 'age': ['30']}
📊

Quick Reference

Here is a quick summary of useful functions for URL parsing in Python:

FunctionDescription
urlparse(url)Parse URL into components
urlunparse(parts)Combine components back into URL string
parse_qs(query)Parse query string into dictionary
urljoin(base, url)Combine base URL with relative URL

Key Takeaways

Use urllib.parse.urlparse() to split a URL into parts.
Access URL parts like scheme, netloc, path, query, and fragment from the ParseResult.
Use urllib.parse.parse_qs() to convert query strings into dictionaries.
Remember to import functions from urllib.parse before using them.
Combine URL parts back with urlunparse() if needed.