0
0
PythonHow-ToBeginner · 3 min read

How to Parse XML in Python: Simple Guide with Examples

To parse XML in Python, use the built-in xml.etree.ElementTree module which provides functions to read and navigate XML data. You can load XML from a string or file, then access elements and attributes easily with methods like find() and iter().
📐

Syntax

The main steps to parse XML using xml.etree.ElementTree are:

  • import xml.etree.ElementTree as ET: Import the module.
  • tree = ET.parse('file.xml'): Load XML from a file.
  • root = tree.getroot(): Get the root element of the XML.
  • element.find('tag'): Find a child element by tag.
  • element.iter('tag'): Iterate over all elements with a tag.
python
import xml.etree.ElementTree as ET

tree = ET.parse('file.xml')
root = tree.getroot()

for child in root:
    print(child.tag, child.attrib)

item = root.find('item')
print(item.text if item is not None else 'No item found')
💻

Example

This example shows how to parse a simple XML string, access elements, and print their text and attributes.

python
import xml.etree.ElementTree as ET

xml_data = '''
<store>
  <item name="apple" price="0.5" />
  <item name="banana" price="0.3" />
  <item name="cherry" price="0.2" />
</store>
'''

root = ET.fromstring(xml_data)

for item in root.iter('item'):
    name = item.attrib.get('name')
    price = item.attrib.get('price')
    print(f"Item: {name}, Price: {price}")
Output
Item: apple, Price: 0.5 Item: banana, Price: 0.3 Item: cherry, Price: 0.2
⚠️

Common Pitfalls

Common mistakes when parsing XML in Python include:

  • Trying to parse malformed XML which causes errors.
  • Using find() when multiple elements exist; it returns only the first match.
  • Not handling missing elements or attributes, which can cause NoneType errors.
  • Confusing parse() (for files) with fromstring() (for strings).

Always check if elements exist before accessing their text or attributes.

python
import xml.etree.ElementTree as ET

xml_data = '<root><item>Value</item></root>'
root = ET.fromstring(xml_data)

# Wrong: assumes 'price' attribute exists
# price = root.find('item').attrib['price']  # KeyError if missing

# Right: safely get attribute with default
price = root.find('item').attrib.get('price', 'N/A')
print(price)
Output
N/A
📊

Quick Reference

Function/MethodDescription
ET.parse(filename)Parse XML file and return ElementTree object
ET.fromstring(string)Parse XML from string and return root element
tree.getroot()Get root element from ElementTree
element.find('tag')Find first child element with tag
element.findall('tag')Find all child elements with tag
element.iter('tag')Iterate over all elements with tag in subtree
element.attribDictionary of element's attributes
element.textText content inside element

Key Takeaways

Use xml.etree.ElementTree for easy XML parsing in Python.
Load XML from files with ET.parse() or from strings with ET.fromstring().
Access elements safely using find(), iter(), and check for None before use.
Handle missing attributes with dict.get() to avoid errors.
Common errors come from malformed XML or wrong method usage.