Quick start
===========
Installation
------------
Install from PyPI::
pip install pyuppsala
Or with `uv `_::
uv add pyuppsala
Requirements: Python 3.10 or later. No C compiler needed -- the package ships
pre-built wheels compiled from Rust.
Parse an XML document
---------------------
.. code-block:: python
from pyuppsala import Document
doc = Document("hello")
el = doc.document_element
print(el.tag.local_name) # "root"
print(el.text_content) # "hello"
You can also parse from bytes (UTF-8 and UTF-16 are auto-detected):
.. code-block:: python
from pyuppsala import parse_bytes
doc = parse_bytes(b"ok")
Quick text access with element_text
------------------------------------
For simple elements like ``value``, use :attr:`~Node.element_text`
instead of :attr:`~Node.text_content` -- it returns the text of the first
Text/CDATA child without recursing:
.. code-block:: python
from pyuppsala import Document
doc = Document("Alice30")
root = doc.document_element
for child in root:
print(f"{child.tag.local_name} = {child.element_text}")
# name = Alice
# age = 30
Source tracking
---------------
Every parsed node remembers its position in the original input. You can
retrieve the original source text and byte ranges:
.. code-block:: python
from pyuppsala import Document
xml = '- hello
- world
'
doc = Document(xml)
# The full original input
print(doc.input_text == xml) # True
# Source text of a specific node
item = doc.document_element.children[0]
print(item.source) # '- hello
'
# Byte range for slicing
start, end = item.source_range
print(xml[start:end]) # '- hello
'
Query with XPath
----------------
.. code-block:: python
from pyuppsala import Document, XPathEvaluator
doc = Document("""\
The Great Gatsby
A Brief History of Time
""")
doc.prepare_xpath()
xpath = XPathEvaluator()
# Select nodes
books = xpath.select(doc, "//book")
print(len(books)) # 2
# Evaluate to a string
title = xpath.evaluate(doc, "string(//book[@category='fiction']/title)")
print(title) # "The Great Gatsby"
# Evaluate to a number
count = xpath.evaluate(doc, "count(//book)")
print(count) # 2.0
# Evaluate to a boolean
has_fiction = xpath.evaluate(doc, "boolean(//book[@category='fiction'])")
print(has_fiction) # True
Namespace-aware XPath requires registering prefixes:
.. code-block:: python
doc = Document('')
doc.prepare_xpath()
xpath = XPathEvaluator()
xpath.add_namespace("ns", "urn:test")
nodes = xpath.select(doc, "/root/ns:item")
Find child elements by namespace
---------------------------------
For direct child lookups by namespace URI and local name, use
:meth:`~Node.first_child_element_by_name_ns` and
:meth:`~Node.child_elements_by_name_ns`:
.. code-block:: python
from pyuppsala import Document
xml = """\
first
skip
second
"""
doc = Document(xml)
root = doc.document_element
# Get the first matching child
first = root.first_child_element_by_name_ns("urn:example", "item")
print(first.element_text) # "first"
# Get all matching children
items = root.child_elements_by_name_ns("urn:example", "item")
print(len(items)) # 2
Check element names with matches_name_ns
------------------------------------------
.. code-block:: python
from pyuppsala import Document
xml = 'ok'
doc = Document(xml)
root = doc.document_element
if root.matches_name_ns("urn:oasis:names:tc:SAML:2.0:assertion", "Assertion"):
print("This is a SAML Assertion")
Validate with XSD
-----------------
.. code-block:: python
from pyuppsala import XsdValidator
schema = """\
"""
validator = XsdValidator(schema)
# Quick boolean check
print(validator.is_valid_str("Hello")) # True
# Detailed error list
errors = validator.validate_str("")
for err in errors:
print(err) # prints line:column: message
Build XML without a DOM
-----------------------
.. code-block:: python
from pyuppsala import XmlWriter
w = XmlWriter()
w.write_declaration()
w.start_element("catalog", [("xmlns", "urn:example")])
w.start_element("item", [("id", "1")])
w.text("Widget")
w.end_element("item")
w.end_element("catalog")
print(w.to_string())
Mutate the DOM
--------------
.. code-block:: python
from pyuppsala import Document
doc = Document("")
root = doc.document_element
# Create and attach new nodes
b = doc.create_element("b")
doc.append_child(root, b)
text = doc.create_text("hello")
doc.append_child(b, text)
# Detach and reattach
doc.detach(b)
doc.insert_before(root, b, root.children[0])
print(doc.to_xml())
QName matching
--------------
.. code-block:: python
from pyuppsala import QName
q = QName("Envelope", namespace_uri="http://schemas.xmlsoap.org/soap/envelope/", prefix="soap")
# Match by local name and namespace
print(q.matches("Envelope", namespace_uri="http://schemas.xmlsoap.org/soap/envelope/")) # True
print(q.matches("Envelope")) # False -- namespace doesn't match None
print(q.matches("Body", namespace_uri="http://schemas.xmlsoap.org/soap/envelope/")) # False