Everything about XML

Last modified: March 1, 2023

What is XML?

XML (eXtensible Markup Language) is a markup language that is used to store and exchange data between different systems. It is a standard format for representing data, and is often used in web services, mobile applications, and other software systems.

XML documents can be created and edited using a variety of tools, including text editors and specialized XML editors. XML data can also be parsed and processed using a variety of programming languages, including Java, C++, and Python.

One of the key benefits of XML is its extensibility. XML allows users to define their own elements and attributes, making it a flexible format for representing a wide range of data. Additionally, XML is designed to be both human-readable and machine-readable, which makes it easy to work with for both humans and computers.

Here is an example of an XML document that contains information about a book:

<?xml version="1.0" encoding="UTF-8"?>
<book>
  <title>The Catcher in the Rye</title>
  <author>J.D. Salinger</author>
  <publisher>Little, Brown and Company</publisher>
  <year>1951</year>
  <price currency="USD">19.99</price>
</book>

In this example, the XML document contains a single "book" element, which has child elements that provide information about the book's title, author, publisher, year of publication, and price. The "price" element also includes an attribute called "currency" that specifies the currency in which the price is denominated.

How to use XML in python?

Python provides a built-in module called xml.etree.ElementTree for working with XML data. Here's an example of how to use this module to parse an XML file and extract data from it:

import xml.etree.ElementTree as ET

# Parse the XML file
tree = ET.parse('books.xml')

# Get the root element of the XML file
root = tree.getroot()

# Loop through each "book" element and extract the data
for book in root.findall('book'):
    # Get the title of the book
    title = book.find('title').text

    # Get the author of the book
    author = book.find('author').text

    # Get the year of publication of the book
    year = book.find('year').text

    # Print the data for this book
    print(f'Title: {title}')
    print(f'Author: {author}')
    print(f'Year: {year}')

In this example, the code first parses an XML file called books.xml using the ET.parse() method. The root element of the XML file is then obtained using the getroot() method. The code then loops through each "book" element in the XML file using the findall() method, and extracts the data for each book using the find() method. Finally, the data is printed to the console.

Note that the find() method returns the first child element with the specified tag, while the findall() method returns a list of all child elements with the specified tag. You can also use other methods like findtext() and findall() with XPath expressions to extract data from more complex XML files. Additionally, the xml.etree.ElementTree module also provides functionality for creating new XML documents and modifying existing ones.

Serialize Python data to an XML string

Python's built-in xml.etree.ElementTree module can also be used to create and serialize XML data. Here's an example of how to create an XML document from scratch and write it to a file:

import xml.etree.ElementTree as ET

# Create the root element of the XML document
root = ET.Element('books')

# Create a "book" element and add it as a child of the root element
book1 = ET.SubElement(root, 'book')
book1.set('id', '1')
title1 = ET.SubElement(book1, 'title')
title1.text = 'The Catcher in the Rye'
author1 = ET.SubElement(book1, 'author')
author1.text = 'J.D. Salinger'
year1 = ET.SubElement(book1, 'year')
year1.text = '1951'

# Create another "book" element and add it as a child of the root element
book2 = ET.SubElement(root, 'book')
book2.set('id', '2')
title2 = ET.SubElement(book2, 'title')
title2.text = 'To Kill a Mockingbird'
author2 = ET.SubElement(book2, 'author')
author2.text = 'Harper Lee'
year2 = ET.SubElement(book2, 'year')
year2.text = '1960'

# Create an XML tree from the root element
tree = ET.ElementTree(root)

# Write the XML tree to a file
tree.write('books.xml', encoding='utf-8', xml_declaration=True)

Online XML tools