How to read XML file using java

Reading XML files is not a trivial task by any means. It requires the reader to have intimate knowledge of the structure of the file. By that, I mean, what are the element names, the attribute names, the data type of the attributes, the order of the elements, whether the elements are simple of complex (meaning they are flat or have nested elements underneath).

One solution, as shown by Jon Skeet’s comment, is to use Java Document API. This interface has all the methods you will need to get data from an XML file. However, in my opinion, this still leaves the reader with the task of knowing the element and attribute names.

If a XML schema (XSD) or Document Type Definition (DTD) for a given XML is available or can be easily constructed, I prefer to use one of the many libraries to parse XML contents; to name a few StaX, JDOM, DOM4j, JAXB. Because I have used it extensively, I prefer JAXB. There are some limitations to JAXB, but those are out of scope for this discussion. One thing worth mentioning is that JAXB is included in Java distributions from Java 6 to 10. Outside of those versions, you must download the JAXB distribution yourself.

One of the primary reasons I used JAXB is that I can use annotations in POJOs to structure a class according to existing XMLs without needing to build a schema. Of course, this is not always simple to do. It is almost always compile your JAXB classes according to a schema. Because this will produce Java custom classes for your XML documents, you can call elements an attributes by their getter methods, rather than putting the burden on the reader to know the element names.

I used the OPs XML file to generate a schema using XML Copy Editor. The resulting schema looks like this:

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified">
  <xs:element name="Tree">
    <xs:complexType>
      <xs:sequence>
        <xs:element ref="child" maxOccurs="unbounded"/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
  <xs:element name="child">
    <xs:complexType>
      <xs:sequence>
        <xs:element ref="Property" maxOccurs="unbounded"/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
  <xs:element name="Property">
    <xs:complexType mixed="true">
      <xs:attribute name="Name" type="xs:string" use="required"/>
    </xs:complexType>
  </xs:element>
</xs:schema>

Once you have the schema, you can use it to compile the JAXB classes using the XJC compiler that comes with Java. Here is an example on how to compile the JAXB classes: https://docs.oracle.com/javase/tutorial/jaxb/intro/examples.html

To download the JAXB compiler, go to https://javaee.github.io/jaxb-v2/ and click “Download standalone distribution”. You can place the contents of the ZIP file anywhere on your computer. Then, simply set JAXB_HOME on your environment variables and you are set. This might seem like a lot of work but up to this point, these are one-time activities. The upside is when you have your environment set up, it will literally take you seconds to compile all your classes; even if you need to generate the schema based on your XML.

Executing the compiler generated Tree.java, Child.java, and Property.java.

Tree.java

@XmlAccessorType(XmlAccessType.FIELD)
@XmlType(name = "", propOrder = {
    "child"
})
@XmlRootElement(name = "Tree")
public class Tree {

    @XmlElement(required = true)
    protected List<Child> child;

    public List<Child> getChild() {
        if (child == null) {
            child = new ArrayList<Child>();
        }
        return this.child;
    }
}

Child.java

@XmlAccessorType(XmlAccessType.FIELD)
@XmlType(name = "", propOrder = {
    "property"
})
@XmlRootElement(name = "child")
public class Child {

    @XmlElement(name = "Property", required = true)
    protected List<Property> property;

    public List<Property> getProperty() {
        if (property == null) {
            property = new ArrayList<Property>();
        }
        return this.property;
    }
}

Property.java

@XmlAccessorType(XmlAccessType.FIELD)
@XmlType(name = "", propOrder = {
    "content"
})
@XmlRootElement(name = "Property")
public class Property {

    @XmlValue
    protected String content;
    @XmlAttribute(name = "Name", required = true)
    protected String name;

    public String getContent() {
        return content;
    }

    public void setContent(String value) {
        this.content = value;
    }

    public String getName() {
        return name;
    }

    public void setName(String value) {
        this.name = value;
    }
}

How to use these classes

The reading process (unmarshaling) converts the XML file into these generated data types. The JAXB unmarshaling process uses the JAXBContext utility class to create an unmarshaler and then call the unmarshal method to convert the XML file into objects:

JAXBContext context = JAXBContext.newInstance(Tree.class); // the argument is the root node
Tree xmlDoc = (Tree) context.createUnmarshaller().unmarshal(new FileReader("abc.xml")); // Reads the XML and returns a Java object

To write, you will use the Java classes to store the data and create the structure. In this case, you will need to create the required Property objects, the Child container for the property elements, and the root node which is the Tree node. You can add elements one at a time or create a list of them and add them all at once. Once the root node object is populated, simply pass it to the marshaler…

JAXBContext context = JAXBContext.newInstance(Tree.class);
Marshaller mar= context.createMarshaller();
mar.setProperty(Marshaller.JAXB_FORMATTED_OUTPUT, Boolean.TRUE); // formatting the xml file
mar.marshal(tree, new File("abc.xml")); // saves the "Tree" object as "abc.xml"

All together

import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.util.List;

import javax.xml.bind.JAXBContext;
import javax.xml.bind.JAXBException;
import javax.xml.bind.Marshaller;

public class JAXBDemo {
    public static void main(String[] args) {
        try {
            // write
            Tree tree = new Tree();
            Property prop0 = new Property();
            prop0.setName("id");
            prop0.setContent("");
            
            Property prop1 = new Property();
            prop1.setName("username");
            prop1.setContent("abc");
            
            Property prop2 = new Property();
            prop2.setName("phoneType");
            prop2.setContent("phone1");

            Property prop3 = new Property();
            prop3.setName("value");
            prop3.setContent("123456");

            List<Property> props1 = List.of(prop0, prop1, prop2, prop3);

            Property prop4 = new Property();
            prop4.setName("id");
            prop4.setContent("");
            
            Property prop5 = new Property();
            prop5.setName("username");
            prop5.setContent("def");
            
            Property prop6 = new Property();
            prop6.setName("phoneType");
            prop6.setContent("phone2");

            Property prop7 = new Property();
            prop7.setName("value");
            prop7.setContent("6789012");

            List<Property> props2 = List.of(prop4, prop5, prop6, prop7);
            
            Child child1 = new Child();
            Child child2 = new Child();
            
            child1.getProperty().addAll(props1);
            child2.getProperty().addAll(props2);
            
            tree.getChild().add(child1);
            tree.getChild().add(child2);

            JAXBContext context = JAXBContext.newInstance(Tree.class);
            Marshaller mar= context.createMarshaller();
            mar.setProperty(Marshaller.JAXB_FORMATTED_OUTPUT, Boolean.TRUE);
            mar.marshal(tree, new File("abc.xml"));

            // read
            Tree xmlDoc = (Tree) context.createUnmarshaller().unmarshal(new FileReader("abc.xml"));
            List<Child> children = xmlDoc.getChild();
            int i = 1;
            for (Child child : children) {
                System.out.println("Property " + i++ + ":");
                List<Property> props = child.getProperty();
                for (Property prop : props) {
                    System.out.println("Name: " + prop.getName() + "; Content: " + prop.getContent());
                }
            }
        } catch (JAXBException | FileNotFoundException e) {

            e.printStackTrace();
        }
    }
}

Last notes:

To get this to work, I had to make some “fixes” to the distribution. The first fix was to edit the xjc.bat according to this post: https://github.com/eclipse-ee4j/jaxb-ri/issues/1321. Scroll to the bottom to see the fix I applied.

Then, I needed to update my “jaxb-runtime” dependency to version 2.3.3 in order for the project to work with “jaxb-api” version 2.3.1.