Spring Batch Tutorial: Writing Information to an XML File

The previous part of my Spring Batch tutorial described how you can write information to a CSV File. This time you will learn to write the output data of your Spring Batch job to an XML file. After you have read this blog post, you:

  • Can identify the dependencies which are required when you want to write the output data of your batch job to an XML file.
  • Can get the required dependencies with Maven and Gradle.
  • Understand how you can configure an ItemWriter bean which writes the output data of your batch job to an XML file.

Let's start by taking a quick look at your batch job.

This blog post assumes that:

Introduction to Your Batch Job

The example batch job of this blog post processes the student information of an online course. The input data of this batch job is read from a data source and transformed into StudentDTO objects. The StudentDTO class contains the information of a single student, and its source code looks as follows:

public class StudentDTO {
 
    private String emailAddress;
    private String name;
    private String purchasedPackage;
 
    public StudentDTO() {}
 
    public String getEmailAddress() {
        return emailAddress;
    }
 
    public String getName() {
        return name;
    }
 
    public String getPurchasedPackage() {
        return purchasedPackage;
    }
 
    public void setEmailAddress(String emailAddress) {
        this.emailAddress = emailAddress;
    }
 
    public void setName(String name) {
        this.name = name;
    }
 
    public void setPurchasedPackage(String purchasedPackage) {
        this.purchasedPackage = purchasedPackage;
    }
}

During this blog post you will write the output data of your batch job to an XML file. To be more specific, this XML file must contain a student list that provides the following information from each student:

  • The name of the student.
  • The email address of the student.
  • The name of the purchased package.

After you have written the output data to an XML file, the content of the output file must look as follows:

<?xml version="1.0" encoding="UTF-8"?>
<students>
    <student>
        <emailAddress>tony.tester@gmail.com</emailAddress>
        <name>Tony Tester</name>
        <purchasedPackage>master</purchasedPackage>
    </student>
    <student>
        <emailAddress>nick.newbie@gmail.com</emailAddress>
        <name>Nick Newbie</name>
        <purchasedPackage>starter</purchasedPackage>
    </student>
    <student>
        <emailAddress>ian.intermediate@gmail.com</emailAddress>
        <name>Ian Intermediate</name>
        <purchasedPackage>intermediate</purchasedPackage>
    </student>
</students>

Next, you will find out how you can get the required dependencies with Maven and Gradle.

Getting the Required Dependencies

Before you can write the output data of your batch job to an XML file, you have to get the following dependencies:

  • The spring-oxm dependency provides the high-level API which helps you to serialize objects to XML documents and deserialize XML documents to objects.
  • The xstream dependency is a library which can serialize objects to XML documents and deserialize XML documents to objects. It's fast and has a low memory footprint, and that's why it's useful when you are working with batch jobs that process large amount of data.

If you are using the dependency management of Spring Boot with Maven, you can get these dependencies by adding the following dependency declarations to the dependencies section of your POM file:

<dependency>
	<groupId>org.springframework</groupId>
 	<artifactId>spring-oxm</artifactId>
</dependency>
<dependency>
 	<groupId>com.thoughtworks.xstream</groupId>
 	<artifactId>xstream</artifactId>
 	<version>1.4.15</version>
</dependency>

If you are using the dependency management of Spring Boot with Gradle, you can get these dependencies by adding the following dependency declarations to your build.gradle file:

dependencies {
    implementation(
            'org.springframework:spring-oxm',
            'com.thoughtworks.xstream:xstream:1.4.15'
    )
}

Next, you will learn to write the output data of your batch job to an XML file.

Writing Information to an XML File

When you want to process the output data of your batch job, you have to configure an ItemWriter bean. Because you have to write the output data to an XML file, you have to configure this bean by following these steps:

First, you have to create the configuration class that contains the beans which describe the workflow of your Spring Batch job. The source code of your configuration class looks as follows:

import org.springframework.context.annotation.Configuration;
 
@Configuration
public class SpringBatchExampleJobConfig {
}

Second, you have to create the method that configures your ItemWriter bean. Ensure that the created method takes an Environment object as a method parameter and returns an ItemWriter<StudentDTO> object.

After you have added this method to your configuration class, its source code looks as follows:

import org.springframework.batch.item.ItemWriter;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.core.env.Environment;

@Configuration
public class SpringBatchExampleJobConfig {

    @Bean
    public ItemWriter<StudentDTO> itemWriter(Environment environment) {

    }
}

Third, you have to implement the itemWriter() method by following these steps:

  1. Create a new StaxEventItemWriterBuilder<StudentDTO> object. This object creates StaxEventItemWriter<StudentDTO> objects which can write the output data of your batch job to an XML file.
  2. Configure the name of the ItemWriter.
  3. Configure the file path of the created XML file. You can read this information from a properties file by using the Environment object given as a method parameter.
  4. Ensure that the created ItemWriter creates the XML document by using the XStream library. Remember to ensure that the student XML element contains the information of a single student.
  5. Configure the name of the root element (students).
  6. Create a new StaxEventItemWriter<StudentDTO> object and return the created object.

After you have implemented the itemWriter() method, the source code of your configuration class looks as follows:

import org.springframework.batch.item.ItemWriter;
import org.springframework.batch.item.xml.builder.StaxEventItemWriterBuilder;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.core.env.Environment;
import org.springframework.core.io.FileSystemResource;
import org.springframework.core.io.Resource;
import org.springframework.oxm.xstream.XStreamMarshaller;

import java.util.Collections;

@Configuration
public class SpringBatchExampleJobConfig {

    @Bean
    public ItemWriter<StudentDTO> itemWriter(Environment environment) {
        String exportFilePath = environment.getRequiredProperty(
                "batch.job.export.file.path"
        );
        Resource exportFileResource = new FileSystemResource(exportFilePath);

        XStreamMarshaller studentMarshaller = new XStreamMarshaller();
        studentMarshaller.setAliases(Collections.singletonMap(
                "student",
                StudentDTO.class
        ));

        return new StaxEventItemWriterBuilder<StudentDTO>()
                .name("studentWriter")
                .resource(exportFileResource)
                .marshaller(studentMarshaller)
                .rootTagName("students")
                .build();
    }
}

You can now get the required dependencies and configure an ItemWriter bean which writes the output data of your batch job to an XML file. Let's summarize what you learned from this blog post.

Summary

This lesson has taught you four things:

  • The spring-oxm dependency provides the high-level API which helps you to serialize objects to XML documents and deserialize XML documents to objects.
  • The xstream dependency is a library which can serialize objects to XML and deserialize XML documents to objects. It’s fast and has a low memory footprint, and that’s why it’s useful when you are working with batch jobs that process large amount of data.
  • If you have to write the output data of your batch job to an XML file, you must use the StaxEventItemWriter<T> class.
  • The StaxEventItemWriter<T> class serializes objects to XML by using a Marshaller.

The next part of my Spring Batch tutorial describes how you can write the output data of your batch job to a relational database by using JDBC.

P.S. You can get the example application of this blog post from Github.

0 comments… add one

Leave a Reply