Spring Batch Tutorial: Reading Information From a REST API

Spring Batch has a good support for reading data from different data sources such as files (CSV or XML) or databases. However, it doesn’t have a built-in support for reading input data from a REST API. If you want to use a REST API as a data source of your Spring Batch job, you have to implement a custom ItemReader which reads the input data from the REST API.

This blog post describes how you can implement your custom ItemReader. After you have read this blog post, you:

  • Understand how you can implement an ItemReader which reads the input data of your batch job by using the RestTemplate class.
  • Know how you can configure the ItemReader bean which provides the input data for your batch job.

Let's begin.

This blog post assumes that:

Introduction to the Example Application

During this blog post you will implement an ItemReader which reads the input data of your Spring Batch job from a REST API endpoint that processes GET requests send to the path: '/api/student/'. This API endpoint returns the information of all students who are enrolled to an online course. To be more specific, your API endpoint returns the following JSON document:

[
    {
        "emailAddress": "tony.tester@gmail.com",
        "name": "Tony Tester",
        "purchasedPackage": "master"
    },
    {
        "emailAddress": "nick.newbie@gmail.com",
        "name": "Nick Newbie",
        "purchasedPackage": "starter"
    },
    {
        "emailAddress": "ian.intermediate@gmail.com",
        "name": "Ian Intermediate",
        "purchasedPackage": "intermediate"
    }
]

You have to transform the returned JSON document into StudentDTO objects which are processed by your batch job. The StudentDTO class contains the information of a single student, and its source code looks as follows:

public class StudentDTO {

    private String emailAddress;
    private String name;
    private String purchasedPackage;

    public StudentDTO() {}

    public String getEmailAddress() {
        return emailAddress;
    }

    public String getName() {
        return name;
    }

    public String getPurchasedPackage() {
        return purchasedPackage;
    }

    public void setEmailAddress(String emailAddress) {
        this.emailAddress = emailAddress;
    }

    public void setName(String name) {
        this.name = name;
    }

    public void setPurchasedPackage(String purchasedPackage) {
        this.purchasedPackage = purchasedPackage;
    }
}

Next, you will implement a custom ItemReader which reads the input data of your batch job from the described API endpoint.

Implementing Your Custom ItemReader

You can implement your custom ItemReader by following these steps:

First, you have to create a new class (RESTStudentReader) and implement the ItemReader interface. When you implement the ItemReader interface, you must set the type of the returned object to StudentDTO.

After you have created your ItemReader class, its source code looks as follows:

import org.springframework.batch.item.ItemReader;

class RESTStudentReader implements ItemReader<StudentDTO> {

}

Second, you have to add the following private fields to the RESTStudentReader class:

  • The final apiUrl field contains the url of the invoked REST API.
  • The final RestTemplate field contains a reference to the RestTemplate object which you use when you read the student information.
  • The nextStudentIndex field contains the index of the next StudentDTO object.
  • The studentData field contains the found StudentDTO objects.

After you have added these fields to the RESTStudentReader class, its source code looks as follows:

import org.springframework.batch.item.ItemReader;
import org.springframework.web.client.RestTemplate;

import java.util.List;

class RESTStudentReader implements ItemReader<StudentDTO> {

    private final String apiUrl;
    private final RestTemplate restTemplate;

    private int nextStudentIndex;
    private List<StudentDTO> studentData;
}

Third, you have to add a constructor to the RESTStudentReader class and implement it by following these steps:

  1. Ensure that the constructor takes the url of the invoked REST API and a RestTemplate object as constructor arguments.
  2. Implement the constructor by storing its constructor arguments in the fields of the created object. Set the value of the nextStudentIndex field to 0.

After you have implemented the constructor, the source code of the RESTStudentReader class looks as follows:

import org.springframework.batch.item.ItemReader;
import org.springframework.web.client.RestTemplate;

import java.util.List;

class RESTStudentReader implements ItemReader<StudentDTO> {
    
    private final String apiUrl;
    private final RestTemplate restTemplate;

    private int nextStudentIndex;
    private List<StudentDTO> studentData;

    RESTStudentReader(String apiUrl, RestTemplate restTemplate) {
        this.apiUrl = apiUrl;
        this.restTemplate = restTemplate;
        nextStudentIndex = 0;
    }
}

Fourth, you have to add a public read() method to the RESTStudentReader class and specify that the method returns a StudentDTO object. Also, you must ensure that this method can throw an Exception. After you have added this method to the RESTStudentReader class, you have to implement it by following these rules:

  • If the student information hasn't been read, read the student information by invoking the REST API.
  • If the next student is found, return the found StudentDTO object and increase the value of the nextStudentIndex field (the index of the next student) by 1.
  • If the next student isn't found, return null. Ensure that your ItemReader reads the input data from the REST API when its read() method is invoked for the next time (set the value of the nextStudentIndex field to 0, and set the value of the studentData field to null).

After you have implemented the RESTStudentReader class, its source code looks as follows:

import org.springframework.batch.item.ItemReader;
import org.springframework.http.ResponseEntity;
import org.springframework.web.client.RestTemplate;

import java.util.Arrays;
import java.util.List;

class RESTStudentReader implements ItemReader<StudentDTO> {

    private final String apiUrl;
    private final RestTemplate restTemplate;

    private int nextStudentIndex;
    private List<StudentDTO> studentData;

    RESTStudentReader(String apiUrl, RestTemplate restTemplate) {
        this.apiUrl = apiUrl;
        this.restTemplate = restTemplate;
        nextStudentIndex = 0;
    }

    @Override
    public StudentDTO read() throws Exception {
        if (studentDataIsNotInitialized()) {
            studentData = fetchStudentDataFromAPI();
        }

        StudentDTO nextStudent = null;

        if (nextStudentIndex < studentData.size()) {
            nextStudent = studentData.get(nextStudentIndex);
            nextStudentIndex++;
        }
        else {
            nextStudentIndex = 0;
            studentData = null;
        }

        return nextStudent;
    }

    private boolean studentDataIsNotInitialized() {
        return this.studentData == null;
    }

    private List<StudentDTO> fetchStudentDataFromAPI() {
        ResponseEntity<StudentDTO[]> response = restTemplate.getForEntity(apiUrl,
                StudentDTO[].class
        );
        StudentDTO[] studentData = response.getBody();
        return Arrays.asList(studentData);
    }
}

Before you can use your new ItemReader, you have to configure the RestTemplate bean. Let's move on and find out how you can configure this bean.

Configuring the RestTemplate Bean

You can configure the RestTemplate bean by following these steps:

  1. Add a public restTemplate() method to your application context configuration class. Ensure that the restTemplate() method returns a RestTemplate object and annotate it with the @Bean annotation.
  2. Implement the restTemplate() method by returning a new RestTemplate object.

If you use Spring Framework, the source code of your application context configuration class looks as follows:

import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.web.client.RestTemplate;
 
@Configuration
public class SpringBatchExampleContext {
 
    @Bean
    public RestTemplate restTemplate() {
        return new RestTemplate();
    }
}

If you use Spring Boot, you can also add the restTemplate() method to your application class which is annotated with the @SpringBootApplication annotation. The source code of this class looks as follows:

import org.springframework.batch.core.configuration.annotation.EnableBatchProcessing;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.context.annotation.Bean;
import org.springframework.scheduling.annotation.EnableScheduling;
import org.springframework.web.client.RestTemplate;

@SpringBootApplication
@EnableBatchProcessing
@EnableScheduling
public class SpringBatchExampleApplication {

    @Bean
    public RestTemplate restTemplate() {
        return new RestTemplate();
    }

    public static void main(String[] args) {
        SpringApplication.run(SpringBatchExampleApplication.class, args);
    }
}
You might have to add some additional dependencies to the classpath before you can configure the RestTemplate bean. These dependencies are described in the following:

  • If you are using Spring Framework, you have to add the spring-webmvc dependency to the classpath.
  • If you are using Spring Boot, you have to add the spring-boot-starter-web dependency to the classpath.

After you have configured the RestTemplate bean, you can finally configure your ItemReader bean.

Configuring the ItemReader Bean

You can configure the ItemReader bean by following these steps:

First, you have to create a new configuration class. After you have created this class, its source code looks as follows:

import org.springframework.context.annotation.Configuration;

@Configuration
public class SpringBatchExampleJobConfig {

}

Second, you have to create a new method that configures your ItemReader bean. This method returns an ItemReader<StudentDTO> object, and it takes an Environment object and a RestTemplate object as method parameters.

After you have added this method to your configuration class, its source code looks as follows:

import org.springframework.batch.item.ItemReader;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.core.env.Environment;
import org.springframework.web.client.RestTemplate;

@Configuration
public class SpringBatchExampleJobConfig {

    @Bean
    public ItemReader<StudentDTO> itemReader(Environment environment,
                                             RestTemplate restTemplate) {

    }
}

Fourth, you have to implement the itemReader() method by returning a new RESTStudentReader object. When you create a new RESTStudentReader object, you have to pass the following objects as constructor arguments:

  • The url of the invoked REST API. You can read this information from a properties file by using the Environment object given as a method parameter.
  • The RestTemplate object which is used to query the student information from the invoked REST API.

After you have implemented the itemReader() method, the source code of your configuration class looks as follows:

import org.springframework.batch.item.ItemReader;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.core.env.Environment;
import org.springframework.web.client.RestTemplate;

@Configuration
public class SpringBatchExampleJobConfig {

    @Bean
    public ItemReader<StudentDTO> itemReader(Environment environment,
                                             RestTemplate restTemplate) {
        return new RESTStudentReader(environment.getRequiredProperty("rest.api.url"),
                restTemplate
        );
    }
}

You can now write a custom ItemReader which reads the input data of your batch job from a REST API. Let’s summarize what you learned from this blog post.

Summary

This blog post has taught you two things:

  • Spring Batch doesn’t have an ItemReader that can read information from a REST API.
  • If you want to read the input data of your batch job from a REST API, you can read this information by using the RestTemplate class.

The next part of this tutorial describes how you can read the input data of your batch job from an Excel spreadsheet.

P.S. You can get the example application of this blog post from Github.

29 comments… add one
  • cici Aug 8, 2016 @ 21:41

    This was a great tutorial on how to use a REST API as the source of my data for Spring Batch. I am struggling with the best way to unit test it though. Do you have any suggestions or additions for this example?

    Thanks!

    • Petri Aug 8, 2016 @ 21:49

      Hi,

      You could move the logic that reads the input data to a separate component (I will call this component RestInputReader) and inject that component to the Spring Batch reader. This gives you the possibility to replace the RestInputReader with a stub when you are writing unit tests for the Spring Batch reader.

      I wouldn't write unit tests for the RestInputReader because it doesn't make much sense to replace RestTemplate with a test double because RestTemplate does all the work. I would test it by writing integration tests. If the REST API is an external API, I would replace it with a simple stub.

      If you have any additional questions, don't hesitate to ask them.

    • Santosh Oct 16, 2019 @ 16:52

      Can you let me know after implementing this code, how do we execute this ? Are you implementing a tasklet ?

      • Santosh Oct 16, 2019 @ 17:05

        Also, why are we naming the method `launchXmlFileToDatabaseJob()` in the Launcher ? wasn't this supposed to be data coming from REST API ?

        • Petri Oct 22, 2019 @ 18:32

          Good catch. I guess that I simply forgot to change the name of the method after I copied it from another launcher class.

  • Senthil Sep 30, 2016 @ 10:09

    I have gone through Spring Batch Tutorial: Reading Information From a REST API post. It is very useful. Can please let me know how to run this project?

    • Petri Sep 30, 2016 @ 12:27

      Hi,

      Sure. Just clone this repository, select the example you want to run (Spring or Spring Boot), and run the command described in the README.

      • devendra Oct 8, 2016 @ 13:08

        Hi Petri,

        How to read XML file using custom itemreader,itemprocessor write into database using itemwriter.

        Pls can u share code for above one

        Waiting for your response
        Advanced thank

        Dandu
        8553562402

        • Petri Oct 11, 2016 @ 20:08

          Hi,

          Unfortunately I cannot cover this topic in a single answer. However, you can find my Spring Batch tutorial and a few other great resources from my Spring Batch resource page. Also, this blog post explains how you can read data from XML file and write it to a database.

  • Florent Dec 25, 2016 @ 9:12

    Hi Petri,
    Thank you for this great post. I have approximately the same use case than you. But my problem is than my rest service is using a database containing more than 150.000 entries. So I am afraid than my memory is not big enough. How can I handle that problem?

    • Petri Dec 25, 2016 @ 15:02

      Hi Florent,

      I agree that it's not a good to load the content of your database into memory. I would probably implement an API that supports pagination and write an ItemReader that reads the data one page at the time.

      If you don't know how you can do it, don't hesitate to ask additional questions.

      • amar Jun 12, 2020 @ 15:59

        can you give me link or code or blog , where some example for reading data with API + Pagination

        • Petri Jun 26, 2020 @ 10:23

          Unfortunately, I don't know if such blog post exists (I didn't find anything interesting from Google). I have added this to my to-do list, and maybe I will write this blog post in the future.

  • Claudio Mar 31, 2017 @ 18:46

    Thanks Petri, it helped me a lot your example.

    • Petri Apr 3, 2017 @ 22:31

      You are welcome. I am happy hear that this blog post was useful to you.

  • Sherin May 1, 2017 @ 17:43

    Petri
    The ItemReader you have is stateful. Do you think it is thread-safe?

    ~Sherin

    • Sherin May 1, 2017 @ 17:49

      By the way it is a great example

      ~Sherin K Syriac

      • Petri May 1, 2017 @ 23:46

        Thank you for your kind words. I really appreciate them. About your question:

        Yes, my ItemReader is stateful, and Spring Batch assumes that this is the case. The Javadoc of the ItemReader interface states that:

        Implementations are expected to be stateful and will be called multiple times for each batch, with each call to read() returning a different value and finally returning null when all input data is exhausted.

        Implementations need not be thread-safe and clients of a ItemReader need to be aware that this is the case.

  • Raju Jun 21, 2018 @ 21:37

    This is nice explanation ,keep it up

    • Petri Jun 24, 2018 @ 21:09

      Thank you for your kind words. I really appreciate them.

  • Chaitanya Jul 23, 2018 @ 9:53

    Hi Petri,
    This is an awesome explanation. Between, I have a requirement to pull data from external REST API which returns data in .csv files.
    I have a few such files ex: user.csv, user_purchases.csv etc. Requirement is, to map data in files based on user id and push accordingly to the database.
    Please suggest an approach to handle this problem.

    ~Chaitanya

  • Efrén Jan 28, 2019 @ 13:59

    Hi Petri,
    thanks for the article.
    I have a question: is a bad practice to consume a web service (rest or soap) for each item in batch processing with the goal to send data to a new application ? for this case, is a better idea to use a JMS queue ?

    • Petri Jan 28, 2019 @ 18:51

      Hi Efrén,

      If you need a "real time" integration, it's a good idea to use a JMS queue or you can simply implement a REST API that is invoked by the service which sends data to the target system. On the other hand, if it's OK that the data is transferred at a specific time (02:00 at night, once per hour, and so on), it's a good idea to use Spring Batch (or you can write the batch job yourself).

  • tsrini30 Aug 11, 2021 @ 16:46

    Hi. Very Good One.
    But I have one clarification how to save the json data in the database using JPA repository.
    Can you help me on this.

  • Prashant Singh Aug 17, 2022 @ 20:13

    Very clear and useful information on customer ItemReader with Rest api call .
    Thanks a lot .

  • Prashant Aug 18, 2022 @ 12:56

    But how to handle the case if Rest api is returning 100,000 or more Student data .
    Any suggestion for this ?
    Thanks

  • Prashant Aug 18, 2022 @ 14:50

    How can we handle large number of data while reading through Rest api ?

  • Abhaya Oct 23, 2022 @ 13:57

    Hi Petri,

    I have a scenario like suppose i have 2 table A and B
    List<A> listA="select * from A"
    for(A a: listA){
    List<B> listB ="select * from B where col1=a.x;
    for(B b : listB){
    if(some condition)
    //update Table B
    }
    //Update Table A
    }

    For above scenario how i can do by using spring Batch.Please help me

Leave a Reply