Spring Batch has a good support for reading data from different data sources such as files (CSV or XML) or databases. However, it doesn’t have a built-in support for reading input data from a REST API. If you want to use a REST API as a data source of your Spring Batch job, you have to implement a custom ItemReader which reads the input data from the REST API.
This blog post describes how you can implement your custom ItemReader. After you have read this blog post, you:
- Understand how you can implement an
ItemReaderwhich reads the input data of your batch job by using theRestTemplateclass. - Know how you can configure the
ItemReaderbean which provides the input data for your batch job.
Let's begin.
- You are familiar with Spring Batch
- You can get the required dependencies with Maven or Gradle
- You can create a custom
ItemReader
Introduction to the Example Application
During this blog post you will implement an ItemReader which reads the input data of your Spring Batch job from a REST API endpoint that processes GET requests send to the path: '/api/student/'. This API endpoint returns the information of all students who are enrolled to an online course. To be more specific, your API endpoint returns the following JSON document:
[
{
"emailAddress": "tony.tester@gmail.com",
"name": "Tony Tester",
"purchasedPackage": "master"
},
{
"emailAddress": "nick.newbie@gmail.com",
"name": "Nick Newbie",
"purchasedPackage": "starter"
},
{
"emailAddress": "ian.intermediate@gmail.com",
"name": "Ian Intermediate",
"purchasedPackage": "intermediate"
}
]
You have to transform the returned JSON document into StudentDTO objects which are processed by your batch job. The StudentDTO class contains the information of a single student, and its source code looks as follows:
public class StudentDTO {
private String emailAddress;
private String name;
private String purchasedPackage;
public StudentDTO() {}
public String getEmailAddress() {
return emailAddress;
}
public String getName() {
return name;
}
public String getPurchasedPackage() {
return purchasedPackage;
}
public void setEmailAddress(String emailAddress) {
this.emailAddress = emailAddress;
}
public void setName(String name) {
this.name = name;
}
public void setPurchasedPackage(String purchasedPackage) {
this.purchasedPackage = purchasedPackage;
}
}
Next, you will implement a custom ItemReader which reads the input data of your batch job from the described API endpoint.
Implementing Your Custom ItemReader
You can implement your custom ItemReader by following these steps:
First, you have to create a new class (RESTStudentReader) and implement the ItemReader interface. When you implement the ItemReader interface, you must set the type of the returned object to StudentDTO.
After you have created your ItemReader class, its source code looks as follows:
import org.springframework.batch.item.ItemReader;
class RESTStudentReader implements ItemReader<StudentDTO> {
}
Second, you have to add the following private fields to the RESTStudentReader class:
- The
final apiUrlfield contains the url of the invoked REST API. - The
final RestTemplatefield contains a reference to theRestTemplateobject which you use when you read the student information. - The
nextStudentIndexfield contains the index of the nextStudentDTOobject. - The
studentDatafield contains the foundStudentDTOobjects.
After you have added these fields to the RESTStudentReader class, its source code looks as follows:
import org.springframework.batch.item.ItemReader;
import org.springframework.web.client.RestTemplate;
import java.util.List;
class RESTStudentReader implements ItemReader<StudentDTO> {
private final String apiUrl;
private final RestTemplate restTemplate;
private int nextStudentIndex;
private List<StudentDTO> studentData;
}
Third, you have to add a constructor to the RESTStudentReader class and implement it by following these steps:
- Ensure that the constructor takes the url of the invoked REST API and a
RestTemplateobject as constructor arguments. - Implement the constructor by storing its constructor arguments in the fields of the created object. Set the value of the
nextStudentIndexfield to 0.
After you have implemented the constructor, the source code of the RESTStudentReader class looks as follows:
import org.springframework.batch.item.ItemReader;
import org.springframework.web.client.RestTemplate;
import java.util.List;
class RESTStudentReader implements ItemReader<StudentDTO> {
private final String apiUrl;
private final RestTemplate restTemplate;
private int nextStudentIndex;
private List<StudentDTO> studentData;
RESTStudentReader(String apiUrl, RestTemplate restTemplate) {
this.apiUrl = apiUrl;
this.restTemplate = restTemplate;
nextStudentIndex = 0;
}
}
Fourth, you have to add a public read() method to the RESTStudentReader class and specify that the method returns a StudentDTO object. Also, you must ensure that this method can throw an Exception. After you have added this method to the RESTStudentReader class, you have to implement it by following these rules:
- If the student information hasn't been read, read the student information by invoking the REST API.
- If the next student is found, return the found
StudentDTOobject and increase the value of thenextStudentIndexfield (the index of the next student) by 1. - If the next student isn't found, return
null. Ensure that yourItemReaderreads the input data from the REST API when itsread()method is invoked for the next time (set the value of thenextStudentIndexfield to 0, and set the value of thestudentDatafield tonull).
After you have implemented the RESTStudentReader class, its source code looks as follows:
import org.springframework.batch.item.ItemReader;
import org.springframework.http.ResponseEntity;
import org.springframework.web.client.RestTemplate;
import java.util.Arrays;
import java.util.List;
class RESTStudentReader implements ItemReader<StudentDTO> {
private final String apiUrl;
private final RestTemplate restTemplate;
private int nextStudentIndex;
private List<StudentDTO> studentData;
RESTStudentReader(String apiUrl, RestTemplate restTemplate) {
this.apiUrl = apiUrl;
this.restTemplate = restTemplate;
nextStudentIndex = 0;
}
@Override
public StudentDTO read() throws Exception {
if (studentDataIsNotInitialized()) {
studentData = fetchStudentDataFromAPI();
}
StudentDTO nextStudent = null;
if (nextStudentIndex < studentData.size()) {
nextStudent = studentData.get(nextStudentIndex);
nextStudentIndex++;
}
else {
nextStudentIndex = 0;
studentData = null;
}
return nextStudent;
}
private boolean studentDataIsNotInitialized() {
return this.studentData == null;
}
private List<StudentDTO> fetchStudentDataFromAPI() {
ResponseEntity<StudentDTO[]> response = restTemplate.getForEntity(apiUrl,
StudentDTO[].class
);
StudentDTO[] studentData = response.getBody();
return Arrays.asList(studentData);
}
}
Before you can use your new ItemReader, you have to configure the RestTemplate bean. Let's move on and find out how you can configure this bean.
Configuring the RestTemplate Bean
You can configure the RestTemplate bean by following these steps:
- Add a
public restTemplate()method to your application context configuration class. Ensure that therestTemplate()method returns aRestTemplateobject and annotate it with the@Beanannotation. - Implement the
restTemplate()method by returning a newRestTemplateobject.
If you use Spring Framework, the source code of your application context configuration class looks as follows:
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.web.client.RestTemplate;
@Configuration
public class SpringBatchExampleContext {
@Bean
public RestTemplate restTemplate() {
return new RestTemplate();
}
}
If you use Spring Boot, you can also add the restTemplate() method to your application class which is annotated with the @SpringBootApplication annotation. The source code of this class looks as follows:
import org.springframework.batch.core.configuration.annotation.EnableBatchProcessing;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.context.annotation.Bean;
import org.springframework.scheduling.annotation.EnableScheduling;
import org.springframework.web.client.RestTemplate;
@SpringBootApplication
@EnableBatchProcessing
@EnableScheduling
public class SpringBatchExampleApplication {
@Bean
public RestTemplate restTemplate() {
return new RestTemplate();
}
public static void main(String[] args) {
SpringApplication.run(SpringBatchExampleApplication.class, args);
}
}
RestTemplate bean. These dependencies are described in the following:
- If you are using Spring Framework, you have to add the
spring-webmvcdependency to the classpath. - If you are using Spring Boot, you have to add the
spring-boot-starter-webdependency to the classpath.
After you have configured the RestTemplate bean, you can finally configure your ItemReader bean.
Configuring the ItemReader Bean
You can configure the ItemReader bean by following these steps:
First, you have to create a new configuration class. After you have created this class, its source code looks as follows:
import org.springframework.context.annotation.Configuration;
@Configuration
public class SpringBatchExampleJobConfig {
}
Second, you have to create a new method that configures your ItemReader bean. This method returns an ItemReader<StudentDTO> object, and it takes an Environment object and a RestTemplate object as method parameters.
After you have added this method to your configuration class, its source code looks as follows:
import org.springframework.batch.item.ItemReader;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.core.env.Environment;
import org.springframework.web.client.RestTemplate;
@Configuration
public class SpringBatchExampleJobConfig {
@Bean
public ItemReader<StudentDTO> itemReader(Environment environment,
RestTemplate restTemplate) {
}
}
Fourth, you have to implement the itemReader() method by returning a new RESTStudentReader object. When you create a new RESTStudentReader object, you have to pass the following objects as constructor arguments:
- The url of the invoked REST API. You can read this information from a properties file by using the
Environmentobject given as a method parameter. - The
RestTemplateobject which is used to query the student information from the invoked REST API.
After you have implemented the itemReader() method, the source code of your configuration class looks as follows:
import org.springframework.batch.item.ItemReader;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.core.env.Environment;
import org.springframework.web.client.RestTemplate;
@Configuration
public class SpringBatchExampleJobConfig {
@Bean
public ItemReader<StudentDTO> itemReader(Environment environment,
RestTemplate restTemplate) {
return new RESTStudentReader(environment.getRequiredProperty("rest.api.url"),
restTemplate
);
}
}
You can now write a custom ItemReader which reads the input data of your batch job from a REST API. Let’s summarize what you learned from this blog post.
Summary
This blog post has taught you two things:
- Spring Batch doesn’t have an
ItemReaderthat can read information from a REST API. - If you want to read the input data of your batch job from a REST API, you can read this information by using the
RestTemplateclass.
The next part of this tutorial describes how you can read the input data of your batch job from an Excel spreadsheet.
P.S. You can get the example application of this blog post from Github.
This was a great tutorial on how to use a REST API as the source of my data for Spring Batch. I am struggling with the best way to unit test it though. Do you have any suggestions or additions for this example?
Thanks!
Hi,
You could move the logic that reads the input data to a separate component (I will call this component
RestInputReader) and inject that component to the Spring Batch reader. This gives you the possibility to replace theRestInputReaderwith a stub when you are writing unit tests for the Spring Batch reader.I wouldn't write unit tests for the
RestInputReaderbecause it doesn't make much sense to replaceRestTemplatewith a test double becauseRestTemplatedoes all the work. I would test it by writing integration tests. If the REST API is an external API, I would replace it with a simple stub.If you have any additional questions, don't hesitate to ask them.
Can you let me know after implementing this code, how do we execute this ? Are you implementing a tasklet ?
Also, why are we naming the method `launchXmlFileToDatabaseJob()` in the Launcher ? wasn't this supposed to be data coming from REST API ?
Good catch. I guess that I simply forgot to change the name of the method after I copied it from another launcher class.
I have gone through Spring Batch Tutorial: Reading Information From a REST API post. It is very useful. Can please let me know how to run this project?
Hi,
Sure. Just clone this repository, select the example you want to run (Spring or Spring Boot), and run the command described in the README.
Hi Petri,
How to read XML file using custom itemreader,itemprocessor write into database using itemwriter.
Pls can u share code for above one
Waiting for your response
Advanced thank
Dandu
8553562402
Hi,
Unfortunately I cannot cover this topic in a single answer. However, you can find my Spring Batch tutorial and a few other great resources from my Spring Batch resource page. Also, this blog post explains how you can read data from XML file and write it to a database.
Hi Petri,
Thank you for this great post. I have approximately the same use case than you. But my problem is than my rest service is using a database containing more than 150.000 entries. So I am afraid than my memory is not big enough. How can I handle that problem?
Hi Florent,
I agree that it's not a good to load the content of your database into memory. I would probably implement an API that supports pagination and write an
ItemReaderthat reads the data one page at the time.If you don't know how you can do it, don't hesitate to ask additional questions.
can you give me link or code or blog , where some example for reading data with API + Pagination
Unfortunately, I don't know if such blog post exists (I didn't find anything interesting from Google). I have added this to my to-do list, and maybe I will write this blog post in the future.
Thanks Petri, it helped me a lot your example.
You are welcome. I am happy hear that this blog post was useful to you.
Petri
The ItemReader you have is stateful. Do you think it is thread-safe?
~Sherin
By the way it is a great example
~Sherin K Syriac
Thank you for your kind words. I really appreciate them. About your question:
Yes, my
ItemReaderis stateful, and Spring Batch assumes that this is the case. The Javadoc of theItemReaderinterface states that:This is nice explanation ,keep it up
Thank you for your kind words. I really appreciate them.
Hi Petri,
This is an awesome explanation. Between, I have a requirement to pull data from external REST API which returns data in .csv files.
I have a few such files ex: user.csv, user_purchases.csv etc. Requirement is, to map data in files based on user id and push accordingly to the database.
Please suggest an approach to handle this problem.
~Chaitanya
Hi Petri,
thanks for the article.
I have a question: is a bad practice to consume a web service (rest or soap) for each item in batch processing with the goal to send data to a new application ? for this case, is a better idea to use a JMS queue ?
Hi Efrén,
If you need a "real time" integration, it's a good idea to use a JMS queue or you can simply implement a REST API that is invoked by the service which sends data to the target system. On the other hand, if it's OK that the data is transferred at a specific time (02:00 at night, once per hour, and so on), it's a good idea to use Spring Batch (or you can write the batch job yourself).
Hi. Very Good One.
But I have one clarification how to save the json data in the database using JPA repository.
Can you help me on this.
Very clear and useful information on customer ItemReader with Rest api call .
Thanks a lot .
But how to handle the case if Rest api is returning 100,000 or more Student data .
Any suggestion for this ?
Thanks
How can we handle large number of data while reading through Rest api ?
Hi Petri,
I have a scenario like suppose i have 2 table A and B
List<A> listA="select * from A"
for(A a: listA){
List<B> listB ="select * from B where col1=a.x;
for(B b : listB){
if(some condition)
//update Table B
}
//Update Table A
}
For above scenario how i can do by using spring Batch.Please help me