Spring Batch has a good support for reading data from different data sources such as files (CSV or XML) or databases. However, it doesn’t have a built-in support for reading input data from a REST API. If you want to use a REST API as a data source of your Spring Batch job, you have to implement a custom ItemReader
which reads the input data from the REST API.
This blog post describes how you can implement your custom ItemReader
. After you have read this blog post, you:
- Understand how you can implement an
ItemReader
which reads the input data of your batch job by using theRestTemplate
class. - Know how you can configure the
ItemReader
bean which provides the input data for your batch job.
Let's begin.
- You are familiar with Spring Batch
- You can get the required dependencies with Maven or Gradle
- You can create a custom
ItemReader
Introduction to the Example Application
During this blog post you will implement an ItemReader
which reads the input data of your Spring Batch job from a REST API endpoint that processes GET
requests send to the path: '/api/student/'. This API endpoint returns the information of all students who are enrolled to an online course. To be more specific, your API endpoint returns the following JSON document:
[ { "emailAddress": "tony.tester@gmail.com", "name": "Tony Tester", "purchasedPackage": "master" }, { "emailAddress": "nick.newbie@gmail.com", "name": "Nick Newbie", "purchasedPackage": "starter" }, { "emailAddress": "ian.intermediate@gmail.com", "name": "Ian Intermediate", "purchasedPackage": "intermediate" } ]
You have to transform the returned JSON document into StudentDTO
objects which are processed by your batch job. The StudentDTO
class contains the information of a single student, and its source code looks as follows:
public class StudentDTO { private String emailAddress; private String name; private String purchasedPackage; public StudentDTO() {} public String getEmailAddress() { return emailAddress; } public String getName() { return name; } public String getPurchasedPackage() { return purchasedPackage; } public void setEmailAddress(String emailAddress) { this.emailAddress = emailAddress; } public void setName(String name) { this.name = name; } public void setPurchasedPackage(String purchasedPackage) { this.purchasedPackage = purchasedPackage; } }
Next, you will implement a custom ItemReader
which reads the input data of your batch job from the described API endpoint.
Implementing Your Custom ItemReader
You can implement your custom ItemReader
by following these steps:
First, you have to create a new class (RESTStudentReader
) and implement the ItemReader
interface. When you implement the ItemReader
interface, you must set the type of the returned object to StudentDTO
.
After you have created your ItemReader
class, its source code looks as follows:
import org.springframework.batch.item.ItemReader; class RESTStudentReader implements ItemReader<StudentDTO> { }
Second, you have to add the following private
fields to the RESTStudentReader
class:
- The
final apiUrl
field contains the url of the invoked REST API. - The
final RestTemplate
field contains a reference to theRestTemplate
object which you use when you read the student information. - The
nextStudentIndex
field contains the index of the nextStudentDTO
object. - The
studentData
field contains the foundStudentDTO
objects.
After you have added these fields to the RESTStudentReader
class, its source code looks as follows:
import org.springframework.batch.item.ItemReader; import org.springframework.web.client.RestTemplate; import java.util.List; class RESTStudentReader implements ItemReader<StudentDTO> { private final String apiUrl; private final RestTemplate restTemplate; private int nextStudentIndex; private List<StudentDTO> studentData; }
Third, you have to add a constructor to the RESTStudentReader
class and implement it by following these steps:
- Ensure that the constructor takes the url of the invoked REST API and a
RestTemplate
object as constructor arguments. - Implement the constructor by storing its constructor arguments in the fields of the created object. Set the value of the
nextStudentIndex
field to 0.
After you have implemented the constructor, the source code of the RESTStudentReader
class looks as follows:
import org.springframework.batch.item.ItemReader; import org.springframework.web.client.RestTemplate; import java.util.List; class RESTStudentReader implements ItemReader<StudentDTO> { private final String apiUrl; private final RestTemplate restTemplate; private int nextStudentIndex; private List<StudentDTO> studentData; RESTStudentReader(String apiUrl, RestTemplate restTemplate) { this.apiUrl = apiUrl; this.restTemplate = restTemplate; nextStudentIndex = 0; } }
Fourth, you have to add a public read()
method to the RESTStudentReader
class and specify that the method returns a StudentDTO
object. Also, you must ensure that this method can throw an Exception
. After you have added this method to the RESTStudentReader
class, you have to implement it by following these rules:
- If the student information hasn't been read, read the student information by invoking the REST API.
- If the next student is found, return the found
StudentDTO
object and increase the value of thenextStudentIndex
field (the index of the next student) by 1. - If the next student isn't found, return
null
. Ensure that yourItemReader
reads the input data from the REST API when itsread()
method is invoked for the next time (set the value of thenextStudentIndex
field to 0, and set the value of thestudentData
field tonull
).
After you have implemented the RESTStudentReader
class, its source code looks as follows:
import org.springframework.batch.item.ItemReader; import org.springframework.http.ResponseEntity; import org.springframework.web.client.RestTemplate; import java.util.Arrays; import java.util.List; class RESTStudentReader implements ItemReader<StudentDTO> { private final String apiUrl; private final RestTemplate restTemplate; private int nextStudentIndex; private List<StudentDTO> studentData; RESTStudentReader(String apiUrl, RestTemplate restTemplate) { this.apiUrl = apiUrl; this.restTemplate = restTemplate; nextStudentIndex = 0; } @Override public StudentDTO read() throws Exception { if (studentDataIsNotInitialized()) { studentData = fetchStudentDataFromAPI(); } StudentDTO nextStudent = null; if (nextStudentIndex < studentData.size()) { nextStudent = studentData.get(nextStudentIndex); nextStudentIndex++; } else { nextStudentIndex = 0; studentData = null; } return nextStudent; } private boolean studentDataIsNotInitialized() { return this.studentData == null; } private List<StudentDTO> fetchStudentDataFromAPI() { ResponseEntity<StudentDTO[]> response = restTemplate.getForEntity(apiUrl, StudentDTO[].class ); StudentDTO[] studentData = response.getBody(); return Arrays.asList(studentData); } }
Before you can use your new ItemReader
, you have to configure the RestTemplate
bean. Let's move on and find out how you can configure this bean.
Configuring the RestTemplate Bean
You can configure the RestTemplate
bean by following these steps:
- Add a
public restTemplate()
method to your application context configuration class. Ensure that therestTemplate()
method returns aRestTemplate
object and annotate it with the@Bean
annotation. - Implement the
restTemplate()
method by returning a newRestTemplate
object.
If you use Spring Framework, the source code of your application context configuration class looks as follows:
import org.springframework.context.annotation.Bean; import org.springframework.context.annotation.Configuration; import org.springframework.web.client.RestTemplate; @Configuration public class SpringBatchExampleContext { @Bean public RestTemplate restTemplate() { return new RestTemplate(); } }
If you use Spring Boot, you can also add the restTemplate()
method to your application class which is annotated with the @SpringBootApplication
annotation. The source code of this class looks as follows:
import org.springframework.batch.core.configuration.annotation.EnableBatchProcessing; import org.springframework.boot.SpringApplication; import org.springframework.boot.autoconfigure.SpringBootApplication; import org.springframework.context.annotation.Bean; import org.springframework.scheduling.annotation.EnableScheduling; import org.springframework.web.client.RestTemplate; @SpringBootApplication @EnableBatchProcessing @EnableScheduling public class SpringBatchExampleApplication { @Bean public RestTemplate restTemplate() { return new RestTemplate(); } public static void main(String[] args) { SpringApplication.run(SpringBatchExampleApplication.class, args); } }
RestTemplate
bean. These dependencies are described in the following:
- If you are using Spring Framework, you have to add the
spring-webmvc
dependency to the classpath. - If you are using Spring Boot, you have to add the
spring-boot-starter-web
dependency to the classpath.
After you have configured the RestTemplate
bean, you can finally configure your ItemReader
bean.
Configuring the ItemReader Bean
You can configure the ItemReader
bean by following these steps:
First, you have to create a new configuration class. After you have created this class, its source code looks as follows:
import org.springframework.context.annotation.Configuration; @Configuration public class SpringBatchExampleJobConfig { }
Second, you have to create a new method that configures your ItemReader
bean. This method returns an ItemReader<StudentDTO>
object, and it takes an Environment
object and a RestTemplate
object as method parameters.
After you have added this method to your configuration class, its source code looks as follows:
import org.springframework.batch.item.ItemReader; import org.springframework.context.annotation.Bean; import org.springframework.context.annotation.Configuration; import org.springframework.core.env.Environment; import org.springframework.web.client.RestTemplate; @Configuration public class SpringBatchExampleJobConfig { @Bean public ItemReader<StudentDTO> itemReader(Environment environment, RestTemplate restTemplate) { } }
Fourth, you have to implement the itemReader()
method by returning a new RESTStudentReader
object. When you create a new RESTStudentReader
object, you have to pass the following objects as constructor arguments:
- The url of the invoked REST API. You can read this information from a properties file by using the
Environment
object given as a method parameter. - The
RestTemplate
object which is used to query the student information from the invoked REST API.
After you have implemented the itemReader()
method, the source code of your configuration class looks as follows:
import org.springframework.batch.item.ItemReader; import org.springframework.context.annotation.Bean; import org.springframework.context.annotation.Configuration; import org.springframework.core.env.Environment; import org.springframework.web.client.RestTemplate; @Configuration public class SpringBatchExampleJobConfig { @Bean public ItemReader<StudentDTO> itemReader(Environment environment, RestTemplate restTemplate) { return new RESTStudentReader(environment.getRequiredProperty("rest.api.url"), restTemplate ); } }
You can now write a custom ItemReader
which reads the input data of your batch job from a REST API. Let’s summarize what you learned from this blog post.
Summary
This blog post has taught you two things:
- Spring Batch doesn’t have an
ItemReader
that can read information from a REST API. - If you want to read the input data of your batch job from a REST API, you can read this information by using the
RestTemplate
class.
The next part of this tutorial describes how you can read the input data of your batch job from an Excel spreadsheet.
P.S. You can get the example application of this blog post from Github.
This was a great tutorial on how to use a REST API as the source of my data for Spring Batch. I am struggling with the best way to unit test it though. Do you have any suggestions or additions for this example?
Thanks!
Hi,
You could move the logic that reads the input data to a separate component (I will call this component
RestInputReader
) and inject that component to the Spring Batch reader. This gives you the possibility to replace theRestInputReader
with a stub when you are writing unit tests for the Spring Batch reader.I wouldn't write unit tests for the
RestInputReader
because it doesn't make much sense to replaceRestTemplate
with a test double becauseRestTemplate
does all the work. I would test it by writing integration tests. If the REST API is an external API, I would replace it with a simple stub.If you have any additional questions, don't hesitate to ask them.
Can you let me know after implementing this code, how do we execute this ? Are you implementing a tasklet ?
Also, why are we naming the method `launchXmlFileToDatabaseJob()` in the Launcher ? wasn't this supposed to be data coming from REST API ?
Good catch. I guess that I simply forgot to change the name of the method after I copied it from another launcher class.
I have gone through Spring Batch Tutorial: Reading Information From a REST API post. It is very useful. Can please let me know how to run this project?
Hi,
Sure. Just clone this repository, select the example you want to run (Spring or Spring Boot), and run the command described in the README.
Hi Petri,
How to read XML file using custom itemreader,itemprocessor write into database using itemwriter.
Pls can u share code for above one
Waiting for your response
Advanced thank
Dandu
8553562402
Hi,
Unfortunately I cannot cover this topic in a single answer. However, you can find my Spring Batch tutorial and a few other great resources from my Spring Batch resource page. Also, this blog post explains how you can read data from XML file and write it to a database.
Hi Petri,
Thank you for this great post. I have approximately the same use case than you. But my problem is than my rest service is using a database containing more than 150.000 entries. So I am afraid than my memory is not big enough. How can I handle that problem?
Hi Florent,
I agree that it's not a good to load the content of your database into memory. I would probably implement an API that supports pagination and write an
ItemReader
that reads the data one page at the time.If you don't know how you can do it, don't hesitate to ask additional questions.
can you give me link or code or blog , where some example for reading data with API + Pagination
Unfortunately, I don't know if such blog post exists (I didn't find anything interesting from Google). I have added this to my to-do list, and maybe I will write this blog post in the future.
Thanks Petri, it helped me a lot your example.
You are welcome. I am happy hear that this blog post was useful to you.
Petri
The ItemReader you have is stateful. Do you think it is thread-safe?
~Sherin
By the way it is a great example
~Sherin K Syriac
Thank you for your kind words. I really appreciate them. About your question:
Yes, my
ItemReader
is stateful, and Spring Batch assumes that this is the case. The Javadoc of theItemReader
interface states that:This is nice explanation ,keep it up
Thank you for your kind words. I really appreciate them.
Hi Petri,
This is an awesome explanation. Between, I have a requirement to pull data from external REST API which returns data in .csv files.
I have a few such files ex: user.csv, user_purchases.csv etc. Requirement is, to map data in files based on user id and push accordingly to the database.
Please suggest an approach to handle this problem.
~Chaitanya
Hi Petri,
thanks for the article.
I have a question: is a bad practice to consume a web service (rest or soap) for each item in batch processing with the goal to send data to a new application ? for this case, is a better idea to use a JMS queue ?
Hi Efrén,
If you need a "real time" integration, it's a good idea to use a JMS queue or you can simply implement a REST API that is invoked by the service which sends data to the target system. On the other hand, if it's OK that the data is transferred at a specific time (02:00 at night, once per hour, and so on), it's a good idea to use Spring Batch (or you can write the batch job yourself).
Hi. Very Good One.
But I have one clarification how to save the json data in the database using JPA repository.
Can you help me on this.
Very clear and useful information on customer ItemReader with Rest api call .
Thanks a lot .
But how to handle the case if Rest api is returning 100,000 or more Student data .
Any suggestion for this ?
Thanks
How can we handle large number of data while reading through Rest api ?
Hi Petri,
I have a scenario like suppose i have 2 table A and B
List<A> listA="select * from A"
for(A a: listA){
List<B> listB ="select * from B where col1=a.x;
for(B b : listB){
if(some condition)
//update Table B
}
//Update Table A
}
For above scenario how i can do by using spring Batch.Please help me