Spring Batch is a framework that helps you to write robust batch jobs that are an essential part of every enterprise application.
The website of the Spring Batch project describes its mission and features as follows:
Spring Batch provides reusable functions that are essential in processing large volumes of records, including logging/tracing, transaction management, job processing statistics, job restart, skip, and resource management. It also provides more advanced technical services and features that will enable extremely high-volume and high performance batch jobs through optimization and partitioning techniques.
That sounds pretty impressive. However, I have heard that it is a bit hard to get started with Spring Batch, and that is why I decided to create this resource page. This page provides links to online resources that have helped me to write batch jobs with Spring Batch.
I hope that these resources are as useful to you as they have been to me.
Getting Started
The following tutorials help you to get started with Spring Batch:
- Spring Batch Tutorial: Introduction specifies the term batch job, explains why you should use Spring Batch, and identifies the basic building blocks of a Spring Batch job.
- Spring Batch Tutorial: Getting the Required Dependencies With Maven describes how you can get Spring Batch dependencies with Maven.
- Spring Batch Tutorial: Getting the Required Dependencies With Gradle describes how you can get Spring Batch dependencies with Gradle.
Configuration
If you want to learn to configure Spring Batch by using Java configuration, you must take a look at Tobias Flohre’s tutorial that was published on blog.codecentric.de. It was published on 2013, but it is still one of the best tutorials out there. It consists of the following blog posts:
- Spring Batch 2.2 – JavaConfig Part 1: A comparison to XML compares Java configuration with XML configuration and explains why you should use Java configuration instead of XML configuration.
- Spring Batch 2.2 – JavaConfig Part 2: JobParameters, ExecutionContext and StepScope describes how you can use job parameters and an execution context in your Spring Batch jobs. More importantly, it also explains why this is a good idea.
- Spring Batch 2.2 – JavaConfig Part 3: Profiles and environments helps you to create environment specific configurations for your Spring Batch jobs. This is useful if you have to run your batch jobs in different environments (your own development environment, test environment, and production environment).
- Spring Batch 2.2 – JavaConfig Part 4: Job inheritance explains how you can implement job inheritance by using Java configuration. This is a useful feature if you need to define common functionality that is used by more than one Spring Batch job.
- Spring Batch 2.2 – JavaConfig Part 5: Modular configurations describes how you can avoid problems caused by “common” bean names (such as reader or writer) and multiple beans with same type (such as multiple ItemReader and / or multiple ItemWriter beans).
- Spring Batch 2.2 – JavaConfig Part 6: Partitioning and Multi-threaded Step explains how you scale your Spring Batch jobs by partitioning your data and using multi-threaded steps.
Reading Input Data
When start writing a batch job, the first thing that you have to do is to provide input data for your batch job. These tutorials describe how you can read input data from different data sources:
- Spring Batch Tutorial: Reading Information From a File describes how you can read input data from CSV and XML files.
- Spring Batch Tutorial: Reading Information From a Database describes how you can read input data from a database by using database cursors and pagination.
- Spring Batch Tutorial: Creating a Custom ItemReader describes how you can create a custom ItemReader.
- Spring Batch Tutorial: Reading Information From a REST API describes how you can create a custom ItemReader that reads the input data of your batch job by using the RestTemplate class.
- Spring Batch Tutorial: Reading Information From an Excel File describes how you can read the input data of your Spring Batch job from an Excel spreadsheet.
Writing Output Data
A Spring Batch job isn’t very useful if it doesn’t write its output data to somewhere. These tutorials help you write the output data of your Spring Batch job to a file system or a database:
- Spring Batch Tutorial: Writing Information to a File describes how you can write the output data of your Spring Batch job to CSV and XML files.
- Spring Batch Tutorial: Writing Information to a Database With JDBC describes how you can write the output data of your Spring Batch job to a relational database by using JDBC.
Tutorials
These tutorials describe how you can create Spring Batch jobs:
- Batch YARN Application describes how you can execute a Spring Batch job on Hadoop YARN.
- Restartable Batch YARN Application describes how you can execute a Spring Batch job on Hadoop YARN and restart it if an error occurs during the job.
- Creating a Batch Service describes how you can create a simple batch job that reads the information from a CSV file and writes it to the used database by using JDBC.
- Spring Batch Tutorial – The Ultimate Guide is a very long blog post that identifies the central components of Spring Batch and describes how you can use them. It also provides several examples that help you to understand the concepts described on this blog post.