Spring Batch Tutorial: Getting the Required Dependencies With Maven

The first part of my Spring Batch tutorial explained why you should use Spring Batch instead of writing your own batch jobs and identified the basic building blocks of a Spring Batch job. Before you can move on and see Spring Batch in action, you have to get the required dependencies.

After you have read this blog post, you:

  • Can identify the modules provided by Spring Batch.
  • Can list the required dependencies when you are using Spring Framework or Spring Boot.
  • Know how you can get the required dependencies with Maven.

Let's start by taking a look at the modules provided by Spring Batch.

This blog post assumes that:

Introduction to the Modules of Spring Batch

Spring Batch provides the following modules:

  • The spring-batch-infrastructure module contains the common readers and writers, and provides services for application developers and the core module.
  • The spring-batch-core module contains the classes which are required to launch and control Spring Batch jobs.
  • The spring-batch-test module provides support for writing automated tests for Spring Batch jobs.
  • The spring-batch-integration module helps you to integrate Spring Batch with Spring Integration.

Next, you will find out how you can get the required dependencies when you are writing a "legacy" Spring application.

Getting the Dependencies of a "Legacy" Spring Application

When you are writing Spring Batch jobs by using Spring Framework, you have to get the following dependencies:

  • Because Spring Batch writes the job and step execution metadata (aka logs) to the database, you need a JDBC driver which allows your application to communicate with the used database. You should use the H2 in-memory database because it makes your example application easy to run. Also, you need this dependency only when your application is run, and that's why you have to use the runtime scope when you declare this dependency.
  • Liquibase. You should integrate Liquibase with Spring Framework and ensure that it creates the database tables which contain the job and step execution metadata when the Spring container is started.
  • The datasource provides database connections to your application. You should use the HikariCP datasource because it's the fastest datasource on this planet.
  • Spring Batch Core contains the classes that are required to launch and control Spring Batch jobs. Also, it includes the spring-batch-infrastructure module.
Spring applications require other dependencies as well. I didn't include those dependencies in the previous list simply because every application is different and requires different dependencies.

When you are writing a "legacy" Spring application, you can get the required dependencies by using one of these two options:

  1. You can manage the dependency versions by using the dependency management of Spring Boot.
  2. You can use the traditional way and manage the dependency versions manually.

Let's find out how you can get the required dependencies when you use the dependency management of Spring Boot.

Using the Dependency Management of Spring Boot

You can get the required dependencies by following these steps:

First, you have to configure the parent POM of your Maven project. Spring Boot has starter called spring-boot-starter-parent which provides dependency and plugin management for Maven projects. When you want to use the dependency management of Spring Boot, you have to set this starter as the parent of your Maven project. You can do this by adding the following XML to your POM file:

<parent>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-parent</artifactId>
    <version>2.3.1.RELEASE</version>
</parent>

Second, after you have configured the parent POM of your Maven project, you don't have to worry about dependency versions because they are inherited from the parent POM. This means that you can get the required dependencies by adding the following XML to the dependencies section of your POM file:

<!-- Database -->
<dependency>
    <groupId>com.h2database</groupId>
    <artifactId>h2</artifactId>
    <scope>runtime</scope>
</dependency>

<!-- Liquibase -->
<dependency>
    <groupId>org.liquibase</groupId>
    <artifactId>liquibase-core</artifactId>
</dependency>
        
<!-- DataSource -->
<dependency>
    <groupId>com.zaxxer</groupId>
    <artifactId>HikariCP</artifactId>
</dependency>

<!-- Spring Batch -->
<dependency>
    <groupId>org.springframework.batch</groupId>
    <artifactId>spring-batch-core</artifactId>
</dependency>

Next, you will find out how you can manage your dependency versions manually.

Using the Traditional Way

If you want to use the traditional way, you have to manage your dependency versions "manually". In other words, you must specify the versions of all dependencies. You can do this by adding the following dependency declarations to the dependencies section of your pom.xml file:

<!-- Database -->
<dependency>
    <groupId>com.h2database</groupId>
    <artifactId>h2</artifactId>
    <version>1.4.200</version>
    <scope>runtime</scope>
</dependency>

<!-- Liquibase -->
<dependency>
    <groupId>org.liquibase</groupId>
    <artifactId>liquibase-core</artifactId>
    <version>3.8.9</version>
</dependency>

<!-- DataSource -->
<dependency>
    <groupId>com.zaxxer</groupId>
    <artifactId>HikariCP</artifactId>
    <version>3.4.5</version>
</dependency>

<!-- Spring Batch -->
<dependency>
    <groupId>org.springframework.batch</groupId>
    <artifactId>spring-batch-core</artifactId>
    <version>4.2.4.RELEASE</version>
</dependency>

You can now get the required dependencies when you are working with a "legacy" Spring application. Let's move on and find out how you can get the required dependencies when you are using Spring Boot.

Getting the Dependencies of a Spring Boot Application

You can get the required dependencies by following these steps:

First, you have to configure the parent POM of your Maven project. When you are writing a Spring Boot application, you must set the spring-boot-starter-parent as the parent of your Maven project. You can do this by adding the following XML to your POM file:

<parent>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-parent</artifactId>
    <version>2.3.1.RELEASE</version>
</parent>

Second, you have to configure the required dependencies. When you configure these dependencies, you can ignore the dependency versions because they are inherited from the parent POM. Before you can use Spring Batch in a Spring Boot application, you must get the following dependencies:

  • Because Spring Batch writes the job and step execution metadata (aka logs) to the database, you need a JDBC driver which allows your application to communicate with the used database. You should use the H2 in-memory database because it makes your example application easy to run. Also, you need this dependency only when your application is run, and that's why you have to use the runtime scope when you declare this dependency.
  • Liquibase. You should integrate Liquibase with Spring Boot and ensure that it creates the database tables which contain the job and step execution metadata when the Spring container is started. Because you need this dependency only when your application is run, you have to use the runtime scope when you declare this dependency.
  • The spring-boot-starter-batch dependency provides the dependencies which are required by Spring Batch.
  • The spring-boot-starter-jdbc dependency is a starter that provides the dependencies which allow you to use JDBC and HikariCP datasource in your Spring Boot application.
Spring Boot applications require other dependencies as well. I didn't include those dependencies in the previous list simply because every application is different and requires different dependencies.

You can get these dependencies by adding the following dependency declarations to the dependencies section of your pom.xml file:

<dependency>
    <groupId>com.h2database</groupId>
    <artifactId>h2</artifactId>
    <scope>runtime</scope>
</dependency>
<dependency>
    <groupId>org.liquibase</groupId>
    <artifactId>liquibase-core</artifactId>
    <scope>runtime</scope>
</dependency>
<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-batch</artifactId>
</dependency>
<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-jdbc</artifactId>
</dependency>

You can now get the required dependencies when you are using Spring Boot. Let's summarize what you learned from this blog post.

Summary

This blog post has taught you five things:

  • Because Spring Batch writes the job and step execution metadata to a database, you need a JDBC driver which allows your application to communicate with the used database.
  • Because you want to communicate with a relational database, you need a datasource which allows you to create database connections.
  • Because Spring Batch writes the job and step execution metadata to a database, you need a way to create the required database tables before Spring Batch tries to insert data into these tables. You can use Liquibase (or Flyway) for this purpose.
  • If you are working with a "legacy" Spring application, you have to declare the spring-batch-core dependency in your POM file.
  • If you are using Spring Boot, you have to declare the spring-boot-starter-batch dependency in your POM file.

The next part of this tutorial describes how you can get the required dependencies with Gradle.

P.S. You can get the example applications of this blog post from Github.

15 comments… add one
  • SGB Feb 3, 2016 @ 7:27

    Thanks Petri for another insightful blog post.

    Looking forward to the rest of this new series - I remember trying to use Spring batch a couple of years back and found it very complicated. As I had a deadline ended up abandoning it. So am hoping to learn it finally though this blog series.

    I do find it odd that you are using JPA (Hibernate) for a batch process - I always preferred using JDBC directly for batch process. But am sure this is an opportunity for me to broaden my horizon and learn more :)

    I enjoyed your series on Spring Data which inspired me to buy your book.
    Am excited to see this where new series leads to.

    Best wishes,
    SGB

    • Petri Feb 3, 2016 @ 9:34

      Hi,

      Thank your for your kind words. I really appreciate them.

      I remember trying to use Spring batch a couple of years back and found it very complicated. As I had a deadline ended up abandoning it. So am hoping to learn it finally though this blog series.

      I have heard similar comments from my colleagues as well, and that is why I decided to write this tutorial.

      I do find it odd that you are using JPA (Hibernate) for a batch process – I always preferred using JDBC directly for batch process. But am sure this is an opportunity for me to broaden my horizon and learn more :)

      This is a good point. I have to confess that typically I use JDBC as well (for batch processes), but I decided to cover both JDBC and JPA because there are a few situations when it makes sense to use JPA. In other words, sometimes the benefits of using JPA are so high that it makes sense to take a performance hit.

  • Ramesh Babu Y Feb 26, 2016 @ 18:11

    Hi

    The following dependency in your article , is wrong because till now only 1.4.191 , is latest version for h2 data base ,but you mentioned the version as , 1.4.90 , so can you please check

    com.h2database
    h2
    1.4.90

    • Petri Feb 26, 2016 @ 19:04

      Hi,

      If you specify your dependencies manually, you should naturally use the newest versions. However, if you use Spring IO Platform or Spring Boot, your best option is to not specify dependency versions and use the versions that are guaranteed to work (provided by parent or starter POMs).

      • Ramesh Babu Y Feb 26, 2016 @ 19:55

        Hi

        I agree the automatically downloading dependencies using sping IO , am just telling you that you mentioned wrong version number for the dependency , as 1.4.90 , am just telling you it is 1.4.191 , please observe the numbers and correct in your article

        • Petri Feb 26, 2016 @ 20:33

          Hi,

          This post uses the dependency versions that provided by the Spring IO Platform 2.0.1.RELEASE because this way I can guarantee that they are working with each other. Because the Spring IO Platform 2.0.1.RELEASE uses H2 1.4.190, this article will use it as well.

          • SGB Feb 27, 2016 @ 1:31

            lol. Hilarious!
            I think Ramesh is saying that there is a typo in the section "Tradition Way".
            For the h2 dependency, in your pom file, you have listed version 1.4.90 instead of 1.4.190.

            90 vs 190.

          • Petri Feb 27, 2016 @ 9:27

            Oops. I fixed it. Thank you for pointing this out.

  • Sharmila Mar 22, 2016 @ 22:48

    Hi Petri,
    I tried running your spring boot batch code (excelFileToDatabaseJob) and seeing the below exception
    Caused by: java.lang.IllegalArgumentException: Sheet index (1) is out of range (0..0)
    at org.apache.poi.xssf.usermodel.XSSFWorkbook.validateSheetIndex(XSSFWorkbook.java:1158) ~[poi-ooxml-3.11.jar:3.11]
    at org.apache.poi.xssf.usermodel.XSSFWorkbook.getSheetAt(XSSFWorkbook.java:934) ~[poi-ooxml-3.11.jar:3.11]
    at org.apache.poi.xssf.usermodel.XSSFWorkbook.getSheetAt(XSSFWorkbook.java:106) ~[poi-ooxml-3.11.jar:3.11]
    at org.springframework.batch.item.excel.poi.PoiItemReader.getSheet(PoiItemReader.java:47) ~[spring-batch-excel-0.5.0-SNAPSHOT.jar:na]

    I created a new project that reads an excel and processes it using spring-batch-excel code and seeing the same exception as well. Have you seen this happening?

    • Petri Mar 22, 2016 @ 23:32

      Hi Sharmila,

      I haven't seen that exception. However, it looks like the spreadsheet has an empty row somewhere, and this is why the exception is thrown.

      Did you modify the spreadsheet or replace it with some other spreadsheet? Also, if you opened it with Excel, Excel might have modified it for you :| The reason why I ask this is that I cannot reproduce this with my version. :(

  • Ramesha Dec 2, 2017 @ 15:12

    Dear Sir,

    Thank you very much for the code. it saved me lot of time. I am executing from web application and passing path dynamically. If any one still looking for dynamic path, send me the mail. I will post the code here.

    With Regards
    Ramesh

    • Petri Dec 6, 2017 @ 16:37

      Thank you for kind words. I really appreciate them. Also, I will keep your offer in mind and ask additional information if someone needs it.

Leave a Reply