When we write tests for our data access code, should we test every method of its public API?
It sounds natural at first. After all, if we don't test everything, how can we know that our code works as expected?
That question provides us an important clue:
Our code.
We should write tests only to our own code.
What Is Our Own Code?
It is sometimes hard to identify the code which we should test. The reason for this is that our data access code is integrated tightly with the library or framework which we use when we save information to the used data storage or read information from it.
For example, if we want to create a Spring Data JPA repository which provides CRUD operations to Todo objects, we should create an interface which extends the CrudRepository interface. The source code of the TodoRepository interface looks as follows:
import org.springframework.data.repository.CrudRepository; public TodoRepository extends CrudRepository<Todo, Long> { }
Even though we haven't added any methods to our repository interface, the CrudRepository interface declares many methods which are available to the classes that use our repository interface.
These methods are not our code because they are implemented and maintained by the Spring Data team. We only use them.
On the other hand, if we add a custom query method to our repository, the situation changes. Let's assume that we have to find all todo entries whose title is equal to the given search term. After we have added this query method to our repository interface, its source code looks as follows:
import org.springframework.data.repository.CrudRepository; import org.springframework.data.repository.query.Param; public TodoRepository extends CrudRepository<Todo, Long> { @Query("SELECT t FROM Todo t where t.title=:searchTerm") public List<Todo> search(@Param("searchTerm") String searchTerm) }
It would be easy to claim that this method is our own code and that is why we should test it. However, the truth is a bit more complex. Even though the JPQL query was written by us, Spring Data JPA provides the code which passes that query forward to the used JPA provider.
And still, I think that this query method is our own code because the most essential part of it was written by us.
If we want to identify our own data access code, we have to locate the essential part of each method. If this part was written by us, we should treat that that method as our own code.
This is all pretty obvious, and the more interesting question is:
Should We Test It?
Our repository interface provides two kinds of methods to the classes which use it:
- It provides methods that are declared by the CrudRepository interface.
- It provides a query method that was written by us.
Should we write integration tests for the TodoRepository interface and test all of these methods?
No. We should not do this because
- The methods declared by the CrudRepository interface are not our own code. This code is written and maintained by the Spring Data team, and they have ensured that it works. If we don't trust that their code works, we should not use it.
- Our application probably has many repository interfaces which extend the CrudRepository interface. If we decide to write tests for the methods declared by the CrudRepository interface, we have to write these tests for all repositories. If we choose this path, we will spend a lot of time writing tests for someone else’s code, and frankly, it is not worth it.
- Our own code might be so simple that writing tests for our repository makes no sense.
In other words, we should concentrate on finding an answer to this question:
Should we write integration tests for our repository methods (methods which were written by us), or should we just write end-to-end tests?
The answer to this question depends from the complexity of our repository method. I am aware that complexity is a pretty vague word, and that is why we need a some kind of guideline that will help us to find the best way of testing our repository methods.
One way to make this decision is to think about the amount of work which is required to test the every possible scenario. This makes sense because:
- It takes less work to write integration tests for a single repository method than to write the same tests for the feature that uses the repository method.
- We have to write end-to-end tests anyway.
Thus, it makes sense to minimize our investment (time) and maximize our profits (test coverage).
This is of course easier said than done because every situation is unique, and it is impossible to figure out rules which would be valid in every situation.
We can get started by finding the answers to the following questions:
- Is the feature which uses our repository method simple or complex? We can get a some kind of idea about this by asking more questions:
- Does the feature only return information that is fetched from a data storage or does it modify it?
- How many dependencies does the feature have?
- How many tests do we have to write for our feature if we want to test all possible scenarios?
- How many tests do we have to write for our repository method if we want to test all possible scenarios?
After we have found the answers to these questions, we can maximize our return of investment by following these rules:
- If we can test all possible scenarios by writing only a few end-to-end tests, we shouldn’t waste our time for writing integration tests for our repository method. We should write end-to-end tests which ensure that the feature is working as expected.
- If we need to write more than a few tests for our repository method, we should write integration tests for our repository method, and write only a few end-to-end tests (smoke tests).
Summary
This blog post has taught us two things:
- We should not waste our time for writing tests for data access framework (or library) written by someone else. If we don’t trust that framework (or library), we should not use it.
- Sometimes we should not write integration tests for our data access code either. If the tested code is simple enough (we can cover all situations by writing a few end-to-end tests), we should test it by writing end-to-end tests.
Another good post, thanks for sharing.
I've put some time into thinking about whether tests should be written to cover my Spring Data JPA repositories recently, and I ended up with the following:
I'd like to make sure I understand the second assertion in the post's conclusion - do you mean that tests for simple custom queries should be assumed as inherent in end-to-end feature tests?
Thank you for your kind words. I really appreciate them.
I have spent some time thinking about writing tests which verify that the database schema is equal to my JPA mapping. Recently I realized that I don't necessarily have to any tests if I use this approach:
hibernate.hbm2ddl.auto
property to 'validate').The benefit of this approach is that I don't have to write tests which verify that my migration scripts create a working database. The downside is that since I haven't used it in a real life project yet, I have no idea how much slower my integration tests are because of it. Do you have any ideas/comments about this?
I think that this depends from the feature. If the feature simply returns information that was fetched from a database, this might be a good enough. If the feature is not that simple, it might still make sense the write tests for the repository method (even if it is very simple) because this is easier to do and might help you to troubleshoot possible problems.
Do you think that I should update the blog post so that this is more clear?
Do you think that I should update the blog post so that this is more clear?
It will be great if you do, I was just about the ask you this same question!
Thank you so much for the feedback. I will update this blog post later today.
It seems to me that however we define our database schema and JPA mappings, there's some duplication when we do them separately, but that it's unavoidable if we want the powerful schema migration features of Liquibase or Flyway. I hadn't thought about using Hibernate's schema validation to achieve the same goal as the tests, so that's a valuable insight that I may put into practice. I haven't seen schema migrations executed as part of integration tests implemented for a large-scale project either, but I think a small performance penalty during testing is worth the gain.
Your explanation about when to write tests for custom repository methods makes a lot of sense, and I agree that the article would benefit from its addition.