Running Out of Database Connections
Contents
“Unable to acquire JDBC Connection” is staring at you from your production logs. Your connection pool is running out of database connections. What do you do?
Setup
Imagine you’ve worked for a year on a modern microservices application with complex external integrations, the client has spent weeks testing it, and you’ve fixed all the reported bugs. You get the official signoff and the public release goes smoothly, the first users are online, and everything is working.
You leave for the release party, and two beers in, the bug strikes. Your database connections are timing out, and your connection pool is full, even though it should be able to handle 10 times more users. The clock is ticking and it’s on you to fix it.
The Problem
Sounds a bit too specific? That’s because it actually happened to me three years ago, and in this blog post, I’ll walk you through the steps I took to understand the problem and fix it.
To illustrate this problem, I created an example repository with Spring Boot, Spring Data, and a Postgres database. Although the debugging techniques I want to show you are general, the problem is specific to this tech stack.
If you want a challenge, go ahead and check out the UserControllerIntegrationTest#multipleRequests()
test. It fails, but it’s not clear why. Can you fix it without reading the rest of the post?
The solutions are on the branches solution/quick_hack
and solution/proper_fix
.
Get Some Facts
To tackle this problem, the first thing to do is collect information.
Priming
Let’s begin by searching online for the exact error, minus parts that are specific to our setup (e.g. a filepath, a username, a timeout value). If we find a solution right away, that’s fine, but we’re usually just looking at a symptom that could be caused by many problems. The goal here is to quickly get an overview of the candidate problems that could cause this symptom so that we’re ready for them.
I’d look at the following for “Unable to acquire JDBC Connection”:
Unable to acquire JDBC Connection — Stack Overflow
Skip the question to see if it looks similar, see that the answer is talking
about @Transactional
and transaction boundaries
JDBCConnectionException Unable to acquire JDBC Connection with spring boot — Stack Overflow
Increase the size of the connection pool, or make sure you close your connectionsSpringBoot:GenericJDBCException: Unable to acquire JDBC Connection — Stack Overflow
Seems like a different problem; we’re using a connection pool[Solved] JDBCConnectionException: Unable to acquire JDBC Connection with Hibernate and MySQL
Something to do with timeouts; seems to be a different issue because the inner exception is differentFrom the above, we get the following questions we want to verify:
- What is the size of our connection pool? Is it big enough?
- Are we using
@Transactional
wrong? - Are we not closing a DB connection?
If you already have experience in this particular area, you can skip this step, but I wouldn’t spend more than 5-10 minutes on this anyway. Don’t go into the details of each problem/solution. Instead, spend just enough time to get a feel for which area our problem could be located in.
Reproduce Locally
The second thing I want to do to understand the issue is try to reproduce it locally. This makes everything easier by making the next steps quicker. If, after one reasonable attempt, you can’t reproduce it locally, you have to weigh spending more time on it against the time savings of debugging locally, but at least you’ll have some indication that the problem might be specific to the production environment (number of concurrent users, different configuration), so that first attempt is always worth it.
For the rest of the post, I’ll assume you could reproduce it locally. These techniques can also be used in production, but it’s always more complicated and riskier.
Code Snooping
Sometimes I’ll spend some time searching for obviously wrong code in the application just to identify areas that might be involved in the issue. Here, something that could cause the problem is opening a DB connection but not closing it.
With @Transactional
, this isn’t possible, but maybe someone thought to open a connection manually? So, search the codebase for these spots and have a quick look: Is a call to close()
missing? For us, unfortunately, this wasn’t the issue.
We can now work with Logs or with a Profiler: Either section can independently bring us to the solution, and they of course can be used together to increase our chance of success.
Logs
Logs are great. The default logging level for a Spring Boot application is INFO
, so let’s go and increase that to DEBUG
to see what we’re working with.
Add this to your application.yml
. You can follow along on the steps/logging
branch; each change is a commit:
logging:
level:
org.springframework: DEBUG
ch.aaap.engineering.example: DEBUG
Running the test with this gives us a lot of output to go through. Let’s focus on the application-level logs — that is, logs from the 'ch.aaap.engineering.example'
package of the application. If you grep for the abbreviated c.a.e.e
(I use Grep Console in IntelliJ), you’ll see that all 10 threads are looking for the user (Looking for user [root]
), but they diverge after that.
Successful threads have Found user User[id=3, name='root']
in their logs, while the failed ones don’t have any more application level logs. So, the issue should be between the two logging statements of looking and finding the user.
Let’s look at one thread at a time. Pick a thread that found a user and grep for its thread name, [pool-thread-x]
. Then pick one that has the error and compare the logs.
Thread that completed OK:
2021-08-13 11:08:20.992 INFO 26284 --- [pool-1-thread-6] c.a.e.e.c.UserControllerIntegrationTest : Running task 5
2021-08-13 11:08:21.015 DEBUG 26284 --- [pool-1-thread-6] o.s.t.web.servlet.TestDispatcherServlet : GET "/users/root/document", parameters={}
2021-08-13 11:08:21.020 DEBUG 26284 --- [pool-1-thread-6] s.w.s.m.m.a.RequestMappingHandlerMapping : Mapped to ch.aaap.engineering.example.connectionpooldebug.UserController#findOne(String)
2021-08-13 11:08:21.022 DEBUG 26284 --- [pool-1-thread-6] o.j.s.OpenEntityManagerInViewInterceptor : Opening JPA EntityManager in OpenEntityManagerInViewInterceptor
2021-08-13 11:08:21.050 DEBUG 26284 --- [pool-1-thread-6] c.a.e.e.c.UserController : Looking for user [root]
2021-08-13 11:08:21.050 DEBUG 26284 --- [pool-1-thread-6] o.s.orm.jpa.JpaTransactionManager : Found thread-bound EntityManager [SessionImpl(811578595<open>)] for JPA transaction
2021-08-13 11:08:21.050 DEBUG 26284 --- [pool-1-thread-6] o.s.orm.jpa.JpaTransactionManager : Creating new transaction with name [ch.aaap.engineering.example.connectionpooldebug.UserService.findOne]: PROPAGATION_REQUIRED,ISOLATION_DEFAULT,readOnly
2021-08-13 11:08:21.054 DEBUG 26284 --- [pool-1-thread-6] o.s.jdbc.datasource.DataSourceUtils : Setting JDBC Connection [HikariProxyConnection@535891644 wrapping org.postgresql.jdbc.PgConnection@5a82a13a] read-only
2021-08-13 11:08:21.054 DEBUG 26284 --- [pool-1-thread-6] o.s.orm.jpa.JpaTransactionManager : Exposing JPA transaction as JDBC [org.springframework.orm.jpa.vendor.HibernateJpaDialect$HibernateConnectionHandle@7420a525]
2021-08-13 11:08:21.087 DEBUG 26284 --- [pool-1-thread-6] o.s.orm.jpa.JpaTransactionManager : Initiating transaction commit
2021-08-13 11:08:21.087 DEBUG 26284 --- [pool-1-thread-6] o.s.orm.jpa.JpaTransactionManager : Committing JPA transaction on EntityManager [SessionImpl(811578595<open>)]
2021-08-13 11:08:21.089 DEBUG 26284 --- [pool-1-thread-6] o.s.jdbc.datasource.DataSourceUtils : Resetting read-only flag of JDBC Connection [HikariProxyConnection@535891644 wrapping org.postgresql.jdbc.PgConnection@5a82a13a]
2021-08-13 11:08:21.089 DEBUG 26284 --- [pool-1-thread-6] o.s.orm.jpa.JpaTransactionManager : Not closing pre-bound JPA EntityManager after transaction
2021-08-13 11:08:21.089 DEBUG 26284 --- [pool-1-thread-6] c.a.e.e.c.UserController : Found user User[id=3, name='root']
2021-08-13 11:08:26.093 DEBUG 26284 --- [pool-1-thread-6] c.a.e.e.c.UserController : Rendered document [Some document]
2021-08-13 11:08:26.139 DEBUG 26284 --- [pool-1-thread-6] o.s.w.s.m.m.a.HttpEntityMethodProcessor : Using 'application/json', given [*/*] and supported [application/json, application/*+json, application/json, application/*+json]
2021-08-13 11:08:26.140 DEBUG 26284 --- [pool-1-thread-6] o.s.w.s.m.m.a.HttpEntityMethodProcessor : Writing [User[id=3, name='root']]
2021-08-13 11:08:26.150 DEBUG 26284 --- [pool-1-thread-6] o.j.s.OpenEntityManagerInViewInterceptor : Closing JPA EntityManager in OpenEntityManagerInViewInterceptor
2021-08-13 11:08:26.150 DEBUG 26284 --- [pool-1-thread-6] o.s.t.web.servlet.TestDispatcherServlet : Completed 200 OK
Thread that failed:
2021-08-13 11:08:20.991 INFO 26284 --- [pool-1-thread-4] c.a.e.e.c.UserControllerIntegrationTest : Running task 3
2021-08-13 11:08:21.015 DEBUG 26284 --- [pool-1-thread-4] o.s.t.web.servlet.TestDispatcherServlet : GET "/users/root/document", parameters={}
2021-08-13 11:08:21.020 DEBUG 26284 --- [pool-1-thread-4] s.w.s.m.m.a.RequestMappingHandlerMapping : Mapped to ch.aaap.engineering.example.connectionpooldebug.UserController#findOne(String)
2021-08-13 11:08:21.022 DEBUG 26284 --- [pool-1-thread-4] o.j.s.OpenEntityManagerInViewInterceptor : Opening JPA EntityManager in OpenEntityManagerInViewInterceptor
2021-08-13 11:08:21.050 DEBUG 26284 --- [pool-1-thread-4] c.a.e.e.c.UserController : Looking for user [root]
2021-08-13 11:08:21.050 DEBUG 26284 --- [pool-1-thread-4] o.s.orm.jpa.JpaTransactionManager : Found thread-bound EntityManager [SessionImpl(1642400880<open>)] for JPA transaction
2021-08-13 11:08:21.051 DEBUG 26284 --- [pool-1-thread-4] o.s.orm.jpa.JpaTransactionManager : Creating new transaction with name [ch.aaap.engineering.example.connectionpooldebug.UserService.findOne]: PROPAGATION_REQUIRED,ISOLATION_DEFAULT,readOnly
2021-08-13 11:08:23.055 DEBUG 26284 --- [pool-1-thread-4] com.zaxxer.hikari.pool.HikariPool : HikariPool-1 - Timeout failure stats (total=5, active=5, idle=0, waiting=0)
2021-08-13 11:08:23.057 WARN 26284 --- [pool-1-thread-4] o.h.engine.jdbc.spi.SqlExceptionHelper : SQL Error: 0, SQLState: null
2021-08-13 11:08:23.057 ERROR 26284 --- [pool-1-thread-4] o.h.engine.jdbc.spi.SqlExceptionHelper : HikariPool-1 - Connection is not available, request timed out after 2005ms.
2021-08-13 11:08:23.060 DEBUG 26284 --- [pool-1-thread-4] o.j.s.OpenEntityManagerInViewInterceptor : Closing JPA EntityManager in OpenEntityManagerInViewInterceptor
2021-08-13 11:08:23.061 DEBUG 26284 --- [pool-1-thread-4] o.s.t.web.servlet.TestDispatcherServlet : Failed to complete request: org.springframework.transaction.CannotCreateTransactionException: Could not open JPA EntityManager for transaction; nested exception is org.hibernate.exception.JDBCConnectionException: Unable to acquire JDBC Connection
The failing thread’s last application output before failing is:
Creating new transaction with name [ch.aaap.engineering.example.connectionpooldebug.UserService.findOne]:
PROPAGATION_REQUIRED,ISOLATION_DEFAULT,readOnly
So, the first thing missing is:
Setting JDBC Connection [HikariProxyConnection@1730990643 wrapping org.postgresql.jdbc.PgConnection@2d0cabbe] read-only
It looks like this thread isn’t getting a connection from the pool, so let’s look at the connection pool logs. Hikari is the default one. After a quick Google search we can add this:
logging:
level:
com.zaxxer.hikari: DEBUG
With this enabled, we find this line in a failing thread, just after the last application log:
Timeout failure stats (total=5, active=5, idle=0, waiting=0)
These are the pool statistics at the time of the connection timeout. Five connections are active. Are these the connections from the successful threads? Why are they still open?
If we look at the logs of a successful thread, when is the connection being opened and closed? We find these:
o.j.s.OpenEntityManagerInViewInterceptor : Opening JPA EntityManager in OpenEntityManagerInViewInterceptor
o.j.s.OpenEntityManagerInViewInterceptor : Closing JPA EntityManager in OpenEntityManagerInViewInterceptor
The closing line is pretty late. In fact, it happens after rendering the document, which is way after the end of @Transactional
on the findOne
method. We’d expect the transaction to close then, and we find another interesting line:
o.s.orm.jpa.JpaTransactionManager : Not closing pre-bound JPA EntityManager after transaction
So the transaction ends, but the connection isn’t being closed because it was “prebound.” Something is opening a connection before our @Transactional
method. What is it?
If we look at the class logging the opening and closing of our OpenEntityManagerInViewInterceptor
connection and read the docs, it refers to the “Open EntityManager in View” pattern. That’s our culprit.
Keep reading to see how we would have done it with a Profiler, or skip to the Solution to read about how to fix the issue.
Profiler
I’m using YourKit, but any profiler that can track DB connections will work. With the IntelliJ integration, we only have to start the test with the profiler and save the results to a snapshot we can then analyze. Usually, a profiler is used for performance problems, but it can also help us find bugs by giving us a good idea of the behavior of our program, so anything out of the norm will pop out.
In this case, from the error message, we know it has something to do with DB connections, so let’s see what we find. In YourKit, go to the Events tab, select Events by Table in the middle, and select JPA/Hibernate on the left side. This will show a timeline of all JPA-related events. Let’s group them by thread and have a look at them. In the top left, above the table, you can select the grouping options.
What we see are the 10 threads that we expect, plus the main thread. Five have only two events (CreateEntityManage
and close
), while another five have five events (createEntityManager
, begin
, getSingleResult
, getSingleResult
, commit
). What’s strange here is that the threads with five events don’t have a close
at the end. So what gives?
The last event in that connection is, for me, at 21s (all times are since the start of the sampling), but the close
in the other thread is at 23s. So what is this connection doing for 2 seconds?
If we look at the code, we expect the @Transaction
to be over and the DB connection closed after the call to findOne()
, which should only take 100-200ms.
Suspiciously enough, the renderDocument
call takes about two seconds. This points to the Transaction
still being open during renderDocument
.
Solution
Let’s explore the solution to this problem.
Defaults, Man!
Why is our transaction open when we call .renderDocument
? findOne()
is @Transactional
, but the transaction should close after, right? RIGHT?
After a good amount of research, you’ll find the lovely spring.jpa.open-in-view
setting, which defaults to true
. What it does is the equivalent of having @Transactional
on every Controller
method in your project. That’s the reason why our transaction is still open during .renderDocument
, and now everything makes sense: The first five lucky threads get to renderDocument
and spend the two seconds there with their DB connection open. The other threads are less lucky and time out while trying to get a connection from the pool.
To be fair, there’s a warning printed in the logs1 about this setting. Did you notice it already? We obviously didn’t, or you wouldn’t be reading this post. :)
Nonetheless, in my opinion, this is a bad default.2 I know it’s a backward compatibility choice to have it like this, but I think forcing people to opt in would’ve been a better solution.
The Quick Hack
OK. So now we know the issue. What to do about it? You’d think adding spring.jpa.open-in-view=false
would solve it, right? The problem is that our application had come to rely on Open Session In View (OSIV) in our controllers, and it wouldn’t have been very quick to fix all the LazyInitializationException
s that were popping out here and there.
So instead, what we can do is close the transaction manually before we render the document.
Inject the EntityManager
, and then add:
log.debug("Found user {}", user.get());
((Session) entityManager.getDelegate()).close();
String document = documentService.renderDocument(user.get());
I’m not sure this is the proper way to close the session, but it worked for us. It allowed us the fix the problem in this particular spot and leave the rest of the code unchanged so we’d have time for a proper fix.
The Proper Fix
Obviously, the proper fix is to disable open-session-in-view
with spring.jpa.open-in-view=false
, and then fix the codebase to use @Transactional
if needed. A refactoring pattern that helped us was to at first move all the code in a controller to another class using @Transactional
, and then slowly go through each class to update the services used until only the service layer was @Transactional
and the view could work with DTOs without a database connection active.
Conclusions
Having open-session-in-view
enabled is a bad default. The warning is good, but Spring prints so much output it can go unnoticed. On a new Spring Boot project, go with spring.jpa.open-in-view=false
by default, and only change it if you’re really sure about what you’re doing.
Understand where your transactions are, and don’t keep a database connection open for longer than you need it.
The discussion around it was quite long. ↩︎