-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Fix deadlock and couple more problems in DefaultConnectionPool
#699
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This commit fixes two problems in the `OpenConcurrencyLimiter`: - deadlock in `DefaultConnectionPool`; - potential removal from `OpenConcurrencyLimiter.desiredConnectionSlots` when its empty. Deadlock a) Before introducing OpenConcurrencyLimiter, "AsyncGetter" thread was used only to do blocking `ConcurrentPool.get`. b) After, I started to additionally use "AsyncGetter" to do blocking `OpenConcurrencyLimiter.waitUntilOpenPermitAvailable`. As a result, we may have a thread that gets the last connection (`maxSize` is reached) from `ConcurrentPool` and submits `waitUntilOpenPermitAvailable` to "AsyncGetter". Concurrently with this happening, the "AsyncGetter" tries to get from `ConcurrentPool` and is blocked because there are no more connections available. In such an execution, `waitUntilOpenPermitAvailable` cannot be completed by "AsyncGetter" because "AsyncGetter" is blocked doing a different task, which itself cannot be completed. a solution is to do `ConcurrentPool.get` and `waitUntilOpenPermitAvailable` in different threads. This way these two different kinds of tasks will not block each other by waiting in the same queue to be done by a single thread. Potential removal from `OpenConcurrencyLimiter.desiredConnectionSlots` when its empty. If `acquirePermitOrGetAvailableOpenedConnection` is called with `true` as `tryGetAvailable`, and `getPooledConnectionImmediately` throws an exception, then `expressDesireToGetAvailableConnection` is not called but `giveUpOnTryingToGetAvailableConnection` is still called, which is incorrect. JAVA-3928
| private static Stream<Arguments> concurrentUsageArguments() { | ||
| return Stream.of( | ||
| Arguments.of(0, 1, 8, true, false, 0.02f), | ||
| Arguments.of(0, 1, 8, false, true, 0.02f), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of all 5 parameter sets, only this one reproduced the deadlock.
|
|
||
| /** | ||
| * Package-private methods are thread-safe, | ||
| * Package-access methods are thread-safe, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"package-private" is not a term the JLS uses, while "package-access" is.
| } finally { | ||
| try { | ||
| if (tryGetAvailable && availableConnection == null) {//the desired connection slot has not yet been removed | ||
| if (expressedDesireToGetAvailableConnection && availableConnection == null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fixes the second problem I discovered while implementing pausable pool. getPooledConnectionImmediately there may throw an exception, which in turn may result in this code calling giveUpOnTryingToGetAvailableConnection despite expressDesireToGetAvailableConnection not being called.
This is an example of how one should try not to rely if possible on methods not throwing exceptions.
JAVA-3927
| * spuriously or because of receiving a signal. | ||
| */ | ||
| private long awaitNanos(final long timeoutNanos) { | ||
| private long awaitNanos(final Condition condition, final long timeoutNanos) throws MongoInterruptedException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change makes the behavior of this method consistent with Timeout and allows to see explicitly what condition it waits for on the calling side.
DefaultConnectionPoolDefaultConnectionPool
On one hand `ConcurrentPool.ensureMinSize` tries to not require the caller to do what can be done by the method itself, e.g., it releases the connection to the pool itself. On the other hand, always releasing in `ensureMinSize` may lead to releasing a permit for the same connection twice, thus not respecting the `maxSize`. It is not easy (requires an additional knob and logic) to prevent the caller (`DefaultConnectionPool`) from releasing a connection when initialization fails. It is, therefore, seems better to mandate that the caller releases a connection if initialization fails. JAVA-3927
DefaultConnectionPoolDefaultConnectionPool
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One question but LGTM
driver-core/src/main/com/mongodb/internal/connection/ConcurrentPool.java
Show resolved
Hide resolved
driver-core/src/main/com/mongodb/internal/connection/DefaultConnectionPool.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
I discovered 3 problems in
DefaultConnectionPoolandConcurrentPoolintroduced by me in #685.OpenConcurrencyLimiter.desiredConnectionSlotsdespite not putting anything there previously. See this commit message for the details.ConcurrentPool.ensureMinSizeexplicitly requires a caller to release resources ifinitializethrows an exception, but then releases the permit fornewItemitself, which leads to double release. This was by far the hardest problem to investigate and find the cause. See this commit message for more details.Commits in this PR can be reviewed one by one.
JAVA-3927