Skip to content

Conversation

@egonelbre
Copy link
Contributor

No description provided.

@egonelbre egonelbre requested review from a team as code owners November 21, 2025 13:45
@product-auto-label product-auto-label bot added the api: spanner Issues related to the Spanner API. label Nov 21, 2025
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @egonelbre, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a subtle bug in the Spanner client's session management where the count of in-use sessions could become negative due to an erroneous double decrement. The change ensures the session counter is accurately maintained, preventing potential issues with session pool health and resource management. New comprehensive tests have been added to validate the fix and prevent future regressions.

Highlights

  • Fix double decrement in session counting: The recycle method in spanner/session.go was modified to prevent a double decrement of the numInUse counter. Previously, if a session was destroyed due to expiration and the pool was full, s.destroy would decrement the counter, and then s.pool.decNumInUseLocked would decrement it again. The fix introduces an early return after s.destroy in this specific scenario.
  • Added new test cases for session recycling: Two new test cases, TestSessionRecycleAfterPoolClose_NoDoubleDecrement and TestSessionRecycle_AlreadyInvalidSession, were added to spanner/session_test.go. These tests specifically target scenarios where a session might be recycled when its pool is invalid or when the session itself has already been destroyed, ensuring the numInUse counter remains accurate and does not go negative.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses a potential race condition in session management that could lead to a double decrement of the in-use session counter. The fix in spanner/session.go is straightforward and correct, preventing the second decrement by returning early. The addition of two targeted tests in spanner/session_test.go is excellent, as they cover the specific scenarios where this bug could manifest. The changes are well-contained and effectively resolve the issue. I have one minor suggestion to improve consistency in the new test code.

@egonelbre egonelbre force-pushed the fix-double-dec-session branch from 0c379ad to 9e8f2ea Compare November 21, 2025 13:52
@rahul2393
Copy link
Contributor

@egonelbre Thanks for opening this PR, we are planning to remove all session pool related code in next release since by default client lib uses multiplexed sessions, are you still using regular session in your application? if yes, can you please tell us any specific reason for using it?

@egonelbre
Copy link
Contributor Author

@rahul2393 sure.

So we've delayed upgrading spanner library for some time because they ended up stalling out our CI pipeline and were unsure why. But, we don't have a nice small reproducer. Yesterday and today I finally had time to bisect the issue to 0491ba6 commit. After updating to latest spanner library and "GOOGLE_CLOUD_SPANNER_MULTIPLEXED_SESSIONS=false" the CI pipeline does seem to work again.

The high level scenario is following, the CI pipeline sets up 7 spanner emulators (to avoid emulator internal locks), each tests sets up and tears down a new database/tables. Tests run in a highly parallel fashion (I would guess about 8000 tests, each setting up 2-3 databases per test and 2-6 connection pools per test). After some time the spanner emulator stops accepting connections. This suggests that something isn't being properly cleaned up. I'm guessing that either connections or sessions are left open, which eventually exhaust the spanner emulator. Or that some goroutines are not being shut down. Or that some retry mechanism is causing problems. But, it could also be a spanner emulator bug with regards to multiplexed sessions that break it down.

I'm still figuring out what the next best step is in debugging multiplexed sessions. Probably will either try to add some monitoring against spanner emulators. Any suggestions for debugging would be very welcome.

This exact fix came about when I ran with that specific environment flag disabled, and had one test failure with the specific message. Directed gemini to diagnose the issue for me. And, if the single sessions are being removed I'm fine closing the PR.

@egonelbre
Copy link
Contributor Author

egonelbre commented Nov 21, 2025

It looks like one of the issues is #13396; still testing whether it resolves the full problem.

EDIT: seems that it's not sufficient, but it does avoid goroutine buildup.

@rahul2393 rahul2393 added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Nov 21, 2025
@kokoro-team kokoro-team removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Nov 21, 2025
@egonelbre
Copy link
Contributor Author

egonelbre commented Nov 24, 2025

@rahul2393 as far as I can able to debug. It seems it's partially related to #10378 issue; and partially that the spanner emulator falters around having ~10000 sessions open (GoogleCloudPlatform/cloud-spanner-emulator#159). I'm not yet sure why multiplexed connections make it worse.

@rahul2393
Copy link
Contributor

@olavloite / @sakthivelmanii need help in one more approval to merge this one, please help

@rahul2393 rahul2393 merged commit e036421 into googleapis:main Nov 25, 2025
9 of 10 checks passed
@egonelbre egonelbre deleted the fix-double-dec-session branch November 25, 2025 15:50
krishnamd-jkp pushed a commit to krishnamd-jkp/google-cloud-go that referenced this pull request Dec 8, 2025
rahul2393 added a commit that referenced this pull request Dec 15, 2025
PR created by the Librarian CLI to initialize a release. Merging this PR
will auto trigger a release.

Librarian Version: v1.0.0
Language Image:
us-central1-docker.pkg.dev/cloud-sdk-librarian-prod/images-prod/librarian-go@sha256:718167d5c23ed389b41f617b3a00ac839bdd938a6bd2d48ae0c2f1fa51ab1c3d
<details><summary>spanner: 1.87.0</summary>

##
[1.87.0](spanner/v1.86.1...spanner/v1.87.0)
(2025-12-10)

### Features

* Add Send and Ack mutations for Queues (PiperOrigin-RevId: 832425466)
([185951b](185951b3))

* Exposing AutoscalingConfig in InstancePartition (PiperOrigin-RevId:
825184314)
([185951b](185951b3))

* Add Spanner location API (PiperOrigin-RevId: 833474957)
([185951b](185951b3))

* Add QueryAdvisorResult for query plan (PiperOrigin-RevId: 832425466)
([185951b](185951b3))

* improve the SQL formatting when printing out SQL (#13267)
([af0806f](af0806f4))

* Add grpc.xds.resource_type label to xDS client metrics (#13358)
([b9196cf](b9196cf6))

* support subquery in View Join (#13266)
([d19f797](d19f797b))

### Bug Fixes

* add env var to allow disabling directpath bound token (#13265)
([029bc79](029bc795))

* fix createMultiplexedSession goroutine leak (#13396)
([1805e89](1805e895))

* decoding spanner rows using SelectAll should map values in correct
annotations (#13301)
([315f65b](315f65b5))

* error instead of panic for iterator after tx end (#13366)
([a27c19a](a27c19ae))

* transaction_tag should be set on BeginTransactionRequest (#13463)
([a429aea](a429aea4))

* Configure keepAlive time for gRPC TCP connections (#13216)
([ca8f64e](ca8f64e0))

* avoid double decrement in session counting (#13395)
([e036421](e0364214))

### Documentation

* minor update for Spanner Location API (PiperOrigin-RevId: 834841888)
([185951b](185951b3))

* Update description for the BatchCreateSessionsRequest and Session
(PiperOrigin-RevId: 832425466)
([185951b](185951b3))

* Update description for the IsolationLevel (PiperOrigin-RevId:
832425466)
([185951b](185951b3))

</details>
cpriti-os pushed a commit to cpriti-os/google-cloud-go that referenced this pull request Dec 16, 2025
…3464)

PR created by the Librarian CLI to initialize a release. Merging this PR
will auto trigger a release.

Librarian Version: v1.0.0
Language Image:
us-central1-docker.pkg.dev/cloud-sdk-librarian-prod/images-prod/librarian-go@sha256:718167d5c23ed389b41f617b3a00ac839bdd938a6bd2d48ae0c2f1fa51ab1c3d
<details><summary>spanner: 1.87.0</summary>

##
[1.87.0](googleapis/google-cloud-go@spanner/v1.86.1...spanner/v1.87.0)
(2025-12-10)

### Features

* Add Send and Ack mutations for Queues (PiperOrigin-RevId: 832425466)
([185951b](googleapis@185951b3))

* Exposing AutoscalingConfig in InstancePartition (PiperOrigin-RevId:
825184314)
([185951b](googleapis@185951b3))

* Add Spanner location API (PiperOrigin-RevId: 833474957)
([185951b](googleapis@185951b3))

* Add QueryAdvisorResult for query plan (PiperOrigin-RevId: 832425466)
([185951b](googleapis@185951b3))

* improve the SQL formatting when printing out SQL (googleapis#13267)
([af0806f](googleapis@af0806f4))

* Add grpc.xds.resource_type label to xDS client metrics (googleapis#13358)
([b9196cf](googleapis@b9196cf6))

* support subquery in View Join (googleapis#13266)
([d19f797](googleapis@d19f797b))

### Bug Fixes

* add env var to allow disabling directpath bound token (googleapis#13265)
([029bc79](googleapis@029bc795))

* fix createMultiplexedSession goroutine leak (googleapis#13396)
([1805e89](googleapis@1805e895))

* decoding spanner rows using SelectAll should map values in correct
annotations (googleapis#13301)
([315f65b](googleapis@315f65b5))

* error instead of panic for iterator after tx end (googleapis#13366)
([a27c19a](googleapis@a27c19ae))

* transaction_tag should be set on BeginTransactionRequest (googleapis#13463)
([a429aea](googleapis@a429aea4))

* Configure keepAlive time for gRPC TCP connections (googleapis#13216)
([ca8f64e](googleapis@ca8f64e0))

* avoid double decrement in session counting (googleapis#13395)
([e036421](googleapis@e0364214))

### Documentation

* minor update for Spanner Location API (PiperOrigin-RevId: 834841888)
([185951b](googleapis@185951b3))

* Update description for the BatchCreateSessionsRequest and Session
(PiperOrigin-RevId: 832425466)
([185951b](googleapis@185951b3))

* Update description for the IsolationLevel (PiperOrigin-RevId:
832425466)
([185951b](googleapis@185951b3))

</details>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api: spanner Issues related to the Spanner API.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants