-
-
Notifications
You must be signed in to change notification settings - Fork 185
feat: shared pgx-pool #4393
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: shared pgx-pool #4393
Conversation
✅ Deploy Preview for go-feature-flag-doc-preview ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #4393 +/- ##
==========================================
+ Coverage 85.61% 85.70% +0.08%
==========================================
Files 147 148 +1
Lines 6131 6155 +24
==========================================
+ Hits 5249 5275 +26
+ Misses 655 654 -1
+ Partials 227 226 -1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a shared connection pool for PostgreSQL retrievers, which is a significant improvement to prevent exhausting database connections. The approach of using a singleton with reference counting is sound. However, the current implementation has a critical flaw where it assumes all retrievers use the same database URI, causing them to share a single connection pool incorrectly if they are configured with different URIs. My review includes suggestions to make the pool manager URI-aware, ensuring a separate pool is managed for each unique database connection string. I've also suggested a small improvement to the Shutdown logic for better robustness. Additionally, I've noted that the checklist items for implementing reconnecting logic and adding unit tests are unchecked. It would be beneficial to add tests for the new pooling logic, especially covering multiple URIs and database disconnection scenarios, to ensure its reliability.
| package postgresqlretriever | ||
|
|
||
| import ( | ||
| "context" | ||
| "sync" | ||
|
|
||
| "github.com/jackc/pgx/v5/pgxpool" | ||
| ) | ||
|
|
||
| var ( | ||
| pool *pgxpool.Pool | ||
| mu sync.Mutex | ||
| refCount int | ||
| ) | ||
|
|
||
| func GetPool(ctx context.Context, uri string) (*pgxpool.Pool, error) { | ||
| mu.Lock() | ||
| defer mu.Unlock() | ||
|
|
||
| if pool == nil { | ||
| p, err := pgxpool.New(ctx, uri) | ||
| if err != nil { | ||
| return nil, err | ||
| } | ||
| if err := p.Ping(ctx); err != nil { | ||
| p.Close() | ||
| return nil, err | ||
| } | ||
|
|
||
| pool = p | ||
| } | ||
|
|
||
| refCount++ | ||
| return pool, nil | ||
| } | ||
|
|
||
| func ReleasePool() { | ||
| mu.Lock() | ||
| defer mu.Unlock() | ||
|
|
||
| refCount-- | ||
| if refCount <= 0 { | ||
| if pool != nil { | ||
| pool.Close() | ||
| pool = nil | ||
| } | ||
| refCount = 0 | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current singleton implementation for the connection pool doesn't support multiple database URIs. The global pool variable will be initialized once with the URI from the first retriever, and all subsequent retrievers, regardless of their configured URI, will receive this same pool. This can lead to retrievers connecting to the wrong database.
To fix this, the pool management should be URI-aware. I suggest using maps to store a connection pool and a reference count for each unique URI. This ensures that each database has its own dedicated pool, which is shared only by retrievers connecting to that same database.
package postgresqlretriever
import (
"context"
"sync"
"github.com/jackc/pgx/v5/pgxpool"
)
var (
pools = make(map[string]*pgxpool.Pool)
refCounts = make(map[string]int)
mu sync.Mutex
)
// GetPool retrieves a shared connection pool for the given URI.
// It creates a new pool if one doesn't exist for the URI, otherwise, it returns the existing one.
// It uses reference counting to manage the lifecycle of the pool.
func GetPool(ctx context.Context, uri string) (*pgxpool.Pool, error) {
mu.Lock()
defer mu.Unlock()
// If a pool for this URI already exists, increment its ref count and return it.
if pool, ok := pools[uri]; ok {
refCounts[uri]++
return pool, nil
}
// Otherwise, create a new pool for this URI.
p, err := pgxpool.New(ctx, uri)
if err != nil {
return nil, err
}
if err := p.Ping(ctx); err != nil {
p.Close()
return nil, err
}
pools[uri] = p
refCounts[uri] = 1
return p, nil
}
// ReleasePool decrements the reference count for a pool.
// If the reference count drops to zero, the pool is closed and removed.
func ReleasePool(uri string) {
mu.Lock()
defer mu.Unlock()
// Only proceed if the URI is being tracked.
if _, ok := refCounts[uri]; !ok {
return
}
refCounts[uri]--
if refCounts[uri] <= 0 {
if pool, ok := pools[uri]; ok {
pool.Close()
delete(pools, uri)
}
delete(refCounts, uri)
}
}| func (r *Retriever) Shutdown(ctx context.Context) error { | ||
| if r.conn == nil { | ||
| return nil | ||
| } | ||
| return r.conn.Close(ctx) | ||
| ReleasePool() | ||
| return nil | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To support the URI-aware pool manager, the Shutdown function should pass the retriever's URI to ReleasePool. This ensures that the reference count for the correct connection pool is decremented.
I've also added a nil check for r.pool to make the shutdown process more robust, preventing a call to ReleasePool if the retriever's initialization failed before the pool was assigned.
| func (r *Retriever) Shutdown(ctx context.Context) error { | |
| if r.conn == nil { | |
| return nil | |
| } | |
| return r.conn.Close(ctx) | |
| ReleasePool() | |
| return nil | |
| } | |
| func (r *Retriever) Shutdown(ctx context.Context) error { | |
| if r.pool != nil { | |
| ReleasePool(r.URI) | |
| } | |
| return nil | |
| } |
|
@gogbog thanks for starting this, Gemini said exactly what I was about to say (probably in a better way). I also was asking myself if we want to offer a way to opt-out from using the pool and use direct connection? |
|
@thomaspoignant I totally missed that use-case. We can still make the solution support multiple pools depending on how many databases the user needs to connect to - for example: For the opting-out from using the pool and using a direct connection instead, we can definitely add support for it, but what would be the benefits? From what I know, the pool - even if you only ever use a single connection - is not a bad choice because it still manages reconnections, backoff, lifecycle, and error recovery for free, whereas a direct connection gives you none of that. |
|
Your proposal with a For the optout, forget it, you are right it will always be better to have a pool. |
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request refactors the PostgreSQL retriever to utilize a shared connection pool (pgxpool.Pool) instead of individual pgx.Conn connections. A new file, postgres.go, was introduced to manage a global map of connection pools with reference counting, allowing multiple retrievers to share pools based on their URI. The retriever.go file was updated to integrate this new pooling mechanism for initialization, shutdown, and query execution. Additionally, the code was modernized by replacing interface{} with any for query arguments and map types. Review comments noted the use of global variables for connection pool management as a common design choice, suggested adding a context.Context to the ReleasePool function for API consistency, and praised the modernization to any types.
| var ( | ||
| mu sync.Mutex | ||
| poolMap = make(map[string]*poolEntry) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The use of global variables (mu and poolMap) to implement the singleton pattern for the connection pool is a common approach for shared resources. However, global state can sometimes make testing more complex and introduce implicit dependencies. While acceptable for this specific use case, it's a design choice that should be considered carefully in larger architectures.
|



Description
This PR addresses an issue with how database connections are managed when using multiple Postgres retrievers.
Currently, each retriever creates its own dedicated connection, which leads to hundreds of open connections over time.
It resolves the problem by introducing a shared pgxpool implemented using a singleton pattern. All retrievers now reuse the same connection pool instead of creating new individual connections.
How to test:
Breaking changes:
None expected. Retrievers now use a pooled connection, but the change is internal and transparent to users.
Closes issue(s)
Resolve #4392
Checklist
README.mdand/website/docs)