Community Article

Connection Pooling, PgBouncer, and the Prisma Trap

What a connection pool actually does, why your Postgres falls over at 200 connections, where PgBouncer sits, and the prepared-statement bug that bites every Prisma team that adds it the wrong way.

Connection Pooling, PgBouncer, and the Prisma Trap

What a connection pool actually does, why your Postgres falls over at 200 connections, where PgBouncer sits, and the prepared-statement bug that bites every Prisma team that adds it the wrong way.

database
performance
backend
scalability
reliability
ananyanakamura

By @ananyanakamura

January 8, 2026

·

Updated May 20, 2026

322 views

2

4.3 (13)

Every team I have worked with hits the same wall around the time their app starts to take real traffic. Postgres connections climb past 100, then 200, p99 latency starts to creep, and someone opens a ticket that says "the database is slow". The database is not slow. The database is out of connections, or close to it, and the cost of opening new ones is eating your tail latency. The fix is connection pooling. The way you add it determines whether the fix is a one-day change or a six-month adventure with prepared-statement bugs in production.

This article is the one I wish I had read before I shipped my first PgBouncer setup. What a connection actually costs, where the pool should sit, the difference between session and transaction pooling, and why every team that introduces PgBouncer in front of an ORM (especially Prisma) hits the same prepared-statement issue and has to figure out the same fix.

What a Postgres connection actually costs

When your app opens a new Postgres connection, the server forks a backend process. That process holds memory (work_mem allocations, plan cache, prepared statements), file descriptors, and a TCP socket. Postgres is process-per-connection, not thread-per-connection. The default max_connections is 100. Crossing that limit means new connections fail outright.

Even within the limit, idle connections are not free. A typical idle backend holds 5-10MB of RAM. With 200 idle connections, that is 1-2GB of RAM doing nothing. With 1000, you are out of memory. The Postgres team's official guidance is roughly "do not have more connections than (number of CPU cores) * 2 to 4", because each backend competes for shared resources (the shared buffer pool, lock manager, autovacuum). A 4-core box with 50+ active connections is thrashing.

Most app servers want one connection per request, scaled to the request concurrency. A Node service with 100 concurrent requests wants 100 connections. Run 10 instances and you want 1000. 1000 connections to a Postgres tuned for 100 is the wall.

What a connection pool actually does

A connection pool sits between your app and Postgres, holding a fixed set of long-lived connections to the database and handing them out to clients on demand. The math changes:

Without pool:
  10 app instances * 100 concurrent requests = 1000 Postgres connections

With a transaction-mode PgBouncer in front:
  10 app instances * 100 concurrent requests = 1000 PgBouncer connections (cheap, no fork)
  PgBouncer to Postgres: configured with default_pool_size = 25
  -> 25 actual Postgres connections, shared across all clients

PgBouncer holds one connection per concurrent transaction, not per concurrent client. If your transactions take 5ms on average, 25 backend connections can serve 5000 transactions per second. The frontend can have many more clients than the backend has connections, which is the entire point.

Session, transaction, and statement pooling

PgBouncer offers three pooling modes; the difference between them is what determines correctness for your application.

Session pooling assigns a backend connection to a client for the entire duration of the client's session. Behaves identically to a direct connection. Safe for everything, but reduces the multiplexing benefit: 1000 long-lived clients still need 1000 backend connections.

Transaction pooling assigns a backend connection to a client only for the duration of a transaction. After COMMIT or ROLLBACK, the connection returns to the pool. This is the mode that gives you the throughput improvement, because most transactions are short and a single backend connection can serve many clients in a second. The cost is that any feature that depends on connection-level state breaks: SET LOCAL (well, SET LOCAL is fine because it scopes to the transaction; plain SET is not), session variables, advisory locks taken across transactions, prepared statements (the trap I will get to), and listening on LISTEN channels.

Statement pooling assigns a backend connection per statement. Even less state survives. Almost no real application can run at statement pooling because multi-statement transactions are no longer transactions. I have never recommended this in production.

Transaction pooling is what almost everyone runs at, and the rest of this article assumes it.

The Prisma trap (and other ORMs that prepare)

Most ORMs, including Prisma, use prepared statements under the hood. A prepared statement is parsed, planned, and cached on the server side, identified by a name. When you execute it later, you reference the name plus the parameters. This saves the planner work and protects against SQL injection cleanly.

The trap: prepared statements are stored on the backend connection, not the client. When PgBouncer is in transaction-pooling mode, the next transaction might run on a different backend connection that has never seen the prepared statement. The client sends EXECUTE stmt_42, the new backend says "I do not have stmt_42", and the query errors out.

Prisma in particular runs into this because it caches prepared statement names per connection (reasonable assumption with a direct connection), and PgBouncer breaks the assumption. The error you see in production is something like:

Error: prepared statement "s0" does not exist

And it happens intermittently, which is the worst kind. Some requests work, some fail, and the failure rate scales with how aggressively your traffic is multiplexing across backends.

There are three real fixes; pick one.

Fix one: switch off prepared statements. Add ?pgbouncer=true to your Prisma connection URL (or the equivalent in your ORM). This tells the client to never prepare statements, just execute them as plain queries. The cost is a small per-query overhead from re-parsing on the server. For most workloads this is in the microsecond range and not worth optimizing.

DATABASE_URL="postgresql://user:pass@pgbouncer:6432/db?pgbouncer=true&connection_limit=10"

Fix two: bypass the pooler for the prepared-statement-using clients. Run two connection strings: a pooled one for the bulk of your traffic, and a direct one for code paths that really need prepared statements (or migrations). This is what Prisma's documented pattern looks like: DATABASE_URL for runtime queries (pooled), DIRECT_URL for migrations (direct).

Fix three: use a pooler that understands prepared statements. Newer poolers like PgCat and Supavisor in transaction-mode-with-prepared-statement-support detect the PARSE extended-protocol message and rewrite the statement name to one shared across the pool. PgBouncer 1.21 added experimental prepared-statement support too. This eliminates the constraint at the cost of running a less-battle-tested pooler. I default to fix one for new projects unless I have a specific reason.

Where the pooler should sit

Three placements I have used; the trade-off is latency vs how much pool you can build.

Co-located with the database. PgBouncer runs on the same machine as Postgres (or right next to it in the same VPC). Each app instance opens connections to PgBouncer over the network. Pros: one pool for the whole fleet, easy to see all activity in one place. Cons: every query pays the app-to-pooler network round trip on top of pooler-to-Postgres.

Co-located with the application. A PgBouncer sidecar per app instance. The app talks to localhost; PgBouncer talks to Postgres. Pros: app-to-pooler latency is essentially zero. Cons: each instance has its own pool, so the total backend connections is (instances * pool_size), and you have to plan capacity that way. Also, configuration drift: 100 sidecars to keep in sync.

Two-tier: sidecar + central. Each app instance has a PgBouncer sidecar (low latency for in-app queries) connecting to a central PgBouncer (which then connects to Postgres). The central pool sets the actual Postgres-side connection budget. This is what I have ended up with for systems large enough that the per-instance latency win mattered.

Two-tier pooler topology
  app instance 1 -> sidecar PgBouncer (transaction mode, 100 client conns)
                    -> central PgBouncer (transaction mode, pool_size=25)
                       -> Postgres primary
  app instance 2 -> sidecar PgBouncer
                    -> central PgBouncer
                       -> Postgres primary
  app instance N -> sidecar PgBouncer
                    -> central PgBouncer
                       -> Postgres primary

Sizing the pool

The wrong default I see most often is pool_size=100 on PgBouncer. Postgres tuned for 100 real connections only sees its own 100 if PgBouncer's pool is sized that way. The right answer is much smaller. The formula I use:

pool_size = (number of Postgres CPU cores * 2) + (effective spindles)
          = (8 cores * 2) + 0 (SSD)
          = 16

Most modern boxes are SSD-backed, so the spindle term is roughly zero. For an 8-core Postgres, 16 backend connections is plenty. Each backend can serve hundreds of transactions per second; the concurrency you actually need is bounded by CPU.

Counterintuitively, lowering pool_size often improves throughput, because Postgres backends stop fighting over shared buffers and lock manager state. I have shipped one change (pool_size from 100 to 20) that cut p99 latency in half without changing anything else.

A pre-launch checklist for adding a pooler

Before you put PgBouncer in front of a system that has been running on direct connections for months, run through this list.

  1. Audit your code for connection-level state. SET (not SET LOCAL), session-scoped advisory locks, LISTEN, pg_temp schema usage. Any of those break under transaction pooling.
  2. If you use Prisma or another ORM with prepared statements, decide which fix you are applying (disable preparation, dual URL, or a pooler that supports it). Test the fix locally with PgBouncer running in transaction mode.
  3. Migration tools usually need a direct connection, not the pooled one. Migrations issue DDL, CREATE INDEX CONCURRENTLY, etc., that do not behave correctly through transaction pooling.
  4. Add a queue depth metric on the pooler. PgBouncer's cl_waiting (clients waiting for a backend) is the canary; when it spikes, your pool is too small or your queries are too slow.
  5. Roll the change out one app instance at a time. The failure mode ("prepared statement does not exist") will appear immediately if your fix is wrong; you want to catch it on instance one, not 100.

What I would deploy today

For a new system on Postgres, my default is:

  • PgBouncer in transaction-pooling mode, sidecar + central topology if the fleet is large.
  • pool_size calculated from CPU cores, not the gut feel of the engineer who set it up.
  • Prisma (or other prepared-statement-using ORM) with ?pgbouncer=true to disable preparation, unless the workload genuinely measures slower for it. Most do not.
  • Separate DIRECT_URL for migrations and any code path that needs connection-level state.
  • Metrics on cl_waiting, pool_size, and per-pool wait time, alerting at 100ms p99 client wait.

The framing that helps me explain it to teams

A connection pool is not a performance optimization. It is a multiplexing layer that decouples the number of concurrent clients from the number of backend connections, so you can scale your app horizontally without scaling your database connection budget. The Prisma trap and its cousins are not bugs in Prisma or PgBouncer; they are the visible cost of state being held on the wrong side of the multiplexer. Once you frame it that way, the fixes (disable that state, or use a pooler that knows about it) are obvious. The mistake teams make is treating the pool as a transparent caching layer when it is actually a stateful protocol bridge with sharp edges. Knowing the edges is the difference between shipping it once and chasing intermittent errors for two weeks.

Back to Articles