P50	P99
Query	0ms	0ms
Connect	0ms	0ms
Cold Start	0ms	0ms
Total	0ms	0ms

What is Neon?

Neon is serverless Postgres: Standard PostgreSQL in a cloud platform that separates storage and compute, unlocking features like branching, autoscaling, and scale to zero.

Connections

Before querying a Postgres database, a connection must be established.

import pg from "pg";
const { Client } = pg;
const client = new Client("...");
await client.connect();  
const res = await client.query("SELECT * from table where id=1");
console.log(res.rows);
await client.end();

To create a connection (via client.connect() in the Node.JS example above), several steps and network roundtrips must happen:

DNS lookup
TCP connection
TLS Negotiation
Postgres Auth and Connection

Each of these steps involves at least one trip between client and host. That makes Client<>Database proximity the primary factor in time required to establish a connection.

This benchmark runs the client in the same AWS region as the Neon Project in order to reduce connection latency.

Connections via Serverless Driver

Neon has a serverless driver that extends the standard node-postgres driver pg to work over HTTP and WebSockets instead of TCP. Establishing a connection via the serverless driver is often faster because of optimizations that reduce the roundtrips required.

Data from this benchmarks shows the following difference in connection times between the two drivers:

Queries

Once a connection between client and host is established, queries are a straightforward request/response process.

import pg from "pg";
const { Client } = pg;
const client = new Client();
await client.connect();
const res = await client.query("SELECT * from table where id=1"); 
console.log(res.rows[0]);
await client.end();

Latencies depend on two factors:

Client<>Database proximity - The network latency
Query Complexity - A complex SQL query requires more work from the database before responding.

Cold Starts

A cold start in Neon begins when a database project with a suspended compute endpoint receives a connection. Neon starts the database compute, processes the query, and serves the response. The compute stays active as long as there are active connections.

Try running a query to get a visual of how it works:

Cold start timing:

Slowed Down

Realistic

Scale to zero

When a Neon compute endpoint hasn't received any connections for a specified amount of time, it can autosuspend. This is useful for:

Resource Management - Turning off unused databases is automatic.
Cost-Efficiency - Never pay for compute that's not serving queries.

But scale to zero is only useful if compute can start up quickly again when it's needed. That's why cold start times are so important.

Applications of scale to zero

Look at the cold start times documented above and decide: In what scenarios is the occasional ms of additional latency acceptable?

The answer depends on the specifics of your project. Here are some example scenarios where scale to zero may be useful:

Non-Production Databases - Development, preview, staging, test databases.
Internal Apps - If the userbase for your app is a limited number of employees, the db is likely idle more than active.
Database-per-user Architectures - Instead of having a single database for all users, if you have a separate database for each user, the activity level of any one database may be low enough that scale to zero results in significant cost reduction.
Small Projects - For small projects, configuring the production database to scale to zero can make it more cost-efficient without major impact to UX.

Benchmark methodology

To gather real-world cold start times, the benchmark uses a Neon project with a separate database branch for each of the above variants. A node.js script executes the following steps for every branch every 30 minutes:

Check that database branch is autosuspended
Connect to branch, forcing it to start. - connect time is subtracted from this time to establish the Cold Start time. *See Limitations
Run ten queries on active database. - results recorded as Query time.
Connect and disconnect ten times to measure connect times - The results are recorded as the Connect time.
Log all the timings to main database.
Suspend the database branch.

Calculating Results

The cold start duration for every benchmark run is saved in a table. To calculate the results, results are fetched and the simple-statistics library is used to calculate P50, P99, and Standard Deviation of results.

The code where the results are calculated can be found here.

Limitations, Improvements

Zeroing in on Cold Starts:

Because cold starts only happen when a connection is initiated, there is no good way to measure "just the cold start." In this benchmark, we try and zero in on the cold start by looking at the difference between time required to connect to an inactive instance vs an active instance. After running the test for weeks, we sometimes see a combination of very fast cold start (200ms) and very slow connections to hot instance (300ms) resulting in the results for cold start times showing a negative number.

There might be a better way to zero in cold-starts. We've stuck to this way of measuring (as opposed to using Neon API timings or audit log data) to try and create a benchmark that is most reflective of real-world user experience.

Comparing latency across platforms:

This benchmark uses an EC2 instance in the same region as the database. It would be useful to compare latency when connecting to Neon from other platforms, other regions, etc...

Benchmark Code

All code for the benchmark and display of results is available on GitHub Here is a snippet showing how the timing of the cold start is measured:

Cold Start Timing:

// Cold Start + Connect (where the database starts out suspended)
const coldTimeStart = Date.now(); // <-- Start timer
await benchClient.connect(); // <-- Connect
let coldConnectMs = Date.now() - coldTimeStart; // <-- Stop timer

Connect Timing:

const hotConnectTimes = [];
for (let i = 0; i < 10; i++) {
  const benchClient = new DRIVERS[driver](connection_details);
  const start = Date.now(); // <-- Start timer
  await benchClient.connect(); // <-- Connect
  hotConnectTimes.push(Date.now() - start); // <-- Stop timer
  await benchClient.query(benchmarkQuery); // <-- Query
  await benchClient.end();
}

Query Timing:

await benchClient.connect(); // <-- Connect
...
const hotQueryTimes = [];
for (let i = 0; i < 10; i++) {
  const start = Date.now(); // <-- Start timer
  await benchClient.query(benchmarkQuery); // <-- Query
  hotQueryTimes.push(Date.now() - start); // <-- Stop timer
}

Benchmark Specifications

Database Compute Size:: Defaults to 0.25 CU (0.25 vCPU, 1GB RAM), with one test using 2 CU
Database Region:: AWS US-East-2 (Ohio)
Benchmark Client Region:: AWS US-East-2 (Ohio)
Postgres Connection Type:: Defaults to direct, with one test using Pooled
Postgres Version:: PostgreSQL 16
Postgres Driver:: Defaults to node-postgres, with one test using Neon Serverless Driver

Try It Yourself

Sign up for the free Neon account, clone the repo for this benchmark and website: Neon Cold Start Repo and you can run and modify the benchmarks yourself. Follow the developer docs in the repo to get started.

Keep in mind that benchmarks run locally will include the roundtrip latency from your device to AWS datacenter. The cold starts from this page are deployed in a Lambda in the same AWS region as the Neon database.

Credits

Created by joacoc with input and help from the Neon team.

Neon Latency Benchmarks

Connections

Queries

Cold Starts

Latency by Database Variant

Detailed Stats by Database Variant

Connections

Connections via Serverless Driver

Queries

Cold Starts

Scale to zero

Applications of scale to zero

Benchmark methodology

Calculating Results

Limitations, Improvements

Benchmark Code

Benchmark Specifications

Try It Yourself

Credits