PostgreSQL Performance Tuning & Parameter Optimization Guide

PostgreSQL Performance Tuning & Parameter Optimization Guide

In our work as seasoned database specialists, we often approach Postgres performance tuning with an eye towards how we can best tune a range of configuration parameters to suit a specific workload and hardware setup. But it’s not always obvious why you should start here, where you should start, or even how to define good database performance. 

Having a strong grasp of these configuration parameters is crucial because it directly impacts how Postgres uses system resources, manages data, and executes queries. Knowing how to skillfully fine-tune these parameters helps you optimize memory usage, disk/IO operations, query execution plans, and concurrency settings so you can best meet the challenges, and expectations, that your application needs to serve.  This can look like being reliable, easily scalable, and resource (and thus, cost) efficient. 

As database practitioners who get excited about crafting solutions to complex engineering problems, we’ll also share learnings about Postgres performance tuning to scale applications that serve millions of users like Nextdoor, Clubhouse, and Coinbase.

Why Postgres performance tuning matters

While it’s hard to define what good Postgres performance looks like, since it depends on your specific business goals, it’s easy to know what failure looks like. Each of these four examples make clear why regular Postgres performance tuning is so important.

Postgres outages

During an outage, users experience frustration, inconvenience, and a loss of trust in the reliability of your service. For businesses, outages can result in significant financial losses due to interrupted transactions, missed opportunities, and reputational damage. Even worse, outages are rarely self-contained events; instead they ripple outwards, causing disruptions downstream. 

For example, Luke Demi talked about how insufficient database scaling due to Heroku’s limitations meant that Clubhouse ran into frequent micro outages, which prompted his recommendation to transition to a more scalable and reliable database. It’s no surprise this is often the starting point for when companies seek to modernize their database performance. 

Scaling bottlenecks

When companies scale, their databases see greater workloads and user concurrency. Inefficient databases will not be able to handle the increased throughput and will become less responsive or even cause outages. This throttles the business’ growth since the lagging database cannot keep pace and becomes a limiting factor. 

Higher costs

Inefficient Postgres databases use a lot of CPU and memory. Oftentimes teams throw more hardware at the problem just to keep things running in the short-term–but this is a bad medium or long-term strategy as the costs will keep climbing. 

Poor end user experience

Slow load times, unresponsive pages, or error messages. What do these three have in common? An inefficient Postgres database. When your database is struggling to keep up with the demands of your application, it can lead to all sorts of performance issues that directly impact the user experience. This sluggishness can make your app feel clunky and unresponsive.

Warning signs that signal poor Postgres performance

Having dealt with a diverse range of Postgres database performance issues, we have a few go-to “red flags” that signal when a database needs some attention. 

One of the most common red flags is long-running queries. When we notice certain queries taking significantly longer than usual to execute, it's a clear indication that something isn't quite right with our database performance.

High CPU utilization is another warning sign. If we see that your CPU is constantly maxing out, it's clear that the database server is struggling to keep up with the workload.

Finally, lock contention, which is when multiple transactions are trying to modify the same data concurrently. This is a bit harder to isolate as it surfaces through a few signs. You might notice a spike in query execution times or deadlock events (transactions unable to proceed). 

8 key parameters for Postgres optimization 

If you’ve encountered any of these warning signs, it’s time to think about what you can do to proactively address and improve database performance before an outage happens. We’ll share some parameters, ranked by how we would prioritize a fix, that you can try out, experiment with, and see improvements almost immediately.

Indexes

Indexes are critical for optimizing query performance as they work by making data retrieval faster. They’re a very simple mechanism for reducing the time Postgres takes to locate and retrieve the data you need, and you can do this by creating indexes on the most frequently used columns. We often start here because it’s an easy way to scale up and provide breathing room for exploring more complex solutions to sustain scalability in the long run. 

Note however that indexes are not a free lunch–they impose overhead on write operations. The more indexes you create, the slower write operations such as inserts, updates, and deletes can become. Why? Every time data is inserted, updated, or deleted in a table with indexes, the corresponding indexes must be updated to reflect the changes. Hence, it’s important to consider the trade-offs and prioritize which columns to index (e.g. those frequently used in queries such as columns in WHERE clauses or JOIN conditions).

Query optimization

It sounds pretty obvious that queries should be optimized for better Postgres performance, but how do you actually go about achieving that?

First, you’ll need to hone in on which queries, or specific parts of queries, are causing performance bottlenecks. Telltale signs include queries with sequential scans, nested loops, or excessive sorting and aggregation. 

From there, you can start with creating or modifying indexes (that’s why we started there first!). More extensive refactoring is a judgment call that comes with experience. You can utilize WHERE clauses to more effectively filter data and reduce the amount of data processed by your queries. Check that you don’t have query patterns that trigger full-table scans or rely on excessive data filtering. 

An example of how a slow query can take down your site can be seen in our interview below:

Avoid repeating calculations or aggregations within queries by using common table expressions (CTEs) or subqueries to precompute and reuse results where possible. Simplify complex expressions and break down queries into smaller, more manageable components.

Query optimization is a pretty big topic and this is just a starting point, and companies that serve large scale demand, like Clubhouse, often need to constantly optimize their queries to manage resource consumption. 

Schema design

If you think back to when you first designed your database schema, you probably went with Third Normal Form (3NF) since it’s a pretty common approach. By organizing data into multiple related tables, you can minimize redundancy. But the tradeoff is that this leads to complex queries requiring joins across these tables, and in more complex systems, this will almost definitely impact Postgres performance. 

So what can you do? Denormalize! Or, identify whether certain frequently accessed data or queries would justify combining tables or even duplicating data, since the performance boost would outweigh the increased storage overhead. 

Connection pooling

In our experience, connection pooling is not an intuitive concept to grasp because it introduces some abstraction. But let’s first define what it is: connection pooling maintains a pool of reusable connections between your application and the database. Instead of opening a new connection every time your application needs to interact with Postgres, it can reuse existing connections from the pool, reducing the overhead of establishing and tearing down connections. This results in faster response times and improved overall performance because your application spends less time waiting for database connections to be established.

So this goes against how we normally think about database connections, where they’re created and taken down on-demand. But it’s worth investing time in understanding how it works under the hood because it gets complicated fast: you can configure and fine-tune parameters such as connection timeouts, pool size, and idle connection evictions.  

Horizontal scaling

Horizontal scaling involves distributing database workload across multiple servers so it can better handle higher demand and improve fault tolerance. There are a few parameters you can tune to achieve this, but we’ll focus on two: replication and high availability configurations. 

Replication is exactly what it sounds like: creating one or more standby replicas of the primary database instance. These replicas maintain an up-to-date copy of the primary database's data by continuously replicating changes from the primary server. Because read-heavy workloads can be offloaded to the replicas, this distributes the overall database workload. 

Also, perhaps most critically, having a standby replica means that you have a backup in case your main Postgres database fails, minimizing any downtime. This leads us to high availability, and you can get there through streaming replication (near real-time streaming any data changes from your primary database to your replicas), synchronous replication (having transactions happen on both the primary and replicas simultaneously), and other methods. 

Vertical scaling

Vertical scaling works very differently. When you vertically scale a database, you’re trying to make it more powerful by getting a “bigger box” for your server: adding more CPU cores, increasing memory capacity, or expanding storage space. This is an easy way to meet increasing demands on your database without engaging in a more complex configuration overhaul. 

An example of where you run into scaling issues with can be seen in this video about Clubhouse:

But there’s a catch: there’s only so much you can add before you max out the capacity of your Postgres server. And this is a more costly option, since the largest servers are very expensive.  

Postgres database configuration parameters

Now we’re getting into the heart of database performance tuning: looking at how Postgres itself is configured and which parameters you can fine-tune to best utilize system resources given a specific workload and hardware setup. There are dozens of parameters you can try, we’ll start with three.

shared_buffers’: determines the amount of memory allocated for caching data. Increasing shared_buffers can improve performance by allowing more data to be cached in memory, reducing the need for frequent disk I/O operations.

work_mem’: controls the amount of memory used for each operation within a query, such as sorting or hashing. Adjusting work_mem can optimize query execution by allocating sufficient memory for intermediate results without overwhelming system resources.

`effective_cache_size` is a parameter that estimates the amount of system memory available for caching data. Setting effective_cache_size accurately ensures that Postgres makes efficient use of available memory for caching frequently accessed data.

There are no magic numbers you can set here; experimentation is key. Try different settings and watch how it impacts (or doesn’t impact) performance metrics such as response times, throughput, and resource utilization.

Table partitioning

With table partitioning, you can break down larger tables into smaller, more manageable pieces. You can decide how and where to partition based on ranges of values or hash values derived from the data. Because Postgres only needs to work with a smaller set of data at a time, you can make data retrieval or manipulation faster.

It might be easier to understand with an example. By partitioning this table based on the transaction date, Postgres can store each year's data in a separate partition. When querying the sales data for a specific year, Postgres only needs to access and process the partition containing that year's data, rather than scanning the entire table.

Not all tables benefit from partitioning; what you want is to partition tables that have a lot of data and contain historical data that is frequently queried. Think range-based or time-series data queries. 

Addressing deeper Postgres performance challenges 

If you’ve tried any or all of these techniques and found that they still don’t solve the problem of Postgres outages or slow performance, you’re not alone. This is common in Postgres scaling efforts: the initial optimizations can help alleviate immediate bottlenecks by providing additional capacity and/or improving resource utilization, but the gains will diminish over time. 

If so, it might be time to try more comprehensive and more complex solutions needed to sustain scalability over the long term, especially if demand for your application is growing fast.

Maintaining Postgres performance optimization over time

In our next article, we’ll zoom in on advanced Postgres query optimization techniques that we’ve refined over time, and share our favorite monitoring and performance tuning tools. 

Postgres performance tuning is a continuous process that evolves alongside your application’s workload, data volume, and hardware infrastructure. It’s not a one-off optimization strategy, but an ongoing investment in learning more about the intricacies of database management so you can maintain optimal performance even as the requirements and challenges change (and they will). 

Our team of experts brings decades of experience with modernizing Postgres and other databases. We’ve scaled applications that served millions of users on Nextdoor, Clubhouse, and Coinbase. If you could use help, feel free to contact us.

Read more