TigerData logo
TigerData logo
  • Product

    Tiger Cloud

    Robust elastic cloud platform for startups and enterprises

    Agentic Postgres

    Postgres for Agents

    TimescaleDB

    Postgres for time-series, real-time analytics and events

  • Docs
  • Pricing

    Pricing

    Enterprise Tier

  • Developer Hub

    Changelog

    Benchmarks

    Blog

    Community

    Customer Stories

    Events

    Support

    Integrations

    Launch Hub

  • Company

    Contact us

    About

    Timescale

    Partners

    Security

    Careers

Log InTry for free
Home
AWS Time-Series Database: Understanding Your OptionsStationary Time-Series AnalysisThe Best Time-Series Databases ComparedTime-Series Analysis and Forecasting With Python Alternatives to TimescaleWhat Are Open-Source Time-Series Databases—Understanding Your OptionsWhy Consider Using PostgreSQL for Time-Series Data?Time-Series Analysis in RWhat Is Temporal Data?What Is a Time Series and How Is It Used?Is Your Data Time Series? Data Types Supported by PostgreSQL and TimescaleUnderstanding Database Workloads: Variable, Bursty, and Uniform PatternsHow to Work With Time Series in Python?Tools for Working With Time-Series Analysis in PythonGuide to Time-Series Analysis in PythonUnderstanding Autoregressive Time-Series ModelingCreating a Fast Time-Series Graph With Postgres Materialized Views
Understanding PostgreSQLOptimizing Your Database: A Deep Dive into PostgreSQL Data TypesUnderstanding FROM in PostgreSQL (With Examples)How to Address ‘Error: Could Not Resize Shared Memory Segment’ How to Install PostgreSQL on MacOSUnderstanding FILTER in PostgreSQL (With Examples)Understanding GROUP BY in PostgreSQL (With Examples)PostgreSQL Join Type TheoryA Guide to PostgreSQL ViewsStructured vs. Semi-Structured vs. Unstructured Data in PostgreSQLUnderstanding Foreign Keys in PostgreSQLUnderstanding PostgreSQL User-Defined FunctionsUnderstanding PostgreSQL's COALESCE FunctionUnderstanding SQL Aggregate FunctionsUsing PostgreSQL UPDATE With JOINHow to Install PostgreSQL on Linux5 Common Connection Errors in PostgreSQL and How to Solve ThemUnderstanding HAVING in PostgreSQL (With Examples)How to Fix No Partition of Relation Found for Row in Postgres DatabasesHow to Fix Transaction ID Wraparound ExhaustionUnderstanding LIMIT in PostgreSQL (With Examples)Understanding PostgreSQL FunctionsUnderstanding ORDER BY in PostgreSQL (With Examples)Understanding WINDOW in PostgreSQL (With Examples)Understanding PostgreSQL WITHIN GROUPPostgreSQL Mathematical Functions: Enhancing Coding EfficiencyUnderstanding DISTINCT in PostgreSQL (With Examples)Using PostgreSQL String Functions for Improved Data AnalysisData Processing With PostgreSQL Window FunctionsPostgreSQL Joins : A SummaryUnderstanding OFFSET in PostgreSQL (With Examples)Understanding PostgreSQL Date and Time FunctionsWhat Is Data Compression and How Does It Work?What Is Data Transformation, and Why Is It Important?Understanding the Postgres string_agg FunctionWhat Is a PostgreSQL Left Join? And a Right Join?Understanding PostgreSQL SELECTSelf-Hosted or Cloud Database? A Countryside Reflection on Infrastructure ChoicesUnderstanding ACID Compliance Understanding percentile_cont() and percentile_disc() in PostgreSQLUnderstanding PostgreSQL Conditional FunctionsUnderstanding PostgreSQL Array FunctionsWhat Characters Are Allowed in PostgreSQL Strings?Understanding WHERE in PostgreSQL (With Examples)What Is a PostgreSQL Full Outer Join?What Is a PostgreSQL Cross Join?What Is a PostgreSQL Inner Join?Data Partitioning: What It Is and Why It MattersStrategies for Improving Postgres JOIN PerformanceUnderstanding the Postgres extract() FunctionUnderstanding the rank() and dense_rank() Functions in PostgreSQL
Guide to PostgreSQL PerformanceHow to Reduce Bloat in Large PostgreSQL TablesDesigning Your Database Schema: Wide vs. Narrow Postgres TablesBest Practices for Time-Series Data Modeling: Single or Multiple Partitioned Table(s) a.k.a. Hypertables Best Practices for (Time-)Series Metadata Tables A Guide to Data Analysis on PostgreSQLA Guide to Scaling PostgreSQLGuide to PostgreSQL SecurityHandling Large Objects in PostgresHow to Query JSON Metadata in PostgreSQLHow to Query JSONB in PostgreSQLHow to Use PostgreSQL for Data TransformationOptimizing Array Queries With GIN Indexes in PostgreSQLPg_partman vs. Hypertables for Postgres PartitioningPostgreSQL Performance Tuning: Designing and Implementing Your Database SchemaPostgreSQL Performance Tuning: Key ParametersPostgreSQL Performance Tuning: Optimizing Database IndexesDetermining the Optimal Postgres Partition SizeNavigating Growing PostgreSQL Tables With Partitioning (and More)Top PostgreSQL Drivers for PythonWhen to Consider Postgres PartitioningGuide to PostgreSQL Database OperationsUnderstanding PostgreSQL TablespacesWhat Is Audit Logging and How to Enable It in PostgreSQLGuide to Postgres Data ManagementHow to Index JSONB Columns in PostgreSQLHow to Monitor and Optimize PostgreSQL Index PerformanceSQL/JSON Data Model and JSON in SQL: A PostgreSQL PerspectiveA Guide to pg_restore (and pg_restore Example)PostgreSQL Performance Tuning: How to Size Your DatabaseAn Intro to Data Modeling on PostgreSQLExplaining PostgreSQL EXPLAINWhat Is a PostgreSQL Temporary View?A PostgreSQL Database Replication GuideHow to Compute Standard Deviation With PostgreSQLHow PostgreSQL Data Aggregation WorksBuilding a Scalable DatabaseRecursive Query in SQL: What It Is, and How to Write OneGuide to PostgreSQL Database DesignHow to Use Psycopg2: The PostgreSQL Adapter for Python
Best Practices for Scaling PostgreSQLHow to Design Your PostgreSQL Database: Two Schema ExamplesHow to Handle High-Cardinality Data in PostgreSQLHow to Store Video in PostgreSQL Using BYTEABest Practices for PostgreSQL Database OperationsHow to Manage Your Data With Data Retention PoliciesBest Practices for PostgreSQL AggregationBest Practices for Postgres Database ReplicationHow to Use a Common Table Expression (CTE) in SQLBest Practices for Postgres Data ManagementBest Practices for Postgres PerformanceBest Practices for Postgres SecurityBest Practices for PostgreSQL Data AnalysisTesting Postgres Ingest: INSERT vs. Batch INSERT vs. COPYHow to Use PostgreSQL for Data Normalization
PostgreSQL Extensions: amcheckPostgreSQL Extensions: Unlocking Multidimensional Points With Cube PostgreSQL Extensions: hstorePostgreSQL Extensions: ltreePostgreSQL Extensions: Secure Your Time-Series Data With pgcryptoPostgreSQL Extensions: pg_prewarmPostgreSQL Extensions: pgRoutingPostgreSQL Extensions: pg_stat_statementsPostgreSQL Extensions: Install pg_trgm for Data MatchingPostgreSQL Extensions: Turning PostgreSQL Into a Vector Database With pgvectorPostgreSQL Extensions: Database Testing With pgTAPPostgreSQL Extensions: PL/pgSQLPostgreSQL Extensions: Using PostGIS and Timescale for Advanced Geospatial InsightsPostgreSQL Extensions: Intro to uuid-ossp
Columnar Databases vs. Row-Oriented Databases: Which to Choose?Data Analytics vs. Real-Time Analytics: How to Pick Your Database (and Why It Should Be PostgreSQL)How to Choose a Real-Time Analytics DatabaseUnderstanding OLTPOLAP Workloads on PostgreSQL: A GuideHow to Choose an OLAP DatabasePostgreSQL as a Real-Time Analytics DatabaseWhat Is the Best Database for Real-Time AnalyticsHow to Build an IoT Pipeline for Real-Time Analytics in PostgreSQL
When Should You Use Full-Text Search vs. Vector Search?HNSW vs. DiskANNA Brief History of AI: How Did We Get Here, and What's Next?A Beginner’s Guide to Vector EmbeddingsPostgreSQL as a Vector Database: A Pgvector TutorialUsing Pgvector With PythonHow to Choose a Vector DatabaseVector Databases Are the Wrong AbstractionUnderstanding DiskANNA Guide to Cosine SimilarityStreaming DiskANN: How We Made PostgreSQL as Fast as Pinecone for Vector DataImplementing Cosine Similarity in PythonVector Database Basics: HNSWVector Database Options for AWSVector Store vs. Vector Database: Understanding the ConnectionPgvector vs. Pinecone: Vector Database Performance and Cost ComparisonHow to Build LLM Applications With Pgvector Vector Store in LangChainHow to Implement RAG With Amazon Bedrock and LangChainRetrieval-Augmented Generation With Claude Sonnet 3.5 and PgvectorRAG Is More Than Just Vector SearchPostgreSQL Hybrid Search Using Pgvector and CohereImplementing Filtered Semantic Search Using Pgvector and JavaScriptRefining Vector Search Queries With Time Filters in Pgvector: A TutorialUnderstanding Semantic SearchWhat Is Vector Search? Vector Search vs Semantic SearchText-to-SQL: A Developer’s Zero-to-Hero GuideNearest Neighbor Indexes: What Are IVFFlat Indexes in Pgvector and How Do They WorkBuilding an AI Image Gallery With OpenAI CLIP, Claude Sonnet 3.5, and Pgvector
Understanding IoT (Internet of Things)A Beginner’s Guide to IIoT and Industry 4.0Storing IoT Data: 8 Reasons Why You Should Use PostgreSQLMoving Past Legacy Systems: Data Historian vs. Time-Series DatabaseWhy You Should Use PostgreSQL for Industrial IoT DataHow to Choose an IoT DatabaseHow to Simulate a Basic IoT Sensor Dataset on PostgreSQLFrom Ingest to Insights in Milliseconds: Everactive's Tech Transformation With TimescaleHow Ndustrial Is Providing Fast Real-Time Queries and Safely Storing Client Data With 97 % CompressionHow Hopthru Powers Real-Time Transit Analytics From a 1 TB Table Migrating a Low-Code IoT Platform Storing 20M Records/DayHow United Manufacturing Hub Is Introducing Open Source to ManufacturingBuilding IoT Pipelines for Faster Analytics With IoT CoreVisualizing IoT Data at Scale With Hopara and TimescaleDB
What Is ClickHouse and How Does It Compare to PostgreSQL and TimescaleDB for Time Series?Timescale vs. Amazon RDS PostgreSQL: Up to 350x Faster Queries, 44 % Faster Ingest, 95 % Storage Savings for Time-Series DataWhat We Learned From Benchmarking Amazon Aurora PostgreSQL ServerlessTimescaleDB vs. Amazon Timestream: 6,000x Higher Inserts, 5-175x Faster Queries, 150-220x CheaperHow to Store Time-Series Data in MongoDB and Why That’s a Bad IdeaPostgreSQL + TimescaleDB: 1,000x Faster Queries, 90 % Data Compression, and Much MoreEye or the Tiger: Benchmarking Cassandra vs. TimescaleDB for Time-Series Data
Alternatives to RDSWhy Is RDS so Expensive? Understanding RDS Pricing and CostsEstimating RDS CostsHow to Migrate From AWS RDS for PostgreSQL to TimescaleAmazon Aurora vs. RDS: Understanding the Difference
5 InfluxDB Alternatives for Your Time-Series Data8 Reasons to Choose Timescale as Your InfluxDB Alternative InfluxQL, Flux, and SQL: Which Query Language Is Best? (With Cheatsheet)What InfluxDB Got WrongTimescaleDB vs. InfluxDB: Purpose Built Differently for Time-Series Data
5 Ways to Monitor Your PostgreSQL DatabaseHow to Migrate Your Data to Timescale (3 Ways)Postgres TOAST vs. Timescale CompressionBuilding Python Apps With PostgreSQL: A Developer's GuideData Visualization in PostgreSQL With Apache SupersetMore Time-Series Data Analysis, Fewer Lines of Code: Meet HyperfunctionsIs Postgres Partitioning Really That Hard? An Introduction To HypertablesPostgreSQL Materialized Views and Where to Find ThemTimescale Tips: Testing Your Chunk Size
Postgres cheat sheet
HomeTime series basicsPostgres basicsPostgres guidesPostgres best practicesPostgres extensionsPostgres for real-time analytics
Sections

Performance

Guide to PostgreSQL Performance

Schema design

PostgreSQL Performance Tuning: Designing and Implementing Your Database Schema

Performance tuning

PostgreSQL Performance Tuning: Key ParametersPostgreSQL Performance Tuning: Optimizing Database IndexesHow to Reduce Bloat in Large PostgreSQL TablesPostgreSQL Performance Tuning: How to Size Your Database

Partitioning

Determining the Optimal Postgres Partition SizeNavigating Growing PostgreSQL Tables With Partitioning (and More)When to Consider Postgres PartitioningPg_partman vs. Hypertables for Postgres Partitioning

Database design and modeling

An Intro to Data Modeling on PostgreSQLDesigning Your Database Schema: Wide vs. Narrow Postgres TablesBest Practices for Time-Series Data Modeling: Single or Multiple Partitioned Table(s) a.k.a. Hypertables Best Practices for (Time-)Series Metadata Tables Guide to PostgreSQL Database Design

Database replication

A PostgreSQL Database Replication Guide

Data analysis

A Guide to Data Analysis on PostgreSQLHow to Compute Standard Deviation With PostgreSQL

Data transformation

How to Use PostgreSQL for Data Transformation

Data aggregation

How PostgreSQL Data Aggregation Works

Scaling postgres

A Guide to Scaling PostgreSQLBuilding a Scalable Database

Database security

Guide to PostgreSQL SecurityWhat Is Audit Logging and How to Enable It in PostgreSQL

Data management

Understanding PostgreSQL TablespacesGuide to Postgres Data ManagementHandling Large Objects in Postgres

Database operations

Guide to PostgreSQL Database Operations

JSON

How to Query JSON Metadata in PostgreSQLHow to Query JSONB in PostgreSQLHow to Index JSONB Columns in PostgreSQLSQL/JSON Data Model and JSON in SQL: A PostgreSQL Perspective

Query optimization

Explaining PostgreSQL EXPLAINWhat Is a PostgreSQL Temporary View?Optimizing Array Queries With GIN Indexes in PostgreSQLRecursive Query in SQL: What It Is, and How to Write One

Database tools and libraries

How to Use Psycopg2: The PostgreSQL Adapter for PythonTop PostgreSQL Drivers for Python

Database indexes

How to Monitor and Optimize PostgreSQL Index Performance

Database backups and restore

A Guide to pg_restore (and pg_restore Example)

Products

Time Series and Analytics AI and Vector Enterprise Plan Cloud Status Support Security Cloud Terms of Service

Learn

Documentation Blog Forum Tutorials Changelog Success Stories Time Series Database

Company

Contact Us Careers About Brand Community Code Of Conduct Events

Subscribe to the Tiger Data Newsletter

By submitting, you acknowledge Tiger Data's Privacy Policy

2025 (c) Timescale, Inc., d/b/a Tiger Data. All rights reserved.

Privacy preferences
LegalPrivacySitemap

Published at Mar 6, 2024

CPU and Memory

PostgreSQL Performance Tuning: How to Size Your Database

A bigger elephant is guiding a smaller one.

Authors: Ben Packer and Carlota Soto

You launched your product, you got some users, and now... Things are getting slower. Queries that used to run in 100 ms are now taking one second, and your team (which has spent the last six months writing application code) faces a whole new set of questions around PostgreSQL performance tuning and optimization.

Scaling PostgreSQL can be challenging, but you don’t need to panic. There are plenty of ways to scale a PostgreSQL database. You may need:

  • More CPU power or memory

  • A partition strategy

  • Alter your PostgreSQL settings like shared_buffers or max_parallel_workers

  • Optimizing indexes

  • Updating database schema

First, congrats! These are great problems to have: they mean you’re scaling your PostgreSQL database.

Second, don't panic—and definitely don't start frantically Googling for a different database to use. PostgreSQL is the most loved database out there and the best foundation for your application, but if you’re scaling, it may need a little extra love.

In this article, we will cover CPU and memory sizing for your database. As often happens with databases, there aren’t foolproof guidelines we can assure will work without direct experimentation and observation: there are very few true invariants and always caveats. Fortunately, though, there are some principles you can systematically follow, and we’ll explore them in this series.

Check out other articles in this series that cover partitioning strategy, PostgreSQL parameters, index optimization, and schema design.

It may be tempting to fix every performance issue by throwing more resources (a.k.a. money) at the problem—don’t do this. That is an unsustainable solution that will raise your bill through the roof in the future. So, let’s start by breaking down the relationship between your available database resources (CPU and memory) and performance.

CPU Sizing for Your PostgreSQL Database

To properly size your database CPU resources, you need to estimate requirements and monitor your system once it is in place. Let’s start by looking at a solid estimate of how CPU resources affect database throughput.

Estimating CPU power for PostgreSQL

There are three main components of the (usually virtual) machine that runs your database: memory (RAM), CPU, and disk I/O. This post will focus on CPU and memory.

Your CPU is the core work unit in your database. Operations such as data transfer between the database and client, index scanning, data joining, and the evaluation of WHERE clauses all rely on the CPU. Generally, given the absence of memory or disk constraints, PostgreSQL's read throughput scales in direct proportion to the number of available cores.

Your CPU also handles concurrency in your database. PostgreSQL can run more than one query per core. Still, in general, you should use the following formula as a rough estimate for queries per second:

💡Queries per Second (QPS) ≈ (1 / Average Query Runtime in Seconds) × Number of Cores

This estimate relates your average query time, queries per second, and the number of available CPU cores. The formula assumes each core can handle one query at a time and that other factors, like memory or disk access, are not bottlenecks. You can use it to estimate target CPU capacities or available throughput.

For example:

  • If you anticipate that your system needs to handle 320 queries per second and know the average query takes 50 ms, you'd need roughly 16 cores (given the linear scaling assumption).

  • Conversely, if you have a fixed number of cores available, like eight, and know your average query time, you can estimate the load your system can handle efficiently.

Remember, this is a back-of-the-napkin estimate. Real-world performance will vary due to many other factors. In most cases, you can expect a higher QPS than what this formula provides.

Once you have enough CPU power to handle your throughput, you need to monitor the system to ensure it works properly.

Monitoring CPU usage

If you’ve sized your CPU properly and yet your database is slow, here are some signs that you might need to increase the number of cores on your machine:

  • Observed CPU usage remains consistently high. You should always keep an eye on current and historical CPU utilization information to correlate that utilization with application activity. If you’re using Timescale, you can access this information via the Metrics panel in the console. Remember—increasing your CPU is not always the best solution to counter low performance. Your CPU usage may be maxed out because you’re using resources inefficiently, for example, due to misconfigured partitioning or missing indexes.

  • Queries slow down when there is higher activity, even without more data. If most of your queries are fast during low-volume periods, but many slow down when you reach certain queries per second, that may be a sign that you need more cores.

  • You are using analytical queries heavily (aka aggregates). All queries use CPU, but big joins, aggregates, and sequential scans use more (they may use multiple cores) and for longer. If you have queries that compute aggregates over large amounts of data, make sure that you have enough CPU to still serve your application’s requests. Timescale’s continuous aggregates also solve this problem—they will reduce your CPU requirements considerably while keeping your analytical queries very fast. Consider using them if you're a heavy user of aggregates.

To catch these issues, you must monitor system performance and adjust resources accordingly. Here are a few tips on CPU monitoring:

  • Use pg_stat_activity to monitor wait_event: pg_stat_activity is a PostgreSQL view that provides information about the current activities of processes. The wait_event column specifically tells you what a process is waiting for. If the value in this column is null, it generally indicates that the process isn't waiting on any I/O or lock operations but is actively processing and utilizing CPU. In simpler terms, a null value suggests that the query or operation is CPU-bound, meaning that the CPU is the primary resource being utilized by that particular process.

  • Use EXPLAIN_ANALYZE to monitor parallelism: The EXPLAIN ANALYZE command in PostgreSQL shows a detailed execution plan of a query, along with the actual runtime statistics. This includes information about which parts of the query are executed in parallel, how many workers are used, and the efficiency of the parallel processing. This information can help you maximize the multi-core setup of your CPU.

  • Use Pgwatch2 to monitor the whole system: Pgwatch2 is a popular open-source monitoring tool designed specifically for PostgreSQL databases. One of its standout features is its detailed breakdown of CPU usage for different components and operations within your PostgreSQL instance. By using Pgwatch2, you can gain insights into which parts of their workload are consuming the most CPU and potentially where optimizations can be made.

Now that we’ve looked at sizing your CPU, let's look at sizing memory for your database.

Memory Sizing for PostgreSQL

A PostgreSQL database utilizes memory in two primary ways, each governed by distinct parameters:

  • Caching data and indexes. PostgreSQL maintains a cache of both raw data and indexes in memory. The size of this cache is determined by the shared_buffers parameter. Data is stored in "pages," which are essentially chunks of data. This applies to both data files and indexes. The presence or absence of a page in memory determines if it's cached. While there are ample resources to delve into the intricacies of cache eviction algorithms and page structures, the key thing to remember is that PostgreSQL uses an LRU (Least Recently Used) system for cache eviction.

  • Working memory for intermediate data structures. When processing queries, PostgreSQL often needs to create temporary data structures to aid in delivering results. This could be as part of join operations, sorting, or even generating aggregates. The amount of memory that can be allocated for these structures is managed by the work_mem parameter. For instance, consider the various join algorithms employed by PostgreSQL: many require memory for operations like sorting or creating a hash map. Similarly, aggregate functions like GROUP BY and DISTINCT, or queries using ORDER BY (that don't use an index for sorting), may also create temporary data structures that occupy memory. If a join or hash needs more memory than your work_mem setting, it will spill to disk, which will be much slower.

These two parameters (shared_buffers and work_mem) are especially important. We look at them in detail later in this series, as they’re convenient in understanding memory-related performance issues in PostgreSQL:

  • The cache maintained in shared_buffers is crucial for speeding up data access. A low cache hit ratio indicates that the database has to frequently read from disk rather than from the faster memory cache. Increasing the shared_buffers might help, but ensure it's balanced with the OS cache.

  • If complex operations (like sorts or joins) frequently spill to disk because they exceed the configured work_mem, performance will suffer. Increasing work_mem may help, but you must be cautious here, too, as setting it too high could eventually consume all your memory, especially with many concurrent connections.

It's worth noting that there are two layers of caching: the PostgreSQL memory cache, which is directly managed by the database and is the primary cache level where data and indexes are held, and the operating system cache. This is an auxiliary cache level managed by the operating system. While PostgreSQL isn't directly aware of what's cached here, this layer significantly accelerates file-reading operations by holding onto files recently read from disk.

Generally, if your running database has more memory, it has to read from (and sometimes even write to) disk less, which is a good thing. But extra memory is expensive. How can you know that you have to increase the amount of RAM on your machine?

  • Queries have unpredictable performance. If a query is slow when run once and then fast on the second go, it could be a sign that the data it needs cannot stay in the cache consistently. However, there are other potential causes for this type of speed pattern (such as high query planning time, which is cached), so it's worth directly observing this difference by comparing shared read vs. shared hit in your query plan.

  • You have many big indexes. If all of your queries can be satisfied with index scans, and all of your indexes can fit in memory, you will likely get good enough performance even if the database has to go to disk to fetch the whole returned row. Many users are aware that more indexes on a table mean that inserts, updates, and deletes have to do more work, but it's also the case that in order to properly take advantage of additional indexes, you may need to increase the memory on your machine so they can stay in the cache.

  • Your sorts are spilling to disk. As we’ve mentioned before, if you see an External Sort or External Merge node while debugging slow queries, this means that the memory required for the operation exceeds the available work_mem. As a result, PostgreSQL had to write its intermediate data structure to disk and then clean up at the end.

  • You see a lot of DataFileRead wait events in pg_stat_activity. Anytime a query reads from disk, a wait event is triggered. Directly observing these in pg_stat_activity or comparing the number of events over time can give you a clue that you may need to fiddle with your memory. Significant wait events are also a sign that sequential scans are happening, so make sure to eliminate the possibility that you're simply missing an index (or have written the query to make the use of the index impossible) before spending more money.

Lastly, here are a few extra tips on memory monitoring.

Measure the size of your indexes

Indexes can consume a significant portion of memory when loaded into the cache, particularly large ones. Knowing the size of your indexes helps you understand how much memory they might occupy and decide which indexes are most crucial. To do so, use the pg_relation_size() function along with the index name:

SELECT pg_size_pretty(pg_relation_size('index_name_here'));

Monitor index usage

An unused index not only consumes disk space but can also lead to unnecessary overhead during write operations. However, it's worth noting that only the blocks of an index used by queries will take up space in memory when loaded into cache.

You can regularly monitor which indexes are being accessed and which aren't via the pg_stat_user_indexes view, which contains statistics about user-defined indexes. Specifically, the idx_scan column will indicate how many times an index was scanned. A value close to or at zero over extended periods shows the index might be unused.

Evaluating these points requires monitoring. Here are a few extra tips on monitoring memory usage in your database.

  • Use pg_relation_size() to measure your indexes: Indexes, particularly large ones, can consume a significant portion of memory when loaded into the cache. Knowing the size of your indexes helps you understand how much memory they might occupy and enables you to decide which indexes are most crucial. To do so, use the pg_relation_size() function along with the index name.

  • Use pg_stat_user_indexes to monitor index usage: An unused index consumes disk space and can lead to unnecessary overhead during write operations. You can regularly monitor which indexes are being accessed and which aren't via the pg_stat_user_indexes view, which contains statistics about user-defined indexes. The idx_scan column will indicate how many times an index was scanned. If the value is consistently low, you may want to remove this index.

  • Use EXPLAIN_ANALYZE to tell if something was fetched from disk: to get a detailed view of memory interactions, you should employ EXPLAIN (ANALYZE, VERBOSE, BUFFERS). This combination of flags furnishes insights into memory usage, buffer hits, and disk reads. In the output, you will see shared hit and shared read. A shared hit implies that the required data was found in the cache, while a shared read suggests that the data had to be fetched from the disk. A higher ratio of hits to reads usually indicates better memory optimization.

  • use pg_stat_activity to check memory: Querying the pg_stat_activity view can show which queries are waiting on memory-related events or are consuming significant resources. This view can be especially useful to see if certain queries are consistently causing issues.

Keep Reading

In this article, we have seen how to size your CPU and memory to keep your PostgreSQL database in top condition. Take a look at our other articles in this series covering partitioning strategy, PostgreSQL parameters, index optimization, and schema design.

On this page