summaryrefslogtreecommitdiff
path: root/doc/src/sgml/monitoring.sgml
AgeCommit message (Collapse)Author
2025-04-30doc: Mention cost-based delays for total_[auto]{vacuum,analyze}_timeMichael Paquier
30a6ed0ce4b has added four attributes to pg_stat_all_tables to track the cumulative time spent in [auto]vacuum and [auto]analyze. It was not mentioned that the vacuum cost-based delays are included in these numbers, which could be confusing now that the delays are included in the vacuum progress view (bb8dff9995f2). This commit adds an extra note about this matter. Reported-by: Magnus Hagander <[email protected]> Author: Bertrand Drouvot <[email protected]> Discussion: https://postgr.es/m/CABUevEz9v1ZNToPyD98JnWDGZgG=SmPZKkSNzU9hXQ-nGTQF0g@mail.gmail.com
2025-04-30doc: Add missing reference to track_cost_delay_timing.Nathan Bossart
Oversight in commit bb8dff9995.
2025-04-21Doc: fix incorrect punctuationDavid Rowley
Author: Noboru Saito <[email protected]> Discussion: https://postgr.es/m/CAAM3qnJtv5YbjpwDfVOYN2gZ9zGSLFM1UGJgptSXmwfifOZJFQ@mail.gmail.com Backpatch-through: 17
2025-04-04Add nbtree skip scan optimization.Peter Geoghegan
Teach nbtree multi-column index scans to opportunistically skip over irrelevant sections of the index given a query with no "=" conditions on one or more prefix index columns. When nbtree is passed input scan keys derived from a predicate "WHERE b = 5", new nbtree preprocessing steps output "WHERE a = ANY(<every possible 'a' value>) AND b = 5" scan keys. That is, preprocessing generates a "skip array" (and an output scan key) for the omitted prefix column "a", which makes it safe to mark the scan key on "b" as required to continue the scan. The scan is therefore able to repeatedly reposition itself by applying both the "a" and "b" keys. A skip array has "elements" that are generated procedurally and on demand, but otherwise works just like a regular ScalarArrayOp array. Preprocessing can freely add a skip array before or after any input ScalarArrayOp arrays. Index scans with a skip array decide when and where to reposition the scan using the same approach as any other scan with array keys. This design builds on the design for array advancement and primitive scan scheduling added to Postgres 17 by commit 5bf748b8. Testing has shown that skip scans of an index with a low cardinality skipped prefix column can be multiple orders of magnitude faster than an equivalent full index scan (or sequential scan). In general, the cardinality of the scan's skipped column(s) limits the number of leaf pages that can be skipped over. The core B-Tree operator classes on most discrete types generate their array elements with the help of their own custom skip support routine. This infrastructure gives nbtree a way to generate the next required array element by incrementing (or decrementing) the current array value. It can reduce the number of index descents in cases where the next possible indexable value frequently turns out to be the next value stored in the index. Opclasses that lack a skip support routine fall back on having nbtree "increment" (or "decrement") a skip array's current element by setting the NEXT (or PRIOR) scan key flag, without directly changing the scan key's sk_argument. These sentinel values behave just like any other value from an array -- though they can never locate equal index tuples (they can only locate the next group of index tuples containing the next set of non-sentinel values that the scan's arrays need to advance to). A skip array's range is constrained by "contradictory" inequality keys. For example, a skip array on "x" will only generate the values 1 and 2 given a qual such as "WHERE x BETWEEN 1 AND 2 AND y = 66". Such a skip array qual usually has near-identical performance characteristics to a comparable SAOP qual "WHERE x = ANY('{1, 2}') AND y = 66". However, improved performance isn't guaranteed. Much depends on physical index characteristics. B-Tree preprocessing is optimistic about skipping working out: it applies static, generic rules when determining where to generate skip arrays, which assumes that the runtime overhead of maintaining skip arrays will pay for itself -- or lead to only a modest performance loss. As things stand, these assumptions are much too optimistic: skip array maintenance will lead to unacceptable regressions with unsympathetic queries (queries whose scan can't skip over many irrelevant leaf pages). An upcoming commit will address the problems in this area by enhancing _bt_readpage's approach to saving cycles on scan key evaluation, making it work in a way that directly considers the needs of = array keys (particularly = skip array keys). Author: Peter Geoghegan <[email protected]> Reviewed-By: Masahiro Ikeda <[email protected]> Reviewed-By: Heikki Linnakangas <[email protected]> Reviewed-By: Matthias van de Meent <[email protected]> Reviewed-By: Tomas Vondra <[email protected]> Reviewed-By: Aleksander Alekseev <[email protected]> Reviewed-By: Alena Rybakina <[email protected]> Discussion: https://postgr.es/m/CAH2-Wzmn1YsLzOGgjAQZdn1STSG_y8qP__vggTaPAYXJP+G4bw@mail.gmail.com
2025-03-30docs: Reframe track_io_timing related docs as wait timeAndres Freund
With AIO it does not make sense anymore to track the time for each individual IO, as multiple IOs can be in-flight at the same time. Instead we now track the time spent *waiting* for IOs. This should be reflected in the docs. While, so far, we only do a subset of reads, and no other operations, via AIO, describing the GUC and view columns as measuring IO waits is accurate for synchronous and asynchronous IO. Reviewed-by: Noah Misch <[email protected]> Discussion: https://postgr.es/m/5dzyoduxlvfg55oqtjyjehez5uoq6hnwgzor4kkybkfdgkj7ag@rbi4gsmzaczk
2025-03-26doc: Mention possible ephemeral discrepancies in pg_stat_activityMichael Paquier
Ephemeral inconsistencies across multiple attributes of pg_stat_activity can exist as the system is designed to be efficient with a low overhead. This question is raised by users from time to time based on the data read in the view, so let's add a note in the docs about this possibility. Author: Alex Friedman <[email protected]> Reviewed-by: Sami Imseih <[email protected]> Discussion: https://postgr.es/m/[email protected]
2025-03-24Detect and Log multiple_unique_conflicts type conflict.Amit Kapila
Introduce a new conflict type, multiple_unique_conflicts, to handle cases where an incoming row during logical replication violates multiple UNIQUE constraints. Previously, the apply worker detected and reported only the first encountered key conflict (insert_exists/update_exists), causing repeated failures as each constraint violation needs to be handled one by one making the process slow and error-prone. With this patch, the apply worker checks all unique constraints upfront once the first key conflict is detected and reports multiple_unique_conflicts if multiple violations exist. This allows users to resolve all conflicts at once by deleting all conflicting tuples rather than dealing with them individually or skipping the transaction. In the future, this will also allow us to specify different resolution handlers for such a conflict type. Add the stats for this conflict type in pg_stat_subscription_stats. Author: Nisha Moond <[email protected]> Author: Zhijie Hou <[email protected]> Reviewed-by: Amit Kapila <[email protected]> Reviewed-by: Peter Smith <[email protected]> Reviewed-by: Dilip Kumar <[email protected]> Discussion: https://postgr.es/m/CABdArM7FW-_dnthGkg2s0fy1HhUB8C3ELA0gZX1kkbs1ZZoV3Q@mail.gmail.com
2025-03-11Show index search count in EXPLAIN ANALYZE, take 2.Peter Geoghegan
Expose the count of index searches/index descents in EXPLAIN ANALYZE's output for index scan/index-only scan/bitmap index scan nodes. This information is particularly useful with scans that use ScalarArrayOp quals, where the number of index searches can be unpredictable due to implementation details that interact with physical index characteristics (at least with nbtree SAOP scans, since Postgres 17 commit 5bf748b8). The information shown also provides useful context when EXPLAIN ANALYZE runs a plan with an index scan node that successfully applied the skip scan optimization (set to be added to nbtree by an upcoming patch). The instrumentation works by teaching all index AMs to increment a new nsearches counter whenever a new index search begins. The counter is incremented at exactly the same point that index AMs already increment the pg_stat_*_indexes.idx_scan counter (we're counting the same event, but at the scan level rather than the relation level). Parallel queries have workers copy their local counter struct into shared memory when an index scan node ends -- even when it isn't a parallel aware scan node. An earlier version of this patch that only worked with parallel aware scans became commit 5ead85fb (though that was quickly reverted by commit d00107cd following "debug_parallel_query=regress" buildfarm failures). Our approach doesn't match the approach used when tracking other index scan related costs (e.g., "Rows Removed by Filter:"). It is comparable to the approach used in similar cases involving costs that are only readily accessible inside an access method, not from the executor proper (e.g., "Heap Blocks:" output for a Bitmap Heap Scan, which was recently enhanced to show per-worker costs by commit 5a1e6df3, using essentially the same scheme as the one used here). It is necessary for index AMs to have direct responsibility for maintaining the new counter, since the counter might need to be incremented multiple times per amgettuple call (or per amgetbitmap call). But it is also necessary for the executor proper to manage the shared memory now used to transfer each worker's counter struct to the leader. Author: Peter Geoghegan <[email protected]> Reviewed-By: Robert Haas <[email protected]> Reviewed-By: Tomas Vondra <[email protected]> Reviewed-By: Masahiro Ikeda <[email protected]> Reviewed-By: Matthias van de Meent <[email protected]> Discussion: https://postgr.es/m/CAH2-WzkRqvaqR2CTNqTZP0z6FuL4-3ED6eQB0yx38XBNj1v-4Q@mail.gmail.com Discussion: https://postgr.es/m/CAH2-Wz=PKR6rB7qbx+Vnd7eqeB5VTcrW=iJvAsTsKbdG+kW_UA@mail.gmail.com
2025-03-11Add WAL data to backend statisticsMichael Paquier
This commit adds per-backend WAL statistics, providing the same information as pg_stat_wal, except that it is now possible to know how much WAL activity is happening in each backend rather than an overall aggregate of all the activity. Like pg_stat_wal, the implementation relies on pgWalUsage, tracking the difference of activity between two reports to pgstats. This data can be retrieved with a new system function called pg_stat_get_backend_wal(), that returns one tuple based on the PID provided in input. Like pg_stat_get_backend_io(), this is useful when joined with pg_stat_activity to get a live picture of the WAL generated for each running backend, showing how the activity is [un]balanced. pgstat_flush_backend() gains a new flag value, able to control the flush of the WAL stats. This commit relies mostly on the infrastructure provided by 9aea73fc61d4, that has introduced backend statistics. Bump catalog version. A bump of PGSTAT_FILE_FORMAT_ID is not required, as backend stats do not persist on disk. Author: Bertrand Drouvot <[email protected]> Reviewed-by: Michael Paquier <[email protected]> Reviewed-by: Nazir Bilal Yavuz <[email protected]> Reviewed-by: Xuneng Zhou <[email protected]> Discussion: https://postgr.es/m/Z3zqc4o09dM/[email protected]
2025-03-05Revert "Show index search count in EXPLAIN ANALYZE."Peter Geoghegan
This reverts commit 5ead85fbc81162ab1594f656b036a22e814f96b3. This commit shows test failures with debug_parallel_query=regress. The underlying issue needs to be debugged, so revert for now.
2025-03-05Show index search count in EXPLAIN ANALYZE.Peter Geoghegan
Expose the count of index searches/index descents in EXPLAIN ANALYZE's output for index scan nodes. This information is particularly useful with scans that use ScalarArrayOp quals, where the number of index scans isn't predictable in advance (at least not with optimizations like the one added to nbtree by Postgres 17 commit 5bf748b8). It will also be useful when EXPLAIN ANALYZE shows details of an nbtree index scan that uses skip scan optimizations set to be introduced by an upcoming patch. The instrumentation works by teaching index AMs to increment a new nsearches counter whenever a new index search begins. The counter is incremented at exactly the same point that index AMs must already increment the index's pg_stat_*_indexes.idx_scan counter (we're counting the same event, but at the scan level rather than the relation level). The new counter is stored in the scan descriptor (IndexScanDescData), which explain.c reaches by going through the scan node's PlanState. This approach doesn't match the approach used when tracking other index scan specific costs (e.g., "Rows Removed by Filter:"). It is similar to the approach used in other cases where we must track costs that are only readily accessible inside an access method, and not from the executor (e.g., "Heap Blocks:" output for a Bitmap Heap Scan). It is inherently necessary to maintain a counter that can be incremented multiple times during a single amgettuple call (or amgetbitmap call), and directly exposing PlanState.instrument to index access methods seems unappealing. Author: Peter Geoghegan <[email protected]> Reviewed-By: Tomas Vondra <[email protected]> Reviewed-By: Robert Haas <[email protected]> Reviewed-By: Masahiro Ikeda <[email protected]> Reviewed-By: Matthias van de Meent <[email protected]> Discussion: https://postgr.es/m/CAH2-Wz=PKR6rB7qbx+Vnd7eqeB5VTcrW=iJvAsTsKbdG+kW_UA@mail.gmail.com Discussion: https://postgr.es/m/CAH2-WzkRqvaqR2CTNqTZP0z6FuL4-3ED6eQB0yx38XBNj1v-4Q@mail.gmail.com
2025-03-04Split pgstat_bestart() into three different routinesMichael Paquier
pgstat_bestart(), used post-authentication to set up a backend entry in the PgBackendStatus array, so as its data becomes visible in pg_stat_activity and related catalogs, has its logic divided into three routines with this commit, called in order at different steps of the backend initialization: * pgstat_bestart_initial() sets up the backend entry with a minimal amount of information, reporting it with a new BackendState called STATE_STARTING while waiting for backend initialization and client authentication to complete. The main benefit that this offers is observability, so as it is possible to monitor the backend activity during authentication. This step happens earlier than in the logic prior to this commit. pgstat_beinit() happens earlier as well, before authentication. * pgstat_bestart_security() reports the SSL/GSS status of the connection, once authentication completes. Auxiliary processes, for example, do not need to call this step, hence it is optional. This step is called after performing authentication, same as previously. * pgstat_bestart_final() reports the user and database IDs, takes the entry out of STATE_STARTING, and reports its application_name. This is called as the last step of the three, once authentication completes. An injection point is added, with a test checking that the "starting" phase of a backend entry is visible in pg_stat_activity. Some follow-up patches are planned to take advantage of this refactoring with more information provided in backend entries during authentication (LDAP hanging was a problem for the author, initially). Author: Jacob Champion <[email protected]> Reviewed-by: Michael Paquier <[email protected]> Reviewed-by: Andres Freund <[email protected]> Discussion: https://postgr.es/m/CAOYmi+=60deN20WDyCoHCiecgivJxr=98s7s7-C8SkXwrCfHXg@mail.gmail.com
2025-02-26Re-add GUC track_wal_io_timingMichael Paquier
This commit is a rework of 2421e9a51d20, about which Andres Freund has raised some concerns as it is valuable to have both track_io_timing and track_wal_io_timing in some cases, as the WAL write and fsync paths can be a major bottleneck for some workloads. Hence, it can be relevant to not calculate the WAL timings in environments where pg_test_timing performs poorly while capturing some IO data under track_io_timing for the non-WAL IO paths. The opposite can be also true: it should be possible to disable the non-WAL timings and enable the WAL timings (the previous GUC setups allowed this possibility). track_wal_io_timing is added back in this commit, controlling if WAL timings should be calculated in pg_stat_io for the read, fsync and write paths, as done previously with pg_stat_wal. pg_stat_wal previously tracked only the sync and write parts (now removed), read stats is new data tracked in pg_stat_io, all three are aggregated if track_wal_io_timing is enabled. The read part matters during recovery or if a XLogReader is used. Extra note: more control over if the types of timings calculated in pg_stat_io could be done with a GUC that lists pairs of (IOObject,IOOp). Reported-by: Andres Freund <[email protected]> Author: Bertrand Drouvot <[email protected]> Co-authored-by: Michael Paquier <[email protected]> Discussion: https://postgr.es/m/3opf2wh2oljco6ldyqf7ukabw3jijnnhno6fjb4mlu6civ5h24@fcwmhsgmlmzu
2025-02-24Remove read/sync fields from pg_stat_wal and GUC track_wal_io_timingMichael Paquier
The four following attributes are removed from pg_stat_wal: * wal_write * wal_sync * wal_write_time * wal_sync_time a051e71e28a1 has added an equivalent of this information in pg_stat_io with more granularity as this now spreads across the backend types, IO context and IO objects. So, keeping the same information in pg_stat_wal has little benefits. Another benefit of this commit is the removal of PendingWalStats, simplifying an upcoming patch to add per-backend WAL statistics, which already support IO statistics and which have access to the write/sync stats data of WAL. The GUC track_wal_io_timing, that was used to enable or disable the aggregation of the write and sync timings for WAL, is also removed. pgstat_prepare_io_time() is simplified. Bump catalog version. Bump PGSTAT_FILE_FORMAT_ID, due to the update of PgStat_WalStats. Author: Bertrand Drouvot <[email protected]> Discussion: https://postgr.es/m/Z7RkQ0EfYaqqjgz/@ip-10-97-1-34.eu-west-3.compute.internal
2025-02-21doc: clarify default checksum behavior in non-master branchesBruce Momjian
Also simplify and correct data checksum wording in master now that it is the default. PG 13 did not have the awkward wording. Reported-by: Felix <[email protected]> Reviewed-by: Laurenz Albe Discussion: https://postgr.es/m/[email protected] Backpatch-through: 14
2025-02-20doc: Add details about object "wal" in pg_stat_ioMichael Paquier
This commit adds a short description of what kind of activity is tracked in pg_stat_io for the object "wal", with a link pointing to the section "WAL configuration" that has a lot of details on the matter. This should perhaps have been added in a051e71e28a1, but things are what they are. Author: Bertrand Drouvot <[email protected]> Discussion: https://postgr.es/m/Z7RkQ0EfYaqqjgz/@ip-10-97-1-34.eu-west-3.compute.internal
2025-02-11Add cost-based vacuum delay time to progress views.Nathan Bossart
This commit adds the amount of time spent sleeping due to cost-based delay to the pg_stat_progress_vacuum and pg_stat_progress_analyze system views. A new configuration parameter named track_cost_delay_timing, which is off by default, controls whether this information is gathered. For vacuum, the reported value includes the sleep time of any associated parallel workers. However, parallel workers only report their sleep time once per second to avoid overloading the leader process. Bumps catversion. Author: Bertrand Drouvot <[email protected]> Co-authored-by: Nathan Bossart <[email protected]> Reviewed-by: Sami Imseih <[email protected]> Reviewed-by: Robert Haas <[email protected]> Reviewed-by: Masahiko Sawada <[email protected]> Reviewed-by: Masahiro Ikeda <[email protected]> Reviewed-by: Dilip Kumar <[email protected]> Reviewed-by: Sergei Kornilov <[email protected]> Discussion: https://postgr.es/m/ZmaXmWDL829fzAVX%40ip-10-97-1-34.eu-west-3.compute.internal
2025-02-04Add data for WAL in pg_stat_io and backend statisticsMichael Paquier
This commit adds WAL IO stats to both pg_stat_io view and per-backend IO statistics (pg_stat_get_backend_io()). This change is possible since f92c854cf406, as WAL IO is not counted in blocks in some code paths where its stats data is measured (like WAL read in xlogreader.c). IOContext gains IOCONTEXT_INIT and IOObject IOOBJECT_WAL, with the following combinations allowed: - IOOBJECT_WAL/IOCONTEXT_NORMAL is used to track I/O operations done on already-created WAL segments. - IOOBJECT_WAL/IOCONTEXT_INIT is used for tracking I/O operations done when initializing WAL segments. The core changes are done in pg_stat_io.c, backend statistics inherit them. Backend statistics and pg_stat_io are now available for the WAL writer, the WAL receiver and the WAL summarizer processes. I/O timing data is controlled by the GUC track_io_timing, like the existing data of pg_stat_io for consistency. The timings related to IOOBJECT_WAL show up if the GUC is enabled (disabled by default). Bump pgstats file version, due to the additions in IOObject and IOContext, impacting the amount of data written for the fixed-numbered IO stats kind in the pgstats file. Author: Nazir Bilal Yavuz Reviewed-by: Bertrand Drouvot, Nitin Jadhav, Amit Kapila, Michael Paquier, Melanie Plageman, Bharath Rupireddy Discussion: https://postgr.es/m/CAN55FZ3AiQ+ZMxUuXnBpd0Rrh1YhwJ5FudkHg=JU0P+-W8T4Vg@mail.gmail.com
2025-01-28Track per-relation cumulative time spent in [auto]vacuum and [auto]analyzeMichael Paquier
This commit adds four fields to the statistics of relations, aggregating the amount of time spent for each operation on a relation: - total_vacuum_time, for manual vacuum. - total_autovacuum_time, for vacuum done by the autovacuum daemon. - total_analyze_time, for manual analyze. - total_autoanalyze_time, for analyze done by the autovacuum daemon. This gives users the option to derive the average time spent for these operations with the help of the related "count" fields. Bump catalog version (for the catalog changes) and PGSTAT_FILE_FORMAT_ID (for the additions in PgStat_StatTabEntry). Author: Sami Imseih Reviewed-by: Bertrand Drouvot, Michael Paquier Discussion: https://postgr.es/m/CAA5RZ0uVOGBYmPEeGF2d1B_67tgNjKx_bKDuL+oUftuoz+=Y1g@mail.gmail.com
2025-01-14Make pg_stat_io count IOs as bytes instead of blocks for some operationsMichael Paquier
Currently in pg_stat_io view, IOs are counted as blocks of size BLCKSZ. There are two limitations with this design: * The actual number of I/O requests sent to the kernel is lower because I/O requests may be merged before being sent. Additionally, it gives the impression that all I/Os are done in block size, which shadows the benefits of merging I/O requests. * Some patches are under work to extend pg_stat_io for the tracking of operations that may not be linked to the block size. For example, WAL read IOs are done in variable bytes and it is not possible to correctly show these IOs in pg_stat_io view, and we want to keep all this data in a single system view rather than spread it across multiple relations to ease monitoring. WaitReadBuffers() can now be tracked as a single read operation worth N blocks. Same for ExtendBufferedRelShared() and ExtendBufferedRelLocal() for extensions. Three columns are added to pg_stat_io for reads, writes and extensions for the byte calculations. op_bytes, which was always hardcoded to BLCKSZ, is removed. IO backend statistics are updated to reflect these changes. Bump catalog version. Author: Nazir Bilal Yavuz Reviewed-by: Bertrand Drouvot, Melanie Plageman Discussion: https://postgr.es/m/CAN55FZ0oqxBaaHAEsj=xFqkzE3n5P=3RA1V_igXwL-RV7QRzyw@mail.gmail.com
2024-12-19Add backend-level statistics to pgstatsMichael Paquier
This adds a new variable-numbered statistics kind in pgstats, where the object ID key of the stats entries is based on the proc number of the backends. This acts as an upper-bound for the number of stats entries that can exist at once. The entries are created when a backend starts after authentication succeeds, and are removed when the backend exits, making the stats entry exist for as long as their backend is up and running. These are not written to the pgstats file at shutdown (note that write_to_file is disabled, as a safety measure). Currently, these stats include only information about the I/O generated by a backend, using the same layer as pg_stat_io, except that it is now possible to know how much activity is happening in each backend rather than an overall aggregate of all the activity. A function called pg_stat_get_backend_io() is added to access this data depending on the PID of a backend. The existing structure could be expanded in the future to add more information about other statistics related to backends, depending on requirements or ideas. Auxiliary processes are not included in this set of statistics. These are less interesting to have than normal backends as they have dedicated entries in pg_stat_io, and stats kinds of their own. This commit includes also pg_stat_reset_backend_stats(), function able to reset all the stats associated to a single backend. Bump catalog version and PGSTAT_FILE_FORMAT_ID. Author: Bertrand Drouvot Reviewed-by: Álvaro Herrera, Kyotaro Horiguchi, Michael Paquier, Nazir Bilal Yavuz Discussion: https://postgr.es/m/ZtXR+CtkEVVE/[email protected]
2024-11-11Add two attributes to pg_stat_database for parallel workers activityMichael Paquier
Two attributes are added to pg_stat_database: * parallel_workers_to_launch, counting the total number of parallel workers that were planned to be launched. * parallel_workers_launched, counting the total number of parallel workers actually launched. The ratio of both fields can provide hints that there are not enough slots available when launching parallel workers, also useful when pg_stat_statements is not deployed on an instance (i.e. cf54a2c00254). This commit relies on de3a2ea3b264, that has added two fields to EState, that get incremented when executing Gather or GatherMerge nodes. A test is added in select_parallel, where parallel workers are spawned. Bump catalog version. Author: Benoit Lobréau Discussion: https://postgr.es/m/[email protected]
2024-10-02Fix inconsistent reporting of checkpointer stats.Fujii Masao
Previously, the pg_stat_checkpointer view and the checkpoint completion log message could show different numbers for buffers written during checkpoints. The view only counted shared buffers, while the log message included both shared and SLRU buffers, causing inconsistencies. This commit resolves the issue by updating both the view and the log message to separately report shared and SLRU buffers written during checkpoints. A new slru_written column is added to the pg_stat_checkpointer view to track SLRU buffers, while the existing buffers_written column now tracks only shared buffers. This change would help users distinguish between the two types of buffers, in the pg_stat_checkpointer view and the checkpoint complete log message, respectively. Bump catalog version. Author: Nitin Jadhav Reviewed-by: Bharath Rupireddy, Michael Paquier, Kyotaro Horiguchi, Robert Haas Reviewed-by: Andres Freund, vignesh C, Fujii Masao Discussion: https://postgr.es/m/CAMm1aWb18EpT0whJrjG+-nyhNouXET6ZUw0pNYYAe+NezpvsAA@mail.gmail.com
2024-09-30docs: Enhance the pg_stat_checkpointer view documentation.Fujii Masao
This commit updates the documentation for the pg_stat_checkpointer view to clarify what kind of checkpoints or restartpoints each counter tracks. This makes it easier to understand the meaning of each counter. Previously, the num_requested description included "backend," which could be misleading since requests come from other sources as well. This commit also removes "backend" from the description of num_requested, to avoid confusion. Author: Fujii Masao Reviewed-by: Anton A. Melnikov Discussion: https://postgr.es/m/[email protected]
2024-09-30Add num_done counter to the pg_stat_checkpointer view.Fujii Masao
Checkpoints can be skipped when the server is idle. The existing num_timed and num_requested counters in pg_stat_checkpointer track both completed and skipped checkpoints, but there was no way to count only the completed ones. This commit introduces the num_done counter, which tracks only completed checkpoints, making it easier to see how many were actually performed. Bump catalog version. Author: Anton A. Melnikov Reviewed-by: Fujii Masao Discussion: https://postgr.es/m/[email protected]
2024-09-24Add ONLY support for VACUUM and ANALYZEDavid Rowley
Since autovacuum does not trigger an ANALYZE for partitioned tables, users must perform these manually. However, performing a manual ANALYZE on a partitioned table would always result in recursively analyzing each partition and that could be undesirable as autovacuum takes care of that. For partitioned tables that contain a large number of partitions, having to analyze each partition could take an unreasonably long time, especially so for tables with a large number of columns. Here we allow the ONLY keyword to prefix the name of the table to allow users to have ANALYZE skip processing partitions. This option can also be used with VACUUM, but there is no work to do if VACUUM ONLY is used on a partitioned table. This commit also changes the behavior of VACUUM and ANALYZE for inheritance parents. Previously inheritance child tables would not be processed when operating on the parent. Now, by default we *do* operate on the child tables. ONLY can be used to obtain the old behavior. The release notes should note this as an incompatibility. The default behavior has not changed for partitioned tables as these always recursively processed the partitions. Author: Michael Harris <[email protected]> Discussion: https://postgr.es/m/CADofcAWATx_haD=QkSxHbnTsAe6+e0Aw8Eh4H8cXyogGvn_kOg@mail.gmail.com Discussion: https://postgr.es/m/CADofcAXVbD0yGp_EaC9chmzsOoSai3jcfBCnyva3j0RRdRvMVA@mail.gmail.com Reviewed-by: Jelte Fennema-Nio <[email protected]> Reviewed-by: Melih Mutlu <[email protected]> Reviewed-by: Atsushi Torikoshi <[email protected]> Reviewed-by: jian he <[email protected]> Reviewed-by: David Rowley <[email protected]>
2024-09-18docs: Improve the description of num_timed column in pg_stat_checkpointer.Fujii Masao
The previous documentation stated that num_timed reflects the number of scheduled checkpoints performed. However, checkpoints may be skipped if the server has been idle, and num_timed counts both skipped and completed checkpoints. This commit clarifies the description to make it clear that the counter includes both skipped and completed checkpoints. Back-patch to v17 where pg_stat_checkpointer was added. Author: Fujii Masao Reviewed-by: Alexander Korotkov Discussion: https://postgr.es/m/[email protected]
2024-09-04Collect statistics about conflicts in logical replication.Amit Kapila
This commit adds columns in view pg_stat_subscription_stats to show the number of times a particular conflict type has occurred during the application of logical replication changes. The following columns are added: confl_insert_exists: Number of times a row insertion violated a NOT DEFERRABLE unique constraint. confl_update_origin_differs: Number of times an update was performed on a row that was previously modified by another origin. confl_update_exists: Number of times that the updated value of a row violates a NOT DEFERRABLE unique constraint. confl_update_missing: Number of times that the tuple to be updated is missing. confl_delete_origin_differs: Number of times a delete was performed on a row that was previously modified by another origin. confl_delete_missing: Number of times that the tuple to be deleted is missing. The update_origin_differs and delete_origin_differs conflicts can be detected only when track_commit_timestamp is enabled. Author: Hou Zhijie Reviewed-by: Shveta Malik, Peter Smith, Anit Kapila Discussion: https://postgr.es/m/OS0PR01MB57160A07BD575773045FC214948F2@OS0PR01MB5716.jpnprd01.prod.outlook.com
2024-07-10Revamp documentation for predefined roles.Nathan Bossart
Presently, the page for predefined roles contains a table with brief descriptions of what each role allows. Below the table, there is a separate section with more detailed information about some of the roles. As the set of predefined roles has grown over the years, this page has (IMHO) become less readable. This commit attempts to improve the predefined roles documentation by abandoning the table in favor of listing each role with its own complete description, similar to how we document GUCs. Besides merging the information that was split between the table and the section below it, this commit also alphabetizes the roles. The alphabetization is imperfect because some of the roles are grouped (e.g., pg_read_all_data and pg_write_all_data), and we order such groups by the first role mentioned, but that seemed like a better choice than breaking the groups apart. Finally, this commit makes some stylistic adjustments to the text. Reviewed-by: David G. Johnston, Robert Haas Discussion: https://postgr.es/m/ZmtM-4-eRtq8DRf6%40nathan
2024-07-10doc: Update track_io_timing documentation to mention pg_stat_io.Fujii Masao
The I/O timing information collected when track_io_timing is enabled is now documented to appear in the pg_stat_io view, which was previously not mentioned. This commit also enhances the description of track_io_timing to clarify that it monitors not only block read and write but also block extend and fsync operations. Additionally, the description of track_wal_io_timing has been improved to mention both WAL write and WAL fsync monitoring. Backpatch to v16 where pg_stat_io was added. Author: Hajime Matsunaga Reviewed-by: Melanie Plageman, Nazir Bilal Yavuz, Fujii Masao Discussion: https://postgr.es/m/TYWPR01MB10742EE4A6F34C33061429D38A4D52@TYWPR01MB10742.jpnprd01.prod.outlook.com
2024-06-28Add wait event type "InjectionPoint", a custom type like "Extension".Noah Misch
Both injection points and customization of type "Extension" are new in v17, so this just changes a detail of an unreleased feature. Reported by Robert Haas. Reviewed by Michael Paquier. Discussion: https://postgr.es/m/CA+TgmobfMU5pdXP36D5iAwxV5WKE_vuDLtp_1QyH+H5jMMt21g@mail.gmail.com
2024-06-14Reintroduce dead tuple counter in pg_stat_progress_vacuum.Masahiko Sawada
Commit 667e65aac3 changed both num_dead_tuples and max_dead_tuples columns to dead_tuple_bytes and max_dead_tuple_bytes columns, respectively. But as per discussion, the number of dead tuples collected still provides meaningful insights for users. This commit reintroduces the column for the count of dead tuples, renamed as num_dead_item_ids. It avoids confusion with the number of dead tuples removed by VACUUM, which includes dead heap-only tuples but excludes any pre-existing LP_DEAD items left behind by opportunistic pruning. Bump catalog version. Reviewed-by: Peter Geoghegan, Álvaro Herrera, Andrey Borodin Discussion: https://postgr.es/m/CAD21AoBL5sJE9TRWPyv%2Bw7k5Ee5QAJqDJEDJBUdAaCzGWAdvZw%40mail.gmail.com
2024-04-06Enhance nbtree ScalarArrayOp execution.Peter Geoghegan
Commit 9e8da0f7 taught nbtree to handle ScalarArrayOpExpr quals natively. This works by pushing down the full context (the array keys) to the nbtree index AM, enabling it to execute multiple primitive index scans that the planner treats as one continuous index scan/index path. This earlier enhancement enabled nbtree ScalarArrayOp index-only scans. It also allowed scans with ScalarArrayOp quals to return ordered results (with some notable restrictions, described further down). Take this general approach a lot further: teach nbtree SAOP index scans to decide how to execute ScalarArrayOp scans (when and where to start the next primitive index scan) based on physical index characteristics. This can be far more efficient. All SAOP scans will now reliably avoid duplicative leaf page accesses (just like any other nbtree index scan). SAOP scans whose array keys are naturally clustered together now require far fewer index descents, since we'll reliably avoid starting a new primitive scan just to get to a later offset from the same leaf page. The scan's arrays now advance using binary searches for the array element that best matches the next tuple's attribute value. Required scan key arrays (i.e. arrays from scan keys that can terminate the scan) ratchet forward in lockstep with the index scan. Non-required arrays (i.e. arrays from scan keys that can only exclude non-matching tuples) "advance" without the process ever rolling over to a higher-order array. Naturally, only required SAOP scan keys trigger skipping over leaf pages (non-required arrays cannot safely end or start primitive index scans). Consequently, even index scans of a composite index with a high-order inequality scan key (which we'll mark required) and a low-order SAOP scan key (which we won't mark required) now avoid repeating leaf page accesses -- that benefit isn't limited to simpler equality-only cases. In general, all nbtree index scans now output tuples as if they were one continuous index scan -- even scans that mix a high-order inequality with lower-order SAOP equalities reliably output tuples in index order. This allows us to remove a couple of special cases that were applied when building index paths with SAOP clauses during planning. Bugfix commit 807a40c5 taught the planner to avoid generating unsafe path keys: path keys on a multicolumn index path, with a SAOP clause on any attribute beyond the first/most significant attribute. These cases are now all safe, so we go back to generating path keys without regard for the presence of SAOP clauses (just like with any other clause type). Affected queries can now exploit scan output order in all the usual ways (e.g., certain "ORDER BY ... LIMIT n" queries can now terminate early). Also undo changes from follow-up bugfix commit a4523c5a, which taught the planner to produce alternative index paths, with path keys, but without low-order SAOP index quals (filter quals were used instead). We'll no longer generate these alternative paths, since they can no longer offer any meaningful advantages over standard index qual paths. Affected queries thereby avoid all of the disadvantages that come from using filter quals within index scan nodes. They can avoid extra heap page accesses from using filter quals to exclude non-matching tuples (index quals will never have that problem). They can also skip over irrelevant sections of the index in more cases (though only when nbtree determines that starting another primitive scan actually makes sense). There is a theoretical risk that removing restrictions on SAOP index paths from the planner will break compatibility with amcanorder-based index AMs maintained as extensions. Such an index AM could have the same limitations around ordered SAOP scans as nbtree had up until now. Adding a pro forma incompatibility item about the issue to the Postgres 17 release notes seems like a good idea. Author: Peter Geoghegan <[email protected]> Author: Matthias van de Meent <[email protected]> Reviewed-By: Heikki Linnakangas <[email protected]> Reviewed-By: Matthias van de Meent <[email protected]> Reviewed-By: Tomas Vondra <[email protected]> Discussion: https://postgr.es/m/CAH2-Wz=ksvN_sjcnD1+Bt-WtifRA5ok48aDYnq3pkKhxgMQpcw@mail.gmail.com
2024-04-03docs: Demote "Monitoring Disk Usage" from chapter to section.Robert Haas
This chapter is very short, and the immediately preceding chapter is called "Monitoring Database Activity". So, instead of having a separate chapter for this, make it the last section of the preceding chapter instead. Discussion: http://postgr.es/m/CA+Tgmob7_uoYuS2=rVwpVXaRwP-UXz+++saYTC-BCZ42QzSNKQ@mail.gmail.com
2024-04-02Use TidStore for dead tuple TIDs storage during lazy vacuum.Masahiko Sawada
Previously, we used a simple array for storing dead tuple IDs during lazy vacuum, which had a number of problems: * The array used a single allocation and so was limited to 1GB. * The allocation was pessimistically sized according to table size. * Lookup with binary search was slow because of poor CPU cache and branch prediction behavior. This commit replaces that array with the TID store from commit 30e144287a. Since the backing radix tree makes small allocations as needed, the 1GB limit is now gone. Further, the total memory used is now often smaller by an order of magnitude or more, depending on the distribution of blocks and offsets. These two features should make multiple rounds of heap scanning and index cleanup an extremely rare event. TID lookup during index cleanup is also several times faster, even more so when index order is correlated with heap tuple order. Since there is no longer a predictable relationship between the number of dead tuples vacuumed and the space taken up by their TIDs, the number of tuples no longer provides any meaningful insights for users, nor is the maximum number predictable. For that reason this commit also changes to byte-based progress reporting, with the relevant columns of pg_stat_progress_vacuum renamed accordingly to max_dead_tuple_bytes and dead_tuple_bytes. For parallel vacuum, both the TID store and supplemental information specific to vacuum are shared among the parallel vacuum workers. As with the previous array, we don't take any locks on TidStore during parallel vacuum since writes are still only done by the leader process. Bump catalog version. Reviewed-by: John Naylor, (in an earlier version) Dilip Kumar Discussion: https://postgr.es/m/CAD21AoAfOZvmfR0j8VmZorZjL7RhTiQdVttNuC4W-Shdc2a-AA%40mail.gmail.com
2024-03-22Revert "Add notBefore and notAfter to SSL cert info display"Daniel Gustafsson
This reverts commit 6acb0a628eccab8764e0306582c2b7e2a1441b9b since LibreSSL didn't support ASN1_TIME_diff until OpenBSD 7.1, leaving the older OpenBSD animals in the buildfarm complaining. Per plover in the buildfarm. Discussion: https://postgr.es/m/[email protected]
2024-03-22Add notBefore and notAfter to SSL cert info displayDaniel Gustafsson
This adds the X509 attributes notBefore and notAfter to sslinfo as well as pg_stat_ssl to allow verifying and identifying the validity period of the current client certificate. OpenSSL has APIs for extracting notAfter and notBefore, but they are only supported in recent versions so we have to calculate the dates by hand in order to make this work for the older versions of OpenSSL that we still support. Original patch by Cary Huang with additional hacking by Jacob and myself. Author: Cary Huang <[email protected]> Co-author: Jacob Champion <[email protected]> Co-author: Daniel Gustafsson <[email protected]> Discussion: https://postgr.es/m/[email protected]
2024-03-14Improve documentation for pg_stat_checkpointer fieldsAlexander Korotkov
pg_stat_checkpointer contains statistics for checkpoints and restartpoints. Before 12915a58eec9 documentation said only about checkpoints implying that restartpoint is the variation of checkpoint. 12915a58eec9 introduced new separate statistics fields for restartpoints. This commit explicitly documents fields that are relevant for both checkpoints and restartpoints. Reported-by: Magnus Hagander Discussion: https://postgr.es/m/CABUevExav5-SR0x%2BG9kBUMV0G8XsvSUfuyyqmYBBJi6VHns6sw%40mail.gmail.com Reviewed-by: Anton A. Melnikov
2024-03-03Replace BackendIds with 0-based ProcNumbersHeikki Linnakangas
Now that BackendId was just another index into the proc array, it was redundant with the 0-based proc numbers used in other places. Replace all usage of backend IDs with proc numbers. The only place where the term "backend id" remains is in a few pgstat functions that expose backend IDs at the SQL level. Those IDs are now in fact 0-based ProcNumbers too, but the documentation still calls them "backend ids". That term still seems appropriate to describe what the numbers are, so I let it be. One user-visible effect is that pg_temp_0 is now a valid temp schema name, for backend with ProcNumber 0. Reviewed-by: Andres Freund Discussion: https://www.postgresql.org/message-id/[email protected]
2024-02-28Improve performance of subsystems on top of SLRUAlvaro Herrera
More precisely, what we do here is make the SLRU cache sizes configurable with new GUCs, so that sites with high concurrency and big ranges of transactions in flight (resp. multixacts/subtransactions) can benefit from bigger caches. In order for this to work with good performance, two additional changes are made: 1. the cache is divided in "banks" (to borrow terminology from CPU caches), and algorithms such as eviction buffer search only affect one specific bank. This forestalls the problem that linear searching for a specific buffer across the whole cache takes too long: we only have to search the specific bank, whose size is small. This work is authored by Andrey Borodin. 2. Change the locking regime for the SLRU banks, so that each bank uses a separate LWLock. This allows for increased scalability. This work is authored by Dilip Kumar. (A part of this was previously committed as d172b717c6f4.) Special care is taken so that the algorithms that can potentially traverse more than one bank release one bank's lock before acquiring the next. This should happen rarely, but particularly clog.c's group commit feature needed code adjustment to cope with this. I (Álvaro) also added lots of comments to make sure the design is sound. The new GUCs match the names introduced by bcdfa5f2e2f2 in the pg_stat_slru view. The default values for these parameters are similar to the previous sizes of each SLRU. commit_ts, clog and subtrans accept value 0, which means to adjust by dividing shared_buffers by 512 (so 2MB for every 1GB of shared_buffers), with a cap of 8MB. (A new slru.c function SimpleLruAutotuneBuffers() was added to support this.) The cap was previously 1MB for clog, so for sites with more than 512MB of shared memory the total memory used increases, which is likely a good tradeoff. However, other SLRUs (notably multixact ones) retain smaller sizes and don't support a configured value of 0. These values based on shared_buffers may need to be revisited, but that's an easy change. There was some resistance to adding these new GUCs: it would be better to adjust to memory pressure automatically somehow, for example by stealing memory from shared_buffers (where the caches can grow and shrink naturally). However, doing that seems to be a much larger project and one which has made virtually no progress in several years, and because this is such a pain point for so many users, here we take the pragmatic approach. Author: Andrey Borodin <[email protected]> Author: Dilip Kumar <[email protected]> Reviewed-by: Amul Sul, Gilles Darold, Anastasia Lubennikova, Ivan Lazarev, Robert Haas, Thomas Munro, Tomas Vondra, Yura Sokolov, Васильев Дмитрий (Dmitry Vasiliev). Discussion: https://postgr.es/m/[email protected] Discussion: https://postgr.es/m/CAFiTN-vzDvNz=ExGXz6gdyjtzGixKSqs0mKHMmaQ8sOSEFZ33A@mail.gmail.com
2024-02-28Rename SLRU elements in view pg_stat_slruAlvaro Herrera
The new names are intended to match those in an upcoming patch that adds a few GUCs to configure the SLRU buffer sizes. Backwards compatibility concern: this changes the accepted names for function pg_stat_slru_rest(). Since this function recognizes "any other string" as a request to reset the entry for "other", this means that calling it with the old names would silently reset "other" instead of doing nothing or throwing an error. Reviewed-by: Andrey M. Borodin <[email protected]> Discussion: https://postgr.es/m/[email protected]
2024-02-06doc: Spell I/O consistentlyMichael Paquier
The pg_stat_io and pg_stat_copy_progress view docs spelled "I/O" as "IO" or even "io" in some places when not referring to literal names or string values. Author: Dagfinn Ilmari Mannsåker Discussion: https://postgr.es/m/[email protected]
2024-01-25Add progress reporting of skipped tuples during COPY FROM.Masahiko Sawada
9e2d870119 enabled the COPY command to skip malformed data, however there was no visibility into how many tuples were actually skipped during the COPY FROM. This commit adds a new "tuples_skipped" column to pg_stat_progress_copy view to report the number of tuples that were skipped because they contain malformed data. Bump catalog version. Author: Atsushi Torikoshi Reviewed-by: Masahiko Sawada Discussion: https://postgr.es/m/d12fd8c99adcae2744212cb23feff6ed%40oss.nttdata.com
2024-01-18More documentation updates for incremental backup.Robert Haas
Add new terms to glossary. Add a reference to walsummarizer to monitoring.sgml. Matthias van de Meent and Robert Haas Discussion: http://postgr.es/m/CAEze2WjhdVCqEe_qqEok3NA6DwUdOGSBjAxzmYdAqiaaH1uRcg@mail.gmail.com
2023-12-24Enhance checkpointer restartpoint statisticsAlexander Korotkov
Bhis commit introduces enhancements to the pg_stat_checkpointer view by adding three new columns: restartpoints_timed, restartpoints_req, and restartpoints_done. These additions aim to improve the visibility and monitoring of restartpoint processes on replicas. Previously, it was challenging to differentiate between successful and failed restartpoint requests. This limitation arises because restartpoints on replicas are dependent on checkpoint records from the primary, and cannot occur more frequently than these checkpoints. The new columns allow for clear distinction and tracking of restartpoint requests, their triggers, and successful completions. This enhancement aids database administrators and developers in better understanding and diagnosing issues related to restartpoint behavior, particularly in scenarios where restartpoint requests may fail. System catalog is changed. Catversion is bumped. Discussion: https://postgr.es/m/99b2ccd1-a77a-962a-0837-191cdf56c2b9%40inbox.ru Author: Anton A. Melnikov Reviewed-by: Kyotaro Horiguchi, Alexander Korotkov
2023-12-18Provide vectored variants of smgrread() and smgrwrite().Thomas Munro
smgrreadv() and smgrwritev() and their md.c implementations call FileReadV() and FileWriteV(). A range of disk blocks beginning at 'blocknum' and extending for 'nblocks' can be scattered to or gathered from multiple buffers with a single system call. The traditional smgrread() and smgrwrite() functions are implemented in terms of the new functions. Later commits will introduce calls with nblocks > 1, but the following behavioral changes can be seen already: * After a short transfer we'll now retry until we eventually read 0 bytes (= EOF) or get ENOSPC, EDQUOT, EFBIG etc, where previously we would infer the reason. Retrying is consistent with xlog.c's treatment of large WAL writes, and arguably also xlog.c and fd.c's treatment of EINTR. Arbitrary short returns for larger transfers have been observed on several OSes, and might in theory also happen for transient reasons with our own pg_p*v() fallback code. * After unexpected EOF or -1, the error thrown now talks about a range even for the single block case, eg "blocks 42..42". Reviewed-by: Heikki Linnakangas <[email protected]> Discussion: https://postgr.es/m/CA+hUKGJkOiOCa+mag4BF+zHo7qo=o9CFheB8=g6uT5TUm2gkvA@mail.gmail.com
2023-11-16Add target "slru" to pg_stat_reset_shared()Michael Paquier
Currently, pg_stat_reset_shared() cannot reset the counters in the view pg_stat_slru even if it is a type of shared stats. This patch adds support for a new value in pg_stat_reset_shared(), called "slru", able to do that. Note that pg_stat_reset_shared(NULL) also resets SLRU counters. There may be a point in removing pg_stat_reset_slru() that was introduced in 28cac71bd368 (v13~) as the new option overlaps with this function, but we would lose the ability to reset individual SLRU counters. This is left for future reconsideration. Author: Atsushi Torikoshi Discussion: https://postgr.es/m/[email protected]
2023-11-15doc: Improve description of targets for pg_stat_reset_shared()Michael Paquier
This commit changes the documentation so as the supported targets are documented with itemized list, making it easier to understand the view a given target affects. Author: Atsushi Torikoshi Discussion: https://postgr.es/m/[email protected]
2023-11-14Add support for pg_stat_reset_slru without argumentMichael Paquier
pg_stat_reset_slru currently requires an input argument, either: - NULL to reset the SLRU counters of everything. - A specific value to reset a single SLRU cache. This commit adds support for a new pattern: pg_stat_reset_slru without any argument works the same way as pg_stat_reset_slru(NULL), relying on a DEFAULT in the function definition to handle this case. This makes the function more consistent with 23c8c0c8f472. Bump catalog version. Author: Bharath Rupireddy Reviewed-by: Atsushi Torikoshi Discussion: https://postgr.es/m/CALj2ACW1VizYg01EeH_cA-7qA+4NzWVAoZ5Lw9_XYO1RRHAZbA@mail.gmail.com
2023-11-12Add ability to reset all shared stats types in pg_stat_reset_shared()Michael Paquier
Currently, pg_stat_reset_shared() can use an argument to specify the target of statistics to reset, doing nothing for NULL as it is strict. This patch adds to pg_stat_reset_shared() the possibility to reset all the stats types already handled in this function rather than do nothing if the argument value given is NULL or if nothing is specified (proisstrict is switched to false). Like previously, SLRUs are not included in what gets reset. The idea to use NULL or no argument to control if all the shared stats already covered by this function should be reset has been proposed by Andres Freund. Bump catalog version. Author: Atsushi Torikoshi Reviewed-by: Kyotaro Horiguchi, Michael Paquier, Bharath Rupireddy, Matthias van de Meent Discussion: https://postgr.es/m/[email protected]