summaryrefslogtreecommitdiff
path: root/src/backend/utils/activity/pgstat_relation.c
AgeCommit message (Collapse)Author
2 daysIntegrate FullTransactionIds deeper into two-phase codeMichael Paquier
This refactoring is a follow-up of the work done in 5a1dfde8334b, that has switched 2PC file names to use FullTransactionIds when written on disk. This will help with the integration of a follow-up solution related to the handling of two-phase files during recovery, to address older defects while reading these from disk after a crash. This change is useful in itself as it reduces the need to build the file names from epoch numbers and TransactionIds, because we can use directly FullTransactionIds from which the 2PC file names are guessed. So this avoids a lot of back-and-forth between the FullTransactionIds retrieved from the file names and how these are passed around in the internal 2PC logic. Note that the core of the change is the use of a FullTransactionId instead of a TransactionId in GlobalTransactionData, that tracks 2PC file information in shared memory. The change in TwoPhaseCallback makes this commit unfit for stable branches. Noah has contributed a good chunk of this patch. I have spent some time on it as well while working on the issues with two-phase state files and recovery. Author: Noah Misch <[email protected]> Co-Authored-by: Michael Paquier <[email protected]> Discussion: https://postgr.es/m/[email protected] Discussion: https://postgr.es/m/[email protected]
2025-05-03Fix typos in comments.Etsuro Fujita
Also adjust the phrasing in the comments. Author: Etsuro Fujita <[email protected]> Author: Heikki Linnakangas <[email protected]> Reviewed-by: Tender Wang <[email protected]> Reviewed-by: Gurjeet Singh <[email protected]> Reviewed-by: Michael Paquier <[email protected]> Discussion: https://postgr.es/m/CAPmGK17%3DPHSDZ%2B0G6jcj12buyyE1bQQc3sbp1Wxri7tODT-SDw%40mail.gmail.com Backpatch-through: 15
2025-04-10Add code comment explaining ins_since_vacuum and aborted insertsDavid Rowley
Sami complained that there's a discrepancy between n_mod_since_analyze and n_ins_since_vacuum, as the former only accounts for committed changes and the latter tracks committed and aborted inserts. Nobody seemed overly concerned that this would cause any concerning issues. The repercussions, from what I can tell, are limited to causing an autovacuum to trigger for inserts sooner than it otherwise might. For typical ratios of commits to aborts, it's unlikely to ever be noticed. Fixing things to make it so n_ins_since_vacuum only displays committed inserts would require an additional field in PgStat_TableCounts, which does not quite seem worthwhile at this stage. This commit just adds a comment with some details to mention that we know about it, which will hopefully prevent repeat discussions. Reported-by: Sami Imseih <[email protected]> Author: David Rowley <[email protected]> Reviewed-by: Sami Imseih <[email protected]> Discussion: https://postgr.es/m/CAApHDvpgV3a-R2EGmPOh0L-x3pHbZpM3y4dySWfy+UqUazwDQA@mail.gmail.com
2025-01-28Track per-relation cumulative time spent in [auto]vacuum and [auto]analyzeMichael Paquier
This commit adds four fields to the statistics of relations, aggregating the amount of time spent for each operation on a relation: - total_vacuum_time, for manual vacuum. - total_autovacuum_time, for vacuum done by the autovacuum daemon. - total_analyze_time, for manual analyze. - total_autoanalyze_time, for analyze done by the autovacuum daemon. This gives users the option to derive the average time spent for these operations with the help of the related "count" fields. Bump catalog version (for the catalog changes) and PGSTAT_FILE_FORMAT_ID (for the additions in PgStat_StatTabEntry). Author: Sami Imseih Reviewed-by: Bertrand Drouvot, Michael Paquier Discussion: https://postgr.es/m/CAA5RZ0uVOGBYmPEeGF2d1B_67tgNjKx_bKDuL+oUftuoz+=Y1g@mail.gmail.com
2025-01-21Rework handling of pending data for backend statisticsMichael Paquier
9aea73fc61d4 has added support for backend statistics, relying on PgStat_EntryRef->pending for its data pending for flush. This design lacks in flexibility, because the pending list does some memory allocation, making it unsuitable if incrementing counters in critical sections. Pending data of backend statistics is reworked so the implementation does not depend on PgStat_EntryRef->pending anymore, relying on a static area of memory to store the counters that are flushed when stats are reported to the pgstats dshash. An advantage of this approach is to allow the pending data to be manipulated in critical sections; some patches are under discussion and require that. The pending data is tracked by PendingBackendStats, local to pgstat_backend.c. Two routines are introduced to allow IO statistics to update the backend-side counters. have_static_pending_cb and flush_static_cb are used for the flush, instead of flush_pending_cb. Author: Bertrand Drouvot, Michael Paquier Discussion: https://postgr.es/m/66efowskppsns35v5u2m7k4sdnl7yoz5bo64tdjwq7r5lhplrz@y7dme5xwh2r5
2025-01-10Refactor some code related to backend statisticsMichael Paquier
This commit changes the way pending backend statistics are tracked by moving them into a new structure called PgStat_BackendPending, removing PgStat_BackendPendingIO. PgStat_BackendPending currently only includes PgStat_PendingIO for the pending I/O stats. pgstat_flush_backend() is extended with a "flags" argument to control which parts of the stats of a backend should be flushed. With this refactoring, it becomes easier to plug into backend statistics more data. A patch to add information related to WAL in this stats kind is under discussion. Author: Bertrand Drouvot Discussion: https://postgr.es/m/Z3zqc4o09dM/[email protected]
2025-01-01Update copyright for 2025Bruce Momjian
Backpatch-through: 13
2024-12-19Add backend-level statistics to pgstatsMichael Paquier
This adds a new variable-numbered statistics kind in pgstats, where the object ID key of the stats entries is based on the proc number of the backends. This acts as an upper-bound for the number of stats entries that can exist at once. The entries are created when a backend starts after authentication succeeds, and are removed when the backend exits, making the stats entry exist for as long as their backend is up and running. These are not written to the pgstats file at shutdown (note that write_to_file is disabled, as a safety measure). Currently, these stats include only information about the I/O generated by a backend, using the same layer as pg_stat_io, except that it is now possible to know how much activity is happening in each backend rather than an overall aggregate of all the activity. A function called pg_stat_get_backend_io() is added to access this data depending on the PID of a backend. The existing structure could be expanded in the future to add more information about other statistics related to backends, depending on requirements or ideas. Auxiliary processes are not included in this set of statistics. These are less interesting to have than normal backends as they have dedicated entries in pg_stat_io, and stats kinds of their own. This commit includes also pg_stat_reset_backend_stats(), function able to reset all the stats associated to a single backend. Bump catalog version and PGSTAT_FILE_FORMAT_ID. Author: Bertrand Drouvot Reviewed-by: Álvaro Herrera, Kyotaro Horiguchi, Michael Paquier, Nazir Bilal Yavuz Discussion: https://postgr.es/m/ZtXR+CtkEVVE/[email protected]
2024-11-01Add pg_memory_is_all_zeros() in memutils.hMichael Paquier
This new function tests if a memory region starting at a given location for a defined length is made only of zeroes. This unifies in a single path the all-zero checks that were happening in a couple of places of the backend code: - For pgstats entries of relation, checkpointer and bgwriter, where some "all_zeroes" variables were previously used with memcpy(). - For all-zero buffer pages in PageIsVerifiedExtended(). This new function uses the same forward scan as the check for all-zero buffer pages, applying it to the three pgstats paths mentioned above. Author: Bertrand Drouvot Reviewed-by: Peter Eisentraut, Heikki Linnakangas, Peter Smith Discussion: https://postgr.es/m/[email protected]
2024-10-27Remove unused #include's from backend .c filesPeter Eisentraut
as determined by IWYU These are mostly issues that are new since commit dbbca2cf299. Discussion: https://www.postgresql.org/message-id/flat/0df1d5b1-8ca8-4f84-93be-121081bde049%40eisentraut.org
2024-03-13Make the order of the header file includes consistentPeter Eisentraut
Similar to commit 7e735035f20. Author: Richard Guo <[email protected]> Reviewed-by: Bharath Rupireddy <[email protected]> Discussion: https://www.postgresql.org/message-id/flat/CAMbWs4-WhpCFMbXCjtJ%2BFzmjfPrp7Hw1pk4p%2BZpU95Kh3ofZ1A%40mail.gmail.com
2024-03-04Remove unused #include's from backend .c filesPeter Eisentraut
as determined by include-what-you-use (IWYU) While IWYU also suggests to *add* a bunch of #include's (which is its main purpose), this patch does not do that. In some cases, a more specific #include replaces another less specific one. Some manual adjustments of the automatic result: - IWYU currently doesn't know about includes that provide global variable declarations (like -Wmissing-variable-declarations), so those includes are being kept manually. - All includes for port(ability) headers are being kept for now, to play it safe. - No changes of catalog/pg_foo.h to catalog/pg_foo_d.h, to keep the patch from exploding in size. Note that this patch touches just *.c files, so nothing declared in header files changes in hidden ways. As a small example, in src/backend/access/transam/rmgr.c, some IWYU pragma annotations are added to handle a special case there. Discussion: https://www.postgresql.org/message-id/flat/af837490-6b2f-46df-ba05-37ea6a6653fc%40eisentraut.org
2024-03-04Use MyBackendType in more places to check what process this isHeikki Linnakangas
Remove IsBackgroundWorker, IsAutoVacuumLauncherProcess(), IsAutoVacuumWorkerProcess(), and IsLogicalSlotSyncWorker() in favor of new Am*Process() macros that use MyBackendType. For consistency with the existing Am*Process() macros. Reviewed-by: Andres Freund Discussion: https://www.postgresql.org/message-id/[email protected]
2024-01-04Update copyright for 2024Bruce Momjian
Reported-by: Michael Paquier Discussion: https://postgr.es/m/[email protected] Backpatch-through: 12
2023-10-29Refactor some code related to transaction-level statistics for relationsMichael Paquier
This commit refactors find_tabstat_entry() so as transaction counters for inserted, updated and deleted tuples are included in the result returned. If a shared entry is found for a relation, its result is now a copy of the PgStat_TableStatus entry retrieved from shared memory. This idea has been proposed by Andres Freund. While on it, the following SQL functions, used in system views, are refactored with macros, in the same spirit as 83a1a1b56645, reducing the amount of code: - pg_stat_get_xact_tuples_deleted() - pg_stat_get_xact_tuples_inserted() - pg_stat_get_xact_tuples_updated() There is now only one caller of find_tabstat_entry() in the tree. Author: Bertrand Drouvot Discussion: https://postgr.es/m/[email protected]
2023-03-23Rename fields in pgstat structures for functions and relationsMichael Paquier
This commit renames the members of a few pgstat structures related to functions and relations, by respectively removing their prefix "f_" and "t_". The statistics for functions and relations and handled in their own file, and pgstatfuncs.c associates each field in a structure variable named based on the object type handled, so no information is lost with this rename. This will help with some of the refactoring aimed for pgstatfuncs.c, as this makes more consistent the field names with the SQL functions retrieving them. Author: Bertrand Drouvot Reviewed-by: Michael Paquier, Melanie Plageman Discussion: https://postgr.es/m/[email protected]
2023-03-23Count updates that move row to a new page.Peter Geoghegan
Add pgstat counter to track row updates that result in the successor version going to a new heap page, leaving behind an original version whose t_ctid points to the new version. The current count is shown by the n_tup_newpage_upd column of each of the pg_stat_*_tables views. The new n_tup_newpage_upd column complements the existing n_tup_hot_upd and n_tup_upd columns. Tables that have high n_tup_newpage_upd values (relative to n_tup_upd) are good candidates for tuning heap fillfactor. Corey Huinker, with small tweaks by me. Author: Corey Huinker <[email protected]> Reviewed-By: Peter Geoghegan <[email protected]> Reviewed-By: Andres Freund <[email protected]> Discussion: https://postgr.es/m/CADkLM=ded21M9iZ36hHm-vj2rE2d=zcKpUQMds__Xm2pxLfHKA@mail.gmail.com
2023-02-09pgstat: Infrastructure for more detailed IO statisticsAndres Freund
This commit adds the infrastructure for more detailed IO statistics. The calls to actually count IOs, a system view to access the new statistics, documentation and tests will be added in subsequent commits, to make review easier. While we already had some IO statistics, e.g. in pg_stat_bgwriter and pg_stat_database, they did not provide sufficient detail to understand what the main sources of IO are, or whether configuration changes could avoid IO. E.g., pg_stat_bgwriter.buffers_backend does contain the number of buffers written out by a backend, but as that includes extending relations (always done by backends) and writes triggered by the use of buffer access strategies, it cannot easily be used to tune background writer or checkpointer. Similarly, pg_stat_database.blks_read cannot easily be used to tune shared_buffers / compute a cache hit ratio, as the use of buffer access strategies will often prevent a large fraction of the read blocks to end up in shared_buffers. The new IO statistics count IO operations (evict, extend, fsync, read, reuse, and write), and are aggregated for each combination of backend type (backend, autovacuum worker, bgwriter, etc), target object of the IO (relations, temp relations) and context of the IO (normal, vacuum, bulkread, bulkwrite). What is tracked in this series of patches, is sufficient to perform the aforementioned analyses. Further details, e.g. tracking the number of buffer hits, would make that even easier, but was left out for now, to keep the scope of the already large patchset manageable. Bumps PGSTAT_FILE_FORMAT_ID. Author: Melanie Plageman <[email protected]> Reviewed-by: Andres Freund <[email protected]> Reviewed-by: Justin Pryzby <[email protected]> Reviewed-by: Kyotaro Horiguchi <[email protected]> Discussion: https://postgr.es/m/[email protected]
2023-01-13Manual cleanup and pgindent of pgstat and bufmgr related codeAndres Freund
This is in preparation for commiting a larger patch series in the area. Discussion: https://postgr.es/m/CAAKRu_bHwGEbzNxxy+MQDkrsgog6aO6iUvajJ4d6PD98gFU7+w@mail.gmail.com
2023-01-02Update copyright for 2023Bruce Momjian
Backpatch-through: 11
2022-12-07Generate pg_stat_get*() functions for databases using macrosMichael Paquier
The same code pattern is repeated 21 times for int64 counters (0 for missing entry) and 5 times for doubles (0 for missing entry) on database entries. This code is switched to use macros for the basic code instead, shaving a few hundred lines of originally-duplicated code patterns. The function names remain the same, but some fields of PgStat_StatDBEntry have to be renamed to cope with the new style. This is in the same spirit as 83a1a1b. Author: Michael Paquier Reviewed-by: Nathan Bossart, Bertrand Drouvot Discussion: https://postgr.es/m/[email protected]
2022-12-06Generate pg_stat_get*() functions for tables using macrosMichael Paquier
The same code pattern is repeated 17 times for int64 counters (0 for missing entry) and 5 times for timestamps (NULL for missing entry) on table entries. This code is switched to use a macro for the basic code instead, shaving a few hundred lines of originally-duplicated code. The function names remain the same, but some fields of PgStat_StatTabEntry have to be renamed to cope with the new style. Author: Bertrand Drouvot Reviewed-by: Nathan Bossart Discussion: https:/postgr.es/m/20221204173207.GA2669116@nathanxps13
2022-11-20pgstat: replace double lookup with IsSharedRelation()Andres Freund
As the list of shared relations is fixed, we can just dispatch based IsSharedRelation(), instead of first trying to look up stats for a non-shared rel and falling back to shared stats. Author: "Drouvot, Bertrand" <[email protected]> Reviewed-by: Bharath Rupireddy <[email protected]> Reviewed-by: Andres Freund <[email protected] Discussion: https://postgr.es/m/[email protected]
2022-10-14pgstat: Track time of the last scan of a relationAndres Freund
It can be useful to know when a relation has last been used, e.g., when evaluating whether an index is still required. It was already possible to infer the time of the last usage by tracking, e.g., pg_stat_all_indexes.idx_scan over time. But far from everybody does so. To make it easier to detect the last time a relation has been scanned, track that time in each relation's pgstat entry. To minimize overhead a) the timestamp is updated only when the backend pending stats entry is flushed to shared stats b) the last transaction's stop timestamp is used as the timestamp. Bumps catalog and stats format versions. Author: Dave Page <[email protected]> Reviewed-by: Andres Freund <[email protected]> Reviewed-by: Bruce Momjian <[email protected]> Reviewed-by: Vik Fearing <[email protected]> Discussion: https://postgr.es/m/CA+OCxozrVHNFVEPkweUHMZje+t1tfY816d9MZYc6eZwOOusOaQ@mail.gmail.com
2022-04-07pgstat: fix small bug in pgstat_drop_relation().Andres Freund
Just after committing 5891c7a8ed8, a test running with debug_discard_caches=1 failed locally... pgstat_drop_relation() neither checked pgstat_should_count_relation() nor called pgstat_prep_relation_pending(). With debug_discard_caches=1 rel->pgstat_info wasn't set up, leading pg_stat_get_xact_tuples_inserted() spuriously still returning > 0 while in the transaction dropping the table.
2022-04-07pgstat: store statistics in shared memory.Andres Freund
Previously the statistics collector received statistics updates via UDP and shared statistics data by writing them out to temporary files regularly. These files can reach tens of megabytes and are written out up to twice a second. This has repeatedly prevented us from adding additional useful statistics. Now statistics are stored in shared memory. Statistics for variable-numbered objects are stored in a dshash hashtable (backed by dynamic shared memory). Fixed-numbered stats are stored in plain shared memory. The header for pgstat.c contains an overview of the architecture. The stats collector is not needed anymore, remove it. By utilizing the transactional statistics drop infrastructure introduced in a prior commit statistics entries cannot "leak" anymore. Previously leaked statistics were dropped by pgstat_vacuum_stat(), called from [auto-]vacuum. On systems with many small relations pgstat_vacuum_stat() could be quite expensive. Now that replicas drop statistics entries for dropped objects, it is not necessary anymore to reset stats when starting from a cleanly shut down replica. Subsequent commits will perform some further code cleanup, adapt docs and add tests. Bumps PGSTAT_FILE_FORMAT_ID. Author: Kyotaro Horiguchi <[email protected]> Author: Andres Freund <[email protected]> Author: Melanie Plageman <[email protected]> Reviewed-By: Andres Freund <[email protected]> Reviewed-By: Thomas Munro <[email protected]> Reviewed-By: Justin Pryzby <[email protected]> Reviewed-By: "David G. Johnston" <[email protected]> Reviewed-By: Tomas Vondra <[email protected]> (in a much earlier version) Reviewed-By: Arthur Zakirov <[email protected]> (in a much earlier version) Reviewed-By: Antonin Houska <[email protected]> (in a much earlier version) Discussion: https://postgr.es/m/[email protected] Discussion: https://postgr.es/m/[email protected] Discussion: https://postgr.es/m/[email protected]
2022-04-07pgstat: normalize function naming.Andres Freund
Most of pgstat uses pgstat_<verb>_<subject>() or just <verb>_<subject>(). But not all (some introduced fairly recently by me). Rename ones that aren't intentionally following a different scheme (e.g. AtEOXact_*).
2022-04-07pgstat: scaffolding for transactional stats creation / drop.Andres Freund
One problematic part of the current statistics collector design is that there is no reliable way of getting rid of statistics entries. Because of that pgstat_vacuum_stat() (called by [auto-]vacuum) matches all stats for the current database with the catalog contents and tries to drop now-superfluous entries. That's quite expensive. What's worse, it doesn't work on physical replicas, despite physical replicas collection statistics entries. This commit introduces infrastructure to create / drop statistics entries transactionally, together with the underlying catalog objects (functions, relations, subscriptions). pgstat_xact.c maintains a list of stats entries created / dropped transactionally in the current transaction. To ensure the removal of statistics entries is durable dropped statistics entries are included in commit / abort (and prepare) records, which also ensures that stats entries are dropped on standbys. Statistics entries created separately from creating the underlying catalog object (e.g. when stats were previously lost due to an immediate restart) are *not* WAL logged. However that can only happen outside of the transaction creating the catalog object, so it does not lead to "leaked" statistics entries. For this to work, functions creating / dropping functions / relations / subscriptions need to call into pgstat. For subscriptions this was already done when dropping subscriptions, via pgstat_report_subscription_drop() (now renamed to pgstat_drop_subscription()). This commit does not actually drop stats yet, it just provides the infrastructure. It is however a largely independent piece of infrastructure, so committing it separately makes sense. Bumps XLOG_PAGE_MAGIC. Author: Andres Freund <[email protected]> Reviewed-By: Thomas Munro <[email protected]> Reviewed-By: Kyotaro Horiguchi <[email protected]> Discussion: https://postgr.es/m/[email protected]
2022-04-06pgstat: add pgstat_copy_relation_stats().Andres Freund
Until now index_concurrently_swap() directly modified pgstat internal datastructures. That will break with the introduction of shared memory statistics and seems off architecturally. This is done separately from the - quite large - shared memory statistics patch to make review easier. Author: Andres Freund <[email protected]> Author: Kyotaro Horiguchi <[email protected]> Reviewed-By: Kyotaro Horiguchi <[email protected]> Discussion: https://postgr.es/m/[email protected]
2022-04-06pgstat: stats collector references in comments.Andres Freund
Soon the stats collector will be no more, with statistics instead getting stored in shared memory. There are a lot of references to the stats collector in comments. This commit replaces most of these references with "cumulative statistics system", with the remaining ones getting replaced as part of subsequent commits. This is done separately from the - quite large - shared memory statistics patch to make review easier. Author: Andres Freund <[email protected]> Reviewed-By: Justin Pryzby <[email protected]> Reviewed-By: Thomas Munro <[email protected]> Reviewed-By: Kyotaro Horiguchi <[email protected]> Discussion: https://postgr.es/m/[email protected] Discussion: https://postgr.es/m/[email protected]
2022-04-06pgstat: move pgstat_report_autovac() to pgstat_database.c.Andres Freund
I got the location wrong in 13619598f10. The name did make it sound like it belonged in pgstat_relation.c...
2022-04-04pgstat: consistent function comment formatting.Andres Freund
There was a wild mishmash of function comment formatting in pgstat, making it hard to know what to use for any new function and hard to extend existing comments (particularly due to randomly different forms of indentation). Author: Andres Freund <[email protected]> Reviewed-By: Thomas Munro <[email protected]> Discussion: https://postgr.es/m/[email protected] Discussion: https://postgr.es/m/[email protected]
2022-03-21pgstat: split different types of stats into separate files.Andres Freund
pgstat.c is very long, and it's hard to find an order that makes sense and is likely to be maintained over time. Splitting the different pieces into separate files makes that a lot easier. With a few exceptions, this commit just moves code around. Those exceptions are: - adding file headers for new files - removing 'static' from functions - adapting pgstat_assert_is_up() to work across TUs - minor comment adjustments git diff --color-moved=dimmed-zebra is very helpful separating code movement from code changes. The next commit in this series will reorder pgstat.[ch] contents to be a bit more coherent. Earlier revisions of this patch had "global" statistics (archiver, bgwriter, checkpointer, replication slots, SLRU, WAL) in one file, because each seemed small enough. However later commits will increase their size and their aggregate size is not insubstantial. It also just seems easier to split each type of statistic into its own file. Author: Andres Freund <[email protected]> Discussion: https://postgr.es/m/[email protected]