summaryrefslogtreecommitdiff
path: root/src/backend/replication
AgeCommit message (Collapse)Author
6 hoursStandardize LSN formatting by zero paddingÁlvaro Herrera
This commit standardizes the output format for LSNs to ensure consistent representation across various tools and messages. Previously, LSNs were inconsistently printed as `%X/%X` in some contexts, while others used zero-padding. This often led to confusion when comparing. To address this, the LSN format is now uniformly set to `%X/%08X`, ensuring the lower 32-bit part is always zero-padded to eight hexadecimal digits. Author: Japin Li <[email protected]> Reviewed-by: Masahiko Sawada <[email protected]> Reviewed-by: Álvaro Herrera <[email protected]> Discussion: https://postgr.es/m/ME0P300MB0445CA53CA0E4B8C1879AF84B641A@ME0P300MB0445.AUSP300.PROD.OUTLOOK.COM
6 daysMake more use of binaryheap_empty() and binaryheap_size().Nathan Bossart
A few places were accessing bh_size directly instead of via these handy macros. Author: Aleksander Alekseev <[email protected]> Discussion: https://postgr.es/m/CAJ7c6TPQMVL%2B028T4zuw9ZqL5Du9JavOLhBQLkJeK0RznYx_6w%40mail.gmail.com
9 daysMessage style improvementsPeter Eisentraut
10 daysFix CheckPointReplicationSlots() with max_replication_slots == 0Alexander Korotkov
ca307d5cec90 made CheckPointReplicationSlots() unconditionally call ReplicationSlotsComputeRequiredLSN(). It causes an assertion trap when max_replication_slots equals 0. This commit makes CheckPointReplicationSlots() call ReplicationSlotsComputeRequiredLSN() only when at least one slot gets its last_saved_restart_lsn updated. That avoids an assert trap and also saves some cycles when no one slot has last_saved_restart_lsn updated. Based on ideas from Dilip Kumar <[email protected]> and Hayato Kuroda <[email protected]>. Reported-by: Zhijie Hou <[email protected]> Discussion: https://postgr.es/m/OS0PR01MB5716BB506AF934376FF3A8BB947BA%40OS0PR01MB5716.jpnprd01.prod.outlook.com
13 daysPrevent excessive delays before launching new logrep workers.Tom Lane
The logical replication launcher process would sometimes sleep for as much as 3 minutes before noticing that it is supposed to launch a new worker. This could happen if (1) WaitForReplicationWorkerAttach absorbed a process latch wakeup that was meant to cause ApplyLauncherMain to do work, or (2) logicalrep_worker_launch reported failure, either because of resource limits or because the new worker terminated immediately. In case (2), the expected behavior is that we retry the launch after wal_retrieve_retry_interval, but that didn't reliably happen. It's not clear how often such conditions would occur in the field, but in our subscription test suite they are somewhat common, especially in tests that exercise cases that cause quick worker failure. That causes the tests to take substantially longer than they ought to do on typical setups. To fix (1), make WaitForReplicationWorkerAttach re-set the latch before returning if it cleared it while looping. To fix (2), ensure that we reduce wait_time to no more than wal_retrieve_retry_interval when logicalrep_worker_launch reports failure. In passing, fix a couple of perhaps-hypothetical race conditions, e.g. examining worker->in_use without a lock. Backpatch to v16. Problem (2) didn't exist before commit 5a3a95385 because the previous code always set wait_time to wal_retrieve_retry_interval when launching a worker, regardless of success or failure of the launch. That behavior also greatly mitigated problem (1), so I'm not excited about adapting the remainder of the patch to the substantially-different code in older branches. Author: Tom Lane <[email protected]> Reviewed-by: Amit Kapila <[email protected]> Reviewed-by: Ashutosh Bapat <[email protected]> Discussion: https://postgr.es/m/[email protected] Backpatch-through: 16
14 daysFix missing comment update in 1462aad2e4.Amit Kapila
Remove the part of comment that says we don't allow toggling two_phase option as that is supported in commit 1462aad2e4. Author: Hayato Kuroda <[email protected]> Author: Amit Kapila <[email protected]> Discussion: https://postgr.es/m/OSCPR01MB1496656725F3951AEE8749EBDF579A@OSCPR01MB14966.jpnprd01.prod.outlook.com
14 daysRemove excess assert from InvalidatePossiblyObsoleteSlot()Alexander Korotkov
ca307d5cec90 introduced keeping WAL segments by slot's last saved restart LSN. It also added an assertion that the slot's restart LSN never goes backward. However, situations when the restart LSN goes backward have been spotted by buildfarm animals and investigated in the thread. When pg_receivewal starts the replication, it sets the last replayed LSN to the beginning of the segment, which is older than what ReplicationSlotReserveWal() set for the slot. A similar situation can happen to pg_basebackup. When standby reconnects to the primary, it sends the last replayed LSN, which might be older than the last confirmed flush LSN. In both these situations, a concurrent checkpoint may trigger an assert trap. Based on ideas from Vitaly Davydov <[email protected]>, Hayato Kuroda (Fujitsu) <[email protected]>, Vignesh C <[email protected]>, Amit Kapila <[email protected]>. Reported-by: Vignesh C <[email protected]> Reported-by: Tom Lane <[email protected]> Discussion: https://postgr.es/m/CALDaNm3s-jpQTe1MshsvQ8GO%3DTLj233JCdkQ7uZ6pwqRVpxAdw%40mail.gmail.com Reviewed-by: Vignesh C <[email protected]> Reviewed-by: Amit Kapila <[email protected]>
2025-06-19Improve log messages and docs for slot synchronization.Amit Kapila
Improve the clarity of LOG messages when a failover logical slot synchronization fails, making the reasons more explicit for easier debugging. Update the documentation to outline scenarios where slot synchronization can fail, especially during the initial sync, and emphasize that pg_sync_replication_slot() is primarily intended for testing and debugging purposes. We also discussed improving the functionality of pg_sync_replication_slot() so that it can be used reliably, but we would take up that work for next version after some more discussion and review. Reported-by: Suraj Kharage <[email protected]> Author: shveta malik <[email protected]> Reviewed-by: Zhijie Hou <[email protected]> Reviewed-by: Peter Smith <[email protected]> Reviewed-by: Amit Kapila <[email protected]> Backpatch-through: 17, where it was introduced Discussion: https://postgr.es/m/CAF1DzPWTcg+m+x+oVVB=y4q9=PYYsL_mujVp7uJr-_oUtWNGbA@mail.gmail.com
2025-06-17Fix re-distributing previously distributed invalidation messages during ↵Masahiko Sawada
logical decoding. Commit 4909b38af0 introduced logic to distribute invalidation messages from catalog-modifying transactions to all concurrent in-progress transactions. However, since each transaction distributes not only its original invalidation messages but also previously distributed messages to other transactions, this leads to an exponential increase in allocation request size for invalidation messages, ultimately causing memory allocation failure. This commit fixes this issue by tracking distributed invalidation messages separately per decoded transaction and not redistributing these messages to other in-progress transactions. The maximum size of distributed invalidation messages that one transaction can store is limited to MAX_DISTR_INVAL_MSG_PER_TXN (8MB). Once the size of the distributed invalidation messages exceeds this threshold, we invalidate all caches in locations where distributed invalidation messages need to be executed. Back-patch to all supported versions where we introduced the fix by commit 4909b38af0. Note that this commit adds two new fields to ReorderBufferTXN to store the distributed transactions. This change breaks ABI compatibility in back branches, affecting third-party extensions that depend on the size of the ReorderBufferTXN struct, though this scenario seems unlikely. Additionally, it adds a new flag to the txn_flags field of ReorderBufferTXN to indicate distributed invalidation message overflow. This should not affect existing implementations, as it is unlikely that third-party extensions use unused bits in the txn_flags field. Bug: #18938 #18942 Author: vignesh C <[email protected]> Reported-by: Duncan Sands <[email protected]> Reported-by: John Hutchins <[email protected]> Reported-by: Laurence Parry <[email protected]> Reported-by: Max Madden <[email protected]> Reported-by: Braulio Fdo Gonzalez <[email protected]> Reviewed-by: Masahiko Sawada <[email protected]> Reviewed-by: Amit Kapila <[email protected]> Reviewed-by: Hayato Kuroda <[email protected]> Discussion: https://postgr.es/m/[email protected] Discussion: https://postgr.es/m/[email protected] Discussion: https://postgr.es/m/[email protected] Discussion: https://postgr.es/m/CAD1FGCT2sYrP_70RTuo56QTizyc+J3wJdtn2gtO3VttQFpdMZg@mail.gmail.com Discussion: https://postgr.es/m/CANO2=B=2BT1hSYCE=nuuTnVTnjidMg0+-FfnRnqM6kd23qoygg@mail.gmail.com Backpatch-through: 13
2025-06-14Add TAP tests to check replication slot advance during the checkpointAlexander Korotkov
The new tests verify that logical and physical replication slots are still valid after an immediate restart on checkpoint completion when the slot was advanced during the checkpoint. This commit introduces two new injection points to make these tests possible: * checkpoint-before-old-wal-removal - triggered in the checkpointer process just before old WAL segments cleanup; * logical-replication-slot-advance-segment - triggered in LogicalConfirmReceivedLocation() when restart_lsn was changed enough to point to the next WAL segment. Discussion: https://postgr.es/m/flat/1d12d2-67235980-35-19a406a0%4063439497 Author: Vitaly Davydov <[email protected]> Author: Tomas Vondra <[email protected]> Reviewed-by: Alexander Korotkov <[email protected]> Reviewed-by: Amit Kapila <[email protected]> Backpatch-through: 17
2025-06-14Keep WAL segments by slot's last saved restart LSNAlexander Korotkov
The patch fixes the issue with the unexpected removal of old WAL segments after checkpoint, followed by an immediate restart. The issue occurs when a slot is advanced after the start of the checkpoint and before old WAL segments are removed at the end of the checkpoint. The patch introduces a new in-memory state for slots: last_saved_restart_lsn, which is used to calculate the oldest LSN for removing WAL segments. This state is updated every time with the current restart_lsn at the moment when the slot is saved to disk. This fix changes the shared memory layout. It's applied to HEAD only because we don't have to preserve ABI compatibility during the beta stage. Another fix that doesn't affect the ABI is committed to back branches. Discussion: https://postgr.es/m/1d12d2-67235980-35-19a406a0%4063439497 Author: Vitaly Davydov <[email protected]> Author: Alexander Korotkov <[email protected]> Reviewed-by: Amit Kapila <[email protected]>
2025-06-06Use NULL instead of 0 for pointer arguments.Nathan Bossart
Commit 5fe08c006c fixed this for calls to dshash_create(). This commit fixes calls to dshash_attach() and dsa_create_in_place(). Reviewed-by: Masahiko Sawada <[email protected]> Reviewed-by: Michael Paquier <[email protected]> Discussion: https://postgr.es/m/aECi_gSD9JnVWQ8T%40nathan
2025-06-02Use replay LSN as target for cascading logical WAL sendersMichael Paquier
A cascading WAL sender doing logical decoding (as known as doing its work on a standby) has been using as flush LSN the value returned by GetStandbyFlushRecPtr() (last position safely flushed to disk). This is incorrect as such processes are only able to decode changes up to the LSN that has been replayed by the startup process. This commit changes cascading logical WAL senders to use the replay LSN, as returned by GetXLogReplayRecPtr(). This distinction is important particularly during shutdown, when WAL senders need to send any remaining available data to their clients, switching WAL senders to a caught-up state. Using the latest flush LSN rather than the replay LSN could cause the WAL senders to be stuck in an infinite loop preventing them to shut down, as the startup process does not run when WAL senders attempt to catch up, so they could keep waiting for work that would never happen. Backpatch down to v16, where logical decoding on standbys has been introduced. Author: Alexey Makhmutov <[email protected]> Reviewed-by: Ajin Cherian <[email protected]> Reviewed-by: Bertrand Drouvot <[email protected]> Reviewed-by: Michael Paquier <[email protected]> Discussion: https://postgr.es/m/[email protected] Backpatch-through: 16
2025-05-30Ensure we have a snapshot when updating various system catalogs.Nathan Bossart
A few places that access system catalogs don't set up an active snapshot before potentially accessing their TOAST tables. To fix, push an active snapshot just before each section of code that might require accessing one of these TOAST tables, and pop it shortly afterwards. While at it, this commit adds some rather strict assertions in an attempt to prevent such issues in the future. Commit 16bf24e0e4 recently removed pg_replication_origin's TOAST table in order to fix the same problem for that catalog. On the back-branches, those bugs are left in place. We cannot easily remove a catalog's TOAST table on released major versions, and only replication origins with extremely long names are affected. Given the low severity of the issue, fixing older versions doesn't seem worth the trouble of significantly modifying the patch. Also, on v13 and v14, the aforementioned strict assertions have been omitted because commit 2776922201, which added HaveRegisteredOrActiveSnapshot(), was not back-patched. While we could probably back-patch it now, I've opted against it because it seems unlikely that new TOAST snapshot issues will be introduced in the oldest supported versions. Reported-by: Alexander Lakhin <[email protected]> Reviewed-by: Michael Paquier <[email protected]> Discussion: https://postgr.es/m/18127-fe54b6a667f29658%40postgresql.org Discussion: https://postgr.es/m/18309-c0bf914950c46692%40postgresql.org Discussion: https://postgr.es/m/ZvMSUPOqUU-VNADN%40nathan Backpatch-through: 13
2025-05-19Don't retreat slot's confirmed_flush LSN.Amit Kapila
Prevent moving the confirmed_flush backwards, as this could lead to data duplication issues caused by replicating already replicated changes. This can happen when a client acknowledges an LSN it doesn't have to do anything for, and thus didn't store persistently. After a restart, the client can send the prior LSN that it stored persistently as an acknowledgement, but we need to ignore such an LSN to avoid retreating confirm_flush LSN. Diagnosed-by: Zhijie Hou <[email protected]> Author: shveta malik <[email protected]> Reviewed-by: Amit Kapila <[email protected]> Reviewed-by: Dilip Kumar <[email protected]> Tested-by: Nisha Moond <[email protected]> Backpatch-through: 13 Discussion: https://postgr.es/m/CAJpy0uDZ29P=BYB1JDWMCh-6wXaNqMwG1u1mB4=10Ly0x7HhwQ@mail.gmail.com Discussion: https://postgr.es/m/OS0PR01MB57164AB5716AF2E477D53F6F9489A@OS0PR01MB5716.jpnprd01.prod.outlook.com
2025-05-07Remove pg_replication_origin's TOAST table.Nathan Bossart
A few places that access this catalog don't set up an active snapshot before potentially accessing its TOAST table. However, roname (the replication origin name) is the only varlena column, so this is only a problem if the name requires out-of-line storage. This commit removes its TOAST table to avoid needing to set up a snapshot. It also places a limit on replication origin names so that attempts to set long names will fail with a more user-friendly error. Those chosen limit of 512 bytes should be sufficient to avoid "row is too big" errors independent of BLCKSZ, but it should also be lenient enough for all reasonable use-cases. Bumps catversion. Reviewed-by: Michael Paquier <[email protected]> Reviewed-by: Amit Kapila <[email protected]> Reviewed-by: Euler Taveira <[email protected]> Reviewed-by: Nisha Moond <[email protected]> Reviewed-by: Tom Lane <[email protected]> Discussion: https://postgr.es/m/ZvMSUPOqUU-VNADN%40nathan
2025-04-29Fix assertion failure during decoding from synced slots.Amit Kapila
The slot synchronization skips updating the confirmed_flush LSN of the local slot if the local slot has a newer catalog_xmin or restart_lsn, but still allows updating the two_phase and two_phase_at fields of the slot. This opens up a window for the prepared transactions between old confirmed_flush LSN and two_phase_at to unexpectedly get decoded and sent to the downstream after promotion. Then, while decoding the commit prepared the assert will fail, which expects that the prepare hasn't been sent to the downstream. The fix is to skip updating the other slot fields when we are skipping to update the confirmed_flush LSN of the slot. We didn't backpatch this commit as two_phase_at was not synced in back branches, which means prepared transactions won't be unexpectedly sent to downstream. We discovered this problem while analyzing BF failure reported in the discussion link. Reliably reproducing this issue without a debugger is difficult. Given its rarity, adding specific injection point to test it doesn't seem worthwhile, so we won't be adding a dedicated test case. Author: Zhijie Hou <[email protected]> Reviewed-by: shveta malik <[email protected]> Reviewed-by: Masahiko Sawada <[email protected]> Reviewed-by: Amit Kapila <[email protected]> Discussion: https://postgr.es/m/OS0PR01MB5716B44052000EB91EFAE60E94BC2@OS0PR01MB5716.jpnprd01.prod.outlook.com
2025-04-28Fix xmin advancement during fast_forward decoding.Amit Kapila
During logical decoding, we advance catalog_xmin of logical too early in fast_forward mode, resulting in required catalog data being removed by vacuum. This mode is normally used to advance the slot without processing the changes, but we still can't let the slot's xmin to advance to an incorrect value. Commit f49a80c481 fixed a similar issue where the logical slot's catalog_xmin was getting advanced prematurely during non-fast-forward mode. During xl_running_xacts processing, instead of directly advancing the slot's xmin to the oldest running xid in the record, it allowed the xmin to be held back for snapshots that can be used for not-yet-replayed transactions, as those might consider older txns as running too. However, it missed the fact that the same problem can happen during fast_forward mode decoding, as we won't build a base snapshot in that mode, and the future call to get_changes from the same slot can miss seeing the required catalog changes leading to incorrect reslts. This commit allows building the base snapshot even in fast_forward mode to prevent the early advancement of xmin. Reported-by: Amit Kapila <[email protected]> Author: Zhijie Hou <[email protected]> Reviewed-by: Masahiko Sawada <[email protected]> Reviewed-by: shveta malik <[email protected]> Reviewed-by: Amit Kapila <[email protected]> Backpatch-through: 13 Discussion: https://postgr.es/m/CAA4eK1LqWncUOqKijiafe+Ypt1gQAQRjctKLMY953J79xDBgAg@mail.gmail.com Discussion: https://postgr.es/m/OS0PR01MB57163087F86621D44D9A72BF94BB2@OS0PR01MB5716.jpnprd01.prod.outlook.com
2025-04-23Fix an oversight in 3f28b2fcac.Amit Kapila
Commit 3f28b2fcac tried to ensure that the replication origin shouldn't be advanced in case of an ERROR in the apply worker, so that it can request the same data again after restart. However, it is possible that an ERROR was caught and handled by a (say PL/pgSQL) function, and the apply worker continues to apply further changes, in which case, we shouldn't reset the replication origin. Ensure to reset the origin only when the apply worker exits after an ERROR. Commit 3f28b2fcac added new function geterrlevel, which we removed in HEAD as part of this commit, but kept it in backbranches to avoid breaking any applications. A separate case can be made to have such a function even for HEAD. Reported-by: Shawn McCoy <[email protected]> Author: Hayato Kuroda <[email protected]> Reviewed-by: Masahiko Sawada <[email protected]> Reviewed-by: vignesh C <[email protected]> Reviewed-by: Amit Kapila <[email protected]> Backpatch-through: 16, where it was introduced Discussion: https://postgr.es/m/CALsgZNCGARa2mcYNVTSj9uoPcJo-tPuWUGECReKpNgTpo31_Pw@mail.gmail.com
2025-04-21Use the same cmd_context throughout a walsender's lifetime.Tom Lane
exec_replication_command created a cmd_context to work in and then deleted it on exit. This is pretty dangerous because some replication commands start/finish transactions. In the wake of commit 1afe31f03, that could lead to re-selecting a CurrentMemoryContext that's already been deleted, leading to hilarity such as a memory context that is its own parent. To fix, let's make the cmd_context persist across exec_replication_command calls; instead of deleting it, we'll just reset it each time. In this way it retains the same identity and there's no problem if transaction abort restores it as the working context. It probably even saves a few microseconds to do this. This fix also ensures that exec_replication_command returns to the caller (PostgresMain) with the same context active that had been when it was called (probably MessageContext). The previous coding could get that wrong too. Reported-by: Anthonin Bonnefoy <[email protected]> Author: Anthonin Bonnefoy <[email protected]> Reviewed-by: Tom Lane <[email protected]> Discussion: https://postgr.es/m/CAO6_XqoJA7-_G6t7Uqe5nWF3nj+QBGn4F6Ptp=rUGDr0zo+KvA@mail.gmail.com
2025-04-11Fix race with synchronous_standby_names at startupMichael Paquier
synchronous_standby_names cannot be reloaded safely by backends, and the checkpointer is in charge of updating a state in shared memory if the GUC is enabled in WalSndCtl, to let the backends know if they should wait or not for a given LSN. This provides a strict control on the timing of the waiting queues if the GUC is enabled or disabled, then reloaded. The checkpointer is also in charge of waking up the backends that could be waiting for a LSN when the GUC is disabled. This logic had a race condition at startup, where it would be possible for backends to not wait for a LSN even if synchronous_standby_names is enabled. This would cause visibility issues with transactions that we should be waiting for but they were not. The problem lasts until the checkpointer does its initial update of the shared memory state when it loads synchronous_standby_names. In order to take care of this problem, the shared memory state in WalSndCtl is extended to detect if it has been initialized by the checkpointer, and not only check if synchronous_standby_names is defined. In WalSndCtlData, sync_standbys_defined is renamed to sync_standbys_status, a bits8 able to know about two states: - If the shared memory state has been initialized. This flag is set by the checkpointer at startup once, and never removed. - If synchronous_standby_names is known as defined in the shared memory state. This is the same as the previous sync_standbys_defined in WalSndCtl. This method gives a way for backends to decide what they should do until the shared memory area is initialized, and they now ultimately fall back to a check on the GUC value in this case, which is the best thing that can be done. Fortunately, SyncRepUpdateSyncStandbysDefined() is called immediately by the checkpointer when this process starts, so the window is very narrow. It is possible to enlarge the problematic window by making the checkpointer wait at the beginning of SyncRepUpdateSyncStandbysDefined() with a hardcoded sleep for example, and doing so has showed that a 2PC visibility test is indeed failing. On machines slow enough, this bug would cause spurious failures. In 17~, we have looked at the possibility of adding an injection point to have a reproducible test, but as the problematic window happens at early startup, we would need to invent a way to make an injection point optionally persistent across restarts when attached, something that would be fine for this case as it would involve the checkpointer. This issue is quite old, and can be reproduced on all the stable branches. Author: Melnikov Maksim <[email protected]> Co-authored-by: Michael Paquier <[email protected]> Discussion: https://postgr.es/m/[email protected] Backpatch-through: 13
2025-04-10Improve various new-to-v18 appendStringInfo callsDavid Rowley
Similar to 8461424fd, here we adjust a few new locations which were not using the most suitable appendStringInfo* function for the intended purpose. Author: David Rowley <[email protected] Discussion: https://postgr.es/m/CAApHDvqJnNjueb=Eoj8K+8n0g7nj_AcPWSiCj5RNV4fDejAfqA@mail.gmail.com
2025-04-10Fix data loss in logical replication.Amit Kapila
Data loss can happen when the DDLs like ALTER PUBLICATION ... ADD TABLE ... or ALTER TYPE ... that don't take a strong lock on table happens concurrently to DMLs on the tables involved in the DDL. This happens because logical decoding doesn't distribute invalidations to concurrent transactions and those transactions use stale cache data to decode the changes. The problem becomes bigger because we keep using the stale cache even after those in-progress transactions are finished and skip the changes required to be sent to the client. This commit fixes the issue by distributing invalidation messages from catalog-modifying transactions to all concurrent in-progress transactions. This allows the necessary rebuild of the catalog cache when decoding new changes after concurrent DDL. We observed performance regression primarily during frequent execution of *publication DDL* statements that modify the published tables. The regression is minor or nearly nonexistent for DDLs that do not affect the published tables or occur infrequently, making this a worthwhile cost to resolve a longstanding data loss issue. An alternative approach considered was to take a strong lock on each affected table during publication modification. However, this would only address issues related to publication DDLs (but not the ALTER TYPE ...) and require locking every relation in the database for publications created as FOR ALL TABLES, which is impractical. The bug exists in all supported branches, but we are backpatching till 14. The fix for 13 requires somewhat bigger changes than this fix, so the fix for that branch is still under discussion. Reported-by: hubert depesz lubaczewski <[email protected]> Reported-by: Tomas Vondra <[email protected]> Author: Shlok Kyal <[email protected]> Author: Hayato Kuroda <[email protected]> Reviewed-by: Zhijie Hou <[email protected]> Reviewed-by: Masahiko Sawada <[email protected]> Reviewed-by: Amit Kapila <[email protected]> Tested-by: Benoit Lobréau <[email protected]> Backpatch-through: 14 Discussion: https://postgr.es/m/[email protected] Discussion: https://postgr.es/m/CAD21AoAenVqiMjpN-PvGHL1N9DWnHSq673bfgr6phmBUzx=kLQ@mail.gmail.com
2025-04-08Fix uninitialized index information access during apply.Amit Kapila
The issue happens when building conflict information during apply of INSERT or UPDATE operations that violate unique constraints on leaf partitions. The problem was introduced in commit 9ff68679b5, which removed the redundant calls to ExecOpenIndices/ExecCloseIndices. The previous code was relying on the redundant ExecOpenIndices call in apply_handle_tuple_routing() to build the index information required for unique key conflict detection. The fix is to delay building the index information until a conflict is detected instead of relying on ExecOpenIndices to do the same. The additional benefit of this approach is that it avoids building index information when there is no conflict. Author: Hou Zhijie <[email protected]> Reviewed-by:Reviewed-by: Amit Kapila <[email protected]> Discussion: https://postgr.es/m/TYAPR01MB57244ADA33DDA57119B9D26494A62@TYAPR01MB5724.jpnprd01.prod.outlook.com
2025-04-07Flush the IO statistics of active WAL senders more frequentlyMichael Paquier
WAL senders do not flush their statistics until they exit, limiting the monitoring possible for live processes. This is penalizing when WAL senders are running for a long time, like in streaming or logical replication setups, because it is not possible to know the amount of IO they generate while running. This commit makes WAL senders more aggressive with their statistics flush, using an internal of 1 second, with the flush timing calculated based on the existing GetCurrentTimestamp() done before the sleeps done to wait for some activity. Note that the sleep done for logical and physical WAL senders happens in two different code paths, so the stats flushes need to happen in these two places. One test is added for the physical WAL sender case, and one for the logical WAL sender case. This can be done in a stable fashion by relying on the WAL generated by the TAP tests in combination with a stats reset while a server is running, but only on HEAD as WAL data has been added to pg_stat_io in a051e71e28a1. This issue exists since a9c70b46dbe and the introduction of pg_stat_io, so backpatch down to v16. Author: Bertrand Drouvot <[email protected]> Reviewed-by: vignesh C <[email protected]> Reviewed-by: Xuneng Zhou <[email protected]> Discussion: https://postgr.es/m/[email protected] Backpatch-through: 16
2025-04-04Use standard die() signal handler in walreceiverHeikki Linnakangas
This gets rid of the bespoken ProcessWalRcvInterrupts() function, which lets walreceiver terminate at any CHECK_FOR_INTERRUPTS() call. And it's less code anyway. We can now use the standard libpqsrv_connect_params() libpq wrapper from libpq-be-fe-helpers.h, removing more code. We attempted to do that earlier already in commit 728f86fec6, but that was reverted because it didn't call ProcessWalRcvInterrupts() and therefore didn't react to shutdown requests. Now that ProcessWalRcvInterrupts() is gone, it works. As stated in that commit, this also leads to libpqwalreceiver reserving file descriptors for libpq conncetions, which is nice. Author: Andres Freund <[email protected]> (the earlier commit) Author: Kyotaro Horiguchi <[email protected]> Reviewed-by: Fujii Masao <[email protected]> Reviewed-by: Yura Sokolov <[email protected]>
2025-04-03Restrict copying of invalidated replication slots.Masahiko Sawada
Previously, invalidated logical and physical replication slots could be copied using the pg_copy_logical_replication_slot and pg_copy_physical_replication_slot functions. Replication slots that were invalidated for reasons other than WAL removal retained their restart_lsn. This meant that a new slot copied from an invalidated slot could have a restart_lsn pointing to a WAL segment that might have already been removed. This commit restricts the copying of invalidated replication slots. Backpatch to v16, where slots could retain their restart_lsn when invalidated for reasons other than WAL removal. For v15 and earlier, this check is not required since slots can only be invalidated due to WAL removal, and existing checks already handle this issue. Author: Shlok Kyal <[email protected]> Reviewed-by: vignesh C <[email protected]> Reviewed-by: Zhijie Hou <[email protected]> Reviewed-by: Peter Smith <[email protected]> Reviewed-by: Masahiko Sawada <[email protected]> Reviewed-by: Amit Kapila <[email protected]> Discussion: https://postgr.es/m/CANhcyEU65aH0VYnLiu%3DOhNNxhnhNhwcXBeT-jvRe1OiJTo_Ayg%40mail.gmail.com Backpatch-through: 16
2025-04-03Fix slot synchronization for two_phase enabled slots.Amit Kapila
The issue is that the transactions prepared before two-phase decoding is enabled can fail to replicate to the subscriber after being committed on a promoted standby following a failover. This is because the two_phase_at field of a slot, which tracks the LSN from which two-phase decoding starts, is not synchronized to standby servers. Without two_phase_at, the logical decoding might incorrectly identify prepared transaction as already replicated to the subscriber after promotion of standby server, causing them to be skipped. To address the issue on HEAD, the two_phase_at field of the slot is exposed by the pg_replication_slots view and allows the slot synchronization to copy this value to the corresponding synced slot on the standby server. This bug is likely to occur if the user toggles the two_phase option to true after initial slot creation. Given that altering the two_phase option of a replication slot is not allowed in PostgreSQL 17, this bug is less likely to occur. We can't change the view/function definition in backbranch so we can't push the same fix but we are brainstorming an appropriate solution for PG17. Author: Zhijie Hou <[email protected]> Reviewed-by: Amit Kapila <[email protected]> Reviewed-by: Masahiko Sawada <[email protected]> Discussion: https://postgr.es/m/TYAPR01MB5724CC7C288535BBCEEE65DA94A72@TYAPR01MB5724.jpnprd01.prod.outlook.com
2025-03-29Use PRI?64 instead of "ll?" in format strings (continued).Peter Eisentraut
Continuation of work started in commit 15a79c73, after initial trial. Author: Thomas Munro <[email protected]> Discussion: https://postgr.es/m/b936d2fb-590d-49c3-a615-92c3a88c6c19%40eisentraut.org
2025-03-27Fix guc_malloc calls for consistency and OOM checksDaniel Gustafsson
check_createrole_self_grant and check_synchronized_standby_slots were allocating memory on a LOG elevel without checking if the allocation succeeded or not, which would have led to a segfault on allocation failure. On top of that, a number of callsites were using the ERROR level, relying on erroring out rather than returning false to allow the GUC machinery handle it gracefully. Other callsites used WARNING instead of LOG. While neither being not wrong, this changes all check_ functions do it consistently with LOG. init_custom_variable gets a promoted elevel to FATAL to keep the guc_malloc error handling in line with the rest of the error handling in that function which already call FATAL. If we encounter an OOM in this callsite there is no graceful handling to be had, better to error out hard. Backpatch the fix to check_createrole_self_grant down to v16 and the fix to check_synchronized_standby_slots down to v17 where they were introduced. Author: Daniel Gustafsson <[email protected]> Reported-by: Nikita <[email protected]> Reviewed-by: Tom Lane <[email protected]> Bug: #18845 Discussion: https://postgr.es/m/[email protected] Backpatch-through: 16
2025-03-26Use PG_MODULE_MAGIC_EXT in our installable shared libraries.Tom Lane
It seems potentially useful to label our shared libraries with version information, now that a facility exists for retrieving that. This patch labels them with the PG_VERSION string. There was some discussion about using semantic versioning conventions, but that doesn't seem terribly helpful for modules with no SQL-level presence; and for those that do have SQL objects, we typically expect them to support multiple revisions of the SQL definitions, so it'd still not be very helpful. I did not label any of src/test/modules/. It seems unnecessary since we don't install those, and besides there ought to be someplace that still provides test coverage for the original PG_MODULE_MAGIC macro. Author: Tom Lane <[email protected]> Discussion: https://postgr.es/m/[email protected]
2025-03-25Fix an oversight in 3abe9dc188.Amit Kapila
Forgot to update the comment atop one of the functions. Author: Hayato Kuroda <[email protected]> Discussion: https://postgr.es/m/OSCPR01MB1496623BE1125B44614494E7AF5A72@OSCPR01MB14966.jpnprd01.prod.outlook.com
2025-03-24Detect and Log multiple_unique_conflicts type conflict.Amit Kapila
Introduce a new conflict type, multiple_unique_conflicts, to handle cases where an incoming row during logical replication violates multiple UNIQUE constraints. Previously, the apply worker detected and reported only the first encountered key conflict (insert_exists/update_exists), causing repeated failures as each constraint violation needs to be handled one by one making the process slow and error-prone. With this patch, the apply worker checks all unique constraints upfront once the first key conflict is detected and reports multiple_unique_conflicts if multiple violations exist. This allows users to resolve all conflicts at once by deleting all conflicting tuples rather than dealing with them individually or skipping the transaction. In the future, this will also allow us to specify different resolution handlers for such a conflict type. Add the stats for this conflict type in pg_stat_subscription_stats. Author: Nisha Moond <[email protected]> Author: Zhijie Hou <[email protected]> Reviewed-by: Amit Kapila <[email protected]> Reviewed-by: Peter Smith <[email protected]> Reviewed-by: Dilip Kumar <[email protected]> Discussion: https://postgr.es/m/CABdArM7FW-_dnthGkg2s0fy1HhUB8C3ELA0gZX1kkbs1ZZoV3Q@mail.gmail.com
2025-03-21Add GUC option to control maximum active replication origins.Masahiko Sawada
This commit introduces a new GUC option max_active_replication_origins to control the maximum number of active replication origins. Previously, this was controlled by 'max_replication_slots'. Having a separate GUC option provides better flexibility for setting up subscribers, as they may not require replication slots (for cascading replication) but always require replication origins. Author: Euler Taveira <[email protected]> Reviewed-by: Amit Kapila <[email protected]> Reviewed-by: Masahiko Sawada <[email protected]> Reviewed-by: Peter Eisentraut <[email protected]> Reviewed-by: vignesh C <[email protected]> Discussion: https://postgr.es/m/[email protected]
2025-03-17aio: Basic subsystem initializationAndres Freund
This commit just does the minimal wiring up of the AIO subsystem, added in the next commit, to the rest of the system. The next commit contains more details about motivation and architecture. This commit is kept separate to make it easier to review, separating the changes across the tree, from the implementation of the new subsystem. We discussed squashing this commit with the main commit before merging AIO, but there has been a mild preference for keeping it separate. Reviewed-by: Heikki Linnakangas <[email protected]> Reviewed-by: Noah Misch <[email protected]> Discussion: https://postgr.es/m/uvrtrknj4kdytuboidbhwclo4gxhswwcpgadptsjvjqcluzmah%40brqs62irg4dt
2025-03-14Fix ALTER SUBSCRIPTION ... SET PUBLICATION ... command.Amit Kapila
The problem is that ALTER SUBSCRIPTION ... SET PUBLICATION ... will lead to restarting of apply worker and after the restart, the apply worker will use the existing slot and replication origin corresponding to the subscription. Now, it is possible that before the restart, the origin has not been updated, and the WAL start location points to a location before where PUBLICATION pointed to by SET PUBLICATION doesn't exist, and that can lead to an error like: "ERROR: publication "pub1" does not exist". Once this error occurs, apply worker will never be able to proceed and will always return the same error. We decided to skip loading the publication if the publication does not exist. The publication is loaded later and updates the relation entry when the publication gets created. We decided not to backpatch this as this is a behaviour change, and we don't see field reports. This problem has been found by intermittent buildfarm failures. Author: vignesh C <[email protected]> Reviewed-by: Dilip Kumar <[email protected]> Reviewed-by: Amit Kapila <[email protected]> Reviewed-by: Masahiko Sawada <[email protected]> Discussion: https://postgr.es/m/flat/CALDaNm0-n8FGAorM%2BbTxkzn%2BAOUyx5%3DL_XmnvOP6T24%2B-NcBKg%40mail.gmail.com Discussion: https://postgr.es/m/CAA4eK1+T-ETXeRM4DHWzGxBpKafLCp__5bPA_QZfFQp7-0wj4Q@mail.gmail.com
2025-03-13pg_noreturn to replace pg_attribute_noreturn()Peter Eisentraut
We want to support a "noreturn" decoration on more compilers besides just GCC-compatible ones, but for that we need to move the decoration in front of the function declaration instead of either behind it or wherever, which is the current style afforded by GCC-style attributes. Also rename the macro to "pg_noreturn" to be similar to the C11 standard "noreturn". pg_noreturn is now supported on all compilers that support C11 (using _Noreturn), as well as GCC-compatible ones (using __attribute__, as before), as well as MSVC (using __declspec). (When PostgreSQL requires C11, the latter two variants can be dropped.) Now, all supported compilers effectively support pg_noreturn, so the extra code for !HAVE_PG_ATTRIBUTE_NORETURN can be dropped. This also fixes a possible problem if third-party code includes stdnoreturn.h, because then the current definition of #define pg_attribute_noreturn() __attribute__((noreturn)) would cause an error. Note that the C standard does not support a noreturn attribute on function pointer types. So we have to drop these here. There are only two instances at this time, so it's not a big loss. In one case, we can make up for it by adding the pg_noreturn to a wrapper function and adding a pg_unreachable(), in the other case, the latter was already done before. Reviewed-by: Dagfinn Ilmari Mannsåker <[email protected]> Reviewed-by: Andres Freund <[email protected]> Discussion: https://www.postgresql.org/message-id/flat/pxr5b3z7jmkpenssra5zroxi7qzzp6eswuggokw64axmdixpnk@zbwxuq7gbbcw
2025-03-13Avoid invalidating all RelationSyncCache entries on publication rename.Amit Kapila
On Publication rename, we need to only invalidate the RelationSyncCache entries corresponding to relations that are part of the publication being renamed. As part of this patch, we introduce a new invalidation message to invalidate the cache maintained by the logical decoding output plugin. We can't use existing relcache invalidation for this purpose, as that would unnecessarily cause relcache invalidations in other backends. This will improve performance by building fewer relation cache entries during logical replication. Author: Hayato Kuroda <[email protected]> Author: Shlok Kyal <[email protected]> Reviewed-by: Hou Zhijie <[email protected]> Reviewed-by: Amit Kapila <[email protected]> Discussion: https://postgr.es/m/OSCPR01MB14966C09AA201EFFA706576A7F5C92@OSCPR01MB14966.jpnprd01.prod.outlook.com
2025-03-12Rename alloc/free functions in reorderbuffer.cHeikki Linnakangas
There used to be bespoken pools for these structs to reduce the palloc/pfree overhead, but that was ripped out a long time ago and replaced with the generic, cheaper generational memory allocator (commit a4ccc1cef5). The Get/Return terminology made sense with the pools, as you "got" an object from the pool and "returned" it later, but now it just looks weird. Rename to Alloc/Free. Reviewed-by: Tom Lane <[email protected]> Discussion: https://www.postgresql.org/message-id/[email protected]
2025-03-11pg_logicalinspect: Fix possible crash when passing a directory path.Masahiko Sawada
Previously, pg_logicalinspect functions were too trusting of their input and blindly passed it to SnapBuildRestoreSnapshot(). If the input pointed to a directory, the server could a PANIC error while attempting to fsync_fname() with isdir=false on a directory. This commit adds validation checks for input filenames and passes the LSN extracted from the filename to SnapBuildRestoreSnapshot() instead of the filename itself. It also adds regression tests for various input patterns and permission checks. Bug: #18828 Reported-by: Robins Tharakan <[email protected]> Co-authored-by: Bertrand Drouvot <[email protected]> Co-authored-by: Masahiko Sawada <[email protected]> Discussion: https://postgr.es/m/[email protected]
2025-03-09Fix incorrect assertion in libpqwalreceiverHeikki Linnakangas
Was supposed to check the length of the array, but was checking its size in bytes. Author: Jacob Brazeal <[email protected]> Discussion: https://www.postgresql.org/message-id/CA%2BCOZaA_9afJxj9ZuO73U5P7WXP%2BZM9NGnZvTDCmBFz0FGP%[email protected]
2025-03-06Avoid invalidating all RelationSyncCache entries on publication change.Amit Kapila
On change of publication via ALTER PUBLICATION ... SET/ADD/DROP commands, we were invalidating all the relations present in relation sync cache maintained by pgoutput. We need to invalidate only the relation entries that are changed as part of publication DDL. We have ensured that the publication DDL execution generated the invalidations required to invalidate impacted relation sync entries in RelationSyncCache. This improves the performance by avoiding building the cache entries for the cases where a publication has many tables but only one of them is dropped. Author: Shlok Kyal <[email protected]> Author: Hayato Kuroda <[email protected]> Reviewed-by: Hou Zhijie <[email protected]> Reviewed-by: Amit Kapila <[email protected]> Discussion: https://postgr.es/m/OSCPR01MB14966C09AA201EFFA706576A7F5C92@OSCPR01MB14966.jpnprd01.prod.outlook.com
2025-03-06Add more monitoring data for WAL writes in the WAL receiverMichael Paquier
This commit adds two improvements related to the monitoring of WAL writes for the WAL receiver. First, write counts and timings are now counted in pg_stat_io for the WAL receiver. These have been discarded from pg_stat_wal in ff99918c625a due to performance concerns, related to the fact that we still relied on an on-disk file for the stats back then, even with track_wal_io_timing to avoid the overhead of the timestamp calculations. This implementation is simpler than the original proposal as it is possible to rely on the APIs of pgstat_io.c to do the job. Like the fsync and read data, track_wal_io_timing needs to be enabled to track the timings. Second, a wait event is added around the pg_pwrite() call in charge of the writes, using the exiting WAIT_EVENT_WAL_WRITE. This is useful as the WAL receiver data is tracked in pg_stat_activity. Reviewed-by: Bertrand Drouvot <[email protected]> Discussion: https://postgr.es/m/[email protected]
2025-03-05Rename some signal and interrupt handling functions for consistencyHeikki Linnakangas
The usual pattern for handling a signal is that the signal handler sets a flag and calls SetLatch(MyLatch), and CHECK_FOR_INTERRUPTS() or other code that is part of a wait loop calls another function to deal with it. The naming of the functions involved was a bit inconsistent, however. CHECK_FOR_INTERRUPTS() calls ProcessInterrupts() to do the heavy-lifting, but the analogous functions in aux processes were called HandleMainLoopInterrupts(), HandleStartupProcInterrupts(), etc. Similarly, most subroutines of ProcessInterrupts() were called Process*(), but some were called Handle*(). To make things less confusing, rename all the functions that are part of the overall signal/interrupt handling system but are not executed in a signal handler to e.g. ProcessSomething(), rather than HandleSomething(). The "Process" prefix is now consistently used in the non-signal-handler functions, and the "Handle" prefix in functions that are part of signal handlers, except for some completely unrelated functions that clearly have nothing to do with signal or interrupt handling. Reviewed-by: Nathan Bossart <[email protected]> Discussion: https://www.postgresql.org/message-id/[email protected]
2025-03-05Fix some gaps in pg_stat_io with WAL receiver and WAL summarizerMichael Paquier
The WAL receiver and WAL summarizer processes gain each one a call to pgstat_report_wal(), to make sure that they report their WAL statistics to pgstats, gathering data for pg_stat_io. In the WAL receiver, the stats reports are timed with status updates sent to the primary, that depend on wal_receiver_status_interval and wal_receiver_timeout. This is a conservative choice, but perhaps we could be more aggressive with the frequency of the stats reports. An interesting historical fact is that the WAL receiver does writes and syncs of WAL, but it has never reported its statistics to pgstats in pg_stat_wal. In the WAL summarizer, the stats reports are done each time the process waits for WAL. While on it, pg_stat_io is adjusted so as these two processes do not report any rows when IOObject is not WAL, making the view easier to use with less rows. Two tests are added in TAP, checking statistics for the WAL summarizer and the WAL receiver. Status updates in the WAL receiver are currently possible in the recovery test 001_stream_rep.pl. Reviewed-by: Bertrand Drouvot <[email protected]> Discussion: https://postgr.es/m/[email protected]
2025-02-27Fix the race condition in ReplicationSlotAcquire().Amit Kapila
After commit f41d8468dd, a process could acquire and use a replication slot that had just been invalidated, leading to failures while accessing WAL. To ensure that we don't accidentally start using invalid slots, we must perform the invalidation check after acquiring the slot or under the spinlock where we associate the slot with a particular process. We choose the earlier method to keep the code simple. Reported-by: Hou Zhijie <[email protected]> Author: Nisha Moond <[email protected]> Reviewed-by: Hou Zhijie <[email protected]> Reviewed-by: Amit Kapila <[email protected]> Discussion: https://postgr.es/m/CABdArM7J-LbGoMPGUPiFiLOyB_TZ5+YaZb=HMES0mQqzVTn8Gg@mail.gmail.com
2025-02-25Change relpath() et al to return path by valueAndres Freund
For AIO, and also some other recent patches, we need the ability to call relpath() in a critical section. Until now that was not feasible, as it allocated memory. The fact that relpath() allocated memory also made it awkward to use in log messages because we had to take care to free the memory afterwards. Which we e.g. didn't do for when zeroing out an invalid buffer. We discussed other solutions, e.g. filling a pre-allocated buffer that's passed to relpath(), but they all came with plenty downsides or were larger projects. The easiest fix seems to be to make relpath() return the path by value. To be able to return the path by value we need to determine the maximum length of a relation path. This patch adds a long #define that computes the exact maximum, which is verified to be correct in a regression test. As this change the signature of relpath(), extensions using it will need to adapt their code. We discussed leaving a backward-compat shim in place, but decided it's not worth it given the use of relpath() doesn't seem widespread. Discussion: https://postgr.es/m/xeri5mla4b5syjd5a25nok5iez2kr3bm26j2qn4u7okzof2bmf@kwdh2vf7npra
2025-02-25Doc: Fix pg_copy_logical_replication_slot description.Amit Kapila
This commit documents that the failover option is not copied when using the pg_copy_logical_replication_slot function. In passing, we modify the comments in the function clarifying the reason for this behavior. Reported-by: <[email protected]> Author: Hou Zhijie <[email protected]> Reviewed-by: Amit Kapila <[email protected]> Backpatch-through: 17, where it was introduced Discussion: https://postgr.es/m/[email protected]
2025-02-24Fix assertion when decoding XLOG_PARAMETER_CHANGE on promoted primary.Masahiko Sawada
When a standby replays an XLOG_PARAMETER_CHANGE record that lowers wal_level below logical, we invalidate all logical slots in hot standby mode. However, if this record was replayed while not in hot standby mode, logical slots could remain valid even after promotion, potentially causing an assertion failure during WAL record decoding. To fix this issue, this commit adds a check for hot_standby status when restoring a logical replication slot on standbys. This check ensures that logical slots are invalidated when they become incompatible due to insufficient wal_level during recovery. Backpatch to v16 where logical decoding on standby was introduced. Reviewed-by: Amit Kapila <[email protected]> Reviewed-by: Bertrand Drouvot <[email protected]> Discussion: https://postgr.es/m/CAD21AoABoFwGY_Rh2aeE6tEq3HkJxf0c6UeOXn4VV9v6BAQPSw%40mail.gmail.com Backpatch-through: 16
2025-02-23SnapBuildRestoreContents() void * argument for binary dataPeter Eisentraut
Change internal snapbuild API function to take void * for binary data instead of char *. This removes the need for numerous casts. Reviewed-by: Dagfinn Ilmari Mannsåker <[email protected]> Discussion: https://www.postgresql.org/message-id/flat/fd1fcedb-3492-4fc8-9e3e-74b97f2db6c7%40eisentraut.org