From 7d0b91a28421a7928fb45e10b31e7f1f81842ba7 Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Tue, 3 Dec 2024 09:27:05 +1300 Subject: [PATCH] RelationTruncate() must set DELAY_CHKPT_START. Previously, it set only DELAY_CHKPT_COMPLETE. That was important, because it meant that if the XLOG_SMGR_TRUNCATE record preceded a XLOG_CHECKPOINT_ONLINE record in the WAL, then the truncation would also happen on disk before the XLOG_CHECKPOINT_ONLINE record was written. However, it didn't guarantee that the sync request for the truncation was processed before the XLOG_CHECKPOINT_ONLINE record was written. By setting DELAY_CHKPT_START, we guarantee that if an XLOG_SMGR_TRUNCATE record is written to WAL before the redo pointer of a concurrent checkpoint, the sync request queued by that operation must be processed by that checkpoint, rather than being left for the following one. This is a refinement of commit 412ad7a5563. Back-patch to all supported releases, like that commit. Author: Robert Haas Reported-by: Thomas Munro Discussion: https://postgr.es/m/CA%2BhUKG%2B-2rjGZC2kwqr2NMLBcEBp4uf59QT1advbWYF_uc%2B0Aw%40mail.gmail.com --- src/backend/catalog/storage.c | 36 +++++++++++++++++++++++++---------- 1 file changed, 26 insertions(+), 10 deletions(-) diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c index 8f74a3efdeb..b8baa3b822d 100644 --- a/src/backend/catalog/storage.c +++ b/src/backend/catalog/storage.c @@ -326,20 +326,35 @@ RelationTruncate(Relation rel, BlockNumber nblocks) RelationPreTruncate(rel); /* - * Make sure that a concurrent checkpoint can't complete while truncation - * is in progress. + * The code which follows can interact with concurrent checkpoints in two + * separate ways. * - * The truncation operation might drop buffers that the checkpoint - * otherwise would have flushed. If it does, then it's essential that - * the files actually get truncated on disk before the checkpoint record - * is written. Otherwise, if reply begins from that checkpoint, the + * First, the truncation operation might drop buffers that the checkpoint + * otherwise would have flushed. If it does, then it's essential that the + * files actually get truncated on disk before the checkpoint record is + * written. Otherwise, if reply begins from that checkpoint, the * to-be-truncated blocks might still exist on disk but have older - * contents than expected, which can cause replay to fail. It's OK for - * the blocks to not exist on disk at all, but not for them to have the - * wrong contents. + * contents than expected, which can cause replay to fail. It's OK for the + * blocks to not exist on disk at all, but not for them to have the wrong + * contents. For this reason, we need to set DELAY_CHKPT_COMPLETE while + * this code executes. + * + * Second, the call to smgrtruncate() below will in turn call + * RegisterSyncRequest(). We need the sync request created by that call to + * be processed before the checkpoint completes. CheckPointGuts() will + * call ProcessSyncRequests(), but if we register our sync request after + * that happens, then the WAL record for the truncation could end up + * preceding the checkpoint record, while the actual sync doesn't happen + * until the next checkpoint. To prevent that, we need to set + * DELAY_CHKPT_START here. That way, if the XLOG_SMGR_TRUNCATE precedes + * the redo pointer of a concurrent checkpoint, we're guaranteed that the + * corresponding sync request will be processed before the checkpoint + * completes. */ + Assert(!MyProc->delayChkpt); + MyProc->delayChkpt = true; /* DELAY_CHKPT_START */ Assert(!MyProc->delayChkptEnd); - MyProc->delayChkptEnd = true; + MyProc->delayChkptEnd = true; /* DELAY_CHKPT_COMPLETE */ /* * We WAL-log the truncation before actually truncating, which means @@ -387,6 +402,7 @@ RelationTruncate(Relation rel, BlockNumber nblocks) smgrtruncate(RelationGetSmgr(rel), forks, nforks, blocks); /* We've done all the critical work, so checkpoints are OK now. */ + MyProc->delayChkpt = false; MyProc->delayChkptEnd = false; /* -- 2.39.5