Restore the portal-level snapshot after procedure COMMIT/ROLLBACK.

COMMIT/ROLLBACK necessarily destroys all snapshots within the session. The original implementation of intra-procedure transactions just cavalierly did that, ignoring the fact that this left us executing in a rather different environment than normal. In particular, it turns out that handling of toasted datums depends rather critically on there being an outer ActiveSnapshot: otherwise, when SPI or the core executor pop whatever snapshot they used and return, it's unsafe to dereference any toasted datums that may appear in the query result. It's possible to demonstrate "no known snapshots" and "missing chunk number N for toast value" errors as a result of this oversight. Historically this outer snapshot has been held by the Portal code, and that seems like a good plan to preserve. So add infrastructure to pquery.c to allow re-establishing the Portal-owned snapshot if it's not there anymore, and add enough bookkeeping support that we can tell whether it is or not. We can't, however, just re-establish the Portal snapshot as part of COMMIT/ROLLBACK. As in normal transaction start, acquiring the first snapshot should wait until after SET and LOCK commands. Hence, teach spi.c about doing this at the right time. (Note that this patch doesn't fix the problem for any PLs that try to run intra-procedure transactions without using SPI to execute SQL commands.) This makes SPI's no_snapshots parameter rather a misnomer, so in HEAD, rename that to allow_nonatomic. replication/logical/worker.c also needs some fixes, because it wasn't careful to hold a snapshot open around AFTER trigger execution. That code doesn't use a Portal, which I suspect someday we're gonna have to fix. But for now, just rearrange the order of operations. This includes back-patching the recent addition of finish_estate() to centralize the cleanup logic there. This also back-patches commit 2ecfeda3e into v13, to improve the test coverage for worker.c (it was that test that exposed that worker.c's snapshot management is wrong). Per bug #15990 from Andreas Wicht. Back-patch to v11 where intra-procedure COMMIT was added. Discussion: https://postgr.es/m/[email protected]
author: Tom Lane 2021-05-21 18:03:53 +0000
committer: Tom Lane 2021-05-21 18:03:59 +0000
commit: 84f5c2908dad81e8622b0406beea580e40bb03ac (patch)
tree: 8f97b9d94f9c599dfb003409983613d200c83b83 /src/backend/utils/mmgr
parent: 124966c1a35b950210e12048e64533963960febd (diff)
1 files changed, 56 insertions, 0 deletions
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index 66e31818156..5c30e141f52 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -502,6 +502,9 @@ PortalDrop(Portal portal, bool isTopCommit)
 		portal->cleanup = NULL;
 	}
 
+	/* There shouldn't be an active snapshot anymore, except after error */
+	Assert(portal->portalSnapshot == NULL || !isTopCommit);
+
 	/*
 	 * Remove portal from hash table.  Because we do this here, we will not
 	 * come back to try to remove the portal again if there's any error in the
@@ -709,6 +712,8 @@ PreCommit_Portals(bool isPrepare)
 				portal->holdSnapshot = NULL;
 			}
 			portal->resowner = NULL;
+			/* Clear portalSnapshot too, for cleanliness */
+			portal->portalSnapshot = NULL;
 			continue;
 		}
 
@@ -1278,3 +1283,54 @@ HoldPinnedPortals(void)
 		}
 	}
 }
+
+/*
+ * Drop the outer active snapshots for all portals, so that no snapshots
+ * remain active.
+ *
+ * Like HoldPinnedPortals, this must be called when initiating a COMMIT or
+ * ROLLBACK inside a procedure.  This has to be separate from that since it
+ * should not be run until we're done with steps that are likely to fail.
+ *
+ * It's tempting to fold this into PreCommit_Portals, but to do so, we'd
+ * need to clean up snapshot management in VACUUM and perhaps other places.
+ */
+void
+ForgetPortalSnapshots(void)
+{
+	HASH_SEQ_STATUS status;
+	PortalHashEnt *hentry;
+	int			numPortalSnaps = 0;
+	int			numActiveSnaps = 0;
+
+	/* First, scan PortalHashTable and clear portalSnapshot fields */
+	hash_seq_init(&status, PortalHashTable);
+
+	while ((hentry = (PortalHashEnt *) hash_seq_search(&status)) != NULL)
+	{
+		Portal		portal = hentry->portal;
+
+		if (portal->portalSnapshot != NULL)
+		{
+			portal->portalSnapshot = NULL;
+			numPortalSnaps++;
+		}
+		/* portal->holdSnapshot will be cleaned up in PreCommit_Portals */
+	}
+
+	/*
+	 * Now, pop all the active snapshots, which should be just those that were
+	 * portal snapshots.  Ideally we'd drive this directly off the portal
+	 * scan, but there's no good way to visit the portals in the correct
+	 * order.  So just cross-check after the fact.
+	 */
+	while (ActiveSnapshotSet())
+	{
+		PopActiveSnapshot();
+		numActiveSnaps++;
+	}
+
+	if (numPortalSnaps != numActiveSnaps)
+		elog(ERROR, "portal snapshots (%d) did not account for all active snapshots (%d)",
+			 numPortalSnaps, numActiveSnaps);
+}
author	Tom Lane	2021-05-21 18:03:53 +0000
committer	Tom Lane	2021-05-21 18:03:59 +0000
commit	84f5c2908dad81e8622b0406beea580e40bb03ac (patch)
tree	8f97b9d94f9c599dfb003409983613d200c83b83 /src/backend/utils/mmgr
parent	124966c1a35b950210e12048e64533963960febd (diff)