Quick Links

Re: Assertion being hit during WAL replay

Lists:	pgsql-hackers

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Andres Freund <andres(at)anarazel(dot)de>
Cc:	pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject:	Assertion being hit during WAL replay
Date:	2023-04-11 18:48:44
Message-ID:	[email protected]
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-hackers

I have seen this failure a couple of times recently while
testing code that caused crashes and restarts:

#2 0x00000000009987e3 in ExceptionalCondition (
conditionName=conditionName(at)entry=0xb31bc8 "mode == RBM_NORMAL || mode == RBM_ZERO_ON_ERROR || mode == RBM_ZERO_AND_LOCK",
fileName=fileName(at)entry=0xb31c15 "bufmgr.c",
lineNumber=lineNumber(at)entry=892) at assert.c:66
#3 0x0000000000842d73 in ExtendBufferedRelTo (eb=...,
fork=fork(at)entry=MAIN_FORKNUM, strategy=strategy(at)entry=0x0,
flags=flags(at)entry=3, extend_to=extend_to(at)entry=1,
mode=mode(at)entry=RBM_ZERO_AND_CLEANUP_LOCK) at bufmgr.c:891
#4 0x00000000005cc398 in XLogReadBufferExtended (rlocator=...,
forknum=MAIN_FORKNUM, blkno=0, mode=mode(at)entry=RBM_ZERO_AND_CLEANUP_LOCK,
recent_buffer=<optimized out>) at xlogutils.c:527
#5 0x00000000005cc697 in XLogReadBufferForRedoExtended (
record=record(at)entry=0x1183b98, block_id=block_id(at)entry=0 '\000',
mode=mode(at)entry=RBM_NORMAL, get_cleanup_lock=get_cleanup_lock(at)entry=true,
buf=buf(at)entry=0x7ffd98e3ea94) at xlogutils.c:391
#6 0x000000000055df59 in heap_xlog_prune (record=0x1183b98) at heapam.c:8779
#7 heap2_redo (record=0x1183b98) at heapam.c:10015
#8 0x00000000005ca430 in ApplyWalRecord (replayTLI=<synthetic pointer>,
record=0x7f8f7afbcb60, xlogreader=<optimized out>)
at ../../../../src/include/access/xlog_internal.h:379

It's not clear to me whether this Assert is wrong, or
XLogReadBufferForRedoExtended shouldn't be using
RBM_ZERO_AND_CLEANUP_LOCK, or the Assert is correctly protecting an
unimplemented case in ExtendBufferedRelTo that we now need to implement.

In any case, I'm pretty sure Andres broke it in 26158b852, because
I hadn't seen it before this weekend.

regards, tom lane

From:	Andres Freund <andres(at)anarazel(dot)de>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject:	Re: Assertion being hit during WAL replay
Date:	2023-04-11 19:56:24
Message-ID:	[email protected]
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hi,

On 2023-04-11 14:48:44 -0400, Tom Lane wrote:
> I have seen this failure a couple of times recently while
> testing code that caused crashes and restarts:

Do you have a quick repro recipe?

> #2 0x00000000009987e3 in ExceptionalCondition (
> conditionName=conditionName(at)entry=0xb31bc8 "mode == RBM_NORMAL || mode == RBM_ZERO_ON_ERROR || mode == RBM_ZERO_AND_LOCK",
> fileName=fileName(at)entry=0xb31c15 "bufmgr.c",
> lineNumber=lineNumber(at)entry=892) at assert.c:66
> #3 0x0000000000842d73 in ExtendBufferedRelTo (eb=...,
> fork=fork(at)entry=MAIN_FORKNUM, strategy=strategy(at)entry=0x0,
> flags=flags(at)entry=3, extend_to=extend_to(at)entry=1,
> mode=mode(at)entry=RBM_ZERO_AND_CLEANUP_LOCK) at bufmgr.c:891
> #4 0x00000000005cc398 in XLogReadBufferExtended (rlocator=...,
> forknum=MAIN_FORKNUM, blkno=0, mode=mode(at)entry=RBM_ZERO_AND_CLEANUP_LOCK,
> recent_buffer=<optimized out>) at xlogutils.c:527
> #5 0x00000000005cc697 in XLogReadBufferForRedoExtended (
> record=record(at)entry=0x1183b98, block_id=block_id(at)entry=0 '\000',
> mode=mode(at)entry=RBM_NORMAL, get_cleanup_lock=get_cleanup_lock(at)entry=true,
> buf=buf(at)entry=0x7ffd98e3ea94) at xlogutils.c:391
> #6 0x000000000055df59 in heap_xlog_prune (record=0x1183b98) at heapam.c:8779
> #7 heap2_redo (record=0x1183b98) at heapam.c:10015
> #8 0x00000000005ca430 in ApplyWalRecord (replayTLI=<synthetic pointer>,
> record=0x7f8f7afbcb60, xlogreader=<optimized out>)
> at ../../../../src/include/access/xlog_internal.h:379
>
> It's not clear to me whether this Assert is wrong, or
> XLogReadBufferForRedoExtended shouldn't be using
> RBM_ZERO_AND_CLEANUP_LOCK, or the Assert is correctly protecting an
> unimplemented case in ExtendBufferedRelTo that we now need to implement.

Hm. It's not implemented because I didn't quite see how it'd make sense to
pass RBM_ZERO_AND_CLEANUP_LOCK when extending the relation, but given how
relation extension is done "implicitly" during recovery, that's too narrow a
view. It's trivial to add.

I wonder if we should eventually redefine the RBM* things into a bitmask.

> In any case, I'm pretty sure Andres broke it in 26158b852, because
> I hadn't seen it before this weekend.

Yea, that's clearly the fault of 26158b852.

Greetings,

Andres Freund

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Andres Freund <andres(at)anarazel(dot)de>
Cc:	pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject:	Re: Assertion being hit during WAL replay
Date:	2023-04-11 20:54:53
Message-ID:	[email protected]
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-hackers

Andres Freund <andres(at)anarazel(dot)de> writes:
> On 2023-04-11 14:48:44 -0400, Tom Lane wrote:
>> I have seen this failure a couple of times recently while
>> testing code that caused crashes and restarts:

> Do you have a quick repro recipe?

Here's something related to what I hit that time:

diff --git a/src/backend/optimizer/plan/subselect.c b/src/backend/optimizer/plan/subselect.c
index 052263aea6..d43a7c7bcb 100644
--- a/src/backend/optimizer/plan/subselect.c
+++ b/src/backend/optimizer/plan/subselect.c
@@ -2188,6 +2188,7 @@ SS_charge_for_initplans(PlannerInfo *root, RelOptInfo *final_rel)
void
SS_attach_initplans(PlannerInfo *root, Plan *plan)
{
+ Assert(root->init_plans == NIL);
plan->initPlan = root->init_plans;
}

You won't get through initdb with this, but if you install this change
into a successfully init'd database and then "make installcheck-parallel",
it will crash and then fail to recover, at least a lot of the time.

regards, tom lane

From:	Andres Freund <andres(at)anarazel(dot)de>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject:	Re: Assertion being hit during WAL replay
Date:	2023-04-11 22:03:02
Message-ID:	[email protected]
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hi,

On 2023-04-11 16:54:53 -0400, Tom Lane wrote:
> Andres Freund <andres(at)anarazel(dot)de> writes:
> > On 2023-04-11 14:48:44 -0400, Tom Lane wrote:
> >> I have seen this failure a couple of times recently while
> >> testing code that caused crashes and restarts:
>
> > Do you have a quick repro recipe?
>
> Here's something related to what I hit that time:
>
> diff --git a/src/backend/optimizer/plan/subselect.c b/src/backend/optimizer/plan/subselect.c
> index 052263aea6..d43a7c7bcb 100644
> --- a/src/backend/optimizer/plan/subselect.c
> +++ b/src/backend/optimizer/plan/subselect.c
> @@ -2188,6 +2188,7 @@ SS_charge_for_initplans(PlannerInfo *root, RelOptInfo *final_rel)
> void
> SS_attach_initplans(PlannerInfo *root, Plan *plan)
> {
> + Assert(root->init_plans == NIL);
> plan->initPlan = root->init_plans;
> }
>
> You won't get through initdb with this, but if you install this change
> into a successfully init'd database and then "make installcheck-parallel",
> it will crash and then fail to recover, at least a lot of the time.

Ah, that allowed me to reproduce. Thanks.

Took me a bit to understand how we actually get into this situation. A PRUNE
record for relation+block that doesn't exist during recovery. That doesn't
commonly happen outside of PITR or such, because we obviously need a block
with content to generate the PRUNE. The way it does happen here, is that the
relation is vacuumed and then truncated. Then we crash. Thus we end up with a
PRUNE record for a block that doesn't exist on disk.

Which is also why the test is quite timing sensitive.

Seems like it'd be good to have a test that covers this scenario. There's
plenty code around it that doesn't currently get exercised.

None of the existing tests seem like a great fit. I guess it could be added to
013_crash_restart, but that really focuses on something else.

So I guess I'll write a 036_notsureyet.pl...

Greetings,

Andres Freund

From:	Andres Freund <andres(at)anarazel(dot)de>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject:	Re: Assertion being hit during WAL replay
Date:	2023-04-14 18:40:15
Message-ID:	[email protected]
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-hackers

Hi,

On 2023-04-11 15:03:02 -0700, Andres Freund wrote:
> On 2023-04-11 16:54:53 -0400, Tom Lane wrote:
> > Here's something related to what I hit that time:
> >
> > diff --git a/src/backend/optimizer/plan/subselect.c b/src/backend/optimizer/plan/subselect.c
> > index 052263aea6..d43a7c7bcb 100644
> > --- a/src/backend/optimizer/plan/subselect.c
> > +++ b/src/backend/optimizer/plan/subselect.c
> > @@ -2188,6 +2188,7 @@ SS_charge_for_initplans(PlannerInfo *root, RelOptInfo *final_rel)
> > void
> > SS_attach_initplans(PlannerInfo *root, Plan *plan)
> > {
> > + Assert(root->init_plans == NIL);
> > plan->initPlan = root->init_plans;
> > }
> >
> > You won't get through initdb with this, but if you install this change
> > into a successfully init'd database and then "make installcheck-parallel",
> > it will crash and then fail to recover, at least a lot of the time.
>
> Ah, that allowed me to reproduce. Thanks.
>
>
> Took me a bit to understand how we actually get into this situation. A PRUNE
> record for relation+block that doesn't exist during recovery. That doesn't
> commonly happen outside of PITR or such, because we obviously need a block
> with content to generate the PRUNE. The way it does happen here, is that the
> relation is vacuumed and then truncated. Then we crash. Thus we end up with a
> PRUNE record for a block that doesn't exist on disk.
>
> Which is also why the test is quite timing sensitive.
>
> Seems like it'd be good to have a test that covers this scenario. There's
> plenty code around it that doesn't currently get exercised.
>
> None of the existing tests seem like a great fit. I guess it could be added to
> 013_crash_restart, but that really focuses on something else.
>
> So I guess I'll write a 036_notsureyet.pl...

See also the separate report by Alexander Lakhin at
https://postgr.es/m/[email protected]

I pushed the fix + test now.

Greetings,

Andres Freund

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Andres Freund <andres(at)anarazel(dot)de>
Cc:	pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject:	Re: Assertion being hit during WAL replay
Date:	2023-04-14 18:42:14
Message-ID:	[email protected]
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Lists:	pgsql-hackers

Andres Freund <andres(at)anarazel(dot)de> writes:
> I pushed the fix + test now.

Cool, thanks.

regards, tom lane