Segfault logical replication PG 10.4

Lists: pgsql-hackers
From: Mai Peng <maily(dot)peng(at)webedia-group(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Cc: maxence(at)bothorel(dot)net
Subject: Segfault logical replication PG 10.4
Date: 2018-07-11 22:56:34
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

We discovered our pg_wal partition was full few days after setting our first logical publication on a PG 10.4 instance.
Then, we can not synchronise our slave to the master, it triggers a segfault on the slave. We had to drop manually the subscription on slave and the slot on master.
Then, we wanted to find the cause of this bug, stop connection between master and slave , after 30 minutes, the slave had a segfault and could not synchronise.
Why does the slave can not synchronise without a complete creation subscription after dropping the slot?
How to manage the replication, knowing we use cloud vm and issue network latency.

Here the details of conf and error logs:
Conf on master:
max_replication_slots = 10
max_sync_workers_per_subscription = 2
wal_receiver_timeout: 60s
wal_keep_segments : 1000
wal_receiver_status_interval :10
wal_retrieve_retry_interval :5 s
max_logical_replication_workers :4
Conf on slave
same except wal_keep_segments=0

Error log on slave:
LOG: logical replication apply worker for subscription « XXXX" has started
DEBUG: connecting to publisher using connection string "postgresql://USER(at)IP"
LOG: worker process: logical replication worker for subscription 132253 (PID 25359) was terminated by signal 11: Segmentation fault
LOG: terminating any other active server processes
WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly co
rrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and repeat your command.
LOG: all server processes terminated; reinitializing
DEBUG: unregistering background worker "logical replication worker for subscription 132253"
LOG: database system was interrupted; last known up at 2018-07-11 21:50:56 UTC
DEBUG: checkpoint record is at 0/7DBFEF10
DEBUG: redo record is at 0/7DBFEF10; shutdown TRUE
DEBUG: next transaction ID: 0:93714; next OID: 140237
DEBUG: next MultiXactId: 1; next MultiXactOffset: 0
DEBUG: oldest unfrozen transaction ID: 548, in database 1
DEBUG: oldest MultiXactId: 1, in database 1
DEBUG: commit timestamp Xid oldest/newest: 0/0
DEBUG: transaction ID wrap limit is 2147484195, limited by database with OID 1
DEBUG: MultiXactId wrap limit is 2147483648, limited by database with OID 1
DEBUG: starting up replication slots
LOG: recovered replication state of node 2 to 0/0
LOG: recovered replication state of node 3 to 0/0
LOG: recovered replication state of node 4 to 0/0
LOG: recovered replication state of node 5 to 56A5/29ACA918
LOG: database system was not properly shut down; automatic recovery in progress

THANK YOU


From: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
To: Mai Peng <maily(dot)peng(at)webedia-group(dot)com>, pgsql-hackers(at)postgresql(dot)org
Cc: maxence(at)bothorel(dot)net
Subject: Re: Segfault logical replication PG 10.4
Date: 2018-07-14 11:15:46
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On 12.07.18 00:56, Mai Peng wrote:
> We discovered our pg_wal partition was full few days after setting our
> first logical publication on a PG 10.4 instance.
> Then, we can not synchronise our slave to the master, it triggers a
> segfault on the slave. We had to drop manually the subscription on slave
> and the slot on master.
> Then, we wanted to find the cause of this bug, stop connection between
> master and slave , after 30 minutes, the slave had a segfault and could
> not synchronise.
> Why does the slave can not synchronise without a complete creation
> subscription after dropping the slot?

Can you get a backtrace from the core dump produced by the segfault?

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


From: Mai Peng <maily(dot)peng(at)webedia-group(dot)com>
To: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, maxence(at)bothorel(dot)net
Subject: Re: Segfault logical replication PG 10.4
Date: 2018-07-18 14:35:10
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Hello,
Here the backtrace

```Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `postgres: bgworker: logical replication worker for subscription'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 GetActiveSnapshot () at ./build/../src/backend/utils/time/snapmgr.c:843
843 ./build/../src/backend/utils/time/snapmgr.c: No such file or directory.
(gdb) up
#1 0x000056239fc83ecd in postquel_start (fcache=0x5623a1c83b78, es=0x5623a1c91a70)
at ./build/../src/backend/executor/functions.c:809
809 ./build/../src/backend/executor/functions.c: No such file or directory.
(gdb) up
#2 fmgr_sql (fcinfo=0x5623a1c7af98) at ./build/../src/backend/executor/functions.c:1150
1150 in ./build/../src/backend/executor/functions.c
(gdb) up
#3 0x000056239fc75b4f in ExecInterpExpr (state=0x5623a1c7aeb8, econtext=0x5623a1c7b898, isnull=<optimized out>)
at ./build/../src/backend/executor/execExprInterp.c:672
672 ./build/../src/backend/executor/execExprInterp.c: No such file or directory.
(gdb) up
#4 0x000056239fc72a96 in ExecEvalExprSwitchContext (isNull=0x7ffdc20d1487 "", econtext=0x5623a1c7b898,
state=<optimized out>) at ./build/../src/include/executor/executor.h:308
308 ./build/../src/include/executor/executor.h: No such file or directory.
(gdb) up
#5 ExecCheck (state=<optimized out>, econtext=econtext(at)entry=0x5623a1c7b898)
at ./build/../src/backend/executor/execExpr.c:556
556 ./build/../src/backend/executor/execExpr.c: No such file or directory.
(gdb) up
#6 0x000056239fdefec0 in domain_check_input (value=value(at)entry=94711038065448, isnull=0 '\000',
my_extra=my_extra(at)entry=0x5623a1c79bd0) at ./build/../src/backend/utils/adt/domains.c:180
180 ./build/../src/backend/utils/adt/domains.c: No such file or directory.
(gdb) up
#7 0x000056239fdf0100 in domain_in (fcinfo=0x7ffdc20d1530) at ./build/../src/backend/utils/adt/domains.c:255
255 in ./build/../src/backend/utils/adt/domains.c
(gdb) up
#8 0x000056239feccd85 in InputFunctionCall (flinfo=flinfo(at)entry=0x7ffdc20d1900,
str=str(at)entry=0x5623a1bf4698 "{test.Type.blabla}", typioparam=typioparam(at)entry=28498, typmod=typmod(at)entry=-1) at ./build/../src/backend/utils/fmgr/fmgr.c:1640
1640 ./build/../src/backend/utils/fmgr/fmgr.c: No such file or directory.
(gdb) up
#9 0x000056239fecd01d in OidInputFunctionCall (functionId=<optimized out>,
str=0x5623a1bf4698 "{test.Type.blabla}",
typioparam=28498, typmod=-1) at ./build/../src/backend/utils/fmgr/fmgr.c:1743
1743 in ./build/../src/backend/utils/fmgr/fmgr.c
(gdb) up
#10 0x000056239fd6429f in slot_store_cstrings (slot=slot(at)entry=0x5623a1c664e8, rel=rel(at)entry=0x5623a1c60bf0,
values=values(at)entry=0x7ffdc20d1a70) at ./build/../src/backend/replication/logical/worker.c:354
354 ./build/../src/backend/replication/logical/worker.c: No such file or directory.
(gdb) up
#11 0x000056239fd64cc2 in apply_handle_insert (s=s(at)entry=0x7ffdc20d5580)
at ./build/../src/backend/replication/logical/worker.c:612
612 in ./build/../src/backend/replication/logical/worker.c
(gdb) up
#12 0x000056239fd65e5b in apply_dispatch (s=0x7ffdc20d5580) at ./build/../src/backend/replication/logical/worker.c:912
912 in ./build/../src/backend/replication/logical/worker.c
(gdb) up
#13 LogicalRepApplyLoop (last_received=97628639619392) at ./build/../src/backend/replication/logical/worker.c:1119
1119 in ./build/../src/backend/replication/logical/worker.c
(gdb) up
#14 ApplyWorkerMain (main_arg=<optimized out>) at ./build/../src/backend/replication/logical/worker.c:1671
1671 in ./build/../src/backend/replication/logical/worker.c
(gdb) up
#15 0x000056239fd3540c in StartBackgroundWorker () at ./build/../src/backend/postmaster/bgworker.c:835
835 ./build/../src/backend/postmaster/bgworker.c: No such file or directory.
(gdb) up
#16 0x000056239fd42175 in do_start_bgworker (rw=0x5623a1b71870) at ./build/../src/backend/postmaster/postmaster.c:5728
5728 ./build/../src/backend/postmaster/postmaster.c: No such file or directory.
(gdb) up
#17 maybe_start_bgworkers () at ./build/../src/backend/postmaster/postmaster.c:5941
5941 in ./build/../src/backend/postmaster/postmaster.c
(gdb) up
#18 0x000056239fd42d35 in sigusr1_handler (postgres_signal_arg=<optimized out>)
at ./build/../src/backend/postmaster/postmaster.c:5121
5121 in ./build/../src/backend/postmaster/postmaster.c
(gdb) up
#19 <signal handler called>
(gdb) up
#20 0x00007f533261a3a3 in __select_nocancel () at ../sysdeps/unix/syscall-template.S:84
84 ../sysdeps/unix/syscall-template.S: No such file or directory.
(gdb) up
#21 0x000056239fae8c71 in ServerLoop () at ./build/../src/backend/postmaster/postmaster.c:1719
1719 ./build/../src/backend/postmaster/postmaster.c: No such file or directory.
(gdb) up
#22 0x000056239fd4407a in PostmasterMain (argc=3, argv=0x5623a1b49ef0)
at ./build/../src/backend/postmaster/postmaster.c:1363
1363 in ./build/../src/backend/postmaster/postmaster.c
(gdb) up
#23 0x000056239faeac74 in main (argc=3, argv=0x5623a1b49ef0) at ./build/../src/backend/main/main.c:228
228 ./build/../src/backend/main/main.c: No such file or directory.```

BR

> Le 14 juil. 2018 à 13:15, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com> a écrit :
>
> On 12.07.18 00:56, Mai Peng wrote:
>> We discovered our pg_wal partition was full few days after setting our
>> first logical publication on a PG 10.4 instance.
>> Then, we can not synchronise our slave to the master, it triggers a
>> segfault on the slave. We had to drop manually the subscription on slave
>> and the slot on master.
>> Then, we wanted to find the cause of this bug, stop connection between
>> master and slave , after 30 minutes, the slave had a segfault and could
>> not synchronise.
>> Why does the slave can not synchronise without a complete creation
>> subscription after dropping the slot?
>
> Can you get a backtrace from the core dump produced by the segfault?
>
> --
> Peter Eisentraut http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Mai Peng <maily(dot)peng(at)webedia-group(dot)com>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org, maxence(at)bothorel(dot)net
Subject: Re: Segfault logical replication PG 10.4
Date: 2018-07-18 15:03:52
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Mai Peng <maily(dot)peng(at)webedia-group(dot)com> writes:
> Here the backtrace

Hmm .. so this can be summarized as "logical replication workers should
provide an ActiveSnapshot in case the user functions they call want one".
Makes me wonder how much other transactional infrastructure is needed
but not present.

regards, tom lane


From: Mai Peng <maily(dot)peng(at)webedia-group(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org, maxence(at)bothorel(dot)net
Subject: Re: Segfault logical replication PG 10.4
Date: 2018-07-19 10:20:18
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Hello ,
Some new input:
On slave, all domains ( with checks) have been replaced by a simple type. No crash on slave since this bypass.
Is there something to fix in the ActiveSnapshot code ?
BR

> Le 18 juil. 2018 à 17:03, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> a écrit :
>
> Mai Peng <maily(dot)peng(at)webedia-group(dot)com> writes:
>> Here the backtrace
>
> Hmm .. so this can be summarized as "logical replication workers should
> provide an ActiveSnapshot in case the user functions they call want one".
> Makes me wonder how much other transactional infrastructure is needed
> but not present.
>
> regards, tom lane


From: Quan TRAN <qtran(at)itscaro(dot)me>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Segfault logical replication PG 10.4
Date: 2018-07-19 12:12:40
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Hello,

In worker.c:

    - in apply_handle_insert, slot_store_cstrings is called before
PushActiveSnapshot
    - in apply_handle_update & apply_handle_delete, slot_store_cstrings
is called after PushActiveSnapshot

    /* Process and store remote tuple in the slot */
    oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
    slot_store_cstrings(remoteslot, rel, newtup.values);
    slot_fill_defaults(rel, estate, remoteslot);
    MemoryContextSwitchTo(oldctx);

    PushActiveSnapshot(GetTransactionSnapshot());

Should this be the cause?

Thank you.


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Quan TRAN <qtran(at)itscaro(dot)me>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Segfault logical replication PG 10.4
Date: 2018-07-20 19:08:30
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On 2018-Jul-19, Quan TRAN wrote:

> Hello,
>
> In worker.c:
>
>     - in apply_handle_insert, slot_store_cstrings is called before
> PushActiveSnapshot
>     - in apply_handle_update & apply_handle_delete, slot_store_cstrings is
> called after PushActiveSnapshot
>
>     /* Process and store remote tuple in the slot */
>     oldctx = MemoryContextSwitchTo(GetPerTupleMemoryContext(estate));
>     slot_store_cstrings(remoteslot, rel, newtup.values);
>     slot_fill_defaults(rel, estate, remoteslot);
>     MemoryContextSwitchTo(oldctx);
>
>     PushActiveSnapshot(GetTransactionSnapshot());
>
> Should this be the cause?

Yeah, moving the PushActiveSnapshot call to just before
slot_store_cstrings sounds right.

--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


From: Minh-Quan Tran <account(at)itscaro(dot)me>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Segfault logical replication PG 10.4
Date: 2018-07-20 19:31:44
Message-ID: CAKH=C-_aBSOnx4ibDGCxKMYTUnTPMVX93ohPW8oL=0abJ-W5eA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Something like this?

Thank you.

Attachment Content-Type Size
logical-replication-worker-insert.patch application/octet-stream 1.0 KB

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Minh-Quan Tran <account(at)itscaro(dot)me>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Segfault logical replication PG 10.4
Date: 2018-07-27 21:13:48
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On 2018-Jul-20, Minh-Quan Tran wrote:

> Something like this?

Can you provide a reproducer for this problem?

--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


From: Minh-Quan Tran <account(at)itscaro(dot)me>
To: Minh-Quan Tran <account(at)itscaro(dot)me>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Segfault logical replication PG 10.4
Date: 2018-07-28 09:19:56
Message-ID: CAKH=C-8xAWh_-amB8Rtm-17B+SYa3_x441Noh+OeLgJnOz4mrQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Hello,

Here is the steps to reproduce this problem.

Thank you.

Pastebin : https://pastebin.com/Wc80YegY

--

CREATE EXTENSION IF NOT EXISTS postgis;
CREATE EXTENSION IF NOT EXISTS ltree;
CREATE EXTENSION IF NOT EXISTS intarray;
CREATE EXTENSION IF NOT EXISTS hstore;
CREATE EXTENSION IF NOT EXISTS unaccent;
CREATE EXTENSION IF NOT EXISTS pgcrypto;

CREATE SCHEMA IF NOT EXISTS itscaro;

CREATE DOMAIN public.tag3level AS ltree
constraint tag3level_check CHECK (public.nlevel(VALUE) = 3);

CREATE OR REPLACE FUNCTION public.is_valid_array_tag3level(ltree[])
RETURNS boolean AS
$BODY$

SELECT NOT EXISTS (
SELECT 1
FROM unnest ($1) tag
WHERE tag operator(public.<>) tag::public.tag3level
)

$BODY$
LANGUAGE sql IMMUTABLE STRICT;

CREATE DOMAIN tag3level_array AS ltree[] CHECK
((public.is_valid_array_tag3level(VALUE)));

CREATE TABLE IF NOT EXISTS itscaro.test
(
id int,
tag public.tag3level_array DEFAULT '{}'
);

insert into itscaro.test values (2, '{{test1.test2.test3}');
-- NOTE: initial data copy is done correctly

-- Pub/Sub

-- Master
CREATE PUBLICATION itscaro_test
FOR TABLE itscaro.test;

-- Slave
-- On slave as this moment we have all domains, functions, extensions and
testing table
CREATE SUBSCRIPTION itscaro_test
CONNECTION 'postgresql://postgres@<FILL IN MASTER IP>/postgres'
PUBLICATION itscaro_test;

-- Inserting new data after subscription is registered

insert into itscaro.test values (2, '{tefghjfgqst.qdqsdsq.qdqsdgfjhfghqs}'),
(3, '{teqghjghjgfst.qdqsdsq.qdqsdqssqdqsdqsd}'),
(4, '{tjgfjhghjgfjheqst.qdqqsdqsdjghjdsdsq.qdqsdqsqsdqsdqs}'),
(5, '{teqsqsdqsdqsdqsdqs.qdqsdsq.qdqsqsddazfdgfdhgfjhgfjhgdqs}'),
(6, '{teqdqqsdqsdqsdqsdqsst.qdqsdsq.qdqsdqs}'),
(7, '{teqst.qdqsdsq.qdqsdqsdqsdqsdqsdqsqs}'),
(8, '{teqsdqsdqssazdazdqdqsdqsdsqdqsdqst.qdqsdsq.qdqsdsdsqdqsdqsqs}'),
(9, '{teqqsdqsdqsdqsdqsdqsdqsdqsdqsst.qdqsdsq.qdqsdqsdqsdqsdqsqs}');

-- After executing the insert batch on master, failure occurs on slave
--
-- 2018-07-28 10:46:05.230 CEST [3132] LOG: database system is ready to
accept connections
-- 2018-07-28 10:46:05.240 CEST [3705] LOG: logical replication apply
worker for subscription "itscaro_test" has started
-- 2018-07-28 10:46:05.269 CEST [3132] LOG: worker process: logical
replication worker for subscription 275990 (PID 3705) was terminated by
signal 11: Segmentation fault
-- 2018-07-28 10:46:05.269 CEST [3132] LOG: terminating any other active
server processes
-- 2018-07-28 10:46:05.274 CEST [3702] WARNING: terminating connection
because of crash of another server process


From: Quan Tran <account(at)itscaro(dot)me>
To: Minh-Quan Tran <account(at)itscaro(dot)me>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Segfault logical replication PG 10.4
Date: 2018-07-28 09:31:05
Message-ID: CAKH=C-8pemkMKLQCNif3eZ9V7d8KBgKKFPGFEb+=1XLhfeYyUg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Sorry, there is an error in the previous message:

insert into itscaro.test values (2, '{{test1.test2.test3}');

should be corrected with:

insert into itscaro.test values (2, '{test1.test2.test3}');

>


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Quan Tran <account(at)itscaro(dot)me>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Segfault logical replication PG 10.4
Date: 2018-07-28 17:52:12
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Aha, I see, thanks. Here's a complete fix with included testcase. In
an unpatched assert-enabled build, this crashes this

TRAP: FailedAssertion("!(ActiveSnapshotSet())", File: "/pgsql/source/REL_10_STABLE/src/backend/tcop/postgres.c", Line: 788)

Will push on Monday.

--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment Content-Type Size
0001-Fix-crash-in-logical-rep-subscriber.patch text/plain 3.9 KB

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Quan Tran <account(at)itscaro(dot)me>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, Mai Peng <maily(dot)peng(at)webedia-group(dot)com>
Subject: Re: Segfault logical replication PG 10.4
Date: 2018-07-30 20:33:21
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On 2018-Jul-28, Alvaro Herrera wrote:

> Aha, I see, thanks. Here's a complete fix with included testcase. In
> an unpatched assert-enabled build, this crashes this
>
> TRAP: FailedAssertion("!(ActiveSnapshotSet())", File: "/pgsql/source/REL_10_STABLE/src/backend/tcop/postgres.c", Line: 788)
>
> Will push on Monday.

Pushed after changing the constraint in the test case to be less silly.
Thanks for the report and diagnosis.

--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


From: Mai Peng <maily(dot)peng(at)webedia-group(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Quan Tran <account(at)itscaro(dot)me>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Segfault logical replication PG 10.4
Date: 2018-07-30 21:05:16
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Thank you Alvaro :)

> Le 30 juil. 2018 à 22:33, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> a écrit :
>
> On 2018-Jul-28, Alvaro Herrera wrote:
>
>> Aha, I see, thanks. Here's a complete fix with included testcase. In
>> an unpatched assert-enabled build, this crashes this
>>
>> TRAP: FailedAssertion("!(ActiveSnapshotSet())", File: "/pgsql/source/REL_10_STABLE/src/backend/tcop/postgres.c", Line: 788)
>>
>> Will push on Monday.
>
> Pushed after changing the constraint in the test case to be less silly.
> Thanks for the report and diagnosis.
>
> --
> Álvaro Herrera https://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services