PQcancel does not use tcp_user_timeout, connect_timeout and keepalive settings

Lists: pgsql-hackers
From: Jelte Fennema <Jelte(dot)Fennema(at)microsoft(dot)com>
To: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: PQcancel does not use tcp_user_timeout, connect_timeout and keepalive settings
Date: 2021-09-30 14:44:45
Message-ID: AM5PR83MB017870DE81FC84D5E21E9D1EF7AA9@AM5PR83MB0178.EURPRD83.prod.outlook.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

The new connection made by PQcancel does not use the tcp_user_timeout, connect_timeout or any of the keepalive settings that are provided in the connection string. This means that a call to PQcancel can block for a much longer time than intended if there are network issues. This can be especially impactful, because PQcancel is a blocking function an there is no non blocking version of it.

I attached a proposed patch to use the tcp_user_timeout from the connection string when connecting to Postgres in PQcancel. This resolves the issue for me, since this will make connecting timeout after a configurable time. So the other options are not strictly needed. It might still be nice for completeness to support them too though. I didn't do this yet, because I first wanted some feedback and also because implementing connect_timeout would require using non blocking TCP to connect and then use select to have a timeout.

Attachment Content-Type Size
0001-Use-tcp_user_timeout-in-PQcancel.patch application/octet-stream 3.2 KB

From: Zhihong Yu <zyu(at)yugabyte(dot)com>
To: Jelte Fennema <Jelte(dot)Fennema(at)microsoft(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PQcancel does not use tcp_user_timeout, connect_timeout and keepalive settings
Date: 2021-09-30 14:52:22
Message-ID: CALNJ-vSbOyVm9nk3y1UwA3e+M8atwJtTg_3ugRfC5W+TyAbMZQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Sep 30, 2021 at 7:45 AM Jelte Fennema <Jelte(dot)Fennema(at)microsoft(dot)com>
wrote:

> The new connection made by PQcancel does not use the tcp_user_timeout,
> connect_timeout or any of the keepalive settings that are provided in the
> connection string. This means that a call to PQcancel can block for a much
> longer time than intended if there are network issues. This can be
> especially impactful, because PQcancel is a blocking function an there is
> no non blocking version of it.
>
> I attached a proposed patch to use the tcp_user_timeout from the
> connection string when connecting to Postgres in PQcancel. This resolves
> the issue for me, since this will make connecting timeout after a
> configurable time. So the other options are not strictly needed. It might
> still be nice for completeness to support them too though. I didn't do this
> yet, because I first wanted some feedback and also because implementing
> connect_timeout would require using non blocking TCP to connect and then
> use select to have a timeout.

Hi,

int be_key; /* key of backend --- needed for cancels */
+ int pgtcp_user_timeout; /* tcp user timeout */

The other field names are quite short. How about naming the field
tcp_timeout ?

Cheers


From: Jelte Fennema <Jelte(dot)Fennema(at)microsoft(dot)com>
To: Zhihong Yu <zyu(at)yugabyte(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [EXTERNAL] Re: PQcancel does not use tcp_user_timeout, connect_timeout and keepalive settings
Date: 2021-10-06 19:56:37
Message-ID: AM5PR83MB01781C8E94AEC8073B646872F7B09@AM5PR83MB0178.EURPRD83.prod.outlook.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

We actually ran into an issue caused by this in production, where a PQcancel connection was open on the client for a 2+ days because the server had restarted at the wrong moment in the cancel handshake. The client was now indefinitely waiting for the server to send an EOF back, and because keepalives were not enabled on this socket it was never closed.

I attached an updated patch which also uses the keepalive settings in PQ. The connect_timeout is a bit harder to get it to work. As far as I can tell it would require something like this. https://stackoverflow.com/a/2597774/2570866

> The other field names are quite short. How about naming the field tcp_timeout ?

I kept the same names as in the pg_conn struct for consistency sake.


From: Jelte Fennema <Jelte(dot)Fennema(at)microsoft(dot)com>
To: Zhihong Yu <zyu(at)yugabyte(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [EXTERNAL] Re: PQcancel does not use tcp_user_timeout, connect_timeout and keepalive settings
Date: 2021-10-06 19:58:32
Message-ID: AM5PR83MB0178586E6B079DF8E7F0D855F7B09@AM5PR83MB0178.EURPRD83.prod.outlook.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Ugh forgot to attach the patch. Here it is.
________________________________
From: Jelte Fennema <Jelte(dot)Fennema(at)microsoft(dot)com>
Sent: Wednesday, October 6, 2021 21:56
To: Zhihong Yu <zyu(at)yugabyte(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [EXTERNAL] Re: PQcancel does not use tcp_user_timeout, connect_timeout and keepalive settings

We actually ran into an issue caused by this in production, where a PQcancel connection was open on the client for a 2+ days because the server had restarted at the wrong moment in the cancel handshake. The client was now indefinitely waiting for the server to send an EOF back, and because keepalives were not enabled on this socket it was never closed.

I attached an updated patch which also uses the keepalive settings in PQ. The connect_timeout is a bit harder to get it to work. As far as I can tell it would require something like this. https://stackoverflow.com/a/2597774/2570866

> The other field names are quite short. How about naming the field tcp_timeout ?

I kept the same names as in the pg_conn struct for consistency sake.

Attachment Content-Type Size
0001-Use-tcp_user_timeout-and-keepalives-in-PQcancel.patch application/octet-stream 8.2 KB

From: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
To: Jelte Fennema <Jelte(dot)Fennema(at)microsoft(dot)com>, Zhihong Yu <zyu(at)yugabyte(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [EXTERNAL] Re: PQcancel does not use tcp_user_timeout, connect_timeout and keepalive settings
Date: 2021-10-07 00:46:47
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On 2021/10/07 4:58, Jelte Fennema wrote:
> Ugh forgot to attach the patch. Here it is.

Thanks for working on this patch!

@@ -4546,10 +4684,21 @@ PQrequestCancel(PGconn *conn)

return false;
}
-
- r = internal_cancel(&conn->raddr, conn->be_pid, conn->be_key,

Since PQrequestCancel() is marked as deprecated, I don't think that
we need to add the feature into it.

+ if (cancel->pgtcp_user_timeout >= 0) {
+ if (setsockopt(tmpsock, IPPROTO_TCP, TCP_USER_TIMEOUT,
+ (char *) &cancel->pgtcp_user_timeout,
+ sizeof(cancel->pgtcp_user_timeout)) < 0) {
+ goto cancel_errReturn;
+ }
+ }

libpq has already setKeepalivesXXX() functions to do the almost same thing.
Isn't it better to modify and reuse them instead of adding the almost
same code, to simplify the code?

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
Cc: Jelte Fennema <Jelte(dot)Fennema(at)microsoft(dot)com>, Zhihong Yu <zyu(at)yugabyte(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [EXTERNAL] Re: PQcancel does not use tcp_user_timeout, connect_timeout and keepalive settings
Date: 2021-11-10 21:06:46
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> writes:
> Since PQrequestCancel() is marked as deprecated, I don't think that
> we need to add the feature into it.

Not absolutely necessary, agreed, but it looks like it's pretty
easy to make that happen, so why not? I'd suggest dropping the
separate implementation and turning PQrequestCancel() into a
thin wrapper around PQgetCancel, PQcancel, PQfreeCancel.

> libpq has already setKeepalivesXXX() functions to do the almost same thing.
> Isn't it better to modify and reuse them instead of adding the almost
> same code, to simplify the code?

I find this patch fairly scary, because it's apparently been coded
with little regard for the expectation that PQcancel can be called
within a signal handler.

I see that setsockopt(2) is specified to be async signal safe by
POSIX, so at least in principle it is okay to add those calls to
PQcancel. But you've got to be really careful what else you do in
there. You can NOT use appendPQExpBuffer. You can NOT access the
PGconn (I don't think the Windows part of this even compiles ...
nope, it doesn't, per the cfbot). I'm not sure that WSAIoctl
is safe to use in a signal handler, so on the whole I think
I'd drop the Windows-specific chunk altogether. But in any case,
I'm very strongly against calling out to other libpq code from here,
because then the signal-safety restrictions start applying to that
other code too, and that's a recipe for trouble in future.

The patch could use a little attention to conforming to PG coding
conventions (comment style, brace style, C99 declarations are all
wrong --- pgindent would fix much of that, but maybe not in a way
you like). The lack of comments about why it's doing what it's doing
needs to be rectified, too. Why are these specific options important
and not any others?

regards, tom lane


From: Jelte Fennema <Jelte(dot)Fennema(at)microsoft(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
Cc: Zhihong Yu <zyu(at)yugabyte(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [EXTERNAL] Re: PQcancel does not use tcp_user_timeout, connect_timeout and keepalive settings
Date: 2021-11-10 22:38:34
Message-ID: HE1PR83MB0186173FC957C2554D0A31B6F7939@HE1PR83MB0186.EURPRD83.prod.outlook.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

> I'd suggest dropping the separate implementation and turning
> PQrequestCancel() into a thin wrapper around PQgetCancel,
> PQcancel, PQfreeCancel.

Fine by me. I didn't want to change behavior of a deprecated
function.

> I find this patch fairly scary, because it's apparently been coded
> with little regard for the expectation that PQcancel can be called
> within a signal handler.
> You can NOT use appendPQExpBuffer. You can NOT access the
> PGconn (I don't think the Windows part of this even compiles ...
> nope, it doesn't, per the cfbot).

I guess I was tired or at least inattentive when I added the windows part at the end
of my coding session. It really was the Linux part that I cared about. For that part
I definitely took care to make the code signal safe. Which is also why I did not call out to
any of the existing functions, like setKeepalivesXXX(). I don't think I'm the right person
to write the windows code for this (I have zero C windows experience). So, if it's not
required for this patch to be accepted I'll happily remove it.

> The patch could use a little attention to conforming to PG coding
> conventions (comment style, brace style, C99 declarations are all
> wrong --- pgindent would fix much of that, but maybe not in a way
> you like).

Sure, I'll run pgindent for my next version of the patch.

> The lack of comments about why it's doing what it's doing
> needs to be rectified, too. Why are these specific options important
> and not any others?

I'll make sure to add comments before the final version of this patch. This
patch was more meant as a draft to gauge if this was even the correct way of fixing
this problem.

To be honest I think it would make sense to add a new PQcancel function that is not
required to be signal safe and reuses regular connection setup code. This would make sure
options like this are supported automatically in the future. Another advantage is that it would
allow for sending cancel messages in a non-blocking way. So, you would be able to easily
send multiple cancels in a concurrent way. It looks to me like PQcancel is mostly designed
the way it is to keep it easy for psql to send cancelations. I think many other uses of PQcancel
don't require it to be signal safe at all (at least for Citus its usage signal safety is not required).


From: Jelte Fennema <Jelte(dot)Fennema(at)microsoft(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
Cc: Zhihong Yu <zyu(at)yugabyte(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [EXTERNAL] Re: PQcancel does not use tcp_user_timeout, connect_timeout and keepalive settings
Date: 2021-12-28 15:49:00
Message-ID: AM5PR83MB0178DCFFC1BE77C4C390DA89F7439@AM5PR83MB0178.EURPRD83.prod.outlook.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

I was able to spend some time on this again. I attached two patches to this email:

The first patch is a cleaned up version of my previous patch. I think I addressed
all feedback on the previous version in that patch (e.g. removed windows code,
fixed formatting).

The second patch is a new one, it implements honouring of the connect_timeout
connection option in PQcancel. This patch requires the first patch to also be applied,
but since it seemed fairly separate and the code is not trivial I didn't want the first
patch to be blocked on this.

Finally, I would love it if once these fixes are merged the would also be backpatched to
previous versions of libpq. Does that seem possible? As far as I can tell it would be fine,
since it doesn't really change any of the public APIs. The only change is that the pg_cancel
struct now has a few additional fields. But since that struct is defined in libpq-int.h, so that
struct should not be used by users of libpq directly, right?.

Attachment Content-Type Size
0001-Use-timeout-socket-options-for-cancel-connections.patch application/octet-stream 8.4 KB
0002-Honor-connect_timeout-when-connecting-with-PQcancel.patch application/octet-stream 6.4 KB

From: Andres Freund <andres(at)anarazel(dot)de>
To: Jelte Fennema <Jelte(dot)Fennema(at)microsoft(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, Zhihong Yu <zyu(at)yugabyte(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [EXTERNAL] Re: PQcancel does not use tcp_user_timeout, connect_timeout and keepalive settings
Date: 2022-01-05 18:42:25
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

On 2021-12-28 15:49:00 +0000, Jelte Fennema wrote:
> The first patch is a cleaned up version of my previous patch. I think I addressed
> all feedback on the previous version in that patch (e.g. removed windows code,
> fixed formatting).

To me it seems a bit problematic to introduce a divergence between windows /
everything else here. Isn't that just going to lead to other complaints just
like this thread, where somebody discovered the hard way that there's platform
dependent behaviour here?

> The second patch is a new one, it implements honouring of the connect_timeout
> connection option in PQcancel. This patch requires the first patch to also be applied,
> but since it seemed fairly separate and the code is not trivial I didn't want the first
> patch to be blocked on this.
>
> Finally, I would love it if once these fixes are merged the would also be backpatched to
> previous versions of libpq. Does that seem possible? As far as I can tell it would be fine,
> since it doesn't really change any of the public APIs. The only change is that the pg_cancel
> struct now has a few additional fields. But since that struct is defined in libpq-int.h, so that
> struct should not be used by users of libpq directly, right?.

I'm not really convinced this is a good patch to backpatch. There does seem to
be some potential for subtle breakage - code in signal handlers is notoriously
finnicky, it's a rarely exercised code path, etc. It's also not fixing
something that previously worked.

> + * NOTE: These socket options are currently not set for Windows. The
> + * reason is that signal safety in this function is very important, and it
> + * was not clear to if the functions required to set the socket options on
> + * Windows were signal-safe.
> + */
> +#ifndef WIN32
> + if (!IS_AF_UNIX(cancel->raddr.addr.ss_family))
> + {
> +#ifdef TCP_USER_TIMEOUT
> + if (cancel->pgtcp_user_timeout >= 0)
> + {
> + if (setsockopt(tmpsock, IPPROTO_TCP, TCP_USER_TIMEOUT,
> + (char *) &cancel->pgtcp_user_timeout,
> + sizeof(cancel->pgtcp_user_timeout)) < 0)
> + {
> + strlcpy(errbuf, "PQcancel() -- setsockopt(TCP_USER_TIMEOUT) failed: ", errbufsize);
> + goto cancel_errReturn;
> + }
> + }
> +#endif
> +
> + if (cancel->keepalives != 0)
> + {
> + int on = 1;
> +
> + if (setsockopt(tmpsock,
> + SOL_SOCKET, SO_KEEPALIVE,
> + (char *) &on, sizeof(on)) < 0)
> + {
> + strlcpy(errbuf, "PQcancel() -- setsockopt(SO_KEEPALIVE) failed: ", errbufsize);
> + goto cancel_errReturn;
> + }
> + }

This is very repetitive - how about introducing a helper function for this?

> @@ -4467,8 +4601,8 @@ retry3:
>
> crp.packetlen = pg_hton32((uint32) sizeof(crp));
> crp.cp.cancelRequestCode = (MsgType) pg_hton32(CANCEL_REQUEST_CODE);
> - crp.cp.backendPID = pg_hton32(be_pid);
> - crp.cp.cancelAuthCode = pg_hton32(be_key);
> + crp.cp.backendPID = pg_hton32(cancel->be_pid);
> + crp.cp.cancelAuthCode = pg_hton32(cancel->be_key);

Others might differ, but I'd separate changing the type passed to
internal_cancel() into its own commit.

Greetings,

Andres Freund


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Jelte Fennema <Jelte(dot)Fennema(at)microsoft(dot)com>, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, Zhihong Yu <zyu(at)yugabyte(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [EXTERNAL] Re: PQcancel does not use tcp_user_timeout, connect_timeout and keepalive settings
Date: 2022-01-05 18:54:49
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Andres Freund <andres(at)anarazel(dot)de> writes:
> On 2021-12-28 15:49:00 +0000, Jelte Fennema wrote:
>> Finally, I would love it if once these fixes are merged the would also be backpatched to
>> previous versions of libpq.

> I'm not really convinced this is a good patch to backpatch. There does seem to
> be some potential for subtle breakage - code in signal handlers is notoriously
> finnicky, it's a rarely exercised code path, etc. It's also not fixing
> something that previously worked.

IMO, this is a new feature not a bug fix, and as such it's not backpatch
material ... especially if we don't have 100.00% confidence in it.

regards, tom lane


From: Jelte Fennema <Jelte(dot)Fennema(at)microsoft(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>
Cc: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, Zhihong Yu <zyu(at)yugabyte(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [EXTERNAL] Re: PQcancel does not use tcp_user_timeout, connect_timeout and keepalive settings
Date: 2022-01-06 15:58:28
Message-ID: AM5PR83MB01782D9A7D607B0419B3CB47F74C9@AM5PR83MB0178.EURPRD83.prod.outlook.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Attached are 3 patches that address the feedback from Andres about code duplication
and splitting up commits. I completely removed internal_cancel now, since it only had
one caller at this point.

> IMO, this is a new feature not a bug fix

IMO this is definitely a bugfix. Nowhere in the libpq docs it stated that the connection
options in question do not apply to connections that are opened for cancellations. So
as a user I definitely expect that any connections that libpq opens would use these options.
Which is why I definitely consider it a bug that they are currently not honoured for cancel
requests.

However, even though I think it's a bugfix, I can understand the being hesitant to
backport this. IMHO in that case at least the docs should be updated to explain this
discrepancy. I attached a patch to do so against the docs on the REL_14_STABLE branch.

> To me it seems a bit problematic to introduce a divergence between windows /
> everything else here. Isn't that just going to lead to other complaints just
> like this thread, where somebody discovered the hard way that there's platform
> dependent behaviour here?

Of course, fixing this also for windows would be much better. There's two problems:
1. I cannot find any clear documentation on which functions are signal safe in Windows
and which are not. The only reference I can find is this: https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/signal?view=msvc-170
However, this says that you should not use any function that generates a system call.
PQcancel is obviously already violating that when calling "connect", so this is not very helpful.
2. My Windows C experience is non existent, so I don't think I would be the right person to write this code.

IMO blocking this bugfix, because it does not fix it for Windows, would be an example of perfect
becoming the enemy of good. One thing I could do is add a note to the docs that these options
are not supported on Windows for cancellation requests (similar to my proposed doc change
for PG14 and below).

Attachment Content-Type Size
0001-Refactor-to-remove-internal_cancel-helper.patch application/octet-stream 4.9 KB
0002-Use-timeout-socket-options-for-cancel-connections.patch application/octet-stream 6.5 KB
0003-Honor-connect_timeout-when-connecting-with-PQcancel.patch application/octet-stream 6.4 KB
0001-Doc-explain-that-connection-options-don-t-apply-to-c.patch application/octet-stream 4.1 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jelte Fennema <Jelte(dot)Fennema(at)microsoft(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, Zhihong Yu <zyu(at)yugabyte(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [EXTERNAL] Re: PQcancel does not use tcp_user_timeout, connect_timeout and keepalive settings
Date: 2022-01-11 23:27:21
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Jelte Fennema <Jelte(dot)Fennema(at)microsoft(dot)com> writes:
> Attached are 3 patches that address the feedback from Andres about code duplication
> and splitting up commits. I completely removed internal_cancel now, since it only had
> one caller at this point.

Here's some cleaned-up versions of 0001 and 0002. I have not bothered
with 0003 because I for one will not be convinced to commit it. The
risk-vs-reward ratio looks far too high on that, and it's not obvious
why 0002 doesn't already solve whatever problem there is.

A couple of notes:

0001 introduces malloc/free into PQrequestCancel, which promotes it
from being "probably unsafe in a signal handler" to "definitely unsafe
in a signal handler". Before, maybe you could have gotten away with
it if you were sure the PGconn object was idle, but now it's no-go for
sure. I don't have a big problem with that, given that it's been
deprecated for decades, but I made the warning text in libpq.sgml a
little stronger.

As for 0002, I don't see a really good reason why we shouldn't try
to do it on Windows too. If connect() will work, then it seems
likely that setsockopt() and WSAIOCtl() will too. Moreover, note
that at least in our own programs, PQcancel doesn't *really* run in a
signal handler on Windows: see commentary in src/fe_utils/cancel.c.
(The fact that we now have test coverage for PQcancel makes me a lot
more willing to try this than I might otherwise be. Will be
interested to see the cfbot's results.)

Also, I was somewhat surprised while working on this to realize
that PQconnectPoll doesn't call setTCPUserTimeout if keepalives
are disabled per useKeepalives(). I made PQcancel act the same,
but I wonder if that was intentional or a bug. I'd personally
have thought that those settings were independent.

regards, tom lane

Attachment Content-Type Size
v2-0001-Refactor-to-remove-internal_cancel-helper.patch text/x-diff 5.4 KB
v2-0002-Use-timeout-socket-options-for-cancel-connections.patch text/x-diff 7.6 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jelte Fennema <Jelte(dot)Fennema(at)microsoft(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, Zhihong Yu <zyu(at)yugabyte(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [EXTERNAL] Re: PQcancel does not use tcp_user_timeout, connect_timeout and keepalive settings
Date: 2022-01-12 00:39:28
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

I wrote:
> (The fact that we now have test coverage for PQcancel makes me a lot
> more willing to try this than I might otherwise be. Will be
> interested to see the cfbot's results.)

On closer look, I'm not sure that psql/t/020_cancel.pl is actually doing
anything on Windows; the cfbot's test transcript says it ran without
skipping, but looking at the test itself, it seems like it would skip on
Windows.

Meanwhile, the warnings build noticed that we need to #ifdef
out the optional_setsockopt function in some cases. Revised
0002 attached.

regards, tom lane

Attachment Content-Type Size
v2-0001-Refactor-to-remove-internal_cancel-helper.patch text/x-diff 5.4 KB
v3-0002-Use-timeout-socket-options-for-cancel-connections.patch text/x-diff 7.6 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jelte Fennema <Jelte(dot)Fennema(at)microsoft(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, Zhihong Yu <zyu(at)yugabyte(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [EXTERNAL] Re: PQcancel does not use tcp_user_timeout, connect_timeout and keepalive settings
Date: 2022-01-12 04:15:29
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

... btw, speaking of signal-safe functions: I am dismayed to
notice that strerror (and strerror_r) are *not* in POSIX's
list of async-signal-safe functions. This is really quite
unsurprising, considering that they are chartered to return
locale-dependent strings. Unless the data has already been
collected in the current process, that'd imply reading something
from the locale definition files, allocating memory to hold it,
etc.

So I'm now thinking this bit in PQcancel is completely unsafe:

strncat(errbuf, SOCK_STRERROR(SOCK_ERRNO, sebuf, sizeof(sebuf)),
maxlen);

Seems we have to give up on providing any details beyond the
name of the function that failed.

regards, tom lane


From: Jelte Fennema <Jelte(dot)Fennema(at)microsoft(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, Zhihong Yu <zyu(at)yugabyte(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [EXTERNAL] Re: PQcancel does not use tcp_user_timeout, connect_timeout and keepalive settings
Date: 2022-01-12 16:11:26
Message-ID: AM5PR83MB017876B0409F0F6773A2174CF7529@AM5PR83MB0178.EURPRD83.prod.outlook.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Thanks for all the cleanup and adding of windows support. To me it now looks good to merge.

Meanwhile I've created another patch that adds, a non-blocking version of PQcancel to libpq.
Which doesn't have this problem by design, because it simply reuses the normal code for
connection establishement. And it also includes tests for PQcancel itself.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jelte Fennema <Jelte(dot)Fennema(at)microsoft(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, Zhihong Yu <zyu(at)yugabyte(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [EXTERNAL] Re: PQcancel does not use tcp_user_timeout, connect_timeout and keepalive settings
Date: 2022-01-17 20:39:40
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Jelte Fennema <Jelte(dot)Fennema(at)microsoft(dot)com> writes:
> Thanks for all the cleanup and adding of windows support. To me it now looks good to merge.

I was about to commit this when I started to wonder if it actually does
anything useful. In particular, I read in the Linux tcp(7) man page

TCP_USER_TIMEOUT (since Linux 2.6.37)
...
This option can be set during any state of a TCP connection, but
is effective only during the synchronized states of a connection
(ESTABLISHED, FIN-WAIT-1, FIN-WAIT-2, CLOSE-WAIT, CLOSING, and
LAST-ACK).

ISTM that the case we care about is where the server fails to respond
to the TCP connection request. If it does so, it seems pretty unlikely
that it wouldn't then eat the small amount of data we're going to send.

While the keepalive options aren't so explicitly documented, I'd bet that
they too don't have any effect until the connection is known established.

So I'm unconvinced that setting these values really has much effect
to make PQcancel more robust (or more accurately, make it fail faster).
I would like to know what scenarios you tested to convince you that
this is worth doing.

regards, tom lane


From: Jelte Fennema <Jelte(dot)Fennema(at)microsoft(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, Zhihong Yu <zyu(at)yugabyte(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [EXTERNAL] Re: PQcancel does not use tcp_user_timeout, connect_timeout and keepalive settings
Date: 2022-01-18 00:35:36
Message-ID: AM5PR83MB01780E7649EC5802643666A5F7589@AM5PR83MB0178.EURPRD83.prod.outlook.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

It seems the man page of TCP_USER_TIMEOUT does not align with
reality then. When I use it on my local machine it is effectively used
as a connection timeout too. The second command times out after
two seconds:

sudo iptables -A INPUT -p tcp --destination-port 5432 -j DROP
psql 'host=localhost tcp_user_timeout=2000'

The keepalive settings only apply once you get to the recv however. And yes,
it is pretty unlikely for the connection to break right when it is waiting for data.
But it has happened for us. And when it happens it is really bad, because
the process will be blocked forever. Since it is a blocking call.

After investigation when this happened it seemed to be a combination of a few
things making this happen:
1. The way citus uses cancelation requests: A Citus query on the coordinator creates
multiple connections to a worker and with 2PC for distributed transactions. If one
connection receives an error it sends a cancel request for all others.
2. When a machine is under heavy CPU or memory pressure things don't work
well:
i. errors can occur pretty frequently, causing lots of cancels to be sent by Citus.
ii. postmaster can be slow in handling new cancelation requests.
iii. Our failover system can think the node is down, because health checks are
failing.
3. Our failover system effectively cuts the power and the network of the primary
when it triggers a fail over to the secondary

This all together can result in a cancel request being interrupted right at that
wrong moment. And when it happens a distributed query on the Citus
coordinator, becomes blocked forever. We've had queries stuck in this state
for multiple days. The only way to get out of it at that point is either by restarting
postgres or manually closing the blocked socket (either with ss or gdb).

Jelte


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jelte Fennema <Jelte(dot)Fennema(at)microsoft(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, Zhihong Yu <zyu(at)yugabyte(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [EXTERNAL] Re: PQcancel does not use tcp_user_timeout, connect_timeout and keepalive settings
Date: 2022-01-18 18:32:55
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Jelte Fennema <Jelte(dot)Fennema(at)microsoft(dot)com> writes:
> It seems the man page of TCP_USER_TIMEOUT does not align with
> reality then. When I use it on my local machine it is effectively used
> as a connection timeout too.

Huh, I should have thought to try that. I confirm this behavior
on RHEL8 (kernel 4.18.0). Not the first mistake I've seen in
Linux man pages :-(.

Doesn't seem to help on macOS, but AFAICT that platform doesn't
have TCP_USER_TIMEOUT, so no surprise there.

Anyway, that removes my objection, so I'll proceed with committing.
Thanks for working on this patch!

regards, tom lane