Re: [PATCH] Add error handling to byteaout.

Lists: pgsql-hackers
From: Andreas Seltenreich <andreas(dot)seltenreich(at)credativ(dot)de>
To: pgsql-hackers(at)postgresql(dot)org
Subject: [PATCH] Add error handling to byteaout.
Date: 2015-06-02 14:43:24
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

when dealing with bytea values that are below the 1GB limit, but too
large to be escaped, the error messages emitted are not very helpful for
users. Especially if they appear in an unrelated context such as during
pg_dump. I've attached a patch that adds ereport()ing that would have
prevented some confusion.

Example:

,----[ master ]
| ase=# select mkbytea(29);
| ERROR: invalid memory alloc request size 1073741827
| ase=# set bytea_output to escape;
| ase=# select mkbytea(29);
| ERROR: invalid memory alloc request size 18446744071562067969
`----

The scary one is due to an integer overflow the attached patch also
fixes. I don't see any security implications though as it's only the
sign bit that is affected.

,----[ with patch applied ]
| ase=# set bytea_output to 'escape';
| ase=# select mkbytea(29);
| ERROR: escaped bytea value would be too big
| DETAIL: Value would require 2147483649 bytes.
| HINT: Use a different bytea_output setting or binary methods such as COPY BINARY.
`----

regards,
Andreas

create function mkbytea(power int, part bytea default '\x00')
returns bytea as
$$select case when power>0 then mkbytea(power-1,part||part) else part end;$$
language sql;

From f62a101a690fc9251c4c2de9c87323cedd0e9a54 Mon Sep 17 00:00:00 2001
From: Andreas Seltenreich <andreas(dot)seltenreich(at)credativ(dot)de>
Date: Mon, 1 Jun 2015 16:17:21 +0200
Subject: [PATCH] Add error handling to byteaout.

Emit a comprehensible error message when escaping fails due to
MaxAllocSize instead of waiting for palloc to fail. Also use size_t
for size computations to prevent integer overflow in
BYTEA_OUTPUT_ESCAPE branch.
---
src/backend/utils/adt/varlena.c | 28 +++++++++++++++++++++++++---
1 file changed, 25 insertions(+), 3 deletions(-)

diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index 779729d..ec2594e 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -347,8 +347,21 @@ byteaout(PG_FUNCTION_ARGS)

if (bytea_output == BYTEA_OUTPUT_HEX)
{
+ Size len;
/* Print hex format */
- rp = result = palloc(VARSIZE_ANY_EXHDR(vlena) * 2 + 2 + 1);
+ len = (Size)VARSIZE_ANY_EXHDR(vlena) * 2 + 2 + 1;
+
+ if (!AllocSizeIsValid(len)) {
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("escaped bytea value would be too big"),
+ errdetail("Value would require %zu bytes.",
+ len),
+ errhint("Use a different bytea_output setting or"
+ " binary methods such as COPY BINARY.")));
+ }
+
+ rp = result = palloc(len);
*rp++ = '\\';
*rp++ = 'x';
rp += hex_encode(VARDATA_ANY(vlena), VARSIZE_ANY_EXHDR(vlena), rp);
@@ -357,8 +370,8 @@ byteaout(PG_FUNCTION_ARGS)
{
/* Print traditional escaped format */
char *vp;
- int len;
- int i;
+ Size len;
+ Size i;

len = 1; /* empty string has 1 char */
vp = VARDATA_ANY(vlena);
@@ -371,6 +384,15 @@ byteaout(PG_FUNCTION_ARGS)
else
len++;
}
+ if (!AllocSizeIsValid(len)) {
+ ereport(ERROR,
+ (errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("escaped bytea value would be too big"),
+ errdetail("Value would require %zu bytes.",
+ len),
+ errhint("Use a different bytea_output setting or"
+ " binary methods such as COPY BINARY.")));
+ }
rp = result = (char *) palloc(len);
vp = VARDATA_ANY(vlena);
for (i = VARSIZE_ANY_EXHDR(vlena); i != 0; i--, vp++)
--
2.1.4


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andreas Seltenreich <andreas(dot)seltenreich(at)credativ(dot)de>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Add error handling to byteaout.
Date: 2015-06-02 15:15:27
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Andreas Seltenreich <andreas(dot)seltenreich(at)credativ(dot)de> writes:
> The scary one is due to an integer overflow the attached patch also
> fixes.

s/int/Size/ doesn't fix anything on 32-bit machines.

regards, tom lane


From: Andreas Seltenreich <andreas(dot)seltenreich(at)credativ(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Add error handling to byteaout.
Date: 2015-06-02 16:47:54
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

> Andreas Seltenreich <andreas(dot)seltenreich(at)credativ(dot)de> writes:
>> The scary one is due to an integer overflow the attached patch also
>> fixes.
>
> s/int/Size/ doesn't fix anything on 32-bit machines.

Well, it changes the signedness of the computation on 32-bit, and in
combination with the fact that "len" is always smaller than 2^32, but
may exceed 2^31-1, the change avoids the dependency on the undefined
behavior of signed integer overflows in C on 32-bit as well. But I
admit that this might be a rather academic point...

Anyway, my motivation for the patch was the improved error reporting.
Is the drive-by type change a problem here?

regards,
Andreas


From: Piotr Stefaniak <postgres(at)piotr-stefaniak(dot)me>
To: Andreas Seltenreich <andreas(dot)seltenreich(at)credativ(dot)de>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Add error handling to byteaout.
Date: 2015-06-03 16:23:24
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On 06/02/2015 06:47 PM, Andreas Seltenreich wrote:
> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
>
>> Andreas Seltenreich <andreas(dot)seltenreich(at)credativ(dot)de> writes:
>>> The scary one is due to an integer overflow the attached patch also
>>> fixes.
>>
>> s/int/Size/ doesn't fix anything on 32-bit machines.
>
> Well, it changes the signedness of the computation on 32-bit, and in
> combination with the fact that "len" is always smaller than 2^32, but
> may exceed 2^31-1, the change avoids the dependency on the undefined
> behavior of signed integer overflows in C on 32-bit as well. But I
> admit that this might be a rather academic point...
>

Postgres requires twos-complement representation, so that the assumption
that signed integer types wrap around on overflow can be safely made.


From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Andreas Seltenreich <andreas(dot)seltenreich(at)credativ(dot)de>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Add error handling to byteaout.
Date: 2015-06-03 16:32:38
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Andreas Seltenreich wrote:
> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
>
> > Andreas Seltenreich <andreas(dot)seltenreich(at)credativ(dot)de> writes:
> >> The scary one is due to an integer overflow the attached patch also
> >> fixes.
> >
> > s/int/Size/ doesn't fix anything on 32-bit machines.
>
> Well, it changes the signedness of the computation on 32-bit, and in
> combination with the fact that "len" is always smaller than 2^32, but
> may exceed 2^31-1, the change avoids the dependency on the undefined
> behavior of signed integer overflows in C on 32-bit as well.

Why not just use an unsigned 64 bit variable? Also, perhaps
palloc_huge() avoids the whole problem in the first place ... though it
might only move the issue around, if you cannot ship the longer-than-1GB
resulting escaped value. (Of course, if you try to allocate 2 GB in a
32 bit machine, you're going to be having quite some fun ...)

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


From: Andreas Seltenreich <andreas(dot)seltenreich(at)credativ(dot)de>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Add error handling to byteaout.
Date: 2015-06-03 19:41:51
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Alvaro Herrera writes:

> Why not just use an unsigned 64 bit variable? Also, perhaps
> palloc_huge() avoids the whole problem in the first place ...

Ja, that crossed my mind too, but the current limit is already far
beyond anything that is usually configured for per-backend memory use,
so I dismissed it.

> though it might only move the issue around, if you cannot ship the
> longer-than-1GB resulting escaped value.

For example, when client and server encodings do not match:

,----[ mbutils.c ]
| result = palloc(len * MAX_CONVERSION_GROWTH + 1);
`----

This results in the fun fact that the maximum size for bytea values that
are guaranteed to be pg_dumpable regardless of encoding/escaping
settings is lower than 64MB.

One thing that would also mitigate the problem is supporting a more
efficient output format. For example, there's already means for
base64-encoding in the backend:

self=# select c, length(encode(mkbytea(28),c)) from (values ('hex'),('base64')) as v(c);
c | length
--------+-----------
hex | 536870912
base64 | 362623337
(2 rows)

Maybe it is reasonable to make it available as another option for use in
bytea_output?

regards,
Andreas


From: Andreas Seltenreich <andreas(dot)seltenreich(at)credativ(dot)de>
To: Piotr Stefaniak <postgres(at)piotr-stefaniak(dot)me>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Add error handling to byteaout.
Date: 2015-06-03 19:42:21
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Piotr Stefaniak writes:
>>> s/int/Size/ doesn't fix anything on 32-bit machines.
>
> Postgres requires twos-complement representation, so that the
> assumption that signed integer types wrap around on overflow can be
> safely made.

Thanks for the clarification!


From: Andres Freund <andres(at)anarazel(dot)de>
To: Andreas Seltenreich <andreas(dot)seltenreich(at)credativ(dot)de>, Piotr Stefaniak <postgres(at)piotr-stefaniak(dot)me>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>,pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] Add error handling to byteaout.
Date: 2015-06-03 19:47:33
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On June 3, 2015 9:42:21 PM GMT+02:00, Andreas Seltenreich <andreas(dot)seltenreich(at)credativ(dot)de> wrote:
>Piotr Stefaniak writes:
>>>> s/int/Size/ doesn't fix anything on 32-bit machines.
>>
>> Postgres requires twos-complement representation, so that the
>> assumption that signed integer types wrap around on overflow can be
>> safely made.

Uh, no. The C standard defines signed overflow as undefined, an compilers assume it doesn't happen. The optimize based on y observation.

---
Please excuse brevity and formatting - I am writing this on my mobile phone.


From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Andreas Seltenreich <andreas(dot)seltenreich(at)credativ(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] Add error handling to byteaout.
Date: 2015-06-04 00:34:57
Message-ID: CAB7nPqRySWTbXXf2HTyxPM2bEevuXmiGUetkBMSer25qGj2jWg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On Thu, Jun 4, 2015 at 1:32 AM, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
wrote:

> Andreas Seltenreich wrote:
> > Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
> >
> > > Andreas Seltenreich <andreas(dot)seltenreich(at)credativ(dot)de> writes:
> > >> The scary one is due to an integer overflow the attached patch also
> > >> fixes.
> > >
> > > s/int/Size/ doesn't fix anything on 32-bit machines.
> >
> > Well, it changes the signedness of the computation on 32-bit, and in
> > combination with the fact that "len" is always smaller than 2^32, but
> > may exceed 2^31-1, the change avoids the dependency on the undefined
> > behavior of signed integer overflows in C on 32-bit as well.
>
> Why not just use an unsigned 64 bit variable? Also, perhaps
> palloc_huge() avoids the whole problem in the first place ... though it
> might only move the issue around, if you cannot ship the longer-than-1GB
> resulting escaped value. (Of course, if you try to allocate 2 GB in a
> 32 bit machine, you're going to be having quite some fun ...)
>

Pure nitpicking: there is no palloc_huge, only repalloc_huge. Though we
could have one.
--
Michael