Lists: | pgsql-bugs |
---|
From: | PG Bug reporting form <noreply(at)postgresql(dot)org> |
---|---|
To: | pgsql-bugs(at)lists(dot)postgresql(dot)org |
Cc: | exclusion(at)gmail(dot)com |
Subject: | BUG #17912: Invalid memory access when converting plpython' array containing empty array |
Date: | 2023-04-28 11:00:01 |
Message-ID: | [email protected] |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
The following bug has been logged on the website:
Bug reference: 17912
Logged by: Alexander Lakhin
Email address: exclusion(at)gmail(dot)com
PostgreSQL version: 15.2
Operating system: Ubuntu 22.04
Description:
When the following query executed:
CREATE EXTENSION plpython3u;
CREATE OR REPLACE FUNCTION test() RETURNS text[] AS $$
return [[], "a"]
$$ LANGUAGE plpython3u;
SELECT test();
valgrind detects an incorrect memory access:
==00:00:00:05.073 1489859== Invalid write of size 1
==00:00:00:05.073 1489859== at 0x4878C38: PLyObject_ToScalar
(plpy_typeio.c:1083)
==00:00:00:05.073 1489859== by 0x4877267: PLySequence_ToArray_recurse
(plpy_typeio.c:1282)
==00:00:00:05.073 1489859== by 0x48776AF: PLySequence_ToArray
(plpy_typeio.c:1227)
==00:00:00:05.073 1489859== by 0x4877E9C: PLy_output_convert
(plpy_typeio.c:122)
==00:00:00:05.073 1489859== by 0x487101E: PLy_exec_function
(plpy_exec.c:235)
==00:00:00:05.073 1489859== by 0x487201B: plpython3_call_handler
(plpy_main.c:247)
==00:00:00:05.073 1489859== by 0x401A95: ExecInterpExpr
(execExprInterp.c:727)
==00:00:00:05.073 1489859== by 0x3FE2A6: ExecInterpExprStillValid
(execExprInterp.c:1826)
==00:00:00:05.073 1489859== by 0x440563: ExecEvalExprSwitchContext
(executor.h:341)
==00:00:00:05.073 1489859== by 0x440563: ExecProject (executor.h:375)
==00:00:00:05.073 1489859== by 0x440563: ExecResult (nodeResult.c:136)
==00:00:00:05.073 1489859== by 0x40EBA2: ExecProcNodeFirst
(execProcnode.c:464)
==00:00:00:05.073 1489859== by 0x407196: ExecProcNode (executor.h:259)
==00:00:00:05.073 1489859== by 0x407196: ExecutePlan (execMain.c:1636)
==00:00:00:05.073 1489859== by 0x407376: standard_ExecutorRun
(execMain.c:363)
==00:00:00:05.073 1489859== Address 0x112e9340 is 320 bytes inside a block
of size 8,192 alloc'd
==00:00:00:05.073 1489859== at 0x4848899: malloc
(vg_replace_malloc.c:381)
==00:00:00:05.073 1489859== by 0x73ACFA: AllocSetContextCreateInternal
(aset.c:469)
==00:00:00:05.073 1489859== by 0x415DFF: CreateExprContextInternal
(execUtils.c:259)
==00:00:00:05.073 1489859== by 0x41623E: CreateExprContext
(execUtils.c:309)
==00:00:00:05.073 1489859== by 0x41648A: ExecAssignExprContext
(execUtils.c:488)
==00:00:00:05.073 1489859== by 0x44075F: ExecInitResult
(nodeResult.c:205)
==00:00:00:05.073 1489859== by 0x40ED32: ExecInitNode
(execProcnode.c:167)
==00:00:00:05.073 1489859== by 0x407AA9: InitPlan (execMain.c:938)
==00:00:00:05.073 1489859== by 0x407C85: standard_ExecutorStart
(execMain.c:265)
==00:00:00:05.073 1489859== by 0x407DDD: ExecutorStart (execMain.c:144)
==00:00:00:05.073 1489859== by 0x5C6723: PortalStart (pquery.c:517)
==00:00:00:05.073 1489859== by 0x5C32DF: exec_simple_query
(postgres.c:1211)
Without valgrind, but with asserts enabled, I get:
WARNING: problem in alloc set ExprContext: detected write past chunk end in
block 0x562777dfbeb0, chunk 0x562777dfbed8
WARNING: problem in alloc set ExprContext: req size > alloc size for chunk
0x562777dfbef0 in block 0x562777dfbeb0
test
--------
{[],a}
(1 row)
When the function returns '["a", []]', I see no anomalies.
As I can see, for the first case we get len = 0 in PLySequence_ToArray();
elems, nulls palloc'ed with zero elements, but PLyObject_ToScalar() tries
to
write a value into nulls[0]...
Reproduced on REL_11_STABLE..master.
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | exclusion(at)gmail(dot)com |
Cc: | pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: BUG #17912: Invalid memory access when converting plpython' array containing empty array |
Date: | 2023-04-28 15:14:06 |
Message-ID: | [email protected] |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
PG Bug reporting form <noreply(at)postgresql(dot)org> writes:
> CREATE EXTENSION plpython3u;
> CREATE OR REPLACE FUNCTION test() RETURNS text[] AS $$
> return [[], "a"]
> $$ LANGUAGE plpython3u;
> SELECT test();
> As I can see, for the first case we get len = 0 in PLySequence_ToArray();
> elems, nulls palloc'ed with zero elements, but PLyObject_ToScalar() tries
> to write a value into nulls[0]...
Yeah. The calculation of the array size is being done in the wrong
place, so that we may update len to zero before breaking out of the
loop. But really it's poor coding for this function to be doing
its own calculation of the array size at all, rather than consulting
the authoritative ArrayGetNItems(). I think we need something like
the attached.
regards, tom lane
Attachment | Content-Type | Size |
---|---|---|
fix-PLySequence_ToArray.patch | text/x-diff | 1.9 KB |
From: | Alexander Lakhin <exclusion(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: BUG #17912: Invalid memory access when converting plpython' array containing empty array |
Date: | 2023-04-28 18:00:00 |
Message-ID: | [email protected] |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
Hello Tom,
28.04.2023 18:14, Tom Lane wrote:
> PG Bug reporting form <noreply(at)postgresql(dot)org> writes:
>> CREATE EXTENSION plpython3u;
>> CREATE OR REPLACE FUNCTION test() RETURNS text[] AS $$
>> return [[], "a"]
>> $$ LANGUAGE plpython3u;
>> SELECT test();
>> As I can see, for the first case we get len = 0 in PLySequence_ToArray();
>> elems, nulls palloc'ed with zero elements, but PLyObject_ToScalar() tries
>> to write a value into nulls[0]...
> Yeah. The calculation of the array size is being done in the wrong
> place, so that we may update len to zero before breaking out of the
> loop. But really it's poor coding for this function to be doing
> its own calculation of the array size at all, rather than consulting
> the authoritative ArrayGetNItems(). I think we need something like
> the attached.
Thank you! I've read the message of the commit you just pushed,
and I confirm that there are another oddities in that area. For example:
CREATE OR REPLACE FUNCTION test() RETURNS text[] AS $$
return [1, [2]]
$$ LANGUAGE plpython3u;
SELECT test();
---------
{1,[2]}
But:
CREATE OR REPLACE FUNCTION test() RETURNS text[] AS $$
return [[1], 2]
$$ LANGUAGE plpython3u;
SELECT test();
ERROR: wrong length of inner sequence: has length -1, but 1 was expected
DETAIL: To construct a multidimensional array, the inner sequences must all have the same length.
CONTEXT: while creating return value
PL/Python function "test"
Or:
CREATE OR REPLACE FUNCTION test() RETURNS text[] AS $$
return [["1"], "abc"]
$$ LANGUAGE plpython3u;
SELECT test();
ERROR: wrong length of inner sequence: has length 3, but 1 was expected
DETAIL: To construct a multidimensional array, the inner sequences must all have the same length.
CONTEXT: while creating return value
PL/Python function "test"
Though may be it's okay, considering python's concepts of
strings/arrays/sequences, or at least deserves a separate discussion.
Best regards,
Alexander
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Alexander Lakhin <exclusion(at)gmail(dot)com> |
Cc: | pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: BUG #17912: Invalid memory access when converting plpython' array containing empty array |
Date: | 2023-04-28 18:18:36 |
Message-ID: | [email protected] |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
Alexander Lakhin <exclusion(at)gmail(dot)com> writes:
> Thank you! I've read the message of the commit you just pushed,
> and I confirm that there are another oddities in that area. For example:
> CREATE OR REPLACE FUNCTION test() RETURNS text[] AS $$
> return [[1], 2]
> $$ LANGUAGE plpython3u;
> SELECT test();
> ERROR: wrong length of inner sequence: has length -1, but 1 was expected
> DETAIL: To construct a multidimensional array, the inner sequences must all have the same length.
Yeah. AFAICT, the idea of the existing code is to descend through
the first item at each nest level till we hit a non-list, then take
the lengths of those lists as the array dimensions, and then complain
if we find any later items that don't fit those dimensions. That leads
to some symmetry problems, in that the error you get for inconsistent
sequence lengths depends on the order in which the items are presented.
Also, it seems like there is something odd here:
regression=# CREATE OR REPLACE FUNCTION test() RETURNS text[] AS $$
regression$# return ["abc"]
regression$# $$ LANGUAGE plpython3u;
CREATE FUNCTION
regression=# select test();
test
-------
{abc}
(1 row)
regression=# CREATE OR REPLACE FUNCTION test() RETURNS text[] AS $$
return [[1,2,3], "abc"]
$$ LANGUAGE plpython3u;
CREATE FUNCTION
regression=# select test();
test
-------------------
{{1,2,3},{a,b,c}}
(1 row)
I think it's weird that "abc" is taken as a scalar in the first
case and a list in the second case. Even worse,
regression=# CREATE OR REPLACE FUNCTION test() RETURNS text[] AS $$
return ["abc", [1,2,3]]
$$ LANGUAGE plpython3u;
CREATE FUNCTION
regression=# select test();
test
-------------------
{abc,"[1, 2, 3]"}
(1 row)
This might be something that used to work more sanely with
Python 2 and got broken in Python 3 for the same reasons
discussed in bug #17908. I've not poked at it more though.
I don't think I have a Python 2 installation to test with
anymore :-(
regards, tom lane
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Alexander Lakhin <exclusion(at)gmail(dot)com> |
Cc: | pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: BUG #17912: Invalid memory access when converting plpython' array containing empty array |
Date: | 2023-04-28 19:17:09 |
Message-ID: | [email protected] |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
I wrote:
> Yeah. AFAICT, the idea of the existing code is to descend through
> the first item at each nest level till we hit a non-list, then take
> the lengths of those lists as the array dimensions, and then complain
> if we find any later items that don't fit those dimensions. That leads
> to some symmetry problems, in that the error you get for inconsistent
> sequence lengths depends on the order in which the items are presented.
The real problem here is that we don't check that the list nesting
depth is the same throughout the array: if lists are more deeply
nested in later elements, we'll treat those sub-lists as scalars,
leading to inconsistent results. Conversely, a less-deeply-nested
list structure in a later element might still work, if it can be
treated as a sequence. I think the second and third examples
I gave should both throw errors.
I also notice that the error messages in this area speak of "sequences",
but it is more correct to call them "lists", because Python draws a
distinction. (All lists are sequences, but not vice versa, eg a
string is a sequence but not a list.)
So I'm thinking about the attached. I do not propose this for
back-patching, because it could break applications that work today.
But it seems like good tightening-up for HEAD, or maybe we should
wait for v17 at this point?
regards, tom lane
Attachment | Content-Type | Size |
---|---|---|
tighten-multidim-array-shape-checks.patch | text/x-diff | 6.3 KB |
From: | Alexander Lakhin <exclusion(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: BUG #17912: Invalid memory access when converting plpython' array containing empty array |
Date: | 2023-04-29 13:00:00 |
Message-ID: | [email protected] |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
28.04.2023 22:17, Tom Lane wrote:
> The real problem here is that we don't check that the list nesting
> depth is the same throughout the array: if lists are more deeply
> nested in later elements, we'll treat those sub-lists as scalars,
> leading to inconsistent results. Conversely, a less-deeply-nested
> list structure in a later element might still work, if it can be
> treated as a sequence. I think the second and third examples
> I gave should both throw errors.
>
> I also notice that the error messages in this area speak of "sequences",
> but it is more correct to call them "lists", because Python draws a
> distinction. (All lists are sequences, but not vice versa, eg a
> string is a sequence but not a list.)
>
> So I'm thinking about the attached.
Thanks for fixing this!
Now python's handling of arrays is much nicer and is aligned with the plperl's behaviour:
CREATE FUNCTION test_pl() RETURNS text[] AS $$ return [1, [2, 3]]; $$ LANGUAGE plperl;
SELECT * FROM test_pl();
ERROR: multidimensional arrays must have array expressions with matching dimensions
CONTEXT: PL/Perl function "test_pl"
I observed another light-hearted case (without the patch, of course):
CREATE FUNCTION test_py() RETURNS text[] AS $$ return [1, [2, 3]]; $$ LANGUAGE plpython3u;
SELECT * FROM test_py();
{1,"[2, 3]"}
So the patch looks more like a bug fix.
Though I still see some discrepancy between plperl and plpython:
CREATE OR REPLACE FUNCTION test_pl() RETURNS text[] AS $$ return [[],1]; $$ LANGUAGE plperl;
SELECT * FROM test_pl();
ERROR: multidimensional arrays must have array expressions with matching dimensions
CONTEXT: PL/Perl function "test_pl"
vs
CREATE OR REPLACE FUNCTION test_py() RETURNS text[] AS $$ return [[],1]; $$ LANGUAGE plpython3u;
SELECT * FROM test_py();
{[],1}
It seems that [] was recognized as "[]" here.
While playing with plperl, I found that it handles arrays not perfectly too:
CREATE OR REPLACE FUNCTION test_pl() RETURNS text[] AS $$return [1, []];$$ LANGUAGE plperl;
SELECT * FROM test_pl();
triggers:
==00:00:08:45.537 2325687== Conditional jump or move depends on uninitialised value(s)
==00:00:08:45.537 2325687== at 0x61FEDC: construct_md_array (arrayfuncs.c:3500)
==00:00:08:45.537 2325687== by 0x625917: makeMdArrayResult (arrayfuncs.c:5428)
==00:00:08:45.537 2325687== by 0x486DEF3: plperl_array_to_datum (plperl.c:1278)
==00:00:08:45.537 2325687== by 0x486D8E8: plperl_sv_to_datum (plperl.c:1347)
==00:00:08:45.537 2325687== by 0x4872FA6: plperl_func_handler (plperl.c:2483)
==00:00:08:45.537 2325687== by 0x4873CE5: plperl_call_handler (plperl.c:1858)
==00:00:08:45.537 2325687== by 0x41406F: ExecMakeTableFunctionResult (execSRF.c:235)
==00:00:08:45.538 2325687== by 0x426F1C: FunctionNext (nodeFunctionscan.c:95)
==00:00:08:45.538 2325687== by 0x414AA7: ExecScanFetch (execScan.c:133)
==00:00:08:45.538 2325687== by 0x414B42: ExecScan (execScan.c:182)
==00:00:08:45.538 2325687== by 0x426E2E: ExecFunctionScan (nodeFunctionscan.c:270)
==00:00:08:45.538 2325687== by 0x4117F1: ExecProcNodeFirst (execProcnode.c:464)
==00:00:08:45.538 2325687==
Here, nelems = 2 (from ArrayGetNItems(ndims, dims)), but
array_to_datum_internal() generated only one datum.
And yet another case:
CREATE OR REPLACE FUNCTION test_py() RETURNS text[] AS $$return [[1],[[]]];$$ LANGUAGE plpython3u;
regression=# SELECT * FROM test_py();
{{1},{[]}}
vs
CREATE OR REPLACE FUNCTION test_pl() RETURNS text[] AS $$return [[1],[[]]];$$ LANGUAGE plperl;
SELECT * FROM test_pl();
{}
> I do not propose this for
> back-patching, because it could break applications that work today.
> But it seems like good tightening-up for HEAD, or maybe we should
> wait for v17 at this point?
I suppose that waiting for v17 is preferable if the patch is considered as
bringing a new feature (it's not the case) or could require extra time to
stabilize. But I'm afraid that anomalies, that would require additional
fixes for the change stabilization, could be related to the existing
code, and thus that extra time will be invested in improving v16- too.
Best regards,
Alexander
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Alexander Lakhin <exclusion(at)gmail(dot)com> |
Cc: | pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: BUG #17912: Invalid memory access when converting plpython' array containing empty array |
Date: | 2023-04-29 17:16:30 |
Message-ID: | [email protected] |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
Alexander Lakhin <exclusion(at)gmail(dot)com> writes:
> Thanks for fixing this!
> Now python's handling of arrays is much nicer and is aligned with the plperl's behaviour:
Well, mostly. As you noticed, it's a bit weird about zero-length
sub-lists, treating those as scalars. I had been intending to keep
the existing behavior there, but now that I see that plperl does it
the other way (that is, you end up with an overall empty output array)
I think we ought to make plpython do likewise.
> While playing with plperl, I found that it handles arrays not perfectly too:
> CREATE OR REPLACE FUNCTION test_pl() RETURNS text[] AS $$return [1, []];$$ LANGUAGE plperl;
> SELECT * FROM test_pl();
Ugh. I pushed a fix for that.
On the whole, I like plperl's implementation of this better, that is
it scans the data structure just once and uses an ArrayBuildState to
accumulate the datums. So now I'm thinking about throwing out
the code in PLySequence_ToArray[_recurse] altogether and borrowing
plperl's logic. I think that would make it easier to deal with
zero-length sublists correctly. Haven't written the patch yet though.
> I suppose that waiting for v17 is preferable if the patch is considered as
> bringing a new feature (it's not the case) or could require extra time to
> stabilize. But I'm afraid that anomalies, that would require additional
> fixes for the change stabilization, could be related to the existing
> code, and thus that extra time will be invested in improving v16- too.
I'm a little uncomfortable with changing the semantics of non-failing
cases in the back branches.
regards, tom lane
From: | Alexander Lakhin <exclusion(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: BUG #17912: Invalid memory access when converting plpython' array containing empty array |
Date: | 2023-04-30 04:00:00 |
Message-ID: | [email protected] |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
29.04.2023 20:16, Tom Lane wrote:
>> I suppose that waiting for v17 is preferable if the patch is considered as
>> bringing a new feature (it's not the case) or could require extra time to
>> stabilize. But I'm afraid that anomalies, that would require additional
>> fixes for the change stabilization, could be related to the existing
>> code, and thus that extra time will be invested in improving v16- too.
> I'm a little uncomfortable with changing the semantics of non-failing
> cases in the back branches.
I agree that we shouldn't introduce a new behavior in the released branches.
For that moment I was thinking about the choice between v16 and v17
for that patch to be committed to.
But as you see the better solution now, the patch will be different and more
extensive, I suppose, so I'd vote for postponing it for v17.
Best regards,
Alexander
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Alexander Lakhin <exclusion(at)gmail(dot)com> |
Cc: | pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: BUG #17912: Invalid memory access when converting plpython' array containing empty array |
Date: | 2023-04-30 16:24:38 |
Message-ID: | [email protected] |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
Alexander Lakhin <exclusion(at)gmail(dot)com> writes:
> 29.04.2023 20:16, Tom Lane wrote:
>> I'm a little uncomfortable with changing the semantics of non-failing
>> cases in the back branches.
> I agree that we shouldn't introduce a new behavior in the released branches.
> For that moment I was thinking about the choice between v16 and v17
> for that patch to be committed to.
> But as you see the better solution now, the patch will be different and more
> extensive, I suppose, so I'd vote for postponing it for v17.
Here's a version that adopts plperl's logic, causing it to treat
empty sub-lists as being zero-length dimensions. Most of the new
test cases are borrowed from plperl, too, and most of them act
differently before and after the code change. So I'm pretty
hesitant to put this into stable branches. OTOH, maybe it's not
too late for v16?
I noticed one inarguable bug here, too: PLySequence_ToArray_recurse
leaks Python object refcounts after errors, because it has no
PG_TRY to ensure that Py_XDECREF() gets done. I'm not sure if
it's worth doing something about that in the back branches, given
the lack of complaints.
regards, tom lane
Attachment | Content-Type | Size |
---|---|---|
tighten-multidim-array-shape-checks-2.patch | text/x-diff | 14.9 KB |
From: | Alexander Lakhin <exclusion(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: BUG #17912: Invalid memory access when converting plpython' array containing empty array |
Date: | 2023-05-01 12:00:00 |
Message-ID: | [email protected] |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
30.04.2023 19:24, Tom Lane wrote:
> Here's a version that adopts plperl's logic, causing it to treat
> empty sub-lists as being zero-length dimensions. Most of the new
> test cases are borrowed from plperl, too, and most of them act
> differently before and after the code change. So I'm pretty
> hesitant to put this into stable branches. OTOH, maybe it's not
> too late for v16?
Thanks for the patch!
I've tested the new implementation and found no issues with it — only
rectangular structures are accepted now. The code is straightforward and
very similar to plperl's, so I would not expect that it might bring new
anomalies, which couldn't be seen before.
Thus I don't think that adding it to current master (and possible follow-up
fixing) can take a significant amount of time out of v16+ schedule only.
> I noticed one inarguable bug here, too: PLySequence_ToArray_recurse
> leaks Python object refcounts after errors, because it has no
> PG_TRY to ensure that Py_XDECREF() gets done. I'm not sure if
> it's worth doing something about that in the back branches, given
> the lack of complaints.
I continue watching the array handling bugs dancing Sirtaki too. Now it's
another asymmetry:
select '{{1},{{2}}}'::int[];
{{{1}},{{2}}}
but:
select '{{{1}},{2}}'::int[];
{}
Reproduced on REL_11_STABLE..master.
Best regards,
Alexander
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Alexander Lakhin <exclusion(at)gmail(dot)com> |
Cc: | pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: BUG #17912: Invalid memory access when converting plpython' array containing empty array |
Date: | 2023-05-04 13:27:49 |
Message-ID: | [email protected] |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
Alexander Lakhin <exclusion(at)gmail(dot)com> writes:
> 30.04.2023 19:24, Tom Lane wrote:
>> Here's a version that adopts plperl's logic, causing it to treat
>> empty sub-lists as being zero-length dimensions. Most of the new
>> test cases are borrowed from plperl, too, and most of them act
>> differently before and after the code change. So I'm pretty
>> hesitant to put this into stable branches. OTOH, maybe it's not
>> too late for v16?
> Thanks for the patch!
> I've tested the new implementation and found no issues with it — only
> rectangular structures are accepted now. The code is straightforward and
> very similar to plperl's, so I would not expect that it might bring new
> anomalies, which couldn't be seen before.
> Thus I don't think that adding it to current master (and possible follow-up
> fixing) can take a significant amount of time out of v16+ schedule only.
After thinking about this some more, I'm inclined to go ahead and
apply this patch and indeed back-patch it. As things stood before
commits 81eaaf65e et al, it was completely unsafe to use an empty
first-level sub-list in a plpython multi-dimensional result value at all.
The only valid use of such a thing would be like "return [[], []]",
that is, all the sub-lists have to be zero-length to satisfy
rectangularity. So that would trigger the bug you originally reported
wherein a zero-length first sublist computes a wrong datum array
length leading to memory clobber. Commit 81eaaf65e did the minimum
possible change to stop the memory clobber, but it left us in a
situation where the actual result behavior isn't very sane. We
should not ship that behavior; we should make it do something sane,
namely return a zero-dimensional array for cases like this.
The argument against that is that zero-length sublists below the
first level managed not to crash, nor to fail, in some weird
cases like this one:
regression=# CREATE OR REPLACE FUNCTION test_type_conversion_md_array_out() RETURNS text[] AS $$
regression$# return [[1], [[]]]
regression$# $$ LANGUAGE plpython3u;
CREATE FUNCTION
regression=# select test_type_conversion_md_array_out();
test_type_conversion_md_array_out
-----------------------------------
{{1},{[]}}
(1 row)
which the patch would turn into an error case. But it's pretty hard
to believe that anyone is depending on corner cases like that one
and yet managing not to trip over the crash hazard. Moreover,
throwing an error for this is consistent with the change we made
in plperl at f47004add et al. So I'm thinking let's apply this
patch and then just release-note all three patchsets as "tighten
checks for rectangularity of multi-dimensional arrays in plperl
and plpython".
regards, tom lane
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Alexander Lakhin <exclusion(at)gmail(dot)com> |
Cc: | pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: BUG #17912: Invalid memory access when converting plpython' array containing empty array |
Date: | 2023-05-04 15:26:58 |
Message-ID: | [email protected] |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-bugs |
Alexander Lakhin <exclusion(at)gmail(dot)com> writes:
> I continue watching the array handling bugs dancing Sirtaki too. Now it's
> another asymmetry:
> select '{{1},{{2}}}'::int[];
> {{{1}},{{2}}}
> but:
> select '{{{1}},{2}}'::int[];
> {}
For the sake of the archives --- this issue in the core code
is being tracked at
I think all the reported issues in plperl and plpython are resolved now.
regards, tom lane