Increase value of OUTER_VAR

Lists: pgsql-hackers
From: Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Increase value of OUTER_VAR
Date: 2021-03-03 08:29:12
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Hi,

Playing with a large value of partitions I caught the limit with 65000
table entries in a query plan:

if (IS_SPECIAL_VARNO(list_length(glob->finalrtable)))
ereport(ERROR,
(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
errmsg("too many range table entries")));

Postgres works well with so many partitions.
The constants INNER_VAR, OUTER_VAR, INDEX_VAR are used as values of the
variable 'var->varno' of integer type. As I see, they were introduced
with commit 1054097464 authored by Marc G. Fournier, in 1996.
Value 65000 was relevant to the size of the int type at that time.

Maybe we will change these values to INT_MAX? (See the patch in attachment).

--
regards,
Andrey Lepikhov
Postgres Professional

Attachment Content-Type Size
0001-Set-values-of-special-varnos-to-the-upper-bound-of-t.patch text/plain 1.1 KB

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Increase value of OUTER_VAR
Date: 2021-03-03 08:52:00
Message-ID: CAApHDvo20D-KkxS4NQ4mOLusffZYMVMF5=budkLhEMXAUVE0+Q@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, 3 Mar 2021 at 21:29, Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru> wrote:
>
> Playing with a large value of partitions I caught the limit with 65000
> table entries in a query plan:
>
> if (IS_SPECIAL_VARNO(list_length(glob->finalrtable)))
> ereport(ERROR,
> (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
> errmsg("too many range table entries")));
>
> Postgres works well with so many partitions.
> The constants INNER_VAR, OUTER_VAR, INDEX_VAR are used as values of the
> variable 'var->varno' of integer type. As I see, they were introduced
> with commit 1054097464 authored by Marc G. Fournier, in 1996.
> Value 65000 was relevant to the size of the int type at that time.
>
> Maybe we will change these values to INT_MAX? (See the patch in attachment).

I don't really see any reason not to increase these a bit, but I'd
rather we kept them at some realistic maximum rather than all-out went
to INT_MAX.

I imagine a gap was left between 65535 and 65000 to allow space for
more special varno in the future. We did get INDEX_VAR since then, so
it seems like it was probably a good idea to leave a gap.

The problem I see what going close to INT_MAX is that the ERROR you
mention is unlikely to work correctly since a list_length() will never
get close to having INT_MAX elements before palloc() would exceed
MaxAllocSize for the elements array.

Something like 1 million seems like a more realistic limit to me.
That might still be on the high side, but it'll likely mean we'd not
need to revisit this for quite a while.

David


From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Increase value of OUTER_VAR
Date: 2021-03-03 08:56:40
Message-ID: CA+HiwqHoq+m4wJc_cbD3NeP43wzLZSVQ-5i9r4xHF1Wv8EtijQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Mar 3, 2021 at 5:52 PM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> On Wed, 3 Mar 2021 at 21:29, Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru> wrote:
> >
> > Playing with a large value of partitions I caught the limit with 65000
> > table entries in a query plan:
> >
> > if (IS_SPECIAL_VARNO(list_length(glob->finalrtable)))
> > ereport(ERROR,
> > (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
> > errmsg("too many range table entries")));
> >
> > Postgres works well with so many partitions.
> > The constants INNER_VAR, OUTER_VAR, INDEX_VAR are used as values of the
> > variable 'var->varno' of integer type. As I see, they were introduced
> > with commit 1054097464 authored by Marc G. Fournier, in 1996.
> > Value 65000 was relevant to the size of the int type at that time.
> >
> > Maybe we will change these values to INT_MAX? (See the patch in attachment).
>
> I don't really see any reason not to increase these a bit, but I'd
> rather we kept them at some realistic maximum rather than all-out went
> to INT_MAX.
>
> I imagine a gap was left between 65535 and 65000 to allow space for
> more special varno in the future. We did get INDEX_VAR since then, so
> it seems like it was probably a good idea to leave a gap.
>
> The problem I see what going close to INT_MAX is that the ERROR you
> mention is unlikely to work correctly since a list_length() will never
> get close to having INT_MAX elements before palloc() would exceed
> MaxAllocSize for the elements array.
>
> Something like 1 million seems like a more realistic limit to me.
> That might still be on the high side, but it'll likely mean we'd not
> need to revisit this for quite a while.

+1

Also, I got reminded of this discussion from not so long ago:

https://www.postgresql.org/message-id/flat/16302-e45634e2c0e34e97%40postgresql.org

--
Amit Langote
EDB: http://www.enterprisedb.com


From: Julien Rouhaud <rjuju123(at)gmail(dot)com>
To: Amit Langote <amitlangote09(at)gmail(dot)com>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Increase value of OUTER_VAR
Date: 2021-03-03 09:52:10
Message-ID: CAOBaU_aYr2Z8Qc=Ps6=BnjYofgJ_BKkq+mg33AKwMUeyvZBrVw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, Mar 3, 2021 at 4:57 PM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
>
> On Wed, Mar 3, 2021 at 5:52 PM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> > Something like 1 million seems like a more realistic limit to me.
> > That might still be on the high side, but it'll likely mean we'd not
> > need to revisit this for quite a while.
>
> +1
>
> Also, I got reminded of this discussion from not so long ago:
>
> https://www.postgresql.org/message-id/flat/16302-e45634e2c0e34e97%40postgresql.org

+1


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Amit Langote <amitlangote09(at)gmail(dot)com>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Increase value of OUTER_VAR
Date: 2021-03-03 15:06:03
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Amit Langote <amitlangote09(at)gmail(dot)com> writes:
> Also, I got reminded of this discussion from not so long ago:

> https://www.postgresql.org/message-id/flat/16302-e45634e2c0e34e97%40postgresql.org

Yeah. Nobody seems to have pursued Peter's idea of changing the magic
values to small negative ones, but that seems like a nicer idea than
arguing over what large positive value is large enough.

(Having said that, I remain pretty dubious that we're anywhere near
getting any real-world use out of such a change.)

regards, tom lane


From: Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
To: Julien Rouhaud <rjuju123(at)gmail(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Increase value of OUTER_VAR
Date: 2021-03-04 07:43:56
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On 3/3/21 12:52, Julien Rouhaud wrote:
> On Wed, Mar 3, 2021 at 4:57 PM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
>>
>> On Wed, Mar 3, 2021 at 5:52 PM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
>>> Something like 1 million seems like a more realistic limit to me.
>>> That might still be on the high side, but it'll likely mean we'd not
>>> need to revisit this for quite a while.
>>
>> +1
>>
>> Also, I got reminded of this discussion from not so long ago:
>>
>> https://www.postgresql.org/message-id/flat/16302-e45634e2c0e34e97%40postgresql.org
Thank you
>
> +1
>
Ok. I changed the value to 1 million and explained this decision in the
comment.
This issue caused by two cases:
1. Range partitioning on a timestamp column.
2. Hash partitioning.
Users use range distribution by timestamp because they want to insert
new data quickly and analyze entire set of data.
Also, in some discussions, I see Oracle users discussing issues with
more than 1e5 partitions.

--
regards,
Andrey Lepikhov
Postgres Professional

Attachment Content-Type Size
0001-Increase-values-of-special-varnos-to-1-million.patch text/plain 1.3 KB

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Increase value of OUTER_VAR
Date: 2021-03-04 13:59:09
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On 3/4/21 8:43 AM, Andrey Lepikhov wrote:
> On 3/3/21 12:52, Julien Rouhaud wrote:
>> On Wed, Mar 3, 2021 at 4:57 PM Amit Langote <amitlangote09(at)gmail(dot)com>
>> wrote:
>>>
>>> On Wed, Mar 3, 2021 at 5:52 PM David Rowley <dgrowleyml(at)gmail(dot)com>
>>> wrote:
>>>> Something like 1 million seems like a more realistic limit to me.
>>>> That might still be on the high side, but it'll likely mean we'd not
>>>> need to revisit this for quite a while.
>>>
>>> +1
>>>
>>> Also, I got reminded of this discussion from not so long ago:
>>>
>>> https://www.postgresql.org/message-id/flat/16302-e45634e2c0e34e97%40postgresql.org
>>>
> Thank you
>>
>> +1
>>
> Ok. I changed the value to 1 million and explained this decision in the
> comment.

IMO just bumping up the constants from ~65k to 1M is a net loss, for
most users. We add this to bitmapsets, which means we're using ~8kB with
the current values, but this jumps to 128kB with this higher value. This
also means bms_next_member etc. have to walk much more memory, which is
bound to have some performance impact for everyone.

Switching to small negative values is a much better idea, but it's going
to be more invasive - we'll have to offset the values in the bitmapsets,
or we'll have to invent a new bitmapset variant that can store negative
values directly (e.g. by keeping two separate bitmaps internally, one
for negative and one for positive values). But that complicates other
stuff too (e.g. bms_next_member now returns -1 to signal "end").

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Increase value of OUTER_VAR
Date: 2021-03-04 15:16:49
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> writes:
> IMO just bumping up the constants from ~65k to 1M is a net loss, for
> most users. We add this to bitmapsets, which means we're using ~8kB with
> the current values, but this jumps to 128kB with this higher value. This
> also means bms_next_member etc. have to walk much more memory, which is
> bound to have some performance impact for everyone.

Hmm, do we really have any places that include OUTER_VAR etc in
bitmapsets? They shouldn't appear in relid sets, for sure.
I agree though that if they did, this would have bad performance
consequences.

I still think the negative-special-values approach is better.
If there are any places that that would break, we'd find out about
it in short order, rather than having a silent performance lossage.

regards, tom lane


From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Increase value of OUTER_VAR
Date: 2021-03-04 15:34:51
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On 3/4/21 4:16 PM, Tom Lane wrote:
> Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> writes:
>> IMO just bumping up the constants from ~65k to 1M is a net loss, for
>> most users. We add this to bitmapsets, which means we're using ~8kB with
>> the current values, but this jumps to 128kB with this higher value. This
>> also means bms_next_member etc. have to walk much more memory, which is
>> bound to have some performance impact for everyone.
>
> Hmm, do we really have any places that include OUTER_VAR etc in
> bitmapsets? They shouldn't appear in relid sets, for sure.
> I agree though that if they did, this would have bad performance
> consequences.
>

Hmmm, I don't know. I mostly assumed that if I do pull_varnos() it would
include those values. But maybe that's not supposed to happen.

> I still think the negative-special-values approach is better.
> If there are any places that that would break, we'd find out about
> it in short order, rather than having a silent performance lossage.
>

OK

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Increase value of OUTER_VAR
Date: 2021-03-04 18:11:19
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> writes:
> On 3/4/21 4:16 PM, Tom Lane wrote:
>> Hmm, do we really have any places that include OUTER_VAR etc in
>> bitmapsets? They shouldn't appear in relid sets, for sure.
>> I agree though that if they did, this would have bad performance
>> consequences.

> Hmmm, I don't know. I mostly assumed that if I do pull_varnos() it would
> include those values. But maybe that's not supposed to happen.

But (IIRC) those varnos are never used till setrefs.c fixes up the plan
to replace normal Vars with references to lower plan nodes' outputs.
I'm not sure why anyone would be doing pull_varnos() after that;
it would not give very meaningful results.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Increase value of OUTER_VAR
Date: 2021-03-04 19:01:13
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Just as a proof of concept, I tried the attached, and it passes
check-world. So if there's anyplace trying to stuff OUTER_VAR and
friends into bitmapsets, it's pretty far off the beaten track.

The main loose ends that'd have to be settled seem to be:

(1) What data type do we want Var.varno to be declared as? In the
previous thread, Robert opined that plain "int" isn't a good choice,
but I'm not sure I agree. There's enough "int" for rangetable indexes
all over the place that it'd be a fool's errand to try to make it
uniformly something different.

(2) Does that datatype change need to propagate anywhere besides
what I touched here? I did not make any effort to search for
other places.

regards, tom lane

Attachment Content-Type Size
remove-64k-rangetable-limit-wip.patch text/x-diff 2.8 KB

From: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Increase value of OUTER_VAR
Date: 2021-03-06 08:43:45
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On 04.03.21 20:01, Tom Lane wrote:
> Just as a proof of concept, I tried the attached, and it passes
> check-world. So if there's anyplace trying to stuff OUTER_VAR and
> friends into bitmapsets, it's pretty far off the beaten track.
>
> The main loose ends that'd have to be settled seem to be:
>
> (1) What data type do we want Var.varno to be declared as? In the
> previous thread, Robert opined that plain "int" isn't a good choice,
> but I'm not sure I agree. There's enough "int" for rangetable indexes
> all over the place that it'd be a fool's errand to try to make it
> uniformly something different.

int seems fine.

> (2) Does that datatype change need to propagate anywhere besides
> what I touched here? I did not make any effort to search for
> other places.

I think

Var.varnosyn
CurrentOfExpr.cvarno

should also have their type changed.


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Increase value of OUTER_VAR
Date: 2021-03-06 14:59:15
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com> writes:
> On 04.03.21 20:01, Tom Lane wrote:
>> (2) Does that datatype change need to propagate anywhere besides
>> what I touched here? I did not make any effort to search for
>> other places.

> I think

> Var.varnosyn
> CurrentOfExpr.cvarno

> should also have their type changed.

Agreed as to CurrentOfExpr.cvarno. But I think the entire point of
varnosyn is that it saves the original rangetable reference and
*doesn't* get overwritten with OUTER_VAR etc. So that one is a
different animal, and I'm inclined to leave it as Index.

regards, tom lane


From: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Increase value of OUTER_VAR
Date: 2021-04-07 13:35:56
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On 06.03.21 15:59, Tom Lane wrote:
> Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com> writes:
>> On 04.03.21 20:01, Tom Lane wrote:
>>> (2) Does that datatype change need to propagate anywhere besides
>>> what I touched here? I did not make any effort to search for
>>> other places.
>
>> I think
>
>> Var.varnosyn
>> CurrentOfExpr.cvarno
>
>> should also have their type changed.
>
> Agreed as to CurrentOfExpr.cvarno. But I think the entire point of
> varnosyn is that it saves the original rangetable reference and
> *doesn't* get overwritten with OUTER_VAR etc. So that one is a
> different animal, and I'm inclined to leave it as Index.

Can we move forward with this?

I suppose there was still some uncertainty about whether all the places
that need changing have been identified, but do we have a better idea
how to find them?


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Increase value of OUTER_VAR
Date: 2021-04-07 13:40:37
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com> writes:
> Can we move forward with this?

> I suppose there was still some uncertainty about whether all the places
> that need changing have been identified, but do we have a better idea
> how to find them?

We could just push the change and see what happens. But I was thinking
more in terms of doing that early in the v15 cycle. I remain skeptical
that we need a near-term fix.

regards, tom lane


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Increase value of OUTER_VAR
Date: 2021-04-08 03:13:32
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

I wrote:
> Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com> writes:
>> Can we move forward with this?

> We could just push the change and see what happens. But I was thinking
> more in terms of doing that early in the v15 cycle. I remain skeptical
> that we need a near-term fix.

To make sure we don't forget, I added an entry to the next CF for this.

regards, tom lane


From: "Andrey V(dot) Lepikhov" <a(dot)lepikhov(at)postgrespro(dot)ru>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Increase value of OUTER_VAR
Date: 2021-04-08 05:24:22
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On 4/8/21 8:13 AM, Tom Lane wrote:
> I wrote:
>> Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com> writes:
>>> Can we move forward with this?
>
>> We could just push the change and see what happens. But I was thinking
>> more in terms of doing that early in the v15 cycle. I remain skeptical
>> that we need a near-term fix.
>
> To make sure we don't forget, I added an entry to the next CF for this.
Thanks for your efforts.

I tried to dive deeper: replace ROWID_VAR with -4 and explicitly change
types of varnos in the description of functions that can only work with
special varnos.
Use cases of OUTER_VAR looks simple (i guess). Use cases of INNER_VAR is
more complex because of the map_variable_attnos(). It is needed to
analyze how negative value of INNER_VAR can affect on this function.

INDEX_VAR causes potential problem:
in ExecInitForeignScan() and ExecInitForeignScan() we do
tlistvarno = INDEX_VAR;

here tlistvarno has non-negative type.

ROWID_VAR caused two problems in the check-world tests:
set_pathtarget_cost_width():
if (var->varno < root->simple_rel_array_size)
{
RelOptInfo *rel = root->simple_rel_array[var->varno];
...

and

replace_nestloop_params_mutator():
if (!bms_is_member(var->varno, root->curOuterRels))

I skipped this problems to see other weak points, but check-world
couldn't find another.

--
regards,
Andrey Lepikhov
Postgres Professional

Attachment Content-Type Size
0001-remove-64k-rangetable-limit-wip.patch text/x-patch 8.2 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Andrey V(dot) Lepikhov" <a(dot)lepikhov(at)postgrespro(dot)ru>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Increase value of OUTER_VAR
Date: 2021-07-02 18:23:40
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Here's a more fleshed-out version of this patch. I ran around and
fixed all the places where INNER_VAR etc. were being assigned directly to
a variable or parameter of type Index, and also grepped for 'Index.*varno'
to find suspicious declarations. (I didn't change every last instance
of the latter though; just places that could possibly be looking at
post-setrefs.c Vars.)

I concluded that we don't really need to change the type of
CurrentOfExpr.cvarno, because that's never set to a special value.

The main thing I remain concerned about is whether there are more
places like set_pathtarget_cost_width(), where we could be making
an inequality comparison on "varno" that would now be wrong.
I tried to catch this by enabling -Wsign-compare and -Wsign-conversion,
but that produced so many thousands of uninteresting warnings that
I soon gave up. I'm not sure there's any good way to catch remaining
places like that except to commit the patch and wait for trouble
reports.

So I'm inclined to propose pushing this and seeing what happens.

regards, tom lane

Attachment Content-Type Size
remove-64k-rangetable-limit-1.patch text/x-diff 16.0 KB

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Andrey V(dot) Lepikhov" <a(dot)lepikhov(at)postgrespro(dot)ru>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Increase value of OUTER_VAR
Date: 2021-07-04 13:51:42
Message-ID: CAApHDvo4PO+4SosbVSH6XwFwrRZxMfMEB9G7kczOMSJpL0UJJg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On Sat, 3 Jul 2021 at 06:23, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> So I'm inclined to propose pushing this and seeing what happens.

Is this really sane?

As much as I would like to see the 65k limit removed, I just have
reservations about fixing it in this way. Even if we get all the
cases fixed in core, there's likely a whole bunch of extensions
that'll have bugs as a result of this for many years to come.

"git grep \sIndex\s -- *.[ch] | wc -l" is showing me 77 matches in the
Citus code. That's not the only extension that uses the planner hook.

I'm really just not sure it's worth all the dev hours fixing the
fallout. To me, it seems much safer to jump bump 65k up to 1m. It'll
be a while before anyone complains about that.

It's also not that great to see the number of locations that you
needed to add run-time checks for negative varnos. That's not going to
come for free.

David


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: "Andrey V(dot) Lepikhov" <a(dot)lepikhov(at)postgrespro(dot)ru>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Increase value of OUTER_VAR
Date: 2021-07-04 15:37:29
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

David Rowley <dgrowleyml(at)gmail(dot)com> writes:
> Is this really sane?

> As much as I would like to see the 65k limit removed, I just have
> reservations about fixing it in this way. Even if we get all the
> cases fixed in core, there's likely a whole bunch of extensions
> that'll have bugs as a result of this for many years to come.

Maybe. I'm not that concerned about planner hacking: almost all of
the planner is only concerned with pre-setrefs.c representations and
will never see these values. Still, the fact that we had to inject
a couple of explicit IS_SPECIAL_VARNO tests is a bit worrisome.
(I'm more surprised really that noplace in the executor needed it.)
FWIW, experience with those places says that such bugs will be
exposed immediately; it's not like they'd lurk undetected "for years".

You might argue that the int-vs-Index declaration changes are
something that would be much harder to get right, but in reality
those are almost entirely cosmetic. We could make them completely
so by changing the macro to

#define IS_SPECIAL_VARNO(varno) ((int) (varno) < 0)

so that it'd still do the right thing when applied to a variable
declared as Index. (In the light of morning, I'm not sure why
I didn't do that already.) But we've always been extremely
cavalier about whether RT indexes should be declared as int or
Index, so I felt that standardizing on the former was actually
a good side-effect of the patch.

Anyway, to address your point more directly: as I recall, the main
objection to just increasing the values of these constants was the
fear that it'd bloat bitmapsets containing these values. Now on
the one hand, this patch has proven that noplace in the core code
does that today. On the other hand, there's no certainty that
someone might not try to do that tomorrow (if we don't fix it as
per this patch); or extensions might be doing so.

> I'm really just not sure it's worth all the dev hours fixing the
> fallout. To me, it seems much safer to jump bump 65k up to 1m. It'll
> be a while before anyone complains about that.

TBH, if we're to approach it that way, I'd be inclined to go for
broke and raise the values to ~2B. Then (a) we'll be shut of the
problem pretty much permanently, and (b) if someone does try to
make a bitmapset containing these values, hopefully they'll see
performance bad enough to expose the issue immediately.

> It's also not that great to see the number of locations that you
> needed to add run-time checks for negative varnos. That's not going to
> come for free.

Since the test is just "< 0", I pretty much disbelieve that argument.
There are only two such places in the patch, and neither of them
are *that* performance-sensitive.

Anyway, the raise-the-values solution does have the advantage of
being a four-liner, so I can live with it if that's the consensus.
But I do think this way is cleaner in the long run, and I doubt
the argument that it'll create any hard-to-detect bugs.

regards, tom lane


From: Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Increase value of OUTER_VAR
Date: 2021-07-05 07:51:03
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On 2/7/21 21:23, Tom Lane wrote:
> So I'm inclined to propose pushing this and seeing what happens.

+1
But why the Index type still uses for indexing of range table entries?
For example:
- we give int resultRelation value to create_modifytable_path() as Index
nominalRelation value.
- exec_rt_fetch(Index) calls list_nth(int).
- generate_subquery_vars() accepts an 'Index varno' value

It looks sloppy. Do you plan to change this in the next commits?

--
regards,
Andrey Lepikhov
Postgres Professional


From: Aleksander Alekseev <aleksander(at)timescale(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
Subject: Re: Increase value of OUTER_VAR
Date: 2021-09-10 14:44:25
Message-ID: CAJ7c6TP-e-fqzb907HZ2uODZnWUmYuqQk_HDqvuJb+zo99-42A@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Hi hackers,

> > So I'm inclined to propose pushing this and seeing what happens.
>
> +1

+1. The proposed changes will be beneficial in the long term. They
will affect existing extensions. However, the scale of the problem
seems to be exaggerated.

I can confirm that the patch passes installcheck-world. After some
searching through the code, I was unable to identify any places where
the logic will break. Although this only proves my inattention, the
easiest way to make any further progress seems to apply the patch.

--
Best regards,
Aleksander Alekseev


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Aleksander Alekseev <aleksander(at)timescale(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
Subject: Re: Increase value of OUTER_VAR
Date: 2021-09-11 17:37:47
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Aleksander Alekseev <aleksander(at)timescale(dot)com> writes:
> +1. The proposed changes will be beneficial in the long term. They
> will affect existing extensions. However, the scale of the problem
> seems to be exaggerated.

Yeah, after thinking more about this I agree we should just do it.
I do not say that David's concerns about effects on extensions are
without merit, but I do think he's overblown it a bit. Most of
the patch is s/Index/int/ for various variables, and as I mentioned
before, that's basically cosmetic; there's no strong reason why
extensions have to follow suit. (In the attached v2, I modified
IS_SPECIAL_VARNO() as discussed, so it will do the right thing
even if the input is declared as Index.) There may be a few
places where extensions need to add explicit IS_SPECIAL_VARNO()
calls, but not many, and I doubt they'll be hard to find.

The alternative of increasing the values of OUTER_VAR et al
is not without risk to extensions either, so on the whole
I don't think this patch is any more problematic than many
other things we commit with little debate.

In any case, since it's still very early in the v15 cycle,
there is plenty of time for people to find problems. If I'm
wrong and there are serious consequences, we can always revert
this and do it the other way.

(v2 below is a rebase up to HEAD; no actual code changes except
for adjusting the definition of IS_SPECIAL_VARNO.)

regards, tom lane

Attachment Content-Type Size
remove-64k-rangetable-limit-2.patch text/x-diff 16.0 KB

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Increase value of OUTER_VAR
Date: 2021-09-11 17:42:06
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru> writes:
> But why the Index type still uses for indexing of range table entries?
> For example:
> - we give int resultRelation value to create_modifytable_path() as Index
> nominalRelation value.
> - exec_rt_fetch(Index) calls list_nth(int).
> - generate_subquery_vars() accepts an 'Index varno' value

As I mentioned, the patch only intends to touch code that's possibly
used with post-setrefs Vars. In the parser and most of the planner,
there's little need to do anything because only positive varno values
will appear. So touching that code would just make the patch more
invasive without accomplishing much.

If we'd had any strong convention about whether RT indexes should be
int or Index, I might be worried about maintaining consistency.
But it's always been a horrid mishmash of both ways. Cleaning that
up completely is a task I don't care to undertake right now.

regards, tom lane


From: "Andrey V(dot) Lepikhov" <a(dot)lepikhov(at)postgrespro(dot)ru>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Aleksander Alekseev <aleksander(at)timescale(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Increase value of OUTER_VAR
Date: 2021-09-14 06:43:03
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On 9/11/21 10:37 PM, Tom Lane wrote:
> Aleksander Alekseev <aleksander(at)timescale(dot)com> writes:
> (v2 below is a rebase up to HEAD; no actual code changes except
> for adjusting the definition of IS_SPECIAL_VARNO.)
I have looked at this code. No problems found.
Also, as a test, I used two tables with 1E5 partitions each. I tried to
do plain SELECT, JOIN, join with plain table. No errors found, only
performance issues. But it is a subject for another research.

--
regards,
Andrey Lepikhov
Postgres Professional


From: Aleksander Alekseev <aleksander(at)timescale(dot)com>
To: "Andrey V(dot) Lepikhov" <a(dot)lepikhov(at)postgrespro(dot)ru>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Increase value of OUTER_VAR
Date: 2021-09-14 11:37:26
Message-ID: CAJ7c6TPt6-sWUbqHLjPc98Vxn5Gq_r_rN-j0Pjcj_nb9X-ddzQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Hi Andrey,

> only performance issues

That's interesting. Any chance you could share the hardware
description, the configuration file, and steps to reproduce with us?

--
Best regards,
Aleksander Alekseev


From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Andrey V(dot) Lepikhov" <a(dot)lepikhov(at)postgrespro(dot)ru>
Cc: Aleksander Alekseev <aleksander(at)timescale(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Increase value of OUTER_VAR
Date: 2021-09-14 14:01:08
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

"Andrey V. Lepikhov" <a(dot)lepikhov(at)postgrespro(dot)ru> writes:
> Also, as a test, I used two tables with 1E5 partitions each. I tried to
> do plain SELECT, JOIN, join with plain table. No errors found, only
> performance issues. But it is a subject for another research.

Yeah, there's no expectation that the performance would be any
good yet ;-)

regards, tom lane


From: Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
To: Aleksander Alekseev <aleksander(at)timescale(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Increase value of OUTER_VAR
Date: 2021-09-15 06:41:38
Message-ID: [email protected]
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

On 14/9/21 16:37, Aleksander Alekseev wrote:
> Hi Andrey,
>
>> only performance issues
>
> That's interesting. Any chance you could share the hardware
> description, the configuration file, and steps to reproduce with us?
>
I didn't control execution time exactly. Because it is a join of two
empty tables. As I see, this join used most part of 48GB RAM memory,
planned all day on a typical 6 amd cores computer.
I guess this is caused by sequental traversal of the partition list in
some places in the optimizer.
If it makes practical sense, I could investigate reasons for such poor
performance.

--
regards,
Andrey Lepikhov
Postgres Professional


From: Aleksander Alekseev <aleksander(at)timescale(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Cc: Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>
Subject: Re: Increase value of OUTER_VAR
Date: 2021-09-15 08:01:43
Message-ID: CAJ7c6TOaeR=aUDaA1c0YWuctJGaTQzP-2a0AaO5ZVbbYFwQfDA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Lists: pgsql-hackers

Hi Andrey,

> >> only performance issues
> >
> > That's interesting. Any chance you could share the hardware
> > description, the configuration file, and steps to reproduce with us?
> >
> I didn't control execution time exactly. Because it is a join of two
> empty tables. As I see, this join used most part of 48GB RAM memory,
> planned all day on a typical 6 amd cores computer.
> I guess this is caused by sequental traversal of the partition list in
> some places in the optimizer.
> If it makes practical sense, I could investigate reasons for such poor
> performance.

Let's say, any information regarding bottlenecks that affect real users
with real queries is of interest. Artificially created queries that are
unlikely to be ever executed by anyone are not.

--
Best regards,
Aleksander Alekseev