Lists: | pgsql-hackers |
---|
From: | Antonin Houska <ah(at)cybertec(dot)at> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Suspicious call of initial_cost_hashjoin() |
Date: | 2017-12-22 09:45:50 |
Message-ID: | 12441.1513935950@localhost |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
try_partial_hashjoin_path() passes constant true to for the parallel_hash
argument of initial_cost_hashjoin(). Shouldn't it instead pass the
parallel_hash argument that it receives?
This is related to commit 1804284042e659e7d16904e7bbb0ad546394b6a3.
--
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at
From: | Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com> |
---|---|
To: | Antonin Houska <ah(at)cybertec(dot)at> |
Cc: | Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Suspicious call of initial_cost_hashjoin() |
Date: | 2017-12-22 11:13:37 |
Message-ID: | CAEepm=3sB=BXFGxwhUKVQED=aF61S95NS05ek+be1c6P5NFDCQ@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Fri, Dec 22, 2017 at 10:45 PM, Antonin Houska <ah(at)cybertec(dot)at> wrote:
> try_partial_hashjoin_path() passes constant true to for the parallel_hash
> argument of initial_cost_hashjoin(). Shouldn't it instead pass the
> parallel_hash argument that it receives?
Thanks. Yeah. When initial_cost_hashjoin() calls
get_parallel_divisor() on a non-partial inner path I think it would
return 1.0, so no damage was done there, but when
ExecChooseHashTableSize() receives try_combined_work_mem == true it
might underestimate the number of batches required for a partial hash
join without parallel hash, because it would incorrectly assume that a
single batch join could use the combined work_mem budget. This was
quite well hidden because ExecHashTableCreate() calls
ExecChooseHashTableSize() again (rather than reusing the results from
planning time), so the bad nbatch estimate doesn't show up anywhere.
--
Thomas Munro
http://www.enterprisedb.com
Attachment | Content-Type | Size |
---|---|---|
fix.patch | application/octet-stream | 1.3 KB |
From: | David Steele <david(at)pgmasters(dot)net> |
---|---|
To: | Antonin Houska <ah(at)cybertec(dot)at> |
Cc: | Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Re: Suspicious call of initial_cost_hashjoin() |
Date: | 2018-03-01 19:45:22 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
Hi Antonin,
On 12/22/17 6:13 AM, Thomas Munro wrote:
> On Fri, Dec 22, 2017 at 10:45 PM, Antonin Houska <ah(at)cybertec(dot)at> wrote:
>> try_partial_hashjoin_path() passes constant true to for the parallel_hash
>> argument of initial_cost_hashjoin(). Shouldn't it instead pass the
>> parallel_hash argument that it receives?
>
> Thanks. Yeah. When initial_cost_hashjoin() calls
> get_parallel_divisor() on a non-partial inner path I think it would
> return 1.0, so no damage was done there, but when
> ExecChooseHashTableSize() receives try_combined_work_mem == true it
> might underestimate the number of batches required for a partial hash
> join without parallel hash, because it would incorrectly assume that a
> single batch join could use the combined work_mem budget. This was
> quite well hidden because ExecHashTableCreate() calls
> ExecChooseHashTableSize() again (rather than reusing the results from
> planning time), so the bad nbatch estimate doesn't show up anywhere.
Does this look right to you? If so, can you sign up as a reviewer and
mark it Ready for Committer?
Thanks,
--
-David
david(at)pgmasters(dot)net
From: | Antonin Houska <ah(at)cybertec(dot)at> |
---|---|
To: | David Steele <david(at)pgmasters(dot)net> |
Cc: | Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Suspicious call of initial_cost_hashjoin() |
Date: | 2018-03-02 08:06:57 |
Message-ID: | 15738.1519978017@localhost |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
David Steele <david(at)pgmasters(dot)net> wrote:
> On 12/22/17 6:13 AM, Thomas Munro wrote:
> > On Fri, Dec 22, 2017 at 10:45 PM, Antonin Houska <ah(at)cybertec(dot)at> wrote:
> >> try_partial_hashjoin_path() passes constant true to for the parallel_hash
> >> argument of initial_cost_hashjoin(). Shouldn't it instead pass the
> >> parallel_hash argument that it receives?
> >
> > Thanks. Yeah. When initial_cost_hashjoin() calls
> > get_parallel_divisor() on a non-partial inner path I think it would
> > return 1.0, so no damage was done there, but when
> > ExecChooseHashTableSize() receives try_combined_work_mem == true it
> > might underestimate the number of batches required for a partial hash
> > join without parallel hash, because it would incorrectly assume that a
> > single batch join could use the combined work_mem budget. This was
> > quite well hidden because ExecHashTableCreate() calls
> > ExecChooseHashTableSize() again (rather than reusing the results from
> > planning time), so the bad nbatch estimate doesn't show up anywhere.
> Does this look right to you?
Yes, this is what I meant. The patch applies cleanly and the code compiles
well.
> If so, can you sign up as a reviewer and mark it Ready for Committer?
Done.
Actually I think it'd be nice if the "parallel_hash" argument was mentioned in
the header comment of initial_cost_hashjoin() function, but not sure this is
worth returning the patch to the author.
--
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26, A-2700 Wiener Neustadt
Web: https://www.cybertec-postgresql.com
From: | Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com> |
---|---|
To: | Antonin Houska <ah(at)cybertec(dot)at> |
Cc: | David Steele <david(at)pgmasters(dot)net>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Suspicious call of initial_cost_hashjoin() |
Date: | 2018-03-02 10:01:33 |
Message-ID: | CAEepm=23tCnUwMzYDpYJQUOQZfgB3nGRVnAif=vLeFYTg-XLVA@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Fri, Mar 2, 2018 at 9:06 PM, Antonin Houska <ah(at)cybertec(dot)at> wrote:
> David Steele <david(at)pgmasters(dot)net> wrote:
>> Does this look right to you?
>
> Yes, this is what I meant. The patch applies cleanly and the code compiles
> well.
>
>> If so, can you sign up as a reviewer and mark it Ready for Committer?
>
> Done.
Thanks.
> Actually I think it'd be nice if the "parallel_hash" argument was mentioned in
> the header comment of initial_cost_hashjoin() function, but not sure this is
> worth returning the patch to the author.
Done.
--
Thomas Munro
http://www.enterprisedb.com
Attachment | Content-Type | Size |
---|---|---|
0001-Fix-costing-of-parallel-hash-joins-v2.patch | application/octet-stream | 2.1 KB |
From: | Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com> |
---|---|
To: | Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Antonin Houska <ah(at)cybertec(dot)at> |
Cc: | David Steele <david(at)pgmasters(dot)net>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Suspicious call of initial_cost_hashjoin() |
Date: | 2018-03-07 02:56:38 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
On 3/2/18 05:01, Thomas Munro wrote:
> On Fri, Mar 2, 2018 at 9:06 PM, Antonin Houska <ah(at)cybertec(dot)at> wrote:
>> David Steele <david(at)pgmasters(dot)net> wrote:
>>> Does this look right to you?
>>
>> Yes, this is what I meant. The patch applies cleanly and the code compiles
>> well.
>>
>>> If so, can you sign up as a reviewer and mark it Ready for Committer?
>>
>> Done.
>
> Thanks.
>
>> Actually I think it'd be nice if the "parallel_hash" argument was mentioned in
>> the header comment of initial_cost_hashjoin() function, but not sure this is
>> worth returning the patch to the author.
>
> Done.
committed
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services