Lists: | pgsql-hackers |
---|
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Accounting for metapages in genericcostestimate() |
Date: | 2025-04-28 17:19:12 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
Per the discussion at [1], genericcostestimate() produces estimates
that are noticeably off for small indexes, because it fails to
discount the index metapage while computing numIndexPages.
Here's a first-draft attempt at improving that.
The basic issue is that the calculation of numIndexPages is (as the
comment says) meant to consider only leaf index pages, but we were
simply using the total index size (index->pages) in the formula.
Subtracting the metapage produces visibly saner results when the
index is only a couple pages in total.
I thought for a bit about trying to also discount index upper pages,
but decided it's not worth it, at least for now. Given reasonable
index fanout, upper pages should amount to at most a percent or two
of the index, so accounting for them would only move the estimates by
a percent or two. Moreover, it's hard to make a non-squishy estimate
of how many upper pages there are. But we do know whether there's a
metapage or not, and failing to account for it produces 100% relative
error if the index has only one data-bearing page. So that seems
worth dealing with.
Some notes:
* Adding a field to GenericCosts breaks ABI for external callers
of genericcostestimate(), but not API, if they followed the
recommendation to zero the whole struct. If numNonLeafPages is
left at zero then the results don't change. We wouldn't consider
back-patching a change like this anyway, so the ABI break is not
a problem.
* There are other uses of index->pages in selfuncs.c. I looked
through them and didn't feel motivated to change any, but perhaps
someone else will have a different opinion.
* Unsurprisingly, this change causes several visible changes in the
core regression tests for index selection with small indexes. In each
of them it seemed that the point of the test case was to test the plan
as-given. So I hacked things up to keep the plans the same, either by
disabling an alternative plan choice or by increasing the size of the
table.
This is v19 material, so I'll park it in the next CF.
regards, tom lane
#text/x-diff; name="v1-discount-metapage-in-genericcostestimate.patch" [v1-discount-metapage-in-genericcostestimate.patch] /home/tgl/pgsql/v1-discount-metapage-in-genericcostestimate.patch
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Accounting for metapages in genericcostestimate() |
Date: | 2025-04-28 17:20:34 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
... sigh, this time with the patch actually attached.
regards, tom lane
Attachment | Content-Type | Size |
---|---|---|
v1-discount-metapage-in-genericcostestimate.patch | text/x-diff | 9.5 KB |
From: | Álvaro Herrera <alvherre(at)kurilemu(dot)de> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Accounting for metapages in genericcostestimate() |
Date: | 2025-04-30 10:47:51 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
On 2025-Apr-28, Tom Lane wrote:
> @@ -135,6 +141,7 @@ typedef struct
> double numIndexTuples; /* number of leaf tuples visited */
> double spc_random_page_cost; /* relevant random_page_cost value */
> double num_sa_scans; /* # indexscans from ScalarArrayOpExprs */
> + BlockNumber numNonLeafPages; /* # of index pages that are not leafs */
> } GenericCosts;
The idea you described seems quite reasonable, though I didn't review
the patch in detail.
I find the use of "leafs" as plural for "leaf" a bit strange ...
We already have uses of that word, but I wonder if they don't mostly
or exclusively come from non-native English speakers.
--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Álvaro Herrera <alvherre(at)kurilemu(dot)de> |
Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Accounting for metapages in genericcostestimate() |
Date: | 2025-04-30 13:46:07 |
Message-ID: | [email protected] |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Lists: | pgsql-hackers |
=?utf-8?Q?=C3=81lvaro?= Herrera <alvherre(at)kurilemu(dot)de> writes:
> On 2025-Apr-28, Tom Lane wrote:
>> + BlockNumber numNonLeafPages; /* # of index pages that are not leafs */
> I find the use of "leafs" as plural for "leaf" a bit strange ...
> We already have uses of that word, but I wonder if they don't mostly
> or exclusively come from non-native English speakers.
Yeah, "leaves" would be correct, but I wondered whether that'd confuse
non-native speakers more. Happy to change it though.
regards, tom lane