Fix ndistinct estimates with system attributes
authorTomas Vondra <[email protected]>
Fri, 26 Mar 2021 21:34:53 +0000 (22:34 +0100)
committerTomas Vondra <[email protected]>
Fri, 26 Mar 2021 21:46:15 +0000 (22:46 +0100)
When estimating the number of groups using extended statistics, the code
was discarding information about system attributes. This led to strange
situation that

    SELECT 1 FROM t GROUP BY ctid;

could have produced higher estimate (equal to pg_class.reltuples) than

    SELECT 1 FROM t GROUP BY a, b, ctid;

with extended statistics on (a,b). Fixed by retaining information about
the system attribute.

Backpatch all the way to 10, where extended statistics were introduced.

Author: Tomas Vondra
Backpatch-through: 10

src/backend/utils/adt/selfuncs.c

index e128dccb04b497eb56e5dc1a988b92cd8eb6c10a..c2f5f948b06bcfd667eb13656adb29ddccdac94d 100644 (file)
@@ -3851,11 +3851,11 @@ estimate_multivariate_ndistinct(PlannerInfo *root, RelOptInfo *rel,
 
            attnum = ((Var *) varinfo->var)->varattno;
 
-           if (!AttrNumberIsForUserDefinedAttr(attnum))
+           if (AttrNumberIsForUserDefinedAttr(attnum) &&
+               bms_is_member(attnum, matched))
                continue;
 
-           if (!bms_is_member(attnum, matched))
-               newlist = lappend(newlist, varinfo);
+           newlist = lappend(newlist, varinfo);
        }
 
        *varinfos = newlist;