Re: [Postgres-xc-developers] aggfinalfn differs from plain postgres for some aggregates

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

2011/4/13 Ashutosh Bapat <ash...@en...>

>
>
> On Tue, Apr 12, 2011 at 6:42 PM, Andrei Martsinchyk <
> and...@gm...> wrote:
>
>>
>>
>> 2011/4/12 Ashutosh Bapat <ash...@en...>
>>
>>>
>>>
>>> On Tue, Apr 12, 2011 at 5:37 PM, Andrei Martsinchyk <
>>> and...@gm...> wrote:
>>>
>>>>
>>>> XC aggregate is implemented by three functions: first function is
>>>> invoked on data node once per row and added argument to internal accumulated
>>>> value, which is sent to coordinator after group is processed, second
>>>> function is invoked on coordinator once per received pre-aggregated row and
>>>> combine pre-aggregated values together, third function is invoked on
>>>> coordinator per group and convert accumulated value to aggregation result.
>>>> Regarding sum() and count(), they have to perform typecast on final
>>>> step.
>>>> In Postgres sum(int4)int8, accumulating function is defined like
>>>> sum_agg(int8, int4):int8; in XC this function performs pre-aggregation.
>>>> Combining function is sum_agg(numeric, int8):numeric, Postgres does not have
>>>> sum_agg(int8, int8):int8. So XC has to convert numeric to int8 to return
>>>> value of declared type, while in Postgres accumulated value can be returned
>>>> without conversion.
>>>>
>>>
>>> The plain PG pg_aggregate entries for sum look like
>>> postgres=# select * from pg_aggregate where aggfnoid in (select oid from
>>> pg_proc where proname = 'sum');
>>>     aggfnoid    | aggtransfn  | aggcollectfn |   aggfinalfn    |
>>> aggsortop | aggtranstype | aggcollecttype | agginitval | agginitcollect
>>>
>>> ----------------+-------------+--------------+-----------------+-----------+--------------+----------------+------------+----------------
>>>  pg_catalog.sum | int8_sum    | numeric_add  | -               |
>>> 0 |         1700 |           1700 |            |
>>>  pg_catalog.sum | int4_sum    | int8_sum     | pg_catalog.int8 |
>>> 0 |           20 |           1700 |            |
>>>  pg_catalog.sum | int2_sum    | int8_sum     | pg_catalog.int8 |
>>> 0 |           20 |           1700 |            |
>>>
>>>
>>> And the PGXC entries look like
>>> testdb=# select * from pg_aggregate where aggfnoid in (select oid from
>>> pg_proc where proname = 'sum');
>>>     aggfnoid    | aggtransfn  | aggfinalfn | aggsortop | aggtranstype |
>>> agginitval
>>>
>>> ----------------+-------------+------------+-----------+--------------+------------
>>>  pg_catalog.sum | int8_sum    | -          |         0 |         1700 |
>>>  pg_catalog.sum | int4_sum    | -          |         0 |           20 |
>>>  pg_catalog.sum | int2_sum    | -          |         0 |           20 |
>>>
>>> In PG, the sum of integers all result in int8, whereas in PGXC they
>>> result into numeric and casted back to int8. May be we should use a new
>>> function int8_sum(int8, int8):int8 instead of int8_sum(numeric,
>>> int8):numeric. That way we don't need any final function for sum, just like
>>> PG.
>>>
>>>
>> Yes, it is possible, I added new functions for some aggregates. But it
>> works with existing functions already.
>> If something is broken and aggregates do not work as expected the
>> workaround will help with sum() and count(), but other aggregates where
>> final function is required won't work.
>> The root cause should be fixed.
>>
>
> I think, the return type of collection function and transition function
> should be same so, that the final function would work in both the cases 1.
> final function applied directly over the transition function result - cases
> when aggregates can not be pushed to data nodes
>
> 2. final function is applied over the collection function result - cases
> when aggregates are pushed to the data nodes. In first case the aggregates
> and group by work the same way as plain PG (applying aggregates row by row)
> at coordinator. In second case the aggregates and group by work the XC way
> (applying aggregates over the aggregation results of datanodes) at
> coordinators. In both the case datanodes won't apply final function but
> return the final results of transition functions.
>
> In all other aggregates except where we see difference between PG and XC,
> this method will work since the final function is same in both cases (so
> collection function and transition function should have same result types).
> Only the differing cases viz. array_agg, string_aggs, count and sum will
> need changes in collection function.
>
>

The transition type and collection type are different for some built-in
aggregate functions and may be different for user-defined aggregates. If you
require them the same you can skip calling collection function if there is
result from only one node. I think this improvement too small to limit
flexibility. In particular, if aggregation occurs on coordinator only, it is
not too much overhead to invoke collection function before final function -
collection function would be trivial in that case, just copy the transition
value.

>
>>
>>> This will help us set the finalfn_oid in ExecInitAgg() and we will have
>>> group by running (albeit slow). This has another impact. The plain
>>> aggregates (without any group by) with JOINs are not working currently. For
>>> example, query postgres=# select avg(emp.val * dept.val) from emp, dept;
>>> returns 0 (1 row with value 0) even if there is some non-zero data in those
>>> tables. This is because we do not set finalfn_oid in ExecInitAgg() and the
>>> tree for above query looks like
>>> AggState(NestedLoop(RemoteQuery (select val from emp), RemoteQuery(select
>>> val from dept))). Thus the aggregate is not pushed down to the data node.
>>> While finalising the aggregate result, it does not find finalfnoid and thus
>>> returns false results. If we can set finalfnoid as done in the attached
>>> patch, we will get the group by running albeit suboptimally.
>>>
>>>
>> Aggregates are used to work. Probably something got broken during merge
>> with Postgres 9.0.3.
>> I have not looked into the latest code, so it is hard to guess what is
>> wrong. I will try to find time to take a look.
>>
>>
>>> Any thoughts?
>>>
>>>  I guess in your code one of aggregation steps is missing.
>>>> Hope this helps.
>>>>
>>>> 2011/4/12 Ashutosh Bapat <ash...@en...>
>>>>
>>>>> Hi,
>>>>> I took outputs of query "select aggfnoid, aggfinalfn from pg_aggregate
>>>>> where aggfinalfn != 0;" against plain postgres and PGXC. It showed following
>>>>> difference
>>>>> [ashutosh@anand PG_HEAD]diff /tmp/pgxc_aggfinalfn.out
>>>>> /tmp/pg_aggfinalfn.out
>>>>> 10,13d9
>>>>> <  pg_catalog.sum         | pg_catalog.int8
>>>>> <  pg_catalog.sum         | pg_catalog.int8
>>>>> <  pg_catalog.count       | pg_catalog.int8
>>>>> <  pg_catalog.count       | pg_catalog.int8
>>>>> 50d45
>>>>> <  regr_count             | pg_catalog.int8
>>>>> 62c57,59
>>>>> < (59 rows)
>>>>> ---
>>>>> >  array_agg              | array_agg_finalfn
>>>>> >  string_agg             | string_agg_finalfn
>>>>> > (56 rows)
>>>>>
>>>>>
>>>>> XC has final functions set for aggregates sum, count whereas plain
>>>>> postgres has those. Plain postgres has the final functions for array_agg and
>>>>> string_agg but XC does not have those. Why is this difference?
>>>>>
>>>>> As of now, in XC, for GROUP BY queries, the coordinators receives plain
>>>>> data from data nodes, stripped of any aggregates or GROUP BY clause. I was
>>>>> trying to use PG mechanism to calculate the aggregates (so as to enable
>>>>> group by clauses quickly). It worked for AVG, but for sum it ended up
>>>>> calling numeric_int8() because of above entries which hit segfault since the
>>>>> data passed to it is not numeric. In that case, it's important to know
>>>>> whether those differences are important. NOTE: This won't be the final
>>>>> version of GROUP BY support. I am trying to design it such a way that we can
>>>>> push GROUP BY down to datanodes.
>>>>>
>>>>> The changes are added by commit 8326f619.
>>>>>
>>>>> --
>>>>> Best Wishes,
>>>>> Ashutosh Bapat
>>>>> EntepriseDB Corporation
>>>>> The Enterprise Postgres Company
>>>>>
>>>>>
>>>>>
>>>>> ------------------------------------------------------------------------------
>>>>> Forrester Wave Report - Recovery time is now measured in hours and
>>>>> minutes
>>>>> not days. Key insights are discussed in the 2010 Forrester Wave Report
>>>>> as
>>>>> part of an in-depth evaluation of disaster recovery service providers.
>>>>> Forrester found the best-in-class provider in terms of services and
>>>>> vision.
>>>>> Read this report now!  http://p.sf.net/sfu/ibm-webcastpromo
>>>>> _______________________________________________
>>>>> Postgres-xc-developers mailing list
>>>>> Pos...@li...
>>>>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Best regards,
>>>> Andrei Martsinchyk                   mailto:
>>>> and...@gm...
>>>>
>>>
>>>
>>>
>>> --
>>> Best Wishes,
>>> Ashutosh Bapat
>>> EntepriseDB Corporation
>>> The Enterprise Postgres Company
>>>
>>>
>>
>>
>> --
>> Best regards,
>> Andrei Martsinchyk                   mailto:and...@gm...
>>
>
>
>
> --
> Best Wishes,
> Ashutosh Bapat
> EntepriseDB Corporation
> The Enterprise Postgres Company
>
>

-- 
Best regards,
Andrei Martsinchyk                   mailto:and...@gm...