From: Andrei M. <and...@gm...> - 2011-04-13 07:23:40
|
2011/4/13 Ashutosh Bapat <ash...@en...> > > > On Tue, Apr 12, 2011 at 6:42 PM, Andrei Martsinchyk < > and...@gm...> wrote: > >> >> >> 2011/4/12 Ashutosh Bapat <ash...@en...> >> >>> >>> >>> On Tue, Apr 12, 2011 at 5:37 PM, Andrei Martsinchyk < >>> and...@gm...> wrote: >>> >>>> >>>> XC aggregate is implemented by three functions: first function is >>>> invoked on data node once per row and added argument to internal accumulated >>>> value, which is sent to coordinator after group is processed, second >>>> function is invoked on coordinator once per received pre-aggregated row and >>>> combine pre-aggregated values together, third function is invoked on >>>> coordinator per group and convert accumulated value to aggregation result. >>>> Regarding sum() and count(), they have to perform typecast on final >>>> step. >>>> In Postgres sum(int4)int8, accumulating function is defined like >>>> sum_agg(int8, int4):int8; in XC this function performs pre-aggregation. >>>> Combining function is sum_agg(numeric, int8):numeric, Postgres does not have >>>> sum_agg(int8, int8):int8. So XC has to convert numeric to int8 to return >>>> value of declared type, while in Postgres accumulated value can be returned >>>> without conversion. >>>> >>> >>> The plain PG pg_aggregate entries for sum look like >>> postgres=# select * from pg_aggregate where aggfnoid in (select oid from >>> pg_proc where proname = 'sum'); >>> aggfnoid | aggtransfn | aggcollectfn | aggfinalfn | >>> aggsortop | aggtranstype | aggcollecttype | agginitval | agginitcollect >>> >>> ----------------+-------------+--------------+-----------------+-----------+--------------+----------------+------------+---------------- >>> pg_catalog.sum | int8_sum | numeric_add | - | >>> 0 | 1700 | 1700 | | >>> pg_catalog.sum | int4_sum | int8_sum | pg_catalog.int8 | >>> 0 | 20 | 1700 | | >>> pg_catalog.sum | int2_sum | int8_sum | pg_catalog.int8 | >>> 0 | 20 | 1700 | | >>> >>> >>> And the PGXC entries look like >>> testdb=# select * from pg_aggregate where aggfnoid in (select oid from >>> pg_proc where proname = 'sum'); >>> aggfnoid | aggtransfn | aggfinalfn | aggsortop | aggtranstype | >>> agginitval >>> >>> ----------------+-------------+------------+-----------+--------------+------------ >>> pg_catalog.sum | int8_sum | - | 0 | 1700 | >>> pg_catalog.sum | int4_sum | - | 0 | 20 | >>> pg_catalog.sum | int2_sum | - | 0 | 20 | >>> >>> In PG, the sum of integers all result in int8, whereas in PGXC they >>> result into numeric and casted back to int8. May be we should use a new >>> function int8_sum(int8, int8):int8 instead of int8_sum(numeric, >>> int8):numeric. That way we don't need any final function for sum, just like >>> PG. >>> >>> >> Yes, it is possible, I added new functions for some aggregates. But it >> works with existing functions already. >> If something is broken and aggregates do not work as expected the >> workaround will help with sum() and count(), but other aggregates where >> final function is required won't work. >> The root cause should be fixed. >> > > I think, the return type of collection function and transition function > should be same so, that the final function would work in both the cases 1. > final function applied directly over the transition function result - cases > when aggregates can not be pushed to data nodes > > 2. final function is applied over the collection function result - cases > when aggregates are pushed to the data nodes. In first case the aggregates > and group by work the same way as plain PG (applying aggregates row by row) > at coordinator. In second case the aggregates and group by work the XC way > (applying aggregates over the aggregation results of datanodes) at > coordinators. In both the case datanodes won't apply final function but > return the final results of transition functions. > > In all other aggregates except where we see difference between PG and XC, > this method will work since the final function is same in both cases (so > collection function and transition function should have same result types). > Only the differing cases viz. array_agg, string_aggs, count and sum will > need changes in collection function. > > The transition type and collection type are different for some built-in aggregate functions and may be different for user-defined aggregates. If you require them the same you can skip calling collection function if there is result from only one node. I think this improvement too small to limit flexibility. In particular, if aggregation occurs on coordinator only, it is not too much overhead to invoke collection function before final function - collection function would be trivial in that case, just copy the transition value. > >> >>> This will help us set the finalfn_oid in ExecInitAgg() and we will have >>> group by running (albeit slow). This has another impact. The plain >>> aggregates (without any group by) with JOINs are not working currently. For >>> example, query postgres=# select avg(emp.val * dept.val) from emp, dept; >>> returns 0 (1 row with value 0) even if there is some non-zero data in those >>> tables. This is because we do not set finalfn_oid in ExecInitAgg() and the >>> tree for above query looks like >>> AggState(NestedLoop(RemoteQuery (select val from emp), RemoteQuery(select >>> val from dept))). Thus the aggregate is not pushed down to the data node. >>> While finalising the aggregate result, it does not find finalfnoid and thus >>> returns false results. If we can set finalfnoid as done in the attached >>> patch, we will get the group by running albeit suboptimally. >>> >>> >> Aggregates are used to work. Probably something got broken during merge >> with Postgres 9.0.3. >> I have not looked into the latest code, so it is hard to guess what is >> wrong. I will try to find time to take a look. >> >> >>> Any thoughts? >>> >>> I guess in your code one of aggregation steps is missing. >>>> Hope this helps. >>>> >>>> 2011/4/12 Ashutosh Bapat <ash...@en...> >>>> >>>>> Hi, >>>>> I took outputs of query "select aggfnoid, aggfinalfn from pg_aggregate >>>>> where aggfinalfn != 0;" against plain postgres and PGXC. It showed following >>>>> difference >>>>> [ashutosh@anand PG_HEAD]diff /tmp/pgxc_aggfinalfn.out >>>>> /tmp/pg_aggfinalfn.out >>>>> 10,13d9 >>>>> < pg_catalog.sum | pg_catalog.int8 >>>>> < pg_catalog.sum | pg_catalog.int8 >>>>> < pg_catalog.count | pg_catalog.int8 >>>>> < pg_catalog.count | pg_catalog.int8 >>>>> 50d45 >>>>> < regr_count | pg_catalog.int8 >>>>> 62c57,59 >>>>> < (59 rows) >>>>> --- >>>>> > array_agg | array_agg_finalfn >>>>> > string_agg | string_agg_finalfn >>>>> > (56 rows) >>>>> >>>>> >>>>> XC has final functions set for aggregates sum, count whereas plain >>>>> postgres has those. Plain postgres has the final functions for array_agg and >>>>> string_agg but XC does not have those. Why is this difference? >>>>> >>>>> As of now, in XC, for GROUP BY queries, the coordinators receives plain >>>>> data from data nodes, stripped of any aggregates or GROUP BY clause. I was >>>>> trying to use PG mechanism to calculate the aggregates (so as to enable >>>>> group by clauses quickly). It worked for AVG, but for sum it ended up >>>>> calling numeric_int8() because of above entries which hit segfault since the >>>>> data passed to it is not numeric. In that case, it's important to know >>>>> whether those differences are important. NOTE: This won't be the final >>>>> version of GROUP BY support. I am trying to design it such a way that we can >>>>> push GROUP BY down to datanodes. >>>>> >>>>> The changes are added by commit 8326f619. >>>>> >>>>> -- >>>>> Best Wishes, >>>>> Ashutosh Bapat >>>>> EntepriseDB Corporation >>>>> The Enterprise Postgres Company >>>>> >>>>> >>>>> >>>>> ------------------------------------------------------------------------------ >>>>> Forrester Wave Report - Recovery time is now measured in hours and >>>>> minutes >>>>> not days. Key insights are discussed in the 2010 Forrester Wave Report >>>>> as >>>>> part of an in-depth evaluation of disaster recovery service providers. >>>>> Forrester found the best-in-class provider in terms of services and >>>>> vision. >>>>> Read this report now! http://p.sf.net/sfu/ibm-webcastpromo >>>>> _______________________________________________ >>>>> Postgres-xc-developers mailing list >>>>> Pos...@li... >>>>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >>>>> >>>>> >>>> >>>> >>>> -- >>>> Best regards, >>>> Andrei Martsinchyk mailto: >>>> and...@gm... >>>> >>> >>> >>> >>> -- >>> Best Wishes, >>> Ashutosh Bapat >>> EntepriseDB Corporation >>> The Enterprise Postgres Company >>> >>> >> >> >> -- >> Best regards, >> Andrei Martsinchyk mailto:and...@gm... >> > > > > -- > Best Wishes, > Ashutosh Bapat > EntepriseDB Corporation > The Enterprise Postgres Company > > -- Best regards, Andrei Martsinchyk mailto:and...@gm... |