postgres-xc-developers Mailing List for Postgres-XC

Brought to you by: ahsanhadi, amitdkhan, ashutoshbapat, gabbasb, and 3 others

postgres-xc-developers — Postgres-XC hackers and developers

You can subscribe to this list here.

2010	Jan	Feb	Mar	Apr (10)	May (17)	Jun (3)	Jul	Aug	Sep (8)	Oct (18)	Nov (51)	Dec (74)
2011	Jan (47)	Feb (44)	Mar (44)	Apr (102)	May (35)	Jun (25)	Jul (56)	Aug (69)	Sep (32)	Oct (37)	Nov (31)	Dec (16)
2012	Jan (34)	Feb (127)	Mar (218)	Apr (252)	May (80)	Jun (137)	Jul (205)	Aug (159)	Sep (35)	Oct (50)	Nov (82)	Dec (52)
2013	Jan (107)	Feb (159)	Mar (118)	Apr (163)	May (151)	Jun (89)	Jul (106)	Aug (177)	Sep (49)	Oct (63)	Nov (46)	Dec (7)
2014	Jan (65)	Feb (128)	Mar (40)	Apr (11)	May (4)	Jun (8)	Jul (16)	Aug (11)	Sep (4)	Oct (1)	Nov (5)	Dec (16)
2015	Jan (5)	Feb	Mar (2)	Apr (5)	May (4)	Jun (12)	Jul	Aug	Sep	Oct	Nov	Dec (4)
2019	Jan	Feb	Mar	Apr	May	Jun	Jul (2)	Aug	Sep	Oct	Nov	Dec

S	M	T	W	T	F	S
					1 (12)	2 (2)
3 (1)	4 (6)	5 (11)	6 (4)	7 (6)	8 (5)	9
10	11 (1)	12 (6)	13 (23)	14 (5)	15 (5)	16
17	18 (1)	19	20 (2)	21 (2)	22 (4)	23
24	25	26	27 (3)	28 (1)	29 (2)	30

Flat | Threaded

[Postgres-xc-developers] isolation level: Support for serializable done !?

From: Michael P. <mic...@gm...> - 2011-04-12 23:55:19

Hi all,

I just noticed one thing.
It looks that since I added support for session parameters, the isolation
level serializable looks to be supported now in XC.

Example for read-committed:
Session 1 connected to Coordinator 1:
Session 2 connected to Coordinator 2:
User 1:
begin;

User 2:
create table aa (a int);
insert into table aa values (1),(2);

User 1:
select count(*) from aa;
 => result = 2

User 2:
insert into table aa values (3),(4);

User 1:
select count(*) from aa;
 => result = 4
commit;
As far everything is normal, default is read committed and read-committed
means that a transaction can see the results of committed transactions when
being run.

Example for serializable:
User 1:
begin;
set transaction isolation level serializable;

User 2:
create table aa (a int);
insert into table aa values (1),(2);

User 1:
select count(*) from aa;
 => result = 2

User 2:
insert into table aa values (3),(4);

User 1:
select count(*) from aa;
 => result = 2
commit;
Also correct, with serializable, an open transaction cannot see the results
of transactions that committed after it began.

Am I missing something, gentlemen?
May there be some snapshot feed issue?
-- 
Thanks,

Michael Paquier
http://michael.otacoo.com

Re: [Postgres-xc-developers] aggfinalfn differs from plain postgres for some aggregates

From: Andrei M. <and...@gm...> - 2011-04-12 13:12:36

2011/4/12 Ashutosh Bapat <ash...@en...>

>
>
> On Tue, Apr 12, 2011 at 5:37 PM, Andrei Martsinchyk <
> and...@gm...> wrote:
>
>>
>> XC aggregate is implemented by three functions: first function is invoked
>> on data node once per row and added argument to internal accumulated value,
>> which is sent to coordinator after group is processed, second function is
>> invoked on coordinator once per received pre-aggregated row and combine
>> pre-aggregated values together, third function is invoked on coordinator per
>> group and convert accumulated value to aggregation result.
>> Regarding sum() and count(), they have to perform typecast on final step.
>> In Postgres sum(int4)int8, accumulating function is defined like
>> sum_agg(int8, int4):int8; in XC this function performs pre-aggregation.
>> Combining function is sum_agg(numeric, int8):numeric, Postgres does not have
>> sum_agg(int8, int8):int8. So XC has to convert numeric to int8 to return
>> value of declared type, while in Postgres accumulated value can be returned
>> without conversion.
>>
>
> The plain PG pg_aggregate entries for sum look like
> postgres=# select * from pg_aggregate where aggfnoid in (select oid from
> pg_proc where proname = 'sum');
>     aggfnoid    | aggtransfn  | aggcollectfn |   aggfinalfn    | aggsortop
> | aggtranstype | aggcollecttype | agginitval | agginitcollect
>
> ----------------+-------------+--------------+-----------------+-----------+--------------+----------------+------------+----------------
>  pg_catalog.sum | int8_sum    | numeric_add  | -               |         0
> |         1700 |           1700 |            |
>  pg_catalog.sum | int4_sum    | int8_sum     | pg_catalog.int8 |         0
> |           20 |           1700 |            |
>  pg_catalog.sum | int2_sum    | int8_sum     | pg_catalog.int8 |         0
> |           20 |           1700 |            |
>
>
> And the PGXC entries look like
> testdb=# select * from pg_aggregate where aggfnoid in (select oid from
> pg_proc where proname = 'sum');
>     aggfnoid    | aggtransfn  | aggfinalfn | aggsortop | aggtranstype |
> agginitval
>
> ----------------+-------------+------------+-----------+--------------+------------
>  pg_catalog.sum | int8_sum    | -          |         0 |         1700 |
>  pg_catalog.sum | int4_sum    | -          |         0 |           20 |
>  pg_catalog.sum | int2_sum    | -          |         0 |           20 |
>
> In PG, the sum of integers all result in int8, whereas in PGXC they result
> into numeric and casted back to int8. May be we should use a new function
> int8_sum(int8, int8):int8 instead of int8_sum(numeric, int8):numeric. That
> way we don't need any final function for sum, just like PG.
>
>
Yes, it is possible, I added new functions for some aggregates. But it works
with existing functions already.
If something is broken and aggregates do not work as expected the workaround
will help with sum() and count(), but other aggregates where final function
is required won't work.
The root cause should be fixed.


> This will help us set the finalfn_oid in ExecInitAgg() and we will have
> group by running (albeit slow). This has another impact. The plain
> aggregates (without any group by) with JOINs are not working currently. For
> example, query postgres=# select avg(emp.val * dept.val) from emp, dept;
> returns 0 (1 row with value 0) even if there is some non-zero data in those
> tables. This is because we do not set finalfn_oid in ExecInitAgg() and the
> tree for above query looks like
> AggState(NestedLoop(RemoteQuery (select val from emp), RemoteQuery(select
> val from dept))). Thus the aggregate is not pushed down to the data node.
> While finalising the aggregate result, it does not find finalfnoid and thus
> returns false results. If we can set finalfnoid as done in the attached
> patch, we will get the group by running albeit suboptimally.
>
>
Aggregates are used to work. Probably something got broken during merge with
Postgres 9.0.3.
I have not looked into the latest code, so it is hard to guess what is
wrong. I will try to find time to take a look.


> Any thoughts?
>
> I guess in your code one of aggregation steps is missing.
>> Hope this helps.
>>
>> 2011/4/12 Ashutosh Bapat <ash...@en...>
>>
>>> Hi,
>>> I took outputs of query "select aggfnoid, aggfinalfn from pg_aggregate
>>> where aggfinalfn != 0;" against plain postgres and PGXC. It showed following
>>> difference
>>> [ashutosh@anand PG_HEAD]diff /tmp/pgxc_aggfinalfn.out
>>> /tmp/pg_aggfinalfn.out
>>> 10,13d9
>>> <  pg_catalog.sum         | pg_catalog.int8
>>> <  pg_catalog.sum         | pg_catalog.int8
>>> <  pg_catalog.count       | pg_catalog.int8
>>> <  pg_catalog.count       | pg_catalog.int8
>>> 50d45
>>> <  regr_count             | pg_catalog.int8
>>> 62c57,59
>>> < (59 rows)
>>> ---
>>> >  array_agg              | array_agg_finalfn
>>> >  string_agg             | string_agg_finalfn
>>> > (56 rows)
>>>
>>>
>>> XC has final functions set for aggregates sum, count whereas plain
>>> postgres has those. Plain postgres has the final functions for array_agg and
>>> string_agg but XC does not have those. Why is this difference?
>>>
>>> As of now, in XC, for GROUP BY queries, the coordinators receives plain
>>> data from data nodes, stripped of any aggregates or GROUP BY clause. I was
>>> trying to use PG mechanism to calculate the aggregates (so as to enable
>>> group by clauses quickly). It worked for AVG, but for sum it ended up
>>> calling numeric_int8() because of above entries which hit segfault since the
>>> data passed to it is not numeric. In that case, it's important to know
>>> whether those differences are important. NOTE: This won't be the final
>>> version of GROUP BY support. I am trying to design it such a way that we can
>>> push GROUP BY down to datanodes.
>>>
>>> The changes are added by commit 8326f619.
>>>
>>> --
>>> Best Wishes,
>>> Ashutosh Bapat
>>> EntepriseDB Corporation
>>> The Enterprise Postgres Company
>>>
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> Forrester Wave Report - Recovery time is now measured in hours and
>>> minutes
>>> not days. Key insights are discussed in the 2010 Forrester Wave Report as
>>> part of an in-depth evaluation of disaster recovery service providers.
>>> Forrester found the best-in-class provider in terms of services and
>>> vision.
>>> Read this report now!  http://p.sf.net/sfu/ibm-webcastpromo
>>> _______________________________________________
>>> Postgres-xc-developers mailing list
>>> Pos...@li...
>>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
>>>
>>>
>>
>>
>> --
>> Best regards,
>> Andrei Martsinchyk                   mailto:and...@gm...
>>
>
>
>
> --
> Best Wishes,
> Ashutosh Bapat
> EntepriseDB Corporation
> The Enterprise Postgres Company
>
>


-- 
Best regards,
Andrei Martsinchyk                   mailto:and...@gm...

Re: [Postgres-xc-developers] aggfinalfn differs from plain postgres for some aggregates

From: Ashutosh B. <ash...@en...> - 2011-04-12 12:48:26

Attachments: xc_group_by.patch

On Tue, Apr 12, 2011 at 5:37 PM, Andrei Martsinchyk <
and...@gm...> wrote:

>
> XC aggregate is implemented by three functions: first function is invoked
> on data node once per row and added argument to internal accumulated value,
> which is sent to coordinator after group is processed, second function is
> invoked on coordinator once per received pre-aggregated row and combine
> pre-aggregated values together, third function is invoked on coordinator per
> group and convert accumulated value to aggregation result.
> Regarding sum() and count(), they have to perform typecast on final step.
> In Postgres sum(int4)int8, accumulating function is defined like
> sum_agg(int8, int4):int8; in XC this function performs pre-aggregation.
> Combining function is sum_agg(numeric, int8):numeric, Postgres does not have
> sum_agg(int8, int8):int8. So XC has to convert numeric to int8 to return
> value of declared type, while in Postgres accumulated value can be returned
> without conversion.
>

The plain PG pg_aggregate entries for sum look like
postgres=# select * from pg_aggregate where aggfnoid in (select oid from
pg_proc where proname = 'sum');
    aggfnoid    | aggtransfn  | aggcollectfn |   aggfinalfn    | aggsortop |
aggtranstype | aggcollecttype | agginitval | agginitcollect
----------------+-------------+--------------+-----------------+-----------+--------------+----------------+------------+----------------
 pg_catalog.sum | int8_sum    | numeric_add  | -               |         0
|         1700 |           1700 |            |
 pg_catalog.sum | int4_sum    | int8_sum     | pg_catalog.int8 |         0
|           20 |           1700 |            |
 pg_catalog.sum | int2_sum    | int8_sum     | pg_catalog.int8 |         0
|           20 |           1700 |            |


And the PGXC entries look like
testdb=# select * from pg_aggregate where aggfnoid in (select oid from
pg_proc where proname = 'sum');
    aggfnoid    | aggtransfn  | aggfinalfn | aggsortop | aggtranstype |
agginitval
----------------+-------------+------------+-----------+--------------+------------
 pg_catalog.sum | int8_sum    | -          |         0 |         1700 |
 pg_catalog.sum | int4_sum    | -          |         0 |           20 |
 pg_catalog.sum | int2_sum    | -          |         0 |           20 |

In PG, the sum of integers all result in int8, whereas in PGXC they result
into numeric and casted back to int8. May be we should use a new function
int8_sum(int8, int8):int8 instead of int8_sum(numeric, int8):numeric. That
way we don't need any final function for sum, just like PG.

This will help us set the finalfn_oid in ExecInitAgg() and we will have
group by running (albeit slow). This has another impact. The plain
aggregates (without any group by) with JOINs are not working currently. For
example, query postgres=# select avg(emp.val * dept.val) from emp, dept;
returns 0 (1 row with value 0) even if there is some non-zero data in those
tables. This is because we do not set finalfn_oid in ExecInitAgg() and the
tree for above query looks like
AggState(NestedLoop(RemoteQuery (select val from emp), RemoteQuery(select
val from dept))). Thus the aggregate is not pushed down to the data node.
While finalising the aggregate result, it does not find finalfnoid and thus
returns false results. If we can set finalfnoid as done in the attached
patch, we will get the group by running albeit suboptimally.

Any thoughts?

I guess in your code one of aggregation steps is missing.
> Hope this helps.
>
> 2011/4/12 Ashutosh Bapat <ash...@en...>
>
>> Hi,
>> I took outputs of query "select aggfnoid, aggfinalfn from pg_aggregate
>> where aggfinalfn != 0;" against plain postgres and PGXC. It showed following
>> difference
>> [ashutosh@anand PG_HEAD]diff /tmp/pgxc_aggfinalfn.out
>> /tmp/pg_aggfinalfn.out
>> 10,13d9
>> <  pg_catalog.sum         | pg_catalog.int8
>> <  pg_catalog.sum         | pg_catalog.int8
>> <  pg_catalog.count       | pg_catalog.int8
>> <  pg_catalog.count       | pg_catalog.int8
>> 50d45
>> <  regr_count             | pg_catalog.int8
>> 62c57,59
>> < (59 rows)
>> ---
>> >  array_agg              | array_agg_finalfn
>> >  string_agg             | string_agg_finalfn
>> > (56 rows)
>>
>>
>> XC has final functions set for aggregates sum, count whereas plain
>> postgres has those. Plain postgres has the final functions for array_agg and
>> string_agg but XC does not have those. Why is this difference?
>>
>> As of now, in XC, for GROUP BY queries, the coordinators receives plain
>> data from data nodes, stripped of any aggregates or GROUP BY clause. I was
>> trying to use PG mechanism to calculate the aggregates (so as to enable
>> group by clauses quickly). It worked for AVG, but for sum it ended up
>> calling numeric_int8() because of above entries which hit segfault since the
>> data passed to it is not numeric. In that case, it's important to know
>> whether those differences are important. NOTE: This won't be the final
>> version of GROUP BY support. I am trying to design it such a way that we can
>> push GROUP BY down to datanodes.
>>
>> The changes are added by commit 8326f619.
>>
>> --
>> Best Wishes,
>> Ashutosh Bapat
>> EntepriseDB Corporation
>> The Enterprise Postgres Company
>>
>>
>>
>> ------------------------------------------------------------------------------
>> Forrester Wave Report - Recovery time is now measured in hours and minutes
>> not days. Key insights are discussed in the 2010 Forrester Wave Report as
>> part of an in-depth evaluation of disaster recovery service providers.
>> Forrester found the best-in-class provider in terms of services and
>> vision.
>> Read this report now!  http://p.sf.net/sfu/ibm-webcastpromo
>> _______________________________________________
>> Postgres-xc-developers mailing list
>> Pos...@li...
>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
>>
>>
>
>
> --
> Best regards,
> Andrei Martsinchyk                   mailto:and...@gm...
>



-- 
Best Wishes,
Ashutosh Bapat
EntepriseDB Corporation
The Enterprise Postgres Company

Re: [Postgres-xc-developers] aggfinalfn differs from plain postgres for some aggregates

From: Andrei M. <and...@gm...> - 2011-04-12 12:07:19

Hi Ashutosh,

In XC aggregates were changed to perform pre-aggregation on data nodes, that
was done to reduce amount of data sent over network.
If aggregation happened on coordinator each data node would return entire
result set, but instead it returns one row (one row per group, if GROUP BY
exists).
Naturally, definitions of aggregate functions in XC were changed.
Postgres aggregate is implemented by two functions: first function is
invoked once per row and added argument to internal accumulated value,
second function is invoked once per group and convert accumulated value to
aggregation result.
XC aggregate is implemented by three functions: first function is invoked on
data node once per row and added argument to internal accumulated value,
which is sent to coordinator after group is processed, second function is
invoked on coordinator once per received pre-aggregated row and combine
pre-aggregated values together, third function is invoked on coordinator per
group and convert accumulated value to aggregation result.
Regarding sum() and count(), they have to perform typecast on final step.
In Postgres sum(int4)int8, accumulating function is defined like
sum_agg(int8, int4):int8; in XC this function performs pre-aggregation.
Combining function is sum_agg(numeric, int8):numeric, Postgres does not have
sum_agg(int8, int8):int8. So XC has to convert numeric to int8 to return
value of declared type, while in Postgres accumulated value can be returned
without conversion.
I guess in your code one of aggregation steps is missing.
Hope this helps.

2011/4/12 Ashutosh Bapat <ash...@en...>

> Hi,
> I took outputs of query "select aggfnoid, aggfinalfn from pg_aggregate
> where aggfinalfn != 0;" against plain postgres and PGXC. It showed following
> difference
> [ashutosh@anand PG_HEAD]diff /tmp/pgxc_aggfinalfn.out
> /tmp/pg_aggfinalfn.out
> 10,13d9
> <  pg_catalog.sum         | pg_catalog.int8
> <  pg_catalog.sum         | pg_catalog.int8
> <  pg_catalog.count       | pg_catalog.int8
> <  pg_catalog.count       | pg_catalog.int8
> 50d45
> <  regr_count             | pg_catalog.int8
> 62c57,59
> < (59 rows)
> ---
> >  array_agg              | array_agg_finalfn
> >  string_agg             | string_agg_finalfn
> > (56 rows)
>
>
> XC has final functions set for aggregates sum, count whereas plain postgres
> has those. Plain postgres has the final functions for array_agg and
> string_agg but XC does not have those. Why is this difference?
>
> As of now, in XC, for GROUP BY queries, the coordinators receives plain
> data from data nodes, stripped of any aggregates or GROUP BY clause. I was
> trying to use PG mechanism to calculate the aggregates (so as to enable
> group by clauses quickly). It worked for AVG, but for sum it ended up
> calling numeric_int8() because of above entries which hit segfault since the
> data passed to it is not numeric. In that case, it's important to know
> whether those differences are important. NOTE: This won't be the final
> version of GROUP BY support. I am trying to design it such a way that we can
> push GROUP BY down to datanodes.
>
> The changes are added by commit 8326f619.
>
> --
> Best Wishes,
> Ashutosh Bapat
> EntepriseDB Corporation
> The Enterprise Postgres Company
>
>
>
> ------------------------------------------------------------------------------
> Forrester Wave Report - Recovery time is now measured in hours and minutes
> not days. Key insights are discussed in the 2010 Forrester Wave Report as
> part of an in-depth evaluation of disaster recovery service providers.
> Forrester found the best-in-class provider in terms of services and vision.
> Read this report now!  http://p.sf.net/sfu/ibm-webcastpromo
> _______________________________________________
> Postgres-xc-developers mailing list
> Pos...@li...
> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
>
>


-- 
Best regards,
Andrei Martsinchyk                   mailto:and...@gm...

[Postgres-xc-developers] aggfinalfn differs from plain postgres for some aggregates

From: Ashutosh B. <ash...@en...> - 2011-04-12 10:54:20

Hi,
I took outputs of query "select aggfnoid, aggfinalfn from pg_aggregate where
aggfinalfn != 0;" against plain postgres and PGXC. It showed following
difference
[ashutosh@anand PG_HEAD]diff /tmp/pgxc_aggfinalfn.out /tmp/pg_aggfinalfn.out

10,13d9
<  pg_catalog.sum         | pg_catalog.int8
<  pg_catalog.sum         | pg_catalog.int8
<  pg_catalog.count       | pg_catalog.int8
<  pg_catalog.count       | pg_catalog.int8
50d45
<  regr_count             | pg_catalog.int8
62c57,59
< (59 rows)
---
>  array_agg              | array_agg_finalfn
>  string_agg             | string_agg_finalfn
> (56 rows)


XC has final functions set for aggregates sum, count whereas plain postgres
has those. Plain postgres has the final functions for array_agg and
string_agg but XC does not have those. Why is this difference?

As of now, in XC, for GROUP BY queries, the coordinators receives plain data
from data nodes, stripped of any aggregates or GROUP BY clause. I was trying
to use PG mechanism to calculate the aggregates (so as to enable group by
clauses quickly). It worked for AVG, but for sum it ended up calling
numeric_int8() because of above entries which hit segfault since the data
passed to it is not numeric. In that case, it's important to know whether
those differences are important. NOTE: This won't be the final version of
GROUP BY support. I am trying to design it such a way that we can push GROUP
BY down to datanodes.

The changes are added by commit 8326f619.

-- 
Best Wishes,
Ashutosh Bapat
EntepriseDB Corporation
The Enterprise Postgres Company

1 message has been excluded from this view by a project administrator.

Flat | Threaded

S	M	T	W	T	F	S
					1 (12)	2 (2)
3 (1)	4 (6)	5 (11)	6 (4)	7 (6)	8 (5)	9
10	11 (1)	12 (6)	13 (23)	14 (5)	15 (5)	16
17	18 (1)	19	20 (2)	21 (2)	22 (4)	23
24	25	26	27 (3)	28 (1)	29 (2)	30