You can subscribe to this list here.
2010 |
Jan
|
Feb
|
Mar
|
Apr
(10) |
May
(17) |
Jun
(3) |
Jul
|
Aug
|
Sep
(8) |
Oct
(18) |
Nov
(51) |
Dec
(74) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2011 |
Jan
(47) |
Feb
(44) |
Mar
(44) |
Apr
(102) |
May
(35) |
Jun
(25) |
Jul
(56) |
Aug
(69) |
Sep
(32) |
Oct
(37) |
Nov
(31) |
Dec
(16) |
2012 |
Jan
(34) |
Feb
(127) |
Mar
(218) |
Apr
(252) |
May
(80) |
Jun
(137) |
Jul
(205) |
Aug
(159) |
Sep
(35) |
Oct
(50) |
Nov
(82) |
Dec
(52) |
2013 |
Jan
(107) |
Feb
(159) |
Mar
(118) |
Apr
(163) |
May
(151) |
Jun
(89) |
Jul
(106) |
Aug
(177) |
Sep
(49) |
Oct
(63) |
Nov
(46) |
Dec
(7) |
2014 |
Jan
(65) |
Feb
(128) |
Mar
(40) |
Apr
(11) |
May
(4) |
Jun
(8) |
Jul
(16) |
Aug
(11) |
Sep
(4) |
Oct
(1) |
Nov
(5) |
Dec
(16) |
2015 |
Jan
(5) |
Feb
|
Mar
(2) |
Apr
(5) |
May
(4) |
Jun
(12) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(4) |
2019 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(2) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
S | M | T | W | T | F | S |
---|---|---|---|---|---|---|
|
|
|
|
|
1
(12) |
2
(4) |
3
|
4
(17) |
5
(2) |
6
(5) |
7
(5) |
8
(23) |
9
|
10
(1) |
11
|
12
(2) |
13
|
14
|
15
|
16
|
17
|
18
(3) |
19
(1) |
20
(3) |
21
(10) |
22
(2) |
23
|
24
(1) |
25
(4) |
26
(8) |
27
(5) |
28
|
29
(3) |
30
(6) |
31
(1) |
|
|
|
|
|
|
From: Ashutosh B. <ash...@en...> - 2013-03-18 10:12:02
|
Ok, I think it's better to leave distributed as distributed and handle each separately. On Mon, Mar 18, 2013 at 2:02 PM, Amit Khandekar < ami...@en...> wrote: > > > On 8 March 2013 14:00, Ashutosh Bapat <ash...@en...>wrote: > >> Hi Amit, >> Please find my replies inlined, >> >> >> >>> I think the logic of shippability of outer joins is flawless. Didn't >>> find any holes. Patch comments below : >>> >>> ------- >>> >>> In case of distributed equi-join case, why is >>> IsExecNodesColumnDistributed() used instead of >>> IsExecNodesDistributedByValue() ? We want to always rule out the round >>> robin case, no ? I can see that pgxc_find_dist_equijoin_qual() will >>> always fail for round robin tables because they won't have any distrib >>> columns, but still , just curious ... >>> >>> >> It keeps open the possibility that we will be able to ship equi-join if >> we can somehow infer that the rows from both the sides of join, >> participating in the result of join are collocated. >> >> >>> ------- >>> >>> * PGXC_TODO: What do we do when baselocatortype is >>> * LOCATOR_TYPE_DISTRIBUTED? It could be anything HASH >>> distributed or >>> * MODULO distributed. In that case, having equi-join >>> doesn't work >>> * really, because same value from different relation >>> will go to >>> * different node. >>> >>> The above comment says that it does not work if one of the tables is >>> distributed by hash and other table is distributed by modulo. But the >>> code is actually checking the baselocatortype also, so I guess it >>> works correctly after all ? I did not get what is the TODO here. Or >>> does it mean this ? : >>> For (t1_hash join t2_hash on ...) tj1 join (t1_mod join t2_mod on ...) >>> tj2 on tj1.col1 = tj2.col4 >>> the merged nodes for tj1 will have LOCATOR_TYPE_DISTRIBUTED, and the >>> merged nodes for tj2 will also be LOCATOR_TYPE_DISTRIBUTED, and so tj1 >>> join tj2 would be wrongly marked shippable even though they should not >>> be shippable because of the mix of hash and modulo ? >>> >>> >> That's correct. This should be taken care by my second patch up for >> review. I think with that patch, we won't need LOCATOR_TYPE_DISTRIBUTED. >> While reviewing that patch, can you please also review if this is true. >> >> >>> ------- >>> >>> Is pgxc_is_expr_shippable(equi_join_expr) necessary ? Won't this qual >>> be examined in is_query_shippable() walker ? >>> >> >> This code will get executed in standard_planner() as well, so it's >> possible that some of the join quals will be shippable and some are not. >> While this is fine for an inner join, we want to make sure the a qual which >> implies collocation of rows is shippable. This check is more from future >> extension perspective than anything else. >> >> > > Ok. Understood all the comments above. > > >> >>> -------- >>> >>> If both tables reside on a single datanode, every join case should be >>> shippable, which doesn't seem to be happening : >>> postgres=# create table tab2 (id2 int, v varchar) distribute by >>> replication to node (datanode_1); >>> postgres=# create table tab1 (id1 int, v varchar) to node (datanode_1); >>> postgres=# explain select * from (tab1 full outer join tab2 on id1 = id2 >>> ) ; >>> QUERY PLAN >>> >>> ------------------------------------------------------------------------------------------------- >>> Hash Full Join (cost=0.12..0.26 rows=10 width=72) >>> Hash Cond: (tab1.id1 = tab2.id2) >>> -> Data Node Scan on tab1 "_REMOTE_TABLE_QUERY_" (cost=0.00..0.00 >>> rows=1000 width=36) >>> Node/s: datanode_1 >>> -> Hash (cost=0.00..0.00 rows=1000 width=36) >>> -> Data Node Scan on tab2 "_REMOTE_TABLE_QUERY_" >>> (cost=0.00..0.00 rows=1000 width=36) >>> Node/s: datanode_1 >>> >>> Probably you need to take out the following statement out of the >>> distributed case and apply it as a general rule: >>> /* If there is only single node, try merging the nodes */ >>> if (list_length(inner_en->nodeList) == 1 && >>> list_length(outer_en->nodeList) == 1) >>> merge_nodes = true; >>> >>> >> I am thinking about this and actually thought that we should mark a >> single node ExecNodes as REPLICATED, so that it doesn't need any special >> handling. What do you think? >> > > I am concerned about loss of information that the underlying table is > actually distributed. Also, there is a function > IsReturningDMLOnReplicatedTable() which is using this information, although > not sure how much it's making use of that information. I leave that to you > for deciding which option to choose. I personally feel it's always good to > be explicit while checking for this condition. > > >> >> >>> >>> >>> -- >>> >>> Best Wishes, >>> >>> Ashutosh Bapat >>> >>> EntepriseDB Corporation >>> >>> The Enterprise Postgres Company >>> >> >>> >> >>> >> >>> >> >>> >> -- >>> >> Best Wishes, >>> >> Ashutosh Bapat >>> >> EntepriseDB Corporation >>> >> The Enterprise Postgres Company >>> > >>> > >>> > >>> > >>> > -- >>> > Best Wishes, >>> > Ashutosh Bapat >>> > EntepriseDB Corporation >>> > The Enterprise Postgres Company >>> > >>> > >>> ------------------------------------------------------------------------------ >>> > Free Next-Gen Firewall Hardware Offer >>> > Buy your Sophos next-gen firewall before the end March 2013 >>> > and get the hardware for free! Learn more. >>> > http://p.sf.net/sfu/sophos-d2d-feb >>> > _______________________________________________ >>> > Postgres-xc-developers mailing list >>> > Pos...@li... >>> > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >>> > >>> >> >> >> >> -- >> Best Wishes, >> Ashutosh Bapat >> EntepriseDB Corporation >> The Enterprise Postgres Company >> > > -- Best Wishes, Ashutosh Bapat EntepriseDB Corporation The Enterprise Postgres Company |
From: Amit K. <ami...@en...> - 2013-03-18 10:03:27
|
On 8 March 2013 14:00, Ashutosh Bapat <ash...@en...>wrote: > Hi Amit, > Please find my replies inlined, > > > >> I think the logic of shippability of outer joins is flawless. Didn't >> find any holes. Patch comments below : >> >> ------- >> >> In case of distributed equi-join case, why is >> IsExecNodesColumnDistributed() used instead of >> IsExecNodesDistributedByValue() ? We want to always rule out the round >> robin case, no ? I can see that pgxc_find_dist_equijoin_qual() will >> always fail for round robin tables because they won't have any distrib >> columns, but still , just curious ... >> >> > It keeps open the possibility that we will be able to ship equi-join if we > can somehow infer that the rows from both the sides of join, participating > in the result of join are collocated. > > >> ------- >> >> * PGXC_TODO: What do we do when baselocatortype is >> * LOCATOR_TYPE_DISTRIBUTED? It could be anything HASH >> distributed or >> * MODULO distributed. In that case, having equi-join >> doesn't work >> * really, because same value from different relation >> will go to >> * different node. >> >> The above comment says that it does not work if one of the tables is >> distributed by hash and other table is distributed by modulo. But the >> code is actually checking the baselocatortype also, so I guess it >> works correctly after all ? I did not get what is the TODO here. Or >> does it mean this ? : >> For (t1_hash join t2_hash on ...) tj1 join (t1_mod join t2_mod on ...) >> tj2 on tj1.col1 = tj2.col4 >> the merged nodes for tj1 will have LOCATOR_TYPE_DISTRIBUTED, and the >> merged nodes for tj2 will also be LOCATOR_TYPE_DISTRIBUTED, and so tj1 >> join tj2 would be wrongly marked shippable even though they should not >> be shippable because of the mix of hash and modulo ? >> >> > That's correct. This should be taken care by my second patch up for > review. I think with that patch, we won't need LOCATOR_TYPE_DISTRIBUTED. > While reviewing that patch, can you please also review if this is true. > > >> ------- >> >> Is pgxc_is_expr_shippable(equi_join_expr) necessary ? Won't this qual >> be examined in is_query_shippable() walker ? >> > > This code will get executed in standard_planner() as well, so it's > possible that some of the join quals will be shippable and some are not. > While this is fine for an inner join, we want to make sure the a qual which > implies collocation of rows is shippable. This check is more from future > extension perspective than anything else. > > Ok. Understood all the comments above. > >> -------- >> >> If both tables reside on a single datanode, every join case should be >> shippable, which doesn't seem to be happening : >> postgres=# create table tab2 (id2 int, v varchar) distribute by >> replication to node (datanode_1); >> postgres=# create table tab1 (id1 int, v varchar) to node (datanode_1); >> postgres=# explain select * from (tab1 full outer join tab2 on id1 = id2 >> ) ; >> QUERY PLAN >> >> ------------------------------------------------------------------------------------------------- >> Hash Full Join (cost=0.12..0.26 rows=10 width=72) >> Hash Cond: (tab1.id1 = tab2.id2) >> -> Data Node Scan on tab1 "_REMOTE_TABLE_QUERY_" (cost=0.00..0.00 >> rows=1000 width=36) >> Node/s: datanode_1 >> -> Hash (cost=0.00..0.00 rows=1000 width=36) >> -> Data Node Scan on tab2 "_REMOTE_TABLE_QUERY_" >> (cost=0.00..0.00 rows=1000 width=36) >> Node/s: datanode_1 >> >> Probably you need to take out the following statement out of the >> distributed case and apply it as a general rule: >> /* If there is only single node, try merging the nodes */ >> if (list_length(inner_en->nodeList) == 1 && >> list_length(outer_en->nodeList) == 1) >> merge_nodes = true; >> >> > I am thinking about this and actually thought that we should mark a single > node ExecNodes as REPLICATED, so that it doesn't need any special handling. > What do you think? > I am concerned about loss of information that the underlying table is actually distributed. Also, there is a function IsReturningDMLOnReplicatedTable() which is using this information, although not sure how much it's making use of that information. I leave that to you for deciding which option to choose. I personally feel it's always good to be explicit while checking for this condition. > > >> >> >>> -- >> >>> Best Wishes, >> >>> Ashutosh Bapat >> >>> EntepriseDB Corporation >> >>> The Enterprise Postgres Company >> >> >> >> >> >> >> >> >> >> -- >> >> Best Wishes, >> >> Ashutosh Bapat >> >> EntepriseDB Corporation >> >> The Enterprise Postgres Company >> > >> > >> > >> > >> > -- >> > Best Wishes, >> > Ashutosh Bapat >> > EntepriseDB Corporation >> > The Enterprise Postgres Company >> > >> > >> ------------------------------------------------------------------------------ >> > Free Next-Gen Firewall Hardware Offer >> > Buy your Sophos next-gen firewall before the end March 2013 >> > and get the hardware for free! Learn more. >> > http://p.sf.net/sfu/sophos-d2d-feb >> > _______________________________________________ >> > Postgres-xc-developers mailing list >> > Pos...@li... >> > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >> > >> > > > > -- > Best Wishes, > Ashutosh Bapat > EntepriseDB Corporation > The Enterprise Postgres Company > |