You can subscribe to this list here.
2010 |
Jan
|
Feb
|
Mar
|
Apr
(4) |
May
(28) |
Jun
(12) |
Jul
(11) |
Aug
(12) |
Sep
(5) |
Oct
(19) |
Nov
(14) |
Dec
(12) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2011 |
Jan
(18) |
Feb
(30) |
Mar
(115) |
Apr
(89) |
May
(50) |
Jun
(44) |
Jul
(22) |
Aug
(13) |
Sep
(11) |
Oct
(30) |
Nov
(28) |
Dec
(39) |
2012 |
Jan
(38) |
Feb
(18) |
Mar
(43) |
Apr
(91) |
May
(108) |
Jun
(46) |
Jul
(37) |
Aug
(44) |
Sep
(33) |
Oct
(29) |
Nov
(36) |
Dec
(15) |
2013 |
Jan
(35) |
Feb
(611) |
Mar
(5) |
Apr
(55) |
May
(30) |
Jun
(28) |
Jul
(458) |
Aug
(34) |
Sep
(9) |
Oct
(39) |
Nov
(22) |
Dec
(32) |
2014 |
Jan
(16) |
Feb
(16) |
Mar
(42) |
Apr
(179) |
May
(7) |
Jun
(6) |
Jul
(9) |
Aug
|
Sep
(4) |
Oct
|
Nov
(3) |
Dec
|
2015 |
Jan
|
Feb
|
Mar
|
Apr
(2) |
May
(4) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
S | M | T | W | T | F | S |
---|---|---|---|---|---|---|
|
|
|
|
|
1
|
2
|
3
|
4
(2) |
5
|
6
|
7
|
8
|
9
|
10
|
11
|
12
|
13
(3) |
14
|
15
|
16
|
17
|
18
(4) |
19
(1) |
20
|
21
|
22
|
23
|
24
|
25
|
26
(1) |
27
(2) |
28
(6) |
29
|
30
|
31
|
|
|
|
|
|
|
From: Michael P. <mic...@us...> - 2010-10-28 02:43:46
|
Project "website". The branch, master has been updated via 950cd623ce6a04e4a7b1b10b9fca80dfabb10805 (commit) from a8d02a5bfcd0d25ed17425b7d69ed6e3b720a314 (commit) - Log ----------------------------------------------------------------- commit 950cd623ce6a04e4a7b1b10b9fca80dfabb10805 Author: Michael P <mic...@us...> Date: Thu Oct 28 11:44:27 2010 +0900 Addition of Configurator code diff --git a/download.html b/download.html index fb16798..9e1a297 100755 --- a/download.html +++ b/download.html @@ -149,6 +149,15 @@ Description of Postgres-XC cluster-wide configurator. </a> </li> +<li> +<code>pgxc_config_v0_9_3.tar.gz</code>: <br> +Postgres-XC configurator. Written in Ruby and recommended to set up easily a Postgres-XC environment. +⇒ +<a href="https://sourceforge.net/projects/postgres-xc/files/Version_0.9.3/pgxc_config_v0_9_3.tar.gz/download"> +(download) +</a> +</li> + <!-- Architecture Document --> <li> <code>PG-XC_Architecture_v0_9.pdf</code>: <br> ----------------------------------------------------------------------- Summary of changes: download.html | 9 +++++++++ 1 files changed, 9 insertions(+), 0 deletions(-) hooks/post-receive -- website |
From: Michael P. <mic...@us...> - 2010-10-28 02:05:13
|
Project "website". The branch, master has been updated via a8d02a5bfcd0d25ed17425b7d69ed6e3b720a314 (commit) from e7e3e50f49fc4e8aad9b69364b263bb0b8c4b18c (commit) - Log ----------------------------------------------------------------- commit a8d02a5bfcd0d25ed17425b7d69ed6e3b720a314 Author: Michael P <mic...@us...> Date: Thu Oct 28 11:05:33 2010 +0900 Addition of Configurator Manual as document for 0.9.3 release diff --git a/download.html b/download.html index 79dcd2f..fb16798 100755 --- a/download.html +++ b/download.html @@ -140,6 +140,15 @@ SQL restrictions available for Postgres-XC 0.9.3. </a> </li> +<li> +<code>PG-XC_Configurator_v0_9_3.pdf</code>: <br> +Description of Postgres-XC cluster-wide configurator. +⇒ +<a href="https://sourceforge.net/projects/postgres-xc/files/Version_0.9.3/PG-XC_Configurator_v0_9_3.pdf/download"> +(download) +</a> +</li> + <!-- Architecture Document --> <li> <code>PG-XC_Architecture_v0_9.pdf</code>: <br> ----------------------------------------------------------------------- Summary of changes: download.html | 9 +++++++++ 1 files changed, 9 insertions(+), 0 deletions(-) hooks/post-receive -- website |
From: Michael P. <mic...@us...> - 2010-10-28 01:46:25
|
Project "website". The branch, master has been updated via e7e3e50f49fc4e8aad9b69364b263bb0b8c4b18c (commit) via d365e89c8dcd4231fe15b634df6e0b6cb62f73d4 (commit) via eaef610bc5a85efb93ce617e0bf8e19ad70ffc9f (commit) via 3dbebc282d6facdde1589cbf97c4c580a52185dd (commit) from d56fffa616e25f0b5c35e1e517855113a7abbad3 (commit) - Log ----------------------------------------------------------------- commit e7e3e50f49fc4e8aad9b69364b263bb0b8c4b18c Author: Michael P <mic...@us...> Date: Thu Oct 28 10:46:24 2010 +0900 Reformulation in Roadmap diff --git a/roadmap.html b/roadmap.html index 7fb6ab4..50aa661 100755 --- a/roadmap.html +++ b/roadmap.html @@ -53,7 +53,7 @@ SQL Limitations </a> document for further details. There is no support yet for <code>SELECT</code> in <code>FROM</code> clause. </p> -<p>We will be expanding the coverage of supported SQL and High-Availability (HA) features as main guidelines in the coming months.</p> +<p>We will be expanding the coverage of supported SQL and as well High-Availability (HA) features in the coming months.</p> <!-- ==== Planned feature === --> <h3> Upcoming Releases and Features commit d365e89c8dcd4231fe15b634df6e0b6cb62f73d4 Author: Michael P <mic...@us...> Date: Thu Oct 28 10:45:31 2010 +0900 Code cleaning in Roadmap diff --git a/roadmap.html b/roadmap.html index 2ccad3b..7fb6ab4 100755 --- a/roadmap.html +++ b/roadmap.html @@ -32,10 +32,10 @@ At present, Postgres-XC provides major transaction management features similar to PostgreSQL, except for savepoints. </p> <p> -On the other hand, Postgres-XC needs to enhance support for general statements.<br> +On the other hand, Postgres-XC needs to enhance support for general statements.<br /> As of Version 0.9.3, Postgres-XC supports statements which can be executed -on a single data node, or on multiple nodes for single and multi step.<br> -This new version adds support for: +on a single data node, or on multiple nodes for single and multi step.<br /> +This new version adds support for:<br /> - Cursor Support<br /> - Basic cross-node operation<br /> - Global timestamp<br /> @@ -89,20 +89,20 @@ Version 1.0 (Late in December, 2010) </h4> <p class="inner"> -Physical backup/restore incl. PITR<br> -Cross-node oepration optimization<br> -More variety of statements such as <code>SELECT</code> in <code>INSERT</code><br> -Full support Prepared statements and cluster-wide recovery<br> +Physical backup/restore incl. PITR<br /> +Cross-node oepration optimization<br /> +More variety of statements such as <code>SELECT</code> in <code>INSERT</code><br /> +Full support Prepared statements and cluster-wide recovery<br /> HA Capability<br /> -General aggregate functions<br> -Savepoint<br> -Session Parameters<br> -Forward cursor with <code>ORDER BY</code><br> -Backward cursor<br> -Batch, statement pushdown<br> -Global constraints<br> -Tuple relocation (distrubute key update)<br> -Performance improvement <br> +General aggregate functions<br /> +Savepoint<br /> +Session Parameters<br /> +Forward cursor with <code>ORDER BY</code><br /> +Backward cursor<br /> +Batch, statement pushdown<br /> +Global constraints<br /> +Tuple relocation (distrubute key update)<br /> +Performance improvement <br /> Regression tests </p> commit eaef610bc5a85efb93ce617e0bf8e19ad70ffc9f Author: Michael P <mic...@us...> Date: Thu Oct 28 10:42:44 2010 +0900 Version release roadmap updated diff --git a/roadmap.html b/roadmap.html index 4f8802e..2ccad3b 100755 --- a/roadmap.html +++ b/roadmap.html @@ -24,35 +24,36 @@ Postgres-XC Roadmap <!-- ==== Current Limintation ==== --> <h3> -Current Limitation of Postgres-XC +Current Limitations of Postgres-XC </h3> <p> At present, Postgres-XC provides major transaction management features -similar to PostgreSQL, except for two phase commit (2PC) and savepoints. -(XC uses 2PC for internal use). +similar to PostgreSQL, except for savepoints. </p> <p> On the other hand, Postgres-XC needs to enhance support for general statements.<br> -As of Version 0.9.2, Postgres-XC supports statements which can be executed -on a single data node, or on multiple nodes but as a single step.<br> +As of Version 0.9.3, Postgres-XC supports statements which can be executed +on a single data node, or on multiple nodes for single and multi step.<br> This new version adds support for: -- views<br> -- extra DDLs<br> -- ORDER BY/DISTINCT<br> -- pg_dump, pg_restore<br> -- sequence full support with GTM<br> -- basic stored function support.<br> -- Cold synchronization of Coordinator's Catalog files<br> -However there are some limitations please refer to <a href="https://sourceforge.net/projects/postgres-xc/files/Version_0.9.2/PG-XC_SQL_Limitations_v0_9_2.pdf/download" target="_blank"> +- Cursor Support<br /> +- Basic cross-node operation<br /> +- Global timestamp<br /> +- DDL synchronisation<br /> +- Cluster-wide installer<br /> +- Cluster-wide operation utilities<br /> +- Driver support (ECPG, JDBC, PHP, etc.)<br /> +- Extended Query Protocol (for JDBC)<br /> +- Support of external 2PC from application<br /> + +However there are some limitations please refer to <a href="https://sourceforge.net/projects/postgres-xc/files/Version_0.9.3/PG-XC_SQL_Limitations_v0_9_3.pdf/download" target="_blank"> SQL Limitations </a> document for further details. </p> <p> There is no support yet for <code>SELECT</code> in <code>FROM</code> clause. -Support for <code>CURSOR</code> is a future issue too. </p> -<p>We will be expanding the coverage of supported SQL as an item of particular focus in the coming months.</p> +<p>We will be expanding the coverage of supported SQL and High-Availability (HA) features as main guidelines in the coming months.</p> <!-- ==== Planned feature === --> <h3> Upcoming Releases and Features @@ -63,7 +64,7 @@ Current plan of future releases and features are as follows: </p> <!-- ==== For version 0.9.3 ==== --> -<h4> +<!-- <h4> Version 0.9.3 (Late in September, 2010) </h4> @@ -80,7 +81,7 @@ Global timestamp<br> Driver support (ECPG, JDBC, PHP, etc.)<br> Forward Cursor (w/o <code>ORDER BY</code>)<br> subqueries<br> -</p> +</p> --> <!-- ==== For Version 1.0 ==== --> <h4> @@ -91,17 +92,15 @@ Version 1.0 (Late in December, 2010) Physical backup/restore incl. PITR<br> Cross-node oepration optimization<br> More variety of statements such as <code>SELECT</code> in <code>INSERT</code><br> -Prepared statements<br> -General aggregate functions<br> +Full support Prepared statements and cluster-wide recovery<br> +HA Capability<br /> +General aggregate functions<br> Savepoint<br> Session Parameters<br> -2PC from Apps<br> Forward cursor with <code>ORDER BY</code><br> Backward cursor<br> Batch, statement pushdown<br> -Caralog synchronize with DDLs<br> -Trigger<br> -GLobal constraints<br> +Global constraints<br> Tuple relocation (distrubute key update)<br> Performance improvement <br> Regression tests @@ -113,8 +112,9 @@ Beyond Version 1.0 </h4> <p class="inner"> -HA Capability<br> -GTM-Standby<br> +HA Capability<br /> +GTM-Standby<br /> +Trigger<br /> </p> </body> commit 3dbebc282d6facdde1589cbf97c4c580a52185dd Author: Michael P <mic...@us...> Date: Thu Oct 28 10:32:16 2010 +0900 Event page updated with 2010 and 2011 upcoming events diff --git a/events.html b/events.html index 1cb57b5..cc3a412 100755 --- a/events.html +++ b/events.html @@ -13,7 +13,12 @@ --> <h2 class="plain">Events</h2> <p class="plain"> -Upcoming events to be decided soon! +A lot of opportunities to meet the Core developpers!! +<ul> +<li><a href="http://2010.pgday.eu/" target="_blank">PGDay-EU</a> in November 2010</li> +<li>PG-East in March 2011</li> +<li>PG-Con 2010 in May 2011</li> +</ul> </p> <!-- Event title --> @@ -30,10 +35,10 @@ Description of this event. UPDATES --> <h2 class="plain">Updates</h2> -<!-- Postgres-XC 0.9.2 download --> +<!-- Postgres-XC 0.9.3 download --> <p class="plain"> -Postgres-XC 0.9.2 is now available!! Download -<a href="https://sourceforge.net/projects/postgres-xc/files/Version_0.9.2/pgxc_v0_9_2.tar.gz/download" target="_blank"> +Postgres-XC 0.9.3 is now available!! Download +<a href="https://sourceforge.net/projects/postgres-xc/files/Version_0.9.3/pgxc_v0_9_3.tar.gz/download" target="_blank"> here. </a> </p> ----------------------------------------------------------------------- Summary of changes: events.html | 13 +++++++--- roadmap.html | 74 +++++++++++++++++++++++++++++----------------------------- 2 files changed, 46 insertions(+), 41 deletions(-) hooks/post-receive -- website |
From: Michael P. <mic...@us...> - 2010-10-28 01:16:50
|
Project "website". The branch, master has been updated via d56fffa616e25f0b5c35e1e517855113a7abbad3 (commit) from 8c2cfec1e2cf6263a2a1bbea33d09643fb6a942a (commit) - Log ----------------------------------------------------------------- commit d56fffa616e25f0b5c35e1e517855113a7abbad3 Author: Michael P <mic...@us...> Date: Thu Oct 28 10:17:07 2010 +0900 Update download list according to 0.9.3 release documents diff --git a/download.html b/download.html index 9bb550d..79dcd2f 100755 --- a/download.html +++ b/download.html @@ -38,32 +38,32 @@ Please also note tarball files do not include Postgres-XC documents. <!-- Documents of version 0.9.2 --> <h4> -Version 0.9.2 +Version 0.9.3 </h4> <p> <ul> -<!-- tarball of 0.9.2, main download--> +<!-- tarball of 0.9.3, main download--> <li> -<code>pgxc_v0.9.2.tar.gz</code>: <br> +<code>pgxc_v0.9.3.tar.gz</code>: <br> Latest version of Postgres-XC available.<br> Please note that Postgres-XC documentation is not included in this file. ⇒ -<a href="https://sourceforge.net/projects/postgres-xc/files/Version_0.9.2/pgxc_v0_9_2.tar.gz/download" target="_blank"> +<a href="https://sourceforge.net/projects/postgres-xc/files/Version_0.9.3/pgxc_v0_9_3.tar.gz/download" target="_blank"> (download) </a> </li> <!-- tarball (diff) --> <li> -<code>PGXC_v0_9_2-PG_REL8_4_3.patch.gz</code>: <br> +<code>PGXC_v0_9_3-PG_REL8_4_3.patch.gz</code>: <br> The same material as above, but this file includes only the patch to apply to the PostgreSQL 8.4.3 release source code.<br> It is useful if you would like to see just a difference between PostgreSQL and Postgres-XC.<br> No Postgres-XC documentation is included in this file either. ⇒ -<a href="https://sourceforge.net/projects/postgres-xc/files/Version_0.9.2/PGXC_v0_9_2-PG_REL8_4_3.patch.gz/download" target="_blank"> +<a href="https://sourceforge.net/projects/postgres-xc/files/Version_0.9.3/PGXC_v0_9_3-PG_REL8_4_3.patch.gz/download" target="_blank"> (download) </a> </li> @@ -73,7 +73,7 @@ No Postgres-XC documentation is included in this file either. <code>COPYING</code>: <br> License description. Postgres-XC is distributed under LGPL version 2.1 ⇒ -<a href="https://sourceforge.net/projects/postgres-xc/files/Version_0.9.2/COPYING/download" target="_blank"> +<a href="https://sourceforge.net/projects/postgres-xc/files/Version_0.9.3/COPYING/download" target="_blank"> (download) </a> </li> @@ -81,61 +81,61 @@ License description. Postgres-XC is distributed under LGPL version 2.1 <!-- Files --> <li> <code>FILES</code>: <br> -Description of files included in Postgres-XC 0.9.2 release. +Description of files included in Postgres-XC 0.9.3 release. ⇒ -<a href="https://sourceforge.net/projects/postgres-xc/files/Version_0.9.2/FILES/download" target="_blank"> +<a href="https://sourceforge.net/projects/postgres-xc/files/Version_0.9.3/FILES/download" target="_blank"> (download) </a> </li> <!-- Reference Manual --> <li> -<code>PG-XC_ReferenceManual_v0_9_2.pdf</code>: <br> +<code>PG-XC_ReferenceManual_v0_9_3.pdf</code>: <br> Reference of Postgres-XC extension. ⇒ -<a href="https://sourceforge.net/projects/postgres-xc/files/Version_0.9.2/PG-XC_ReferenceManual_v0_9_2.pdf/download" target="_blank"> +<a href="https://sourceforge.net/projects/postgres-xc/files/Version_0.9.3/PG-XC_ReferenceManual_v0_9_3.pdf/download" target="_blank"> (download) </a> </li> <!-- pgbench Tutorial Manual --> <li> -<code>PG-XC_pgbench_Tutorial_v0_9_2.pdf</code>: <br> +<code>PG-XC_pgbench_Tutorial_v0_9_3.pdf</code>: <br> Step by step description how to build and configure pgbench to run with Postgres-XC. ⇒ -<a href="https://sourceforge.net/projects/postgres-xc/files/Version_0.9.2/PG-XC_pgbench_Tutorial_v0_9_2.pdf/download" target="_blank"> +<a href="https://sourceforge.net/projects/postgres-xc/files/Version_0.9.3/PG-XC_pgbench_Tutorial_v0_9_3.pdf/download" target="_blank"> (download) </a> </li> <!-- DBT-1 Tutorial Manual --> <li> -<code>PG-XC_DBT1_Tutorial_v0_9_2.pdf</code>: <br> +<code>PG-XC_DBT1_Tutorial_v0_9_3.pdf</code>: <br> Step by step description how to build and configure DBT-1 to run with Postgres-XC. ⇒ -<a href="https://sourceforge.net/projects/postgres-xc/files/Version_0.9.2/PG-XC_DBT1_Tutorial_v0_9_2.pdf/download" target="_blank"> +<a href="https://sourceforge.net/projects/postgres-xc/files/Version_0.9.3/PG-XC_DBT1_Tutorial_v0_9_3.pdf/download" target="_blank"> (download) </a> </li> <!-- Install Manual --> <li> -<code>PG-XC_InstallManual_v0_9_2.pdf</code>: <br> +<code>PG-XC_InstallManual_v0_9_3.pdf</code>: <br> Step by step description how to build, install and configure Postgres-XC. ⇒ -<a href="https://sourceforge.net/projects/postgres-xc/files/Version_0.9.2/PG-XC_InstallManual_v0_9_2.pdf/download" target="_blank"> +<a href="https://sourceforge.net/projects/postgres-xc/files/Version_0.9.3/PG-XC_InstallManual_v0_9_3.pdf/download" target="_blank"> (download) </a> </li> <!-- SQL limitation manual --> <li> -<code>PG-XC_SQL_Limitations_v0_9_2.pdf</code>: <br> -SQL restrictions available for Postgres-XC 0.9.2. +<code>PG-XC_SQL_Limitations_v0_9_3.pdf</code>: <br> +SQL restrictions available for Postgres-XC 0.9.3. ⇒ -<a href="https://sourceforge.net/projects/postgres-xc/files/Version_0.9.2/PG-XC_SQL_Limitations_v0_9_2.pdf/download" target="_blank"> +<a href="https://sourceforge.net/projects/postgres-xc/files/Version_0.9.3/PG-XC_SQL_Limitations_v0_9_3.pdf/download" target="_blank"> (download) </a> </li> diff --git a/members.html b/members.html index c230c54..4db05ef 100755 --- a/members.html +++ b/members.html @@ -55,7 +55,7 @@ He is also GridSQL developer and is now developping aggregate functions and other cross-node operation. </p> -<h4>Michael Paquier</h4> +<h4><a href="http://michaelpq.users.sourceforge.net/">Michael Paquier</a></h4> <p class="inner"> Coordinator feature developer.<br> diff --git a/prev_vers/version0_9.html b/prev_vers/version0_9.html index 4ed4cf1..487592e 100644 --- a/prev_vers/version0_9.html +++ b/prev_vers/version0_9.html @@ -238,3 +238,121 @@ Description of the outline of Postgres-XC internals. </body> </html> + + +<!-- Documents of version 0.9.2 --> +<h4> +Version 0.9.2 +</h4> + +<p> +<ul> +<!-- tarball of 0.9.2, main download--> +<li> +<code>pgxc_v0.9.2.tar.gz</code>: <br> +Latest version of Postgres-XC available.<br> +Please note that Postgres-XC documentation is not included in this file. +⇒ +<a href="https://sourceforge.net/projects/postgres-xc/files/Version_0.9.2/pgxc_v0_9_2.tar.gz/download" target="_blank"> +(download) +</a> +</li> + +<!-- tarball (diff) --> +<li> +<code>PGXC_v0_9_2-PG_REL8_4_3.patch.gz</code>: <br> +The same material as above, but this file includes only the patch to apply +to the PostgreSQL 8.4.3 release source code.<br> +It is useful if you would like to see just a difference between PostgreSQL +and Postgres-XC.<br> +No Postgres-XC documentation is included in this file either. +⇒ +<a href="https://sourceforge.net/projects/postgres-xc/files/Version_0.9.2/PGXC_v0_9_2-PG_REL8_4_3.patch.gz/download" target="_blank"> +(download) +</a> +</li> + +<!-- License --> +<li> +<code>COPYING</code>: <br> +License description. Postgres-XC is distributed under LGPL version 2.1 +⇒ +<a href="https://sourceforge.net/projects/postgres-xc/files/Version_0.9.2/COPYING/download" target="_blank"> +(download) +</a> +</li> + +<!-- Files --> +<li> +<code>FILES</code>: <br> +Description of files included in Postgres-XC 0.9.2 release. +⇒ +<a href="https://sourceforge.net/projects/postgres-xc/files/Version_0.9.2/FILES/download" target="_blank"> +(download) +</a> +</li> + +<!-- Reference Manual --> +<li> +<code>PG-XC_ReferenceManual_v0_9_2.pdf</code>: <br> +Reference of Postgres-XC extension. +⇒ +<a href="https://sourceforge.net/projects/postgres-xc/files/Version_0.9.2/PG-XC_ReferenceManual_v0_9_2.pdf/download" target="_blank"> +(download) +</a> +</li> + +<!-- pgbench Tutorial Manual --> +<li> +<code>PG-XC_pgbench_Tutorial_v0_9_2.pdf</code>: <br> +Step by step description how to build and configure pgbench to run with +Postgres-XC. +⇒ +<a href="https://sourceforge.net/projects/postgres-xc/files/Version_0.9.2/PG-XC_pgbench_Tutorial_v0_9_2.pdf/download" target="_blank"> +(download) +</a> +</li> + +<!-- DBT-1 Tutorial Manual --> +<li> +<code>PG-XC_DBT1_Tutorial_v0_9_2.pdf</code>: <br> +Step by step description how to build and configure DBT-1 to run with +Postgres-XC. +⇒ +<a href="https://sourceforge.net/projects/postgres-xc/files/Version_0.9.2/PG-XC_DBT1_Tutorial_v0_9_2.pdf/download" target="_blank"> +(download) +</a> +</li> + +<!-- Install Manual --> +<li> +<code>PG-XC_InstallManual_v0_9_2.pdf</code>: <br> +Step by step description how to build, install and configure Postgres-XC. +⇒ +<a href="https://sourceforge.net/projects/postgres-xc/files/Version_0.9.2/PG-XC_InstallManual_v0_9_2.pdf/download" target="_blank"> +(download) +</a> +</li> + +<!-- SQL limitation manual --> +<li> +<code>PG-XC_SQL_Limitations_v0_9_2.pdf</code>: <br> +SQL restrictions available for Postgres-XC 0.9.2. +⇒ +<a href="https://sourceforge.net/projects/postgres-xc/files/Version_0.9.2/PG-XC_SQL_Limitations_v0_9_2.pdf/download" target="_blank"> +(download) +</a> +</li> + +<!-- Architecture Document --> +<li> +<code>PG-XC_Architecture_v0_9.pdf</code>: <br> +Description of the outline of Postgres-XC internals. +⇒ +<a href="https://sourceforge.net/projects/postgres-xc/files/Version_0.9/PG-XC_Architecture.pdf/download" target="_blank"> +(download) +</a> +</li> + +</ul> +</p> ----------------------------------------------------------------------- Summary of changes: download.html | 40 ++++++++-------- members.html | 2 +- prev_vers/version0_9.html | 118 +++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 139 insertions(+), 21 deletions(-) hooks/post-receive -- website |
From: Michael P. <mic...@us...> - 2010-10-28 00:30:45
|
Project "Postgres-XC". The branch, REL0_9_3_STABLE has been created at d7d492eaeca181add193b4705de58637f5ba7c58 (commit) - Log ----------------------------------------------------------------- ----------------------------------------------------------------------- hooks/post-receive -- Postgres-XC |
From: Michael P. <mic...@us...> - 2010-10-28 00:28:38
|
Project "Postgres-XC". The annotated tag, v0.9.3 has been created at 4fa8496749b9d27ca702facbc78df9a45f6cda9c (tag) tagging d7d492eaeca181add193b4705de58637f5ba7c58 (commit) replaces v0.9.2 tagged by Michael P on Thu Oct 28 09:26:23 2010 +0900 - Log ----------------------------------------------------------------- Postgres-XC version 0.9.3 tag M S (2): Portal integration changes. Initial support for multi-step queries, including cross-node joins. Mason S (2): Added more handling to deal with data node connection failures. There is a race condition that could lead to problems Mason Sharp (13): In Postgres-XC, when extedngin the clog the status assertion Fix a visibility warning due to not taking into account Fixed a bug in GTM introduced with timestamp piggybacking with GXID. Fix a bug with AVG() Improved error handling. Address performance issues that were introduced in the last Initial support for cursors (DECLARE, FETCH). Handle stored functions in queries. Fix a bug with EXPLAIN and EXPLAIN VERBOSE. Fixed bug where extra materialization nodes were being created. Fix bug with pooler. SourceForge Bug ID: 3076224 checkpoint command causes seg fault When there is a data node crash, sometimes we were trying to read Michael P (6): Correction of bugs in pgxc_ddl Support for Global timestamp in Postgres-XC. Implementation of 2PC from applications Added support for two new pieces of functionality. After a Commit of prepared transaction on GTM, Deletion of a DEBUG message in postmaster.c ----------------------------------------------------------------------- hooks/post-receive -- Postgres-XC |
From: Pavan D. <pa...@us...> - 2010-10-27 10:48:12
|
Project "Postgres-XC". The branch, PGXC-sqlmed has been updated via eb50a76cb929fbe4a31d093b43e1589382c892a0 (commit) from 69bb66c62f71b9be918475ea65931adb3bbfba20 (commit) - Log ----------------------------------------------------------------- commit eb50a76cb929fbe4a31d093b43e1589382c892a0 Author: Pavan Deolasee <pav...@gm...> Date: Wed Oct 27 16:09:28 2010 +0530 Set remote relation stats (pages, rows etc) to a lower value so that NestLoop joins are preferred over other join types. This is necessary until we can handle other join types for remote join reduction diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c index 957a515..b1c8bcb 100644 --- a/src/backend/optimizer/util/relnode.c +++ b/src/backend/optimizer/util/relnode.c @@ -102,6 +102,20 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptKind reloptkind) case RTE_RELATION: /* Table --- retrieve statistics from the system catalogs */ get_relation_info(root, rte->relid, rte->inh, rel); +#ifdef PGXC + /* + * This is a remote table... we have no idea how many pages/rows + * we may get from a scan of this table. However, we should set the + * costs in such a manner that cheapest paths should pick up the + * ones involving these remote rels + * + * These allow for maximum query shipping to the remote + * side later during the planning phase + */ + rel->pages = 1; + rel->tuples = 1; + rel->rows = 1; +#endif break; case RTE_SUBQUERY: case RTE_FUNCTION: ----------------------------------------------------------------------- Summary of changes: src/backend/optimizer/util/relnode.c | 14 ++++++++++++++ 1 files changed, 14 insertions(+), 0 deletions(-) hooks/post-receive -- Postgres-XC |
From: Michael P. <mic...@us...> - 2010-10-27 00:07:14
|
Project "Postgres-XC". The branch, master has been updated via d7d492eaeca181add193b4705de58637f5ba7c58 (commit) from fee989010d22b6ca6c47b72d2d9b0620e4ab42b8 (commit) - Log ----------------------------------------------------------------- commit d7d492eaeca181add193b4705de58637f5ba7c58 Author: Michael P <mic...@us...> Date: Wed Oct 27 09:06:15 2010 +0900 Deletion of a DEBUG message in postmaster.c When opening a child under postmaster, there was always a message written in log telling about the PID number. This was written for bug pruposes only. diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c index 56974b0..0add0e6 100644 --- a/src/backend/postmaster/postmaster.c +++ b/src/backend/postmaster/postmaster.c @@ -3181,10 +3181,6 @@ BackendStartup(Port *port) pid = fork_process(); if (pid == 0) /* child */ { - //// FOR DEBUG - printf("The session started: %d\n", getpid()); - //sleep(60); - //// FOR DEBUG free(bn); /* ----------------------------------------------------------------------- Summary of changes: src/backend/postmaster/postmaster.c | 4 ---- 1 files changed, 0 insertions(+), 4 deletions(-) hooks/post-receive -- Postgres-XC |
From: mason_s <ma...@us...> - 2010-10-26 00:30:31
|
Project "Postgres-XC". The branch, master has been updated via fee989010d22b6ca6c47b72d2d9b0620e4ab42b8 (commit) from e11ab021fa203b8902c790e83e5bd78dbc4b2729 (commit) - Log ----------------------------------------------------------------- commit fee989010d22b6ca6c47b72d2d9b0620e4ab42b8 Author: Mason Sharp <ma...@us...> Date: Mon Oct 25 20:29:24 2010 -0400 When there is a data node crash, sometimes we were trying to read from a bad socket. diff --git a/src/backend/pgxc/pool/pgxcnode.c b/src/backend/pgxc/pool/pgxcnode.c index 5340a93..0d90273 100644 --- a/src/backend/pgxc/pool/pgxcnode.c +++ b/src/backend/pgxc/pool/pgxcnode.c @@ -272,10 +272,16 @@ pgxc_node_receive(const int conn_count, continue; /* prepare select params */ - if (nfds < connections[i]->sock) + if (connections[i]->sock > 0) + { + FD_SET(connections[i]->sock, &readfds); nfds = connections[i]->sock; - - FD_SET(connections[i]->sock, &readfds); + } + else + { + /* flag as bad, it will be removed from the list */ + connections[i]->state == DN_CONNECTION_STATE_ERROR_NOT_READY; + } } /* ----------------------------------------------------------------------- Summary of changes: src/backend/pgxc/pool/pgxcnode.c | 12 +++++++++--- 1 files changed, 9 insertions(+), 3 deletions(-) hooks/post-receive -- Postgres-XC |
From: Pavan D. <pa...@us...> - 2010-10-19 06:52:25
|
Project "Postgres-XC". The branch, PGXC-sqlmed has been updated via 69bb66c62f71b9be918475ea65931adb3bbfba20 (commit) from 2a313446f3e714ba36c9ccc5c5167309b7c89a95 (commit) - Log ----------------------------------------------------------------- commit 69bb66c62f71b9be918475ea65931adb3bbfba20 Author: Pavan Deolasee <pav...@gm...> Date: Tue Oct 19 12:20:44 2010 +0530 Set aliases properly for join reduction diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c index c3cb3b7..a753e95 100644 --- a/src/backend/optimizer/plan/createplan.c +++ b/src/backend/optimizer/plan/createplan.c @@ -827,6 +827,8 @@ create_remotejoin_plan(PlannerInfo *root, JoinPath *best_path, Plan *parent, Pla result->outer_alias = pstrdup(out_alias); result->inner_reduce_level = inner->reduce_level; result->outer_reduce_level = outer->reduce_level; + result->inner_relids = in_relids; + result->outer_relids = out_relids; appendStringInfo(&fromlist, " %s (%s) %s", pname, inner->sql_statement, quote_identifier(in_alias)); ----------------------------------------------------------------------- Summary of changes: src/backend/optimizer/plan/createplan.c | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) hooks/post-receive -- Postgres-XC |
From: mason_s <ma...@us...> - 2010-10-18 20:21:32
|
Project "Postgres-XC". The branch, PGXC-sqlmed has been updated via 2a313446f3e714ba36c9ccc5c5167309b7c89a95 (commit) from f275fa535e9673af0964ecc7ca93ab1b49df2317 (commit) - Log ----------------------------------------------------------------- commit 2a313446f3e714ba36c9ccc5c5167309b7c89a95 Author: Mason Sharp <ma...@us...> Date: Mon Oct 18 16:15:16 2010 -0400 Added IsJoinReducible to determine if the two plan nodes can be joined. See comments for this function for more details. Basically, we use examine_conditions_walker to check if it is safe to join the two. Partitioned-partitioned joins are safe to collapse, and partitioned-replicated are safe iff one of the nodes does not already contain such a collapsed node. diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c index d9d5e4c..c3cb3b7 100644 --- a/src/backend/optimizer/plan/createplan.c +++ b/src/backend/optimizer/plan/createplan.c @@ -77,7 +77,7 @@ static WorkTableScan *create_worktablescan_plan(PlannerInfo *root, Path *best_pa #ifdef PGXC static RemoteQuery *create_remotequery_plan(PlannerInfo *root, Path *best_path, List *tlist, List *scan_clauses); -static Plan *create_remotejoin_plan(PlannerInfo *root, Path *best_path, +static Plan *create_remotejoin_plan(PlannerInfo *root, JoinPath *best_path, Plan *parent, Plan *outer_plan, Plan *inner_plan); static void create_remote_target_list(PlannerInfo *root, StringInfo targets, List *out_tlist, List *in_tlist, @@ -574,7 +574,7 @@ create_join_plan(PlannerInfo *root, JoinPath *best_path) #ifdef PGXC /* check if this join can be reduced to an equiv. remote scan node */ - plan = create_remotejoin_plan(root, (Path *)best_path, plan, outer_plan, inner_plan); + plan = create_remotejoin_plan(root, best_path, plan, outer_plan, inner_plan); #endif return plan; @@ -627,7 +627,7 @@ create_join_plan(PlannerInfo *root, JoinPath *best_path) * this code a lot much readable and easier. */ static Plan * -create_remotejoin_plan(PlannerInfo *root, Path *best_path, Plan *parent, Plan *outer_plan, Plan *inner_plan) +create_remotejoin_plan(PlannerInfo *root, JoinPath *best_path, Plan *parent, Plan *outer_plan, Plan *inner_plan) { NestLoop *nest_parent; @@ -662,19 +662,37 @@ create_remotejoin_plan(PlannerInfo *root, Path *best_path, Plan *parent, Plan *o IsA(inner_plan, Material) && IsA(((Material *) inner_plan)->plan.lefttree, RemoteQuery)) { + int i; + List *rtable_list = NIL; + bool partitioned_replicated_join = false; + Material *outer_mat = (Material *)outer_plan; Material *inner_mat = (Material *)inner_plan; RemoteQuery *outer = (RemoteQuery *)outer_mat->plan.lefttree; RemoteQuery *inner = (RemoteQuery *)inner_mat->plan.lefttree; + /* * Check if both these plans are from the same remote node. If yes, * replace this JOIN along with it's two children with one equivalent * remote node */ + /* + * Build up rtable for XC Walker + * (was not sure I could trust this, but it seems to work in various cases) + */ + for (i = 0; i < root->simple_rel_array_size; i++) + { + RangeTblEntry *rte = root->simple_rte_array[i]; + + /* Check for NULL first, sometimes it is NULL at position 0 */ + if (rte) + rtable_list = lappend(rtable_list, root->simple_rte_array[i]); + } + /* XXX Check if the join optimization is possible */ - if (true) + if (IsJoinReducible(inner, outer, rtable_list, best_path, &partitioned_replicated_join)) { RemoteQuery *result; Plan *result_plan; @@ -898,6 +916,8 @@ create_remotejoin_plan(PlannerInfo *root, Path *best_path, Plan *parent, Plan *o result->base_tlist = base_tlist; result->relname = "__FOREIGN_QUERY__"; + result->partitioned_replicated = partitioned_replicated_join; + /* * if there were any local scan clauses stick them up here. They * can come from the join node or from remote scan node themselves. diff --git a/src/backend/pgxc/plan/planner.c b/src/backend/pgxc/plan/planner.c index 29e4ee0..e678a14 100644 --- a/src/backend/pgxc/plan/planner.c +++ b/src/backend/pgxc/plan/planner.c @@ -87,7 +87,8 @@ typedef struct /* If two relations are joined based on special location information */ typedef enum PGXCJoinType { - JOIN_REPLICATED, + JOIN_REPLICATED_ONLY, + JOIN_REPLICATED_PARTITIONED, JOIN_COLOCATED_PARTITIONED, JOIN_OTHER } PGXCJoinType; @@ -144,6 +145,7 @@ static ExecNodes *get_plan_nodes(Query *query, bool isRead); static bool get_plan_nodes_walker(Node *query_node, XCWalkerContext *context); static bool examine_conditions_walker(Node *expr_node, XCWalkerContext *context); static int handle_limit_offset(RemoteQuery *query_step, Query *query, PlannedStmt *plan_stmt); +static void InitXCWalkerContext(XCWalkerContext *context); /* * True if both lists contain only one node and are the same @@ -693,15 +695,20 @@ examine_conditions_walker(Node *expr_node, XCWalkerContext *context) if (rel_loc_info1->locatorType == LOCATOR_TYPE_REPLICATED) { + /* add to replicated join conditions */ context->conditions->replicated_joins = - lappend(context->conditions->replicated_joins, opexpr); + lappend(context->conditions->replicated_joins, pgxc_join); if (colvar->varlevelsup != colvar2->varlevelsup) context->multilevel_join = true; - if (rel_loc_info2->locatorType != LOCATOR_TYPE_REPLICATED) + if (rel_loc_info2->locatorType == LOCATOR_TYPE_REPLICATED) + pgxc_join->join_type = JOIN_REPLICATED_ONLY; + else { + pgxc_join->join_type = JOIN_REPLICATED_PARTITIONED; + /* Note other relation, saves us work later. */ context->conditions->base_rel_name = column_base2->relname; context->conditions->base_rel_loc_info = rel_loc_info2; @@ -717,23 +724,21 @@ examine_conditions_walker(Node *expr_node, XCWalkerContext *context) FreeRelationLocInfo(rel_loc_info2); } - /* note nature of join between the two relations */ - pgxc_join->join_type = JOIN_REPLICATED; return false; } else if (rel_loc_info2->locatorType == LOCATOR_TYPE_REPLICATED) { + /* note nature of join between the two relations */ + pgxc_join->join_type = JOIN_REPLICATED_PARTITIONED; + /* add to replicated join conditions */ context->conditions->replicated_joins = - lappend(context->conditions->replicated_joins, opexpr); + lappend(context->conditions->replicated_joins, pgxc_join); /* other relation not replicated, note it for later */ context->conditions->base_rel_name = column_base->relname; context->conditions->base_rel_loc_info = rel_loc_info1; - /* note nature of join between the two relations */ - pgxc_join->join_type = JOIN_REPLICATED; - if (rel_loc_info2) FreeRelationLocInfo(rel_loc_info2); @@ -1259,6 +1264,23 @@ get_plan_nodes_walker(Node *query_node, XCWalkerContext *context) return false; } +/* + * Set initial values for expression walker + */ +static void +InitXCWalkerContext(XCWalkerContext *context) +{ + context->isRead = true; + context->exec_nodes = NULL; + context->conditions = (Special_Conditions *) palloc0(sizeof(Special_Conditions)); + context->rtables = NIL; + context->multilevel_join = false; + context->varno = 0; + context->within_or = false; + context->within_not = false; + context->exec_on_coord = false; + context->join_list = NIL; +} /* * Top level entry point before walking query to determine plan nodes @@ -1271,18 +1293,9 @@ get_plan_nodes(Query *query, bool isRead) XCWalkerContext context; - context.query = query; + InitXCWalkerContext(&context); context.isRead = isRead; - context.exec_nodes = NULL; - context.conditions = (Special_Conditions *) palloc0(sizeof(Special_Conditions)); - context.rtables = NIL; context.rtables = lappend(context.rtables, query->rtable); - context.multilevel_join = false; - context.varno = 0; - context.within_or = false; - context.within_not = false; - context.exec_on_coord = false; - context.join_list = NIL; if (!get_plan_nodes_walker((Node *) query, &context)) result_nodes = context.exec_nodes; @@ -2315,3 +2328,148 @@ free_query_step(RemoteQuery *query_step) list_free_deep(query_step->simple_aggregates); pfree(query_step); } + + +/* + * See if we can reduce the passed in RemoteQuery nodes to a single step. + * + * We need to check when we can further collapse already collapsed nodes. + * We cannot always collapse- we do not want to allow a replicated table + * to be used twice. That is if we have + * + * partitioned_1 -- replicated -- partitioned_2 + * + * partitioned_1 and partitioned_2 cannot (usually) be safely joined only + * locally. + * We can do this by checking (may need tracking) what type it is, + * and looking at context->conditions->replicated_joins + * + * The following cases are possible, and whether or not it is ok + * to reduce. + * + * If the join between the two RemoteQuery nodes is replicated + * + * Node 1 Node 2 + * rep-part folded rep-part folded ok to reduce? + * 0 0 0 1 1 + * 0 0 1 1 1 + * 0 1 0 1 1 + * 0 1 1 1 1 + * 1 1 1 1 0 + * + * + * If the join between the two RemoteQuery nodes is replicated - partitioned + * + * Node 1 Node 2 + * rep-part folded rep-part folded ok to reduce? + * 0 0 0 1 1 + * 0 0 1 1 0 + * 0 1 0 1 1 + * 0 1 1 1 0 + * 1 1 1 1 0 + * + * + * If the join between the two RemoteQuery nodes is partitioned - partitioned + * it is always reducibile safely, + * + * RemoteQuery *innernode - the inner node + * RemoteQuery *outernode - the outer node + * bool *partitioned_replicated - set to true if we have a partitioned-replicated + * join. We want to use replicated tables with non-replicated + * tables ony once. Only use this value if this function + * returns true. + */ +bool +IsJoinReducible(RemoteQuery *innernode, RemoteQuery *outernode, + List *rtable_list, JoinPath *join_path, bool *partitioned_replicated) +{ + XCWalkerContext context; + ListCell *cell; + bool maybe_reducible = false; + bool result = false; + + + *partitioned_replicated = false; + + InitXCWalkerContext(&context); + context.isRead = true; /* PGXCTODO - determine */ + context.rtables = NIL; + context.rtables = lappend(context.rtables, rtable_list); /* add to list of lists */ + + + + foreach(cell, join_path->joinrestrictinfo) + { + RestrictInfo *node = (RestrictInfo *) lfirst(cell); + + /* + * Check if we can fold these safely. + * + * If examine_conditions_walker() returns true, + * then it definitely is not collapsable. + * If it returns false, it may or may not be, we have to check + * context.conditions at the end. + * We keep trying, because another condition may fulfill the criteria. + */ + maybe_reducible = !examine_conditions_walker((Node *) node->clause, &context); + + if (!maybe_reducible) + break; + + } + + /* check to see if we found any partitioned or replicated joins */ + if (maybe_reducible && + (context.conditions->partitioned_parent_child + || context.conditions->replicated_joins)) + { + /* + * If we get here, we think that we can fold the + * RemoteQuery nodes into a single one. + */ + result = true; + + /* Check replicated-replicated and replicated-partitioned joins */ + if (context.conditions->replicated_joins) + { + ListCell *cell; + + /* if we already reduced with replicated tables already, we + * cannot here. + * PGXCTODO - handle more cases and use outer_relids and inner_relids + * For now we just give up. + */ + if ((innernode->remotejoin && innernode->partitioned_replicated) && + (outernode->remotejoin && outernode->partitioned_replicated)) + { + /* not reducible after all */ + return false; + } + + foreach(cell, context.conditions->replicated_joins) + { + PGXC_Join *pgxc_join = (PGXC_Join *) lfirst(cell); + + if (pgxc_join->join_type == JOIN_REPLICATED_PARTITIONED) + { + *partitioned_replicated = true; + + /* + * If either of these already have such a join, we do not + * want to add it a second time. + */ + if ((innernode->remotejoin && innernode->partitioned_replicated) || + (outernode->remotejoin && outernode->partitioned_replicated)) + { + /* not reducible after all */ + return false; + } + } + } + } + } + + return result; +} + + diff --git a/src/include/pgxc/planner.h b/src/include/pgxc/planner.h index ef00f27..8aae356 100644 --- a/src/include/pgxc/planner.h +++ b/src/include/pgxc/planner.h @@ -89,6 +89,7 @@ typedef struct char *relname; bool remotejoin; /* True if this is a reduced remote join */ + bool partitioned_replicated; /* True if reduced and contains replicated-partitioned join */ int reduce_level; /* in case of reduced JOIN, it's level */ List *base_tlist; /* in case of isReduced, the base tlist */ char *outer_alias; @@ -177,4 +178,8 @@ extern PlannedStmt *pgxc_planner(Query *query, int cursorOptions, extern bool IsHashDistributable(Oid col_type); extern bool is_immutable_func(Oid funcid); + +extern bool IsJoinReducible(RemoteQuery *innernode, RemoteQuery *outernode, + List *rtable_list, JoinPath *join_path, bool *partitioned_replicated); + #endif /* PGXCPLANNER_H */ ----------------------------------------------------------------------- Summary of changes: src/backend/optimizer/plan/createplan.c | 28 ++++- src/backend/pgxc/plan/planner.c | 196 ++++++++++++++++++++++++++++--- src/include/pgxc/planner.h | 5 + 3 files changed, 206 insertions(+), 23 deletions(-) hooks/post-receive -- Postgres-XC |
From: mason_s <ma...@us...> - 2010-10-18 18:57:34
|
Project "Postgres-XC". The branch, master has been updated via e11ab021fa203b8902c790e83e5bd78dbc4b2729 (commit) from ca4fb6103add2b4560b8efe142f24d94ed03d56e (commit) - Log ----------------------------------------------------------------- commit e11ab021fa203b8902c790e83e5bd78dbc4b2729 Author: Mason Sharp <ma...@us...> Date: Mon Oct 18 14:52:45 2010 -0400 SourceForge Bug ID: 3076224 checkpoint command causes seg fault Prevent a manual checkpoint from crashing nodes. Note, this does not mean that there is a cluster-wide coordinated checkpoint; it just passes it down to the nodes. Written by Benny Mei Le diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c index 053751c..eb704ce 100644 --- a/src/backend/tcop/pquery.c +++ b/src/backend/tcop/pquery.c @@ -21,6 +21,9 @@ #include "executor/tstoreReceiver.h" #include "miscadmin.h" #include "pg_trace.h" +#ifdef PGXC +#include "pgxc/pgxc.h" +#endif #include "tcop/pquery.h" #include "tcop/tcopprot.h" #include "tcop/utility.h" @@ -1192,7 +1195,11 @@ PortalRunUtility(Portal portal, Node *utilityStmt, bool isTopLevel, IsA(utilityStmt, ListenStmt) || IsA(utilityStmt, NotifyStmt) || IsA(utilityStmt, UnlistenStmt) || +#ifdef PGXC + (IsA(utilityStmt, CheckPointStmt) && IS_PGXC_DATANODE))) +#else IsA(utilityStmt, CheckPointStmt))) +#endif { PushActiveSnapshot(GetTransactionSnapshot()); active_snapshot_set = true; ----------------------------------------------------------------------- Summary of changes: src/backend/tcop/pquery.c | 7 +++++++ 1 files changed, 7 insertions(+), 0 deletions(-) hooks/post-receive -- Postgres-XC |
From: Pavan D. <pa...@us...> - 2010-10-18 06:25:02
|
Project "Postgres-XC". The branch, PGXC-sqlmed has been updated via f275fa535e9673af0964ecc7ca93ab1b49df2317 (commit) from 6af07721357944af801a384ed1eb54e363839403 (commit) - Log ----------------------------------------------------------------- commit f275fa535e9673af0964ecc7ca93ab1b49df2317 Author: Pavan Deolasee <pav...@gm...> Date: Mon Oct 18 11:53:54 2010 +0530 Fix a bug where rte/alias were not getting set up properly diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c index e8134d1..d9d5e4c 100644 --- a/src/backend/optimizer/plan/createplan.c +++ b/src/backend/optimizer/plan/createplan.c @@ -1159,8 +1159,9 @@ create_remote_expr(PlannerInfo *root, Plan *parent, StringInfo expr, Assert(cell != NULL); rte->eref = lfirst(cell); - rte->alias = lfirst(lnext(cell)); + cell = lnext(cell); + rte->alias = lfirst(cell); cell = lnext(cell); } bms_free(tmprelids); ----------------------------------------------------------------------- Summary of changes: src/backend/optimizer/plan/createplan.c | 3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) hooks/post-receive -- Postgres-XC |
From: Pavan D. <pa...@us...> - 2010-10-18 06:08:18
|
Project "Postgres-XC". The branch, PGXC-sqlmed has been created at 6af07721357944af801a384ed1eb54e363839403 (commit) - Log ----------------------------------------------------------------- commit 6af07721357944af801a384ed1eb54e363839403 Author: Pavan Deolasee <pav...@gm...> Date: Mon Oct 18 11:35:43 2010 +0530 Update some missing copy/out/read functions diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c index 1d3155b..d4ae006 100644 --- a/src/backend/nodes/copyfuncs.c +++ b/src/backend/nodes/copyfuncs.c @@ -1849,6 +1849,7 @@ _copyRangeTblEntry(RangeTblEntry *from) COPY_SCALAR_FIELD(rtekind); #ifdef PGXC + COPY_STRING_FIELD(relname); if (from->reltupdesc) newnode->reltupdesc = CreateTupleDescCopy(from->reltupdesc); #endif diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c index 9fcbe4c..85cfaca 100644 --- a/src/backend/nodes/outfuncs.c +++ b/src/backend/nodes/outfuncs.c @@ -2028,6 +2028,9 @@ _outRangeTblEntry(StringInfo str, RangeTblEntry *node) WRITE_NODE_FIELD(alias); WRITE_NODE_FIELD(eref); WRITE_ENUM_FIELD(rtekind, RTEKind); +#ifdef PGXC + WRITE_STRING_FIELD(relname); +#endif switch (node->rtekind) { diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c index 72a2156..c928bf8 100644 --- a/src/backend/nodes/readfuncs.c +++ b/src/backend/nodes/readfuncs.c @@ -1118,6 +1118,9 @@ _readRangeTblEntry(void) READ_NODE_FIELD(alias); READ_NODE_FIELD(eref); READ_ENUM_FIELD(rtekind, RTEKind); +#ifdef PGXC + READ_STRING_FIELD(relname); +#endif switch (local_node->rtekind) { commit 7bcb490dc50eeb1ad1569d90cc5eb759b766aa91 Author: Pavan Deolasee <pav...@gm...> Date: Mon Oct 18 11:33:54 2010 +0530 Initial implementation of remote join reduction. We still don't have the logic to determine whether its safe to reduce two join trees or not diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c index aa92917..5099162 100644 --- a/src/backend/commands/explain.c +++ b/src/backend/commands/explain.c @@ -686,7 +686,11 @@ explain_outNode(StringInfo str, Assert(rte->rtekind == RTE_RELATION); /* We only show the rel name, not schema name */ +#ifdef PGXC + relname = rte->relname; +#else relname = get_rel_name(rte->relid); +#endif appendStringInfo(str, " on %s", quote_identifier(relname)); diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c index 134b9e1..1d3155b 100644 --- a/src/backend/nodes/copyfuncs.c +++ b/src/backend/nodes/copyfuncs.c @@ -840,6 +840,17 @@ _copyRemoteQuery(RemoteQuery *from) COPY_SCALAR_FIELD(read_only); COPY_SCALAR_FIELD(force_autocommit); + COPY_STRING_FIELD(relname); + COPY_SCALAR_FIELD(remotejoin); + COPY_SCALAR_FIELD(reduce_level); + COPY_NODE_FIELD(base_tlist); + COPY_STRING_FIELD(outer_alias); + COPY_STRING_FIELD(inner_alias); + COPY_SCALAR_FIELD(outer_reduce_level); + COPY_SCALAR_FIELD(inner_reduce_level); + COPY_BITMAPSET_FIELD(outer_relids); + COPY_BITMAPSET_FIELD(inner_relids); + return newnode; } diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c index be80f18..9fcbe4c 100644 --- a/src/backend/nodes/outfuncs.c +++ b/src/backend/nodes/outfuncs.c @@ -1502,6 +1502,9 @@ _outPlannerInfo(StringInfo str, PlannerInfo *node) WRITE_BOOL_FIELD(hasHavingQual); WRITE_BOOL_FIELD(hasPseudoConstantQuals); WRITE_BOOL_FIELD(hasRecursion); +#ifdef PGXC + WRITE_INT_FIELD(rs_alias_index); +#endif WRITE_INT_FIELD(wt_param_id); } diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c index fcbb8ca..8bb9057 100644 --- a/src/backend/optimizer/path/costsize.c +++ b/src/backend/optimizer/path/costsize.c @@ -109,6 +109,9 @@ bool enable_hashagg = true; bool enable_nestloop = true; bool enable_mergejoin = true; bool enable_hashjoin = true; +#ifdef PGXC +bool enable_remotejoin = true; +#endif typedef struct { diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c index 4f3a7c6..e8134d1 100644 --- a/src/backend/optimizer/plan/createplan.c +++ b/src/backend/optimizer/plan/createplan.c @@ -38,6 +38,7 @@ #include "utils/builtins.h" #include "utils/syscache.h" #include "catalog/pg_proc.h" +#include "executor/executor.h" #endif #include "utils/lsyscache.h" @@ -76,6 +77,14 @@ static WorkTableScan *create_worktablescan_plan(PlannerInfo *root, Path *best_pa #ifdef PGXC static RemoteQuery *create_remotequery_plan(PlannerInfo *root, Path *best_path, List *tlist, List *scan_clauses); +static Plan *create_remotejoin_plan(PlannerInfo *root, Path *best_path, + Plan *parent, Plan *outer_plan, Plan *inner_plan); +static void create_remote_target_list(PlannerInfo *root, + StringInfo targets, List *out_tlist, List *in_tlist, + char *out_alias, int out_index, + char *in_alias, int in_index); +static Alias *generate_remote_rte_alias(RangeTblEntry *rte, int varno, + char *aliasname, int reduce_level); #endif static NestLoop *create_nestloop_plan(PlannerInfo *root, NestPath *best_path, Plan *outer_plan, Plan *inner_plan); @@ -146,7 +155,12 @@ static Sort *make_sort(PlannerInfo *root, Plan *lefttree, int numCols, static Material *make_material(Plan *lefttree); #ifdef PGXC +static void findReferencedVars(List *parent_vars, Plan *plan, List **out_tlist, Relids *out_relids); extern bool is_foreign_qual(Node *clause); +static void create_remote_clause_expr(PlannerInfo *root, Plan *parent, StringInfo clauses, + List *qual, RemoteQuery *scan); +static void create_remote_expr(PlannerInfo *root, Plan *parent, StringInfo expr, + Node *node, RemoteQuery *scan); #endif /* @@ -228,9 +242,6 @@ create_scan_plan(PlannerInfo *root, Path *best_path) List *tlist; List *scan_clauses; Plan *plan; -#ifdef PGXC - Plan *matplan; -#endif /* * For table scans, rather than using the relation targetlist (which is @@ -561,9 +572,604 @@ create_join_plan(PlannerInfo *root, JoinPath *best_path) get_actual_clauses(get_loc_restrictinfo(best_path)))); #endif +#ifdef PGXC + /* check if this join can be reduced to an equiv. remote scan node */ + plan = create_remotejoin_plan(root, (Path *)best_path, plan, outer_plan, inner_plan); +#endif + return plan; } +#ifdef PGXC +/* + * create_remotejoin_plan + * check if the children plans involve remote entities from the same remote + * node. If so, this join can be reduced to an equivalent remote scan plan + * node + * + * RULES: + * + * * provide unique aliases to both inner and outer nodes to represent their + * corresponding subqueries + * + * * identify target entries from both inner and outer that appear in the join + * targetlist, only those need to be selected from these aliased subqueries + * + * * a join node has a joinqual list which represents the join condition. E.g. + * SELECT * from emp e LEFT JOIN emp2 d ON e.x = d.x + * Here the joinqual contains "e.x = d.x". If the joinqual itself has a local + * dependency, e.g "e.x = localfunc(d.x)", then this join cannot be reduced + * + * * other than the joinqual, the join node can contain additional quals. Even + * if they have any local dependencies, we can reduce the join and just + * append these quals into the reduced remote scan node. We DO do a pass to + * identify remote quals and ship those in the squery though + * + * * these quals (both joinqual and normal quals with no local dependencies) + * need to be converted into expressions referring to the aliases assigned to + * the nodes. These expressions will eventually become part of the squery of + * the reduced remote scan node + * + * * the children remote scan nodes themselves can have local dependencies in + * their quals (the remote ones are already part of the squery). We can still + * reduce the join and just append these quals into the reduced remote scan + * node + * + * * if we reached successfully so far, generate a new remote scan node with + * this new squery generated using the aliased references + * + * One important point to note here about targetlists is that this function + * does not set any DUMMY var references in the Var nodes appearing in it. It + * follows the standard mechanism as is followed by other nodes. Similar to the + * existing nodes, the references which point to DUMMY vars is done in + * set_remote_references() function in set_plan_references phase at the fag + * end. Avoiding such DUMMY references manipulations till the end also makes + * this code a lot much readable and easier. + */ +static Plan * +create_remotejoin_plan(PlannerInfo *root, Path *best_path, Plan *parent, Plan *outer_plan, Plan *inner_plan) +{ + NestLoop *nest_parent; + + if (!enable_remotejoin) + return parent; + + /* meh, what are these for :( */ + if (root->hasPseudoConstantQuals) + return parent; + + /* Works only for SELECT commands right now */ + if (root->parse->commandType != CMD_SELECT) + return parent; + + /* do not optimize CURSOR based select statements */ + if (root->parse->rowMarks != NIL) + return parent; + + /* + * optimize only simple NestLoop joins for now. Other joins like Merge and + * Hash can be reduced too. But they involve additional intermediate nodes + * and we need to understand them a bit more as yet + */ + if (!IsA(parent, NestLoop)) + return parent; + else + nest_parent = (NestLoop *)parent; + + /* check if both the nodes qualify for reduction */ + if (IsA(outer_plan, Material) && + IsA(((Material *) outer_plan)->plan.lefttree, RemoteQuery) && + IsA(inner_plan, Material) && + IsA(((Material *) inner_plan)->plan.lefttree, RemoteQuery)) + { + Material *outer_mat = (Material *)outer_plan; + Material *inner_mat = (Material *)inner_plan; + + RemoteQuery *outer = (RemoteQuery *)outer_mat->plan.lefttree; + RemoteQuery *inner = (RemoteQuery *)inner_mat->plan.lefttree; + /* + * Check if both these plans are from the same remote node. If yes, + * replace this JOIN along with it's two children with one equivalent + * remote node + */ + + /* XXX Check if the join optimization is possible */ + if (true) + { + RemoteQuery *result; + Plan *result_plan; + StringInfoData targets, clauses, scan_clauses, fromlist; + StringInfoData squery; + List *parent_vars, *out_tlist = NIL, *in_tlist = NIL, *base_tlist; + ListCell *l; + char in_alias[15], out_alias[15]; + Relids out_relids = NULL, in_relids = NULL; + bool use_where = false; + Index dummy_rtindex; + RangeTblEntry *dummy_rte; + List *local_scan_clauses = NIL, *remote_scan_clauses = NIL; + char *pname; + + + /* KISS! As long as distinct aliases are provided for all the objects in + * involved in query, remote server should not crib! */ + sprintf(in_alias, "out_%d", root->rs_alias_index); + sprintf(out_alias, "in_%d", root->rs_alias_index); + + /* + * Walk the left, right trees and identify which vars appear in the + * parent targetlist, only those need to be selected. Note that + * depending on whether the parent targetlist is top-level or + * intermediate, the children vars may or may not be referenced + * multiple times in it. + */ + parent_vars = pull_var_clause((Node *)parent->targetlist, PVC_REJECT_PLACEHOLDERS); + + findReferencedVars(parent_vars, outer_plan, &out_tlist, &out_relids); + findReferencedVars(parent_vars, inner_plan, &in_tlist, &in_relids); + + /* + * If the JOIN ON clause has a local dependency then we cannot ship + * the join to the remote side at all, bail out immediately. + */ + if (!is_foreign_qual((Node *)nest_parent->join.joinqual)) + { + elog(DEBUG1, "cannot reduce: local dependencies in the joinqual"); + return parent; + } + + /* + * If the normal plan qual has local dependencies, the join can + * still be shipped. Try harder to ship remote clauses out of the + * entire list. These local quals will become part of the quals + * list of the reduced remote scan node down later. + */ + if (!is_foreign_qual((Node *)nest_parent->join.plan.qual)) + { + elog(DEBUG1, "local dependencies in the join plan qual"); + + /* + * trawl through each entry and come up with remote and local + * clauses... sigh + */ + foreach(l, nest_parent->join.plan.qual) + { + Node *clause = lfirst(l); + + /* + * if the currentof in the above call to + * clause_is_local_bound is set, somewhere in the list there + * is currentof clause, so keep that information intact and + * pass a dummy argument here. + */ + if (!is_foreign_qual((Node *)clause)) + local_scan_clauses = lappend(local_scan_clauses, clause); + else + remote_scan_clauses = lappend(remote_scan_clauses, clause); + } + } + else + { + /* + * there is no local bound clause, all the clauses are remote + * scan clauses + */ + remote_scan_clauses = nest_parent->join.plan.qual; + } + + /* generate the tlist for the new RemoteScan node using out_tlist, in_tlist */ + initStringInfo(&targets); + create_remote_target_list(root, &targets, out_tlist, in_tlist, + out_alias, outer->reduce_level, in_alias, inner->reduce_level); + + /* + * generate the fromlist now. The code has to appropriately mention + * the JOIN type in the string being generated. + */ + initStringInfo(&fromlist); + appendStringInfo(&fromlist, " (%s) %s ", + outer->sql_statement, quote_identifier(out_alias)); + + use_where = false; + switch (nest_parent->join.jointype) + { + case JOIN_INNER: + pname = ", "; + use_where = true; + break; + case JOIN_LEFT: + pname = "LEFT JOIN"; + break; + case JOIN_FULL: + pname = "FULL JOIN"; + break; + case JOIN_RIGHT: + pname = "RIGHT JOIN"; + break; + case JOIN_SEMI: + case JOIN_ANTI: + default: + return parent; + } + + /* + * splendid! we can actually replace this join hierarchy with a + * single RemoteScan node now. Start off by constructing the + * appropriate new tlist and tupdescriptor + */ + result = makeNode(RemoteQuery); + + /* + * Save various information about the inner and the outer plans. We + * may need this information later if more entries are added to it + * as part of the remote expression optimization + */ + result->remotejoin = true; + result->inner_alias = pstrdup(in_alias); + result->outer_alias = pstrdup(out_alias); + result->inner_reduce_level = inner->reduce_level; + result->outer_reduce_level = outer->reduce_level; + + appendStringInfo(&fromlist, " %s (%s) %s", + pname, inner->sql_statement, quote_identifier(in_alias)); + + /* generate join.joinqual remote clause string representation */ + initStringInfo(&clauses); + if (nest_parent->join.joinqual != NIL) + { + create_remote_clause_expr(root, parent, &clauses, + nest_parent->join.joinqual, result); + } + + /* generate join.plan.qual remote clause string representation */ + initStringInfo(&scan_clauses); + if (remote_scan_clauses != NIL) + { + create_remote_clause_expr(root, parent, &scan_clauses, + remote_scan_clauses, result); + } + + /* + * set the base tlist of the involved base relations, useful in + * set_plan_refs later. Additionally the tupledescs should be + * generated using this base_tlist and not the parent targetlist. + * This is because we want to take into account any additional + * column references from the scan clauses too + */ + base_tlist = add_to_flat_tlist(NIL, list_concat(out_tlist, in_tlist)); + + /* cook up the reltupdesc using this base_tlist */ + dummy_rte = makeNode(RangeTblEntry); + dummy_rte->reltupdesc = ExecTypeFromTL(base_tlist, false); + dummy_rte->rtekind = RTE_RELATION; + + /* use a dummy relname... */ + dummy_rte->relname = "__FOREIGN_QUERY__"; + dummy_rte->eref = makeAlias("__FOREIGN_QUERY__", NIL); + /* not sure if we need to set the below explicitly.. */ + dummy_rte->inh = false; + dummy_rte->inFromCl = false; + dummy_rte->requiredPerms = 0; + dummy_rte->checkAsUser = 0; + dummy_rte->selectedCols = NULL; + dummy_rte->modifiedCols = NULL; + + /* + * Append the dummy range table entry to the range table. + * Note that this modifies the master copy the caller passed us, otherwise + * e.g EXPLAIN VERBOSE will fail to find the rte the Vars built below refer + * to. + */ + root->parse->rtable = lappend(root->parse->rtable, dummy_rte); + dummy_rtindex = list_length(root->parse->rtable); + + result_plan = &result->scan.plan; + + /* the join targetlist becomes this node's tlist */ + result_plan->targetlist = parent->targetlist; + result_plan->lefttree = NULL; + result_plan->righttree = NULL; + result->scan.scanrelid = dummy_rtindex; + + /* generate the squery for this node */ + + /* NOTE: it's assumed that the remote_paramNums array is + * filled in the same order as we create the query here. + * + * TODO: we need some way to ensure that the remote_paramNums + * is filled in the same order as the order in which the clauses + * are added in the query below. + */ + initStringInfo(&squery); + appendStringInfo(&squery, "SELECT %s FROM %s", targets.data, fromlist.data); + + if (clauses.data[0] != '\0') + appendStringInfo(&squery, " %s %s", use_where? " WHERE " : " ON ", clauses.data); + + if (scan_clauses.data[0] != '\0') + appendStringInfo(&squery, " %s %s", use_where? " AND " : " WHERE ", scan_clauses.data); + + result->sql_statement = squery.data; + /* don't forget to increment the index for the next time around! */ + result->reduce_level = root->rs_alias_index++; + + + /* set_plan_refs needs this later */ + result->base_tlist = base_tlist; + result->relname = "__FOREIGN_QUERY__"; + + /* + * if there were any local scan clauses stick them up here. They + * can come from the join node or from remote scan node themselves. + * Because of the processing being done earlier in + * create_remotescan_plan, all of the clauses if present will be + * local ones and hence can be stuck without checking for + * remoteness again here into result_plan->qual + */ + result_plan->qual = list_concat(result_plan->qual, outer_plan->qual); + result_plan->qual = list_concat(result_plan->qual, inner_plan->qual); + result_plan->qual = list_concat(result_plan->qual, local_scan_clauses); + + /* we actually need not worry about costs since this is the final plan */ + result_plan->startup_cost = outer_plan->startup_cost; + result_plan->total_cost = outer_plan->total_cost; + result_plan->plan_rows = outer_plan->plan_rows; + result_plan->plan_width = outer_plan->plan_width; + + return (Plan *)make_material(result_plan); + } + } + + return parent; +} + +/* + * Generate aliases for columns of remote tables using the + * colname_varno_varattno_reduce_level nomenclature + */ +static Alias * +generate_remote_rte_alias(RangeTblEntry *rte, int varno, char *aliasname, int reduce_level) +{ + TupleDesc tupdesc; + int maxattrs; + int varattno; + List *colnames = NIL; + StringInfo attr = makeStringInfo(); + + if (rte->rtekind != RTE_RELATION) + elog(ERROR, "called in improper context"); + + if (reduce_level == 0) + return makeAlias(aliasname, NIL); + + tupdesc = rte->reltupdesc; + maxattrs = tupdesc->natts; + + for (varattno = 0; varattno < maxattrs; varattno++) + { + Form_pg_attribute att = tupdesc->attrs[varattno]; + Value *attrname; + + resetStringInfo(attr); + appendStringInfo(attr, "%s_%d_%d_%d", + NameStr(att->attname), varno, varattno + 1, reduce_level); + + attrname = makeString(pstrdup(attr->data)); + + colnames = lappend(colnames, attrname); + } + + return makeAlias(aliasname, colnames); +} + +/* create_remote_target_list + * generate a targetlist using out_alias and in_alias appropriately. It is + * possible that in case of multiple-hierarchy reduction, both sides can have + * columns with the same name. E.g. consider the following: + * + * select * from emp e join emp f on e.x = f.x, emp g; + * + * So if we just use new_alias.columnname it can + * very easily clash with other columnname from the same side of an already + * reduced join. To avoid this, we generate unique column aliases using the + * following convention: + * colname_varno_varattno_reduce_level_index + * + * Each RemoteScan node carries it's reduce_level index to indicate the + * convention that should be adopted while referring to it's columns. If the + * level is 0, then normal column names can be used because they will never + * clash at the join level + */ +static void +create_remote_target_list(PlannerInfo *root, StringInfo targets, List *out_tlist, List *in_tlist, + char *out_alias, int out_index, char *in_alias, int in_index) +{ + int i = 0; + ListCell *l; + StringInfo attrname = makeStringInfo(); + bool add_null_target = true; + + foreach(l, out_tlist) + { + Var *var = (Var *) lfirst(l); + RangeTblEntry *rte = planner_rt_fetch(var->varno, root); + char *attname; + + + if (i++ > 0) + appendStringInfo(targets, ", "); + + attname = get_rte_attribute_name(rte, var->varattno); + + if (out_index) + { + resetStringInfo(attrname); + /* varattno can be negative for sys attributes, hence the abs! */ + appendStringInfo(attrname, "%s_%d_%d_%d", + attname, var->varno, abs(var->varattno), out_index); + appendStringInfo(targets, "%s.%s", + quote_identifier(out_alias), quote_identifier(attrname->data)); + } + else + appendStringInfo(targets, "%s.%s", + quote_identifier(out_alias), quote_identifier(attname)); + + /* generate the new alias now using root->rs_alias_index */ + resetStringInfo(attrname); + appendStringInfo(attrname, "%s_%d_%d_%d", + attname, var->varno, abs(var->varattno), root->rs_alias_index); + appendStringInfo(targets, " AS %s", quote_identifier(attrname->data)); + add_null_target = false; + } + + foreach(l, in_tlist) + { + Var *var = (Var *) lfirst(l); + RangeTblEntry *rte = planner_rt_fetch(var->varno, root); + char *attname; + + if (i++ > 0) + appendStringInfo(targets, ", "); + + attname = get_rte_attribute_name(rte, var->varattno); + + if (in_index) + { + resetStringInfo(attrname); + /* varattno can be negative for sys attributes, hence the abs! */ + appendStringInfo(attrname, "%s_%d_%d_%d", + attname, var->varno, abs(var->varattno), in_index); + appendStringInfo(targets, "%s.%s", + quote_identifier(in_alias), quote_identifier(attrname->data)); + } + else + appendStringInfo(targets, "%s.%s", + quote_identifier(in_alias), quote_identifier(attname)); + + /* generate the new alias now using root->rs_alias_index */ + resetStringInfo(attrname); + appendStringInfo(attrname, "%s_%d_%d_%d", + attname, var->varno, abs(var->varattno), root->rs_alias_index); + appendStringInfo(targets, " AS %s", quote_identifier(attrname->data)); + add_null_target = false; + } + + /* + * It's possible that in some cases, the targetlist might not refer to any + * vars from the joined relations, eg. + * select count(*) from t1, t2; select const from t1, t2; etc + * For such cases just add a NULL selection into this targetlist + */ + if (add_null_target) + appendStringInfo(targets, " NULL "); +} + +/* + * create_remote_clause_expr + * generate a string to represent the clause list expression using out_alias + * and in_alias references. This function does a cute hack by temporarily + * modifying the rte->eref entries of the involved relations to point to + * out_alias and in_alias appropriately. The deparse_expression call then + * generates a string using these erefs which is exactly what is desired here. + * + * Additionally it creates aliases for the column references based on the + * reduce_level values too. This handles the case when both sides have same + * named columns.. + * + * Obviously this function restores the eref, alias values to their former selves + * appropriately too, after use + */ +static void +create_remote_clause_expr(PlannerInfo *root, Plan *parent, StringInfo clauses, + List *qual, RemoteQuery *scan) +{ + Node *node = (Node *) make_ands_explicit(qual); + + return create_remote_expr(root, parent, clauses, node, scan); +} + +static void +create_remote_expr(PlannerInfo *root, Plan *parent, StringInfo expr, + Node *node, RemoteQuery *scan) +{ + List *context; + List *leref = NIL; + ListCell *cell; + char *exprstr; + int rtindex; + Relids tmprelids, relids; + + relids = pull_varnos((Node *)node); + + tmprelids = bms_copy(relids); + + while ((rtindex = bms_first_member(tmprelids)) >= 0) + { + RangeTblEntry *rte = planner_rt_fetch(rtindex, root); + + /* + * This rtindex should be a member of either out_relids or + * in_relids and never both + */ + if (bms_is_member(rtindex, scan->outer_relids) && + bms_is_member(rtindex, scan->inner_relids)) + elog(ERROR, "improper relid references in the join clause list"); + + /* + * save the current rte->eref and rte->alias values and stick in a new + * one in the rte with the proper inner or outer alias + */ + leref = lappend(leref, rte->eref); + leref = lappend(leref, rte->alias); + + if (bms_is_member(rtindex, scan->outer_relids)) + { + rte->eref = makeAlias(scan->outer_alias, NIL); + + /* attach proper column aliases.. */ + rte->alias = generate_remote_rte_alias(rte, rtindex, + scan->outer_alias, scan->outer_reduce_level); + } + if (bms_is_member(rtindex, scan->inner_relids)) + { + rte->eref = makeAlias(scan->inner_alias, NIL); + + /* attach proper column aliases.. */ + rte->alias = generate_remote_rte_alias(rte, rtindex, + scan->inner_alias, scan->inner_reduce_level); + } + } + bms_free(tmprelids); + + /* Set up deparsing context */ + context = deparse_context_for_plan((Node *) parent, + NULL, + root->parse->rtable, + NULL); + + exprstr = deparse_expression(node, context, true, false); + + /* revert back the saved eref entries in the same order now! */ + cell = list_head(leref); + tmprelids = bms_copy(relids); + while ((rtindex = bms_first_member(tmprelids)) >= 0) + { + RangeTblEntry *rte = planner_rt_fetch(rtindex, root); + + Assert(cell != NULL); + + rte->eref = lfirst(cell); + rte->alias = lfirst(lnext(cell)); + + cell = lnext(cell); + } + bms_free(tmprelids); + + appendStringInfo(expr, " %s", exprstr); + return; +} +#endif + /* * create_append_plan * Create an Append plan for 'best_path' and (recursively) plans @@ -3980,3 +4586,56 @@ is_projection_capable_plan(Plan *plan) } return true; } + +#ifdef PGXC +/* + * findReferencedVars() + * + * Constructs a list of those Vars in targetlist which are found in + * parent_vars (in other words, the intersection of targetlist and + * parent_vars). Returns a new list in *out_tlist and a bitmap of + * those relids found in the result. + * + * Additionally do look at the qual references to other vars! They + * also need to be selected.. + */ +static void +findReferencedVars(List *parent_vars, Plan *plan, List **out_tlist, Relids *out_relids) +{ + List *vars; + Relids relids = NULL; + List *tlist = NIL; + ListCell *l; + + /* Pull vars from both the targetlist and the clauses attached to this plan */ + vars = pull_var_clause((Node *)plan->targetlist, PVC_REJECT_PLACEHOLDERS); + + foreach(l, vars) + { + Var *var = lfirst(l); + + if (search_tlist_for_var(var, parent_vars)) + tlist = lappend(tlist, var); + + if (!bms_is_member(var->varno, relids)) + relids = bms_add_member(relids, var->varno); + } + + /* now consider the local quals */ + vars = pull_var_clause((Node *)plan->qual, PVC_REJECT_PLACEHOLDERS); + + foreach(l, vars) + { + Var *var = lfirst(l); + + if (search_tlist_for_var(var, tlist) == NULL) + tlist = lappend(tlist, var); + + if (!bms_is_member(var->varno, relids)) + relids = bms_add_member(relids, var->varno); + } + + *out_tlist = tlist; + *out_relids = relids; +} +#endif diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c index 2c95815..dc6ff35 100644 --- a/src/backend/optimizer/plan/planner.c +++ b/src/backend/optimizer/plan/planner.c @@ -301,6 +301,9 @@ subquery_planner(PlannerGlobal *glob, Query *parse, root->eq_classes = NIL; root->append_rel_list = NIL; +#ifdef PGXC + root->rs_alias_index = 1; +#endif root->hasRecursion = hasRecursion; if (hasRecursion) root->wt_param_id = SS_assign_worktable_param(root); diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c index cab7fb4..950e388 100644 --- a/src/backend/optimizer/plan/setrefs.c +++ b/src/backend/optimizer/plan/setrefs.c @@ -1401,6 +1401,32 @@ search_indexed_tlist_for_non_var(Node *node, return NULL; /* no match */ } +#ifdef PGXC +/* + * search_tlist_for_var --- find a Var in the provided tlist. This does a + * basic scan through the list. So not very efficient... + * + * If no match, return NULL. + * + */ +Var * +search_tlist_for_var(Var *var, List *jtlist) +{ + Index varno = var->varno; + AttrNumber varattno = var->varattno; + ListCell *l; + + foreach(l, jtlist) + { + Var *listvar = (Var *) lfirst(l); + + if (listvar->varno == varno && listvar->varattno == varattno) + return var; + } + return NULL; /* no match */ +} +#endif + /* * search_indexed_tlist_for_sortgroupref --- find a sort/group expression * (which is assumed not to be just a Var) diff --git a/src/backend/parser/parse_relation.c b/src/backend/parser/parse_relation.c index d63e504..229b16d 100644 --- a/src/backend/parser/parse_relation.c +++ b/src/backend/parser/parse_relation.c @@ -925,6 +925,7 @@ addRangeTableEntry(ParseState *pstate, #ifdef PGXC rte->reltupdesc = CreateTupleDescCopyConstr(rel->rd_att); + rte->relname = RelationGetRelationName(rel); #endif /* @@ -991,6 +992,7 @@ addRangeTableEntryForRelation(ParseState *pstate, #ifdef PGXC rte->reltupdesc = CreateTupleDescCopyConstr(rel->rd_att); + rte->relname = RelationGetRelationName(rel); #endif /* diff --git a/src/backend/pgxc/pool/execRemote.c b/src/backend/pgxc/pool/execRemote.c index c493eb3..7fe08be 100644 --- a/src/backend/pgxc/pool/execRemote.c +++ b/src/backend/pgxc/pool/execRemote.c @@ -2388,20 +2388,6 @@ ExecInitRemoteQuery(RemoteQuery *node, EState *estate, int eflags) ExecInitScanTupleSlot(estate, &remotestate->ss); - /* - * Initialize scan relation. get the relation object id from the - * relid'th entry in the range table, open that relation and acquire - * appropriate lock on it. - * This is needed for deparseSQL - * We should remove these lines once we plan and deparse earlier. - */ - if (!node->is_single_step) - { - currentRelation = ExecOpenScanRelation(estate, node->scan.scanrelid); - remotestate->ss.ss_currentRelation = currentRelation; - ExecAssignScanType(&remotestate->ss, RelationGetDescr(currentRelation)); - } - remotestate->ss.ps.ps_TupFromTlist = false; /* diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c index 3e8077f..684396c 100644 --- a/src/backend/utils/misc/guc.c +++ b/src/backend/utils/misc/guc.c @@ -687,6 +687,16 @@ static struct config_bool ConfigureNamesBool[] = &enable_hashjoin, true, NULL, NULL }, +#ifdef PGXC + { + {"enable_remotejoin", PGC_USERSET, QUERY_TUNING_METHOD, + gettext_noop("Enables the planner's use of remote join plans."), + NULL + }, + &enable_remotejoin, + true, NULL, NULL + }, +#endif { {"geqo", PGC_USERSET, QUERY_TUNING_GEQO, gettext_noop("Enables genetic query optimization."), diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h index 9423175..4d90052 100644 --- a/src/include/nodes/parsenodes.h +++ b/src/include/nodes/parsenodes.h @@ -663,6 +663,7 @@ typedef struct RangeTblEntry */ #ifdef PGXC + char *relname; TupleDesc reltupdesc; #endif diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h index 581ce0a..d537855 100644 --- a/src/include/nodes/relation.h +++ b/src/include/nodes/relation.h @@ -189,6 +189,11 @@ typedef struct PlannerInfo * pseudoconstant = true */ bool hasRecursion; /* true if planning a recursive WITH item */ +#ifdef PGXC + /* This field is used only when RemoteScan nodes are involved */ + int rs_alias_index; /* used to build the alias reference */ +#endif + /* These fields are used only when hasRecursion is true: */ int wt_param_id; /* PARAM_EXEC ID for the work table */ struct Plan *non_recursive_plan; /* plan for non-recursive term */ diff --git a/src/include/optimizer/cost.h b/src/include/optimizer/cost.h index 29aed38..876542a 100644 --- a/src/include/optimizer/cost.h +++ b/src/include/optimizer/cost.h @@ -59,6 +59,9 @@ extern bool enable_hashagg; extern bool enable_nestloop; extern bool enable_mergejoin; extern bool enable_hashjoin; +#ifdef PGXC +extern bool enable_remotejoin; +#endif extern int constraint_exclusion; extern double clamp_row_est(double nrows); diff --git a/src/include/optimizer/planmain.h b/src/include/optimizer/planmain.h index 0dd2bcc..c1d191e 100644 --- a/src/include/optimizer/planmain.h +++ b/src/include/optimizer/planmain.h @@ -119,4 +119,7 @@ extern void extract_query_dependencies(List *queries, List **relationOids, List **invalItems); +#ifdef PGXC +extern Var *search_tlist_for_var(Var *var, List *jtlist); +#endif #endif /* PLANMAIN_H */ diff --git a/src/include/pgxc/planner.h b/src/include/pgxc/planner.h index 1e31fa3..ef00f27 100644 --- a/src/include/pgxc/planner.h +++ b/src/include/pgxc/planner.h @@ -23,6 +23,7 @@ #include "nodes/primnodes.h" #include "pgxc/locator.h" #include "tcop/dest.h" +#include "nodes/relation.h" typedef enum @@ -85,6 +86,17 @@ typedef struct bool read_only; /* do not use 2PC when committing read only steps */ bool force_autocommit; /* some commands like VACUUM require autocommit mode */ RemoteQueryExecType exec_type; + + char *relname; + bool remotejoin; /* True if this is a reduced remote join */ + int reduce_level; /* in case of reduced JOIN, it's level */ + List *base_tlist; /* in case of isReduced, the base tlist */ + char *outer_alias; + char *inner_alias; + int outer_reduce_level; + int inner_reduce_level; + Relids outer_relids; + Relids inner_relids; } RemoteQuery; commit aefc06e7bd90c657fb093a923f7b66177687561d Author: Pavan Deolasee <pav...@gm...> Date: Mon Oct 18 11:29:21 2010 +0530 First step to SQL-med integration. Moving query generation to planning stage diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c index c58e2a0..134b9e1 100644 --- a/src/backend/nodes/copyfuncs.c +++ b/src/backend/nodes/copyfuncs.c @@ -1836,6 +1836,12 @@ _copyRangeTblEntry(RangeTblEntry *from) RangeTblEntry *newnode = makeNode(RangeTblEntry); COPY_SCALAR_FIELD(rtekind); + +#ifdef PGXC + if (from->reltupdesc) + newnode->reltupdesc = CreateTupleDescCopy(from->reltupdesc); +#endif + COPY_SCALAR_FIELD(relid); COPY_NODE_FIELD(subquery); COPY_SCALAR_FIELD(jointype); diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c index 1c8691a..be80f18 100644 --- a/src/backend/nodes/outfuncs.c +++ b/src/backend/nodes/outfuncs.c @@ -2015,6 +2015,10 @@ _outSetOperationStmt(StringInfo str, SetOperationStmt *node) static void _outRangeTblEntry(StringInfo str, RangeTblEntry *node) { +#ifdef PGXC + int i; +#endif + WRITE_NODE_TYPE("RTE"); /* put alias + eref first to make dump more legible */ @@ -2025,6 +2029,22 @@ _outRangeTblEntry(StringInfo str, RangeTblEntry *node) switch (node->rtekind) { case RTE_RELATION: +#ifdef PGXC + /* write tuple descriptor */ + appendStringInfo(str, " :tupdesc_natts %d (", node->reltupdesc->natts); + + for (i = 0 ; i < node->reltupdesc->natts ; i++) + { + appendStringInfo(str, ":colname "); + _outToken(str, NameStr(node->reltupdesc->attrs[i]->attname)); + appendStringInfo(str, " :coltypid %u ", + node->reltupdesc->attrs[i]->atttypid); + appendStringInfo(str, ":coltypmod %d ", + node->reltupdesc->attrs[i]->atttypmod); + } + + appendStringInfo(str, ") "); + #endif case RTE_SPECIAL: WRITE_OID_FIELD(relid); break; diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c index 1f562d7..72a2156 100644 --- a/src/backend/nodes/readfuncs.c +++ b/src/backend/nodes/readfuncs.c @@ -31,7 +31,9 @@ #include "nodes/parsenodes.h" #include "nodes/readfuncs.h" - +#ifdef PGXC +#include "access/htup.h" +#endif /* * Macros to simplify reading of different kinds of fields. Use these @@ -1104,6 +1106,12 @@ _readFromExpr(void) static RangeTblEntry * _readRangeTblEntry(void) { +#ifdef PGXC + int natts, i; + char *colname; + Oid typid, typmod; +#endif + READ_LOCALS(RangeTblEntry); /* put alias + eref first to make dump more legible */ @@ -1114,6 +1122,52 @@ _readRangeTblEntry(void) switch (local_node->rtekind) { case RTE_RELATION: +#ifdef PGXC + /* read tuple descriptor */ + token = pg_strtok(&length); /* skip :tupdesc_natts */ + token = pg_strtok(&length); /* get field value */ + + natts = atoi(token); + + if (natts > 0 && natts <= MaxTupleAttributeNumber) + local_node->reltupdesc = CreateTemplateTupleDesc(natts, false); + else + elog(ERROR, "invalid node field to read"); + + token = pg_strtok(&length); /* skip '(' */ + + if (length == 1 && pg_strncasecmp(token, "(", length) == 0) + { + for (i = 0 ; i < natts ; i++) + { + token = pg_strtok(&length); /* skip :colname */ + token = pg_strtok(&length); /* get colname */ + colname = nullable_string(token, length); + + if (colname == NULL) + elog(ERROR, "invalid node field to read"); + + token = pg_strtok(&length); /* skip :coltypid */ + token = pg_strtok(&length); /* get typid */ + typid = atooid(token); + + token = pg_strtok(&length); /* skip :coltypmod */ + token = pg_strtok(&length); /* get typmod */ + typmod = atoi(token); + + TupleDescInitEntry(local_node->reltupdesc, + (i + 1), colname, typid, typmod, 0); + } + } + else + elog(ERROR, "invalid node field to read"); + + token = pg_strtok(&length); /* skip '(' */ + + if (!(length == 1 && pg_strncasecmp(token, ")", length) == 0)) + elog(ERROR, "invalid node field to read"); +#endif + case RTE_SPECIAL: READ_OID_FIELD(relid); break; diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c index 818ea1b..4f3a7c6 100644 --- a/src/backend/optimizer/plan/createplan.c +++ b/src/backend/optimizer/plan/createplan.c @@ -34,6 +34,10 @@ #include "parser/parsetree.h" #ifdef PGXC #include "pgxc/planner.h" +#include "access/sysattr.h" +#include "utils/builtins.h" +#include "utils/syscache.h" +#include "catalog/pg_proc.h" #endif #include "utils/lsyscache.h" @@ -141,6 +145,9 @@ static Sort *make_sort(PlannerInfo *root, Plan *lefttree, int numCols, double limit_tuples); static Material *make_material(Plan *lefttree); +#ifdef PGXC +extern bool is_foreign_qual(Node *clause); +#endif /* * create_plan @@ -445,9 +452,6 @@ disuse_physical_tlist(Plan *plan, Path *path) case T_ValuesScan: case T_CteScan: case T_WorkTableScan: -#ifdef PGXC - case T_RemoteQuery: -#endif plan->targetlist = build_relation_tlist(path->parent); break; default: @@ -1583,9 +1587,23 @@ create_remotequery_plan(PlannerInfo *root, Path *best_path, List *tlist, List *scan_clauses) { RemoteQuery *scan_plan; + bool prefix; Index scan_relid = best_path->parent->relid; RangeTblEntry *rte; - + char *wherestr = NULL; + Bitmapset *varattnos = NULL; + List *remote_scan_clauses = NIL; + List *local_scan_clauses = NIL; + Oid nspid; + char *nspname; + char *relname; + const char *nspname_q; + const char *relname_q; + const char *aliasname_q; + int i; + TupleDesc tupdesc; + bool first; + StringInfoData sql; Assert(scan_relid > 0); rte = planner_rt_fetch(scan_relid, root); @@ -1598,16 +1616,159 @@ create_remotequery_plan(PlannerInfo *root, Path *best_path, /* Reduce RestrictInfo list to bare expressions; ignore pseudoconstants */ scan_clauses = extract_actual_clauses(scan_clauses, false); + if (scan_clauses) + { + ListCell *l; + + foreach(l, (List *)scan_clauses) + { + Node *clause = lfirst(l); + + if (is_foreign_qual(clause)) + remote_scan_clauses = lappend(remote_scan_clauses, clause); + else + local_scan_clauses = lappend(local_scan_clauses, clause); + } + } + + /* + * Incorporate any remote_scan_clauses into the WHERE clause that + * we intend to push to the remote server. + */ + if (remote_scan_clauses) + { + char *sep = ""; + ListCell *l; + StringInfoData buf; + List *deparse_context; + + initStringInfo(&buf); + + deparse_context = deparse_context_for_remotequery( + get_rel_name(rte->relid), rte->relid); + + /* + * remote_scan_clauses is a list of scan clauses (restrictions) that we + * can push to the remote server. We want to deparse each of those + * expressions (that is, each member of the List) and AND them together + * into a WHERE clause. + */ + + foreach(l, (List *)remote_scan_clauses) + { + Node *clause = lfirst(l); + + appendStringInfo(&buf, "%s", sep ); + appendStringInfo(&buf, "%s", deparse_expression(clause, deparse_context, false, false)); + sep = " AND "; + } + + wherestr = buf.data; + } + + /* + * Now walk through the target list and the scan clauses to get the + * interesting attributes. Only those attributes will be fetched from the + * remote side. + */ + varattnos = pull_varattnos_varno((Node *) best_path->parent->reltargetlist, best_path->parent->relid, + varattnos); + varattnos = pull_varattnos_varno((Node *) local_scan_clauses, + best_path->parent->relid, varattnos); + /* + * Scanning multiple relations in a RemoteQuery node is not supported. + */ + prefix = false; +#if 0 + prefix = list_length(estate->es_range_table) > 1; +#endif + + /* Get quoted names of schema, table and alias */ + nspid = get_rel_namespace(rte->relid); + nspname = get_namespace_name(nspid); + relname = get_rel_name(rte->relid); + nspname_q = quote_identifier(nspname); + relname_q = quote_identifier(relname); + aliasname_q = quote_identifier(rte->eref->aliasname); + + initStringInfo(&sql); + + /* deparse SELECT clause */ + appendStringInfo(&sql, "SELECT "); + + /* + * TODO: omit (deparse to "NULL") columns which are not used in the + * original SQL. + * + * We must parse nodes parents of this RemoteQuery node to determine unused + * columns because some columns may be used only in parent Sort/Agg/Limit + * nodes. + */ + tupdesc = best_path->parent->reltupdesc; + first = true; + for (i = 0; i < tupdesc->natts; i++) + { + /* skip dropped attributes */ + if (tupdesc->attrs[i]->attisdropped) + continue; + + if (!first) + appendStringInfoString(&sql, ", "); + + if (bms_is_member(i + 1 - FirstLowInvalidHeapAttributeNumber, varattnos)) + { + if (prefix) + appendStringInfo(&sql, "%s.%s", + aliasname_q, tupdesc->attrs[i]->attname.data); + else + appendStringInfo(&sql, "%s", tupdesc->attrs[i]->attname.data); + } + else + appendStringInfo(&sql, "%s", "NULL"); + first = false; + } + + /* if target list is composed only of system attributes, add dummy column */ + if (first) + appendStringInfo(&sql, "NULL"); + + /* deparse FROM clause */ + appendStringInfo(&sql, " FROM "); + /* + * XXX: should use GENERIC OPTIONS like 'foreign_relname' or something for + * the foreign table name instead of the local name ? + */ + appendStringInfo(&sql, "%s.%s %s", nspname_q, relname_q, aliasname_q); + pfree(nspname); + pfree(relname); + if (nspname_q != nspname_q) + pfree((char *) nspname_q); + if (relname_q != relname_q) + pfree((char *) relname_q); + if (aliasname_q != rte->eref->aliasname) + pfree((char *) aliasname_q); + + if (wherestr) + { + appendStringInfo(&sql, " WHERE "); + appendStringInfo(&sql, "%s", wherestr); + pfree(wherestr); + } + + bms_free(varattnos); + scan_plan = make_remotequery(tlist, rte, - scan_clauses, + local_scan_clauses, scan_relid); + scan_plan->sql_statement = sql.data; + copy_path_costsize(&scan_plan->scan.plan, best_path); /* PGXCTODO - get better estimates */ scan_plan->scan.plan.plan_rows = 1000; - + return scan_plan; } #endif diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c index 1d93203..957a515 100644 --- a/src/backend/optimizer/util/relnode.c +++ b/src/backend/optimizer/util/relnode.c @@ -92,6 +92,10 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptKind reloptkind) rel->index_outer_relids = NULL; rel->index_inner_paths = NIL; +#ifdef PGXC + rel->reltupdesc = rte->reltupdesc; +#endif + /* Check type of rtable entry */ switch (rte->rtekind) { diff --git a/src/backend/optimizer/util/var.c b/src/backend/optimizer/util/var.c index 1a6826f..a574278 100644 --- a/src/backend/optimizer/util/var.c +++ b/src/backend/optimizer/util/var.c @@ -34,6 +34,14 @@ typedef struct int sublevels_up; } pull_varnos_context; +#ifdef PGXC +typedef struct +{ + Index varno; + Bitmapset *varattnos; +} pull_varattnos_context; +#endif + typedef struct { int var_location; @@ -68,6 +76,10 @@ typedef struct static bool pull_varnos_walker(Node *node, pull_varnos_context *context); static bool pull_varattnos_walker(Node *node, Bitmapset **varattnos); +#ifdef PGXC +static bool pull_varattnos_varno_walker(Node *node, + pull_varattnos_context *context); +#endif static bool contain_var_clause_walker(Node *node, void *context); static bool contain_vars_of_level_walker(Node *node, int *sublevels_up); static bool locate_var_of_level_walker(Node *node, @@ -228,6 +240,54 @@ contain_var_clause(Node *node) return contain_var_clause_walker(node, NULL); } +#ifdef PGXC +/* + * pull_varattnos_varno + * Find all the distinct attribute numbers present in an expression tree, + * and add them to the initial contents of *varattnos. + * + * Attribute numbers are offset by FirstLowInvalidHeapAttributeNumber so that + * we can include system attributes (e.g., OID) in the bitmap representation. + * + * This is same as pull_varattnos except for the fact that it gets attributes + * for the given varno + */ +Bitmapset * +pull_varattnos_varno(Node *node, Index varno, Bitmapset *varattnos) +{ + pull_varattnos_context context; + + context.varno = varno; + context.varattnos = varattnos; + + (void) pull_varattnos_varno_walker(node, &context); + + return context.varattnos; +} + +static bool +pull_varattnos_varno_walker(Node *node, pull_varattnos_context *context) +{ + if (node == NULL) + return false; + + Assert(context != NULL); + + if (IsA(node, Var)) + { + Var *var = (Var *) node; + + if (var->varno == context->varno) + context->varattnos = bms_add_member(context->varattnos, + var->varattno - FirstLowInvalidHeapAttributeNumber); + return false; + } + + return expression_tree_walker(node, pull_varattnos_varno_walker, + (void *) context); +} +#endif + static bool contain_var_clause_walker(Node *node, void *context) { diff --git a/src/backend/parser/parse_relation.c b/src/backend/parser/parse_relation.c index 5a42451..d63e504 100644 --- a/src/backend/parser/parse_relation.c +++ b/src/backend/parser/parse_relation.c @@ -923,6 +923,10 @@ addRangeTableEntry(ParseState *pstate, rel = parserOpenTable(pstate, relation, lockmode); rte->relid = RelationGetRelid(rel); +#ifdef PGXC + rte->reltupdesc = CreateTupleDescCopyConstr(rel->rd_att); +#endif + /* * Build the list of effective column names using user-supplied aliases * and/or actual column names. @@ -985,6 +989,10 @@ addRangeTableEntryForRelation(ParseState *pstate, rte->alias = alias; rte->relid = RelationGetRelid(rel); +#ifdef PGXC + rte->reltupdesc = CreateTupleDescCopyConstr(rel->rd_att); +#endif + /* * Build the list of effective column names using user-supplied aliases * and/or actual column names. diff --git a/src/backend/pgxc/pool/execRemote.c b/src/backend/pgxc/pool/execRemote.c index 14dce33..c493eb3 100644 --- a/src/backend/pgxc/pool/execRemote.c +++ b/src/backend/pgxc/pool/execRemote.c @@ -2723,11 +2723,6 @@ ExecRemoteQuery(RemoteQueryState *node) errmsg("Could not begin transaction on data nodes."))); } - /* Get the SQL string */ - /* only do if not single step */ - if (!step->is_single_step) - step->sql_statement = deparseSql(node); - /* See if we have a primary node, execute on it first before the others */ if (primaryconnection) { diff --git a/src/backend/pgxc/pool/postgresql_fdw.c b/src/backend/pgxc/pool/postgresql_fdw.c index dabf5da..14c0ddb 100644 --- a/src/backend/pgxc/pool/postgresql_fdw.c +++ b/src/backend/pgxc/pool/postgresql_fdw.c @@ -45,7 +45,7 @@ /* deparse SQL from the request */ bool is_immutable_func(Oid funcid); -static bool is_foreign_qual(ExprState *state); +bool is_foreign_qual(Node *node); static bool foreign_qual_walker(Node *node, void *context); char *deparseSql(RemoteQueryState *scanstate); @@ -103,10 +103,10 @@ is_immutable_func(Oid funcid) * local server in the foreign server. * - scalar array operator (ANY/ALL) */ -static bool -is_foreign_qual(ExprState *state) +bool +is_foreign_qual(Node *node) { - return !foreign_qual_walker((Node *) state->expr, NULL); + return !foreign_qual_walker(node, NULL); } /* @@ -120,6 +120,9 @@ foreign_qual_walker(Node *node, void *context) switch (nodeTag(node)) { + case T_ExprState: + return foreign_qual_walker((Node *) ((ExprState *) node)->expr, NULL); + case T_Param: /* TODO: pass internal parameters to the foreign server */ if (((Param *) node)->paramkind != PARAM_EXTERN) @@ -286,7 +289,7 @@ elog(DEBUG2, "%s(%u) called", __FUNCTION__, __LINE__); { ExprState *state = lfirst(lc); - if (is_foreign_qual(state)) + if (is_foreign_qual((Node *) state)) { elog(DEBUG1, "foreign qual: %s", nodeToString(state->expr)); foreign_qual = lappend(foreign_qual, state); @@ -317,7 +320,7 @@ elog(DEBUG2, "%s(%u) called", __FUNCTION__, __LINE__); Node *node; node = (Node *) make_ands_explicit(foreign_expr); appendStringInfo(&sql, " WHERE "); - appendStringInfo(&sql, + appendStringInfo(&sql, "%s", deparse_expression(node, context, prefix, false)); /* * The contents of the list MUST NOT be free-ed because they are diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c index c930701..130dff3 100644 --- a/src/backend/utils/adt/ruleutils.c +++ b/src/backend/utils/adt/ruleutils.c @@ -114,6 +114,8 @@ typedef struct List *subplans; /* List of subplans, in plan-tree case */ Plan *outer_plan; /* OUTER subplan, or NULL if none */ Plan *inner_plan; /* INNER subplan, or NULL if none */ + + bool remotequery; /* deparse context for remote query */ } deparse_namespace; @@ -1936,10 +1938,42 @@ deparse_context_for(const char *aliasname, Oid relid) dpns->ctes = NIL; dpns->subplans = NIL; dpns->outer_plan = dpns->inner_plan = NULL; +#ifdef PGXC + dpns->remotequery = false; +#endif + + /* Return a one-deep namespace stack */ + return list_make1(dpns); +} + +#ifdef PGXC +List * +deparse_context_for_remotequery(const char *aliasname, Oid relid) +{ + deparse_namespace *dpns; + RangeTblEntry *rte; + + dpns = (deparse_namespace *) palloc(sizeof(deparse_namespace)); + + /* Build a minimal RTE for the rel */ + rte = makeNode(RangeTblEntry); + rte->rtekind = RTE_RELATION; + rte->relid = relid; + rte->eref = makeAlias(aliasname, NIL); + rte->inh = false; + rte->inFromCl = true; + + /* Build one-element rtable */ + dpns->rtable = list_make1(rte); + dpns->ctes = NIL; + dpns->subplans = NIL; + dpns->outer_plan = dpns->inner_plan = NULL; + dpns->remotequery = true; /* Return a one-deep namespace stack */ return list_make1(dpns); } +#endif /* * deparse_context_for_plan - Build deparse context for a plan node @@ -1974,7 +2008,9 @@ deparse_context_for_plan(Node *plan, Node *outer_plan, dpns->rtable = rtable; dpns->ctes = NIL; dpns->subplans = subplans; - +#ifdef PGXC + dpns->remotequery = false; +#endif /* * Set up outer_plan and inner_plan from the Plan node (this includes * various special cases for particular Plan types). @@ -2138,7 +2174,9 @@ make_ruledef(StringInfo buf, HeapTuple ruletup, TupleDesc rulettc, dpns.ctes = query->cteList; dpns.subplans = NIL; dpns.outer_plan = dpns.inner_plan = NULL; - +#ifdef PGXC + dpns.remotequery = false; +#endif get_rule_expr(qual, &context, false); } @@ -2285,7 +2323,9 @@ get_query_def(Query *query, StringInfo buf, List *parentnamespace, dpns.ctes = query->cteList; dpns.subplans = NIL; dpns.outer_plan = dpns.inner_plan = NULL; - +#ifdef PGXC + dpns.remotequery = false; +#endif switch (query->commandType) { case CMD_SELECT: @@ -3379,6 +3419,14 @@ get_variable(Var *var, int levelsup, bool showstar, deparse_context *context) * likely that varno is OUTER or INNER, in which case we must dig down * into the subplans. */ +#ifdef PGXC + if (dpns->remotequery) + { + rte = rt_fetch(1, dpns->rtable); + attnum = var->varattno; + } + else +#endif if (var->varno >= 1 && var->varno <= list_length(dpns->rtable)) { rte = rt_fetch(var->varno, dpns->rtable); @@ -3705,6 +3753,9 @@ get_name_for_var_field(Var *var, int fieldno, mydpns.ctes = rte->subquery->cteList; mydpns.subplans = NIL; mydpns.outer_plan = mydpns.inner_plan = NULL; +#ifdef PGXC + mydpns.remotequery = false; +#endif context->namespaces = lcons(&mydpns, context->namespaces); @@ -3828,7 +3879,9 @@ get_name_for_var_field(Var *var, int fieldno, mydpns.ctes = ctequery->cteList; mydpns.subplans = NIL; mydpns.outer_plan = mydpns.inner_plan = NULL; - +#ifdef PGXC + mydpns.remotequery = false; +#endif new_nslist = list_copy_tail(context->namespaces, ctelevelsup); context->namespaces = lcons(&mydpns, new_nslist); diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h index 5fb2a2b..9423175 100644 --- a/src/include/nodes/parsenodes.h +++ b/src/include/nodes/parsenodes.h @@ -24,6 +24,9 @@ #include "nodes/bitmapset.h" #include "nodes/primnodes.h" #include "nodes/value.h" +#ifdef PGXC +#include "access/tupdesc.h" +#endif /* Possible sources of a Query */ typedef enum QuerySource @@ -659,6 +662,10 @@ typedef struct RangeTblEntry * code that is being actively worked on. FIXME someday. */ +#ifdef PGXC + TupleDesc reltupdesc; +#endif + /* * Fields valid for a plain relation RTE (else zero): */ diff --git a/src/include/nodes/relation.h b/src/include/nodes/relation.h index ea48889..581ce0a 100644 --- a/src/include/nodes/relation.h +++ b/src/include/nodes/relation.h @@ -377,6 +377,10 @@ typedef struct RelOptInfo * clauses */ List *index_inner_paths; /* InnerIndexscanInfo nodes */ +#ifdef PGXC + TupleDesc reltupdesc; +#endif + /* * Inner indexscans are not in the main pathlist because they are not * usable except in specific join contexts. We use the index_inner_paths diff --git a/src/include/optimizer/var.h b/src/include/optimizer/var.h index 08e885b..966e827 100644 --- a/src/include/optimizer/var.h +++ b/src/include/optimizer/var.h @@ -25,6 +25,9 @@ typedef enum extern Relids pull_varnos(Node *node); extern void pull_varattnos(Node *node, Bitmapset **varattnos); +#ifdef PGXC +extern Bitmapset * pull_varattnos_varno(Node *node, Index varno, Bitmapset *varattnos); +#endif extern bool contain_var_clause(Node *node); extern bool contain_vars_of_level(Node *node, int levelsup); extern int locate_var_of_level(Node *node, int levelsup); diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h index 50b9ab2..85384b5 100644 --- a/src/include/utils/builtins.h +++ b/src/include/utils/builtins.h @@ -595,7 +595,10 @@ extern Datum pg_get_function_identity_arguments(PG_FUNCTION_ARGS); extern Datum pg_get_function_result(PG_FUNCTION_ARGS); extern char *deparse_expression(Node *expr, List *dpcontext, bool forceprefix, bool showimplicit); +extern List *deparse_context_for_remotequery(const char *aliasname, Oid relid); +#ifdef PGXC extern List *deparse_context_for(const char *aliasname, Oid relid); +#endif extern List *deparse_context_for_plan(Node *plan, Node *outer_plan, List *rtable, List *subplans); extern const char *quote_identifier(const char *ident); ----------------------------------------------------------------------- hooks/post-receive -- Postgres-XC |
From: Michael P. <mic...@us...> - 2010-10-13 05:46:42
|
Project "Postgres-XC". The branch, master has been updated via ca4fb6103add2b4560b8efe142f24d94ed03d56e (commit) from 52af07a890baeb608b5ea59211eb4a080511e8c7 (commit) - Log ----------------------------------------------------------------- commit ca4fb6103add2b4560b8efe142f24d94ed03d56e Author: Michael P <mic...@us...> Date: Wed Oct 13 14:41:13 2010 +0900 After a Commit of prepared transaction on GTM, Connection from PGXC Node to GTM was always reinitialized even if process went correctly on GTM. Now if Commit Prepared at GTM runs without error, connection is not reinitialized. Bug found by Benny Mei Le diff --git a/src/gtm/client/gtm_client.c b/src/gtm/client/gtm_client.c index 984aee1..53ab3f3 100644 --- a/src/gtm/client/gtm_client.c +++ b/src/gtm/client/gtm_client.c @@ -215,6 +215,8 @@ commit_prepared_transaction(GTM_Conn *conn, GlobalTransactionId gxid, GlobalTran Assert(res->gr_resdata.grd_gxid == gxid); } + return res->gr_status; + send_failed: receive_failed: return -1; ----------------------------------------------------------------------- Summary of changes: src/gtm/client/gtm_client.c | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) hooks/post-receive -- Postgres-XC |
From: mason_s <ma...@us...> - 2010-10-13 03:03:38
|
Project "Postgres-XC". The branch, master has been updated via 52af07a890baeb608b5ea59211eb4a080511e8c7 (commit) from 88162bcb5cb3dabf8cef3717ad1837182fb5f5dc (commit) - Log ----------------------------------------------------------------- commit 52af07a890baeb608b5ea59211eb4a080511e8c7 Author: Mason Sharp <ma...@us...> Date: Tue Oct 12 23:00:12 2010 -0400 Fix bug with pooler. Make sure socket removed when signal to stop is trapped. diff --git a/src/backend/pgxc/pool/poolcomm.c b/src/backend/pgxc/pool/poolcomm.c index 7e4771c..853b385 100644 --- a/src/backend/pgxc/pool/poolcomm.c +++ b/src/backend/pgxc/pool/poolcomm.c @@ -5,7 +5,7 @@ * Communication functions between the pool manager and session * * - * Portions Copyright (c) 1996-2009, PostgreSQL Global Development Group + * Portions Copyright (c) 1996-2009, PostgreSQL Global Development Group * Portions Copyright (c) 2010 Nippon Telegraph and Telephone Corporation * *------------------------------------------------------------------------- @@ -41,6 +41,8 @@ static int pool_discardbytes(PoolPort *port, size_t len); static char sock_path[MAXPGPATH]; +static void StreamDoUnlink(int code, Datum arg); + static int Lock_AF_UNIX(unsigned short port, const char *unixSocketName); #endif @@ -77,6 +79,9 @@ pool_listen(unsigned short port, const char *unixSocketName) if (listen(fd, 5) < 0) return -1; + /* Arrange to unlink the socket file at exit */ + on_proc_exit(StreamDoUnlink, 0); + return fd; #else /* TODO support for non-unix platform */ @@ -87,6 +92,19 @@ pool_listen(unsigned short port, const char *unixSocketName) #endif } +/* StreamDoUnlink() + * Shutdown routine for pooler connection + * If a Unix socket is used for communication, explicitly close it. + */ +#ifdef HAVE_UNIX_SOCKETS +static void +StreamDoUnlink(int code, Datum arg) +{ + Assert(sock_path[0]); + unlink(sock_path); +} +#endif /* HAVE_UNIX_SOCKETS */ + #ifdef HAVE_UNIX_SOCKETS static int Lock_AF_UNIX(unsigned short port, const char *unixSocketName) @@ -411,8 +429,8 @@ pool_flush(PoolPort *port) { last_reported_send_errno = errno; - /* - * Handle a seg fault that may later occur in proc array + /* + * Handle a seg fault that may later occur in proc array * when this fails when we are already shutting down * If shutting down already, do not call. */ ----------------------------------------------------------------------- Summary of changes: src/backend/pgxc/pool/poolcomm.c | 24 +++++++++++++++++++++--- 1 files changed, 21 insertions(+), 3 deletions(-) hooks/post-receive -- Postgres-XC |
From: Michael P. <mic...@us...> - 2010-10-13 01:48:50
|
Project "Postgres-XC". The branch, master has been updated via 88162bcb5cb3dabf8cef3717ad1837182fb5f5dc (commit) from ea13b66f4beaeb13db9741fb5a1347f976b9ebab (commit) - Log ----------------------------------------------------------------- commit 88162bcb5cb3dabf8cef3717ad1837182fb5f5dc Author: Michael P <mic...@us...> Date: Wed Oct 13 10:45:16 2010 +0900 Added support for two new pieces of functionality. 1) Support for DDL and utility command synchronisation among Coordinators. DDL is now synchronized amongst multiple coordinators. Previously, after DDL it was required to use an extra utility to resync the nodes and restart other Coordinators. This is no longer necessary. DDL support works also with common BEGIN, COMMIT and ROLLBACK instructions in the cluster. DDL may be initiated at any node. Each Coordinator can connect to any other one. Just as Coordinators use pools for connecting to Data Nodes, Coordinators now use pools for connecting to the other Coordinators. 2) Support for PREPARE TRANSACTION and COMMIT TRANSACTION, ROLLBACK PREPARED. When a transaction is prepared or committed, based on the SQL, it will only execute on the involved nodes, including DDL on Coordinators. GTM is used track which xid and nodes are involved in the transaction, identified by the user or application specified transaction identifier, when it is prepared. New GUCs -------- There are some new GUCs for handling Coordinator communication num_coordinators coordinator_hosts coordinator_ports coordinator_users coordinator_passwords In addition, a new GUC replaces coordinator_id: pgxc_node_id Open Issues ----------- Implicit two phase commit (client in autocommit mode, but distributed transaction required because of multiple nodes) does not first prepare on the originating coordinator before committing, if DDL is involved. We really should prepare here before committing on all nodes. We also need to add a bit of special handling for COMMIT PREPARED. If there is an error, and it got committed on some nodes, we still should force it to be committed on the originating coordinator, if involved, and still return an error of some sort that it was partially committed. (When the downed node recovers, in the future it will determine if any other node has committed the transaction, and if so, it, too, must commit.) It is a pretty rare case, but we should handle it. With this current configuration, DDL will fail if at least one Coordinator is down. In the future, we will make this more flexible. Written by Michael Paquier diff --git a/src/backend/access/transam/gtm.c b/src/backend/access/transam/gtm.c index 08ed2c9..64437e7 100644 --- a/src/backend/access/transam/gtm.c +++ b/src/backend/access/transam/gtm.c @@ -20,7 +20,7 @@ /* Configuration variables */ char *GtmHost = "localhost"; int GtmPort = 6666; -int GtmCoordinatorId = 1; +int PGXCNodeId = 1; extern bool FirstSnapshotSet; @@ -42,7 +42,7 @@ InitGTM() /* 256 bytes should be enough */ char conn_str[256]; - sprintf(conn_str, "host=%s port=%d coordinator_id=%d", GtmHost, GtmPort, GtmCoordinatorId); + sprintf(conn_str, "host=%s port=%d coordinator_id=%d", GtmHost, GtmPort, PGXCNodeId); conn = PQconnectGTM(conn_str); if (GTMPQstatus(conn) != CONNECTION_OK) @@ -187,7 +187,7 @@ RollbackTranGTM(GlobalTransactionId gxid) } int -BeingPreparedTranGTM(GlobalTransactionId gxid, +StartPreparedTranGTM(GlobalTransactionId gxid, char *gid, int datanodecnt, PGXC_NodeId datanodes[], @@ -200,7 +200,7 @@ BeingPreparedTranGTM(GlobalTransactionId gxid, return 0; CheckConnection(); - ret = being_prepared_transaction(conn, gxid, gid, datanodecnt, datanodes, coordcnt, coordinators); + ret = start_prepared_transaction(conn, gxid, gid, datanodecnt, datanodes, coordcnt, coordinators); /* * If something went wrong (timeout), try and reset GTM connection. diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c index d881078..97f6c76 100644 --- a/src/backend/access/transam/twophase.c +++ b/src/backend/access/transam/twophase.c @@ -892,8 +892,13 @@ StartPrepare(GlobalTransaction gxact) * * Calculates CRC and writes state file to WAL and in pg_twophase directory. */ +#ifdef PGXC +void +EndPrepare(GlobalTransaction gxact, bool write_2pc_file) +#else void EndPrepare(GlobalTransaction gxact) +#endif { TransactionId xid = gxact->proc.xid; TwoPhaseFileHeader *hdr; @@ -929,9 +934,10 @@ EndPrepare(GlobalTransaction gxact) * critical section, though, it doesn't matter since any failure causes * PANIC anyway. */ + #ifdef PGXC - /* Do not write 2PC state file on Coordinator side */ - if (IS_PGXC_DATANODE) + /* Write 2PC state file on Coordinator side if a DDL is involved in transaction */ + if (write_2pc_file) { #endif TwoPhaseFilePath(path, xid); @@ -1009,6 +1015,7 @@ EndPrepare(GlobalTransaction gxact) #ifdef PGXC } #endif + START_CRIT_SECTION(); MyProc->inCommit = true; @@ -1020,8 +1027,11 @@ EndPrepare(GlobalTransaction gxact) /* If we crash now, we have prepared: WAL replay will fix things */ #ifdef PGXC - /* Just write 2PC state file on Datanodes */ - if (IS_PGXC_DATANODE) + /* + * Just write 2PC state file on Datanodes + * or on Coordinators if DDL queries are involved. + */ + if (write_2pc_file) { #endif @@ -1038,6 +1048,7 @@ EndPrepare(GlobalTransaction gxact) ereport(ERROR, (errcode_for_file_access(), errmsg("could not close two-phase state file: %m"))); + #ifdef PGXC } #endif @@ -1893,15 +1904,16 @@ RecordTransactionAbortPrepared(TransactionId xid, END_CRIT_SECTION(); } + #ifdef PGXC /* * Remove a gxact on a Coordinator, * this is used to be able to prepare a commit transaction on another coordinator than the one - * who prepared the transaction + * who prepared the transaction, for a transaction that does not include DDLs */ void RemoveGXactCoord(GlobalTransaction gxact) { - RemoveGXact(gxact); + RemoveGXact(gxact); } #endif diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c index 5176e85..03e6d90 100644 --- a/src/backend/access/transam/varsup.c +++ b/src/backend/access/transam/varsup.c @@ -22,7 +22,7 @@ #include "storage/pmsignal.h" #include "storage/proc.h" #include "utils/builtins.h" -#ifdef PGXC +#ifdef PGXC #include "pgxc/pgxc.h" #include "access/gtm.h" #endif @@ -99,25 +99,27 @@ GetNewTransactionId(bool isSubXact) return BootstrapTransactionId; } -#ifdef PGXC - if (IS_PGXC_COORDINATOR) +#ifdef PGXC + if (IS_PGXC_COORDINATOR && !IsConnFromCoord()) { - /* Get XID from GTM before acquiring the lock. + /* + * Get XID from GTM before acquiring the lock. * The rest of the code will handle it if after obtaining XIDs, * the lock is acquired in a different order. * This will help with GTM connection issues- we will not * block all other processes. + * GXID can just be obtained from a remote Coordinator */ xid = (TransactionId) BeginTranGTM(timestamp); - *timestamp_received = true; + *timestamp_received = true; } - #endif LWLockAcquire(XidGenLock, LW_EXCLUSIVE); -#ifdef PGXC - if (IS_PGXC_COORDINATOR) +#ifdef PGXC + /* Only remote Coordinator can go a GXID */ + if (IS_PGXC_COORDINATOR && !IsConnFromCoord()) { if (TransactionIdIsValid(xid)) { @@ -140,7 +142,8 @@ GetNewTransactionId(bool isSubXact) LWLockRelease(XidGenLock); return xid; } - } else if(IS_PGXC_DATANODE) + } + else if(IS_PGXC_DATANODE || IsConnFromCoord()) { if (IsAutoVacuumWorkerProcess()) { @@ -159,7 +162,8 @@ GetNewTransactionId(bool isSubXact) /* try and get gxid directly from GTM */ next_xid = (TransactionId) BeginTranGTM(NULL); } - } else if (GetForceXidFromGTM()) + } + else if (GetForceXidFromGTM()) { elog (DEBUG1, "Force get XID from GTM"); /* try and get gxid directly from GTM */ diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c index 458068c..f51672e 100644 --- a/src/backend/access/transam/xact.c +++ b/src/backend/access/transam/xact.c @@ -26,7 +26,7 @@ #include "access/gtm.h" /* PGXC_COORD */ #include "gtm/gtm_c.h" -#include "pgxc/datanode.h" +#include "pgxc/pgxcnode.h" /* PGXC_DATANODE */ #include "postmaster/autovacuum.h" #endif @@ -116,7 +116,10 @@ typedef enum TBlockState TBLOCK_ABORT_END, /* failed xact, ROLLBACK received */ TBLOCK_ABORT_PENDING, /* live xact, ROLLBACK received */ TBLOCK_PREPARE, /* live xact, PREPARE received */ - +#ifdef PGXC + TBLOCK_PREPARE_NO_2PC_FILE, /* PREPARE receive but skip 2PC file creation + * and Commit gxact */ +#endif /* subtransaction states */ TBLOCK_SUBBEGIN, /* starting a subtransaction */ TBLOCK_SUBINPROGRESS, /* live subtransaction */ @@ -334,7 +337,7 @@ static GlobalTransactionId GetGlobalTransactionId(TransactionState s) { GTM_Timestamp gtm_timestamp; - bool received_tp; + bool received_tp = false; /* * Here we receive timestamp at the same time as gxid. @@ -495,7 +498,7 @@ AssignTransactionId(TransactionState s) * the Xid as "running". See GetNewTransactionId. */ #ifdef PGXC /* PGXC_COORD */ - if (IS_PGXC_COORDINATOR) + if (IS_PGXC_COORDINATOR && !IsConnFromCoord()) { s->transactionId = (TransactionId) GetGlobalTransactionId(s); elog(DEBUG1, "New transaction id assigned = %d, isSubXact = %s", @@ -1629,7 +1632,8 @@ StartTransaction(void) */ s->state = TRANS_START; #ifdef PGXC /* PGXC_COORD */ - if (IS_PGXC_COORDINATOR) + /* GXID is assigned already by a remote Coordinator */ + if (IS_PGXC_COORDINATOR && !IsConnFromCoord()) s->globalTransactionId = InvalidGlobalTransactionId; /* until assigned */ #endif s->transactionId = InvalidTransactionId; /* until assigned */ @@ -1797,7 +1801,7 @@ CommitTransaction(void) * There can be error on the data nodes. So go to data nodes before * changing transaction state and local clean up */ - DataNodeCommit(); + PGXCNodeCommit(); #endif /* Prevent cancel/die interrupt while cleaning up */ @@ -1818,14 +1822,15 @@ CommitTransaction(void) #ifdef PGXC /* - * Now we can let GTM know about transaction commit + * Now we can let GTM know about transaction commit. + * Only a Remote Coordinator is allowed to do that. */ - if (IS_PGXC_COORDINATOR) + if (IS_PGXC_COORDINATOR && !IsConnFromCoord()) { CommitTranGTM(s->globalTransactionId); latestXid = s->globalTransactionId; } - else if (IS_PGXC_DATANODE) + else if (IS_PGXC_DATANODE || IsConnFromCoord()) { /* If we are autovacuum, commit on GTM */ if ((IsAutoVacuumWorkerProcess() || GetForceXidFromGTM()) @@ -1930,9 +1935,9 @@ CommitTransaction(void) s->maxChildXids = 0; #ifdef PGXC - if (IS_PGXC_COORDINATOR) + if (IS_PGXC_COORDINATOR && !IsConnFromCoord()) s->globalTransactionId = InvalidGlobalTransactionId; - else if (IS_PGXC_DATANODE) + else if (IS_PGXC_DATANODE || IsConnFromCoord()) SetNextTransactionId(InvalidTransactionId); #endif @@ -1951,8 +1956,17 @@ CommitTransaction(void) * * NB: if you change this routine, better look at CommitTransaction too! */ +#ifdef PGXC +/* + * Only a Postgres-XC Coordinator that received a PREPARE Command from + * an application can use this special prepare. + */ +static void +PrepareTransaction(bool write_2pc_file) +#else static void PrepareTransaction(void) +#endif { TransactionState s = CurrentTransactionState; TransactionId xid = GetCurrentTransactionId(); @@ -2084,7 +2098,7 @@ PrepareTransaction(void) * updates, because the transaction manager might get confused if we lose * a global transaction. */ - EndPrepare(gxact); + EndPrepare(gxact, write_2pc_file); /* * Now we clean up backend-internal state and release internal resources. @@ -2138,7 +2152,7 @@ PrepareTransaction(void) * We want to be able to commit a prepared transaction from another coordinator, * so clean up the gxact in shared memory also. */ - if (IS_PGXC_COORDINATOR) + if (!write_2pc_file) { RemoveGXactCoord(gxact); } @@ -2183,7 +2197,7 @@ PrepareTransaction(void) s->maxChildXids = 0; #ifdef PGXC /* PGXC_DATANODE */ - if (IS_PGXC_DATANODE) + if (IS_PGXC_DATANODE || IsConnFromCoord()) SetNextTransactionId(InvalidTransactionId); #endif /* @@ -2273,16 +2287,18 @@ AbortTransaction(void) TRACE_POSTGRESQL_TRANSACTION_ABORT(MyProc->lxid); #ifdef PGXC - if (IS_PGXC_COORDINATOR) + /* This is done by remote Coordinator */ + if (IS_PGXC_COORDINATOR && !IsConnFromCoord()) { - /* Make sure this is rolled back on the DataNodes, - * if so it will just return + /* + * Make sure this is rolled back on the DataNodes + * if so it will just return */ - DataNodeRollback(); + PGXCNodeRollback(); RollbackTranGTM(s->globalTransactionId); latestXid = s->globalTransactionId; } - else if (IS_PGXC_DATANODE) + else if (IS_PGXC_DATANODE || IsConnFromCoord()) { /* If we are autovacuum, commit on GTM */ if ((IsAutoVacuumWorkerProcess() || GetForceXidFromGTM()) @@ -2378,9 +2394,9 @@ CleanupTransaction(void) s->maxChildXids = 0; #ifdef PGXC /* PGXC_DATANODE */ - if (IS_PGXC_COORDINATOR) + if (IS_PGXC_COORDINATOR && !IsConnFromCoord()) s->globalTransactionId = InvalidGlobalTransactionId; - else if (IS_PGXC_DATANODE) + else if (IS_PGXC_DATANODE || IsConnFromCoord()) SetNextTransactionId(InvalidTransactionId); #endif @@ -2446,6 +2462,9 @@ StartTransactionCommand(void) case TBLOCK_SUBRESTART: case TBLOCK_SUBABORT_RESTART: case TBLOCK_PREPARE: +#ifdef PGXC + case TBLOCK_PREPARE_NO_2PC_FILE: +#endif elog(ERROR, "StartTransactionCommand: unexpected state %s", BlockStateAsString(s->blockState)); break; @@ -2552,9 +2571,20 @@ CommitTransactionCommand(void) * return to the idle state. */ case TBLOCK_PREPARE: - PrepareTransaction(); + PrepareTransaction(true); + s->blockState = TBLOCK_DEFAULT; + break; + +#ifdef PGXC + /* + * We are complieting a PREPARE TRANSACTION for a pgxc transaction + * that involved DDLs on a Coordinator. + */ + case TBLOCK_PREPARE_NO_2PC_FILE: + PrepareTransaction(false); s->blockState = TBLOCK_DEFAULT; break; +#endif /* * We were just issued a SAVEPOINT inside a transaction block. @@ -2586,10 +2616,15 @@ CommitTransactionCommand(void) CommitTransaction(); s->blockState = TBLOCK_DEFAULT; } +#ifdef PGXC + else if (s->blockState == TBLOCK_PREPARE || + s->blockState == TBLOCK_PREPARE_NO_2PC_FILE) +#else else if (s->blockState == TBLOCK_PREPARE) +#endif { Assert(s->parent == NULL); - PrepareTransaction(); + PrepareTransaction(true); s->blockState = TBLOCK_DEFAULT; } else @@ -2789,6 +2824,9 @@ AbortCurrentTransaction(void) * the transaction). */ case TBLOCK_PREPARE: +#ifdef PGXC + case TBLOCK_PREPARE_NO_2PC_FILE: +#endif AbortTransaction(); CleanupTransaction(); s->blockState = TBLOCK_DEFAULT; @@ -3140,6 +3178,9 @@ BeginTransactionBlock(void) case TBLOCK_SUBRESTART: case TBLOCK_SUBABORT_RESTART: case TBLOCK_PREPARE: +#ifdef PGXC + case TBLOCK_PREPARE_NO_2PC_FILE: +#endif elog(FATAL, "BeginTransactionBlock: unexpected state %s", BlockStateAsString(s->blockState)); break; @@ -3158,8 +3199,13 @@ BeginTransactionBlock(void) * We do it this way because it's not convenient to change memory context, * resource owner, etc while executing inside a Portal. */ +#ifdef PGXC +bool +PrepareTransactionBlock(char *gid, bool write_2pc_file) +#else bool PrepareTransactionBlock(char *gid) +#endif { TransactionState s; bool result; @@ -3180,6 +3226,16 @@ PrepareTransactionBlock(char *gid) /* Save GID where PrepareTransaction can find it again */ prepareGID = MemoryContextStrdup(TopTransactionContext, gid); +#ifdef PGXC + /* + * For a Postgres-XC Coordinator, prepare is done for a transaction + * if and only if a DDL was involved in the transaction. + * If not, it is enough to prepare it on Datanodes involved only. + */ + if (!write_2pc_file) + s->blockState = TBLOCK_PREPARE_NO_2PC_FILE; + else +#endif s->blockState = TBLOCK_PREPARE; } else @@ -3308,6 +3364,9 @@ EndTransactionBlock(void) case TBLOCK_SUBRESTART: case TBLOCK_SUBABORT_RESTART: case TBLOCK_PREPARE: +#ifdef PGXC + case TBLOCK_PREPARE_NO_2PC_FILE: +#endif elog(FATAL, "EndTransactionBlock: unexpected state %s", BlockStateAsString(s->blockState)); break; @@ -3400,6 +3459,9 @@ UserAbortTransactionBlock(void) case TBLOCK_SUBRESTART: case TBLOCK_SUBABORT_RESTART: case TBLOCK_PREPARE: +#ifdef PGXC + case TBLOCK_PREPARE_NO_2PC_FILE: +#endif elog(FATAL, "UserAbortTransactionBlock: unexpected state %s", BlockStateAsString(s->blockState)); break; @@ -3447,6 +3509,9 @@ DefineSavepoint(char *name) case TBLOCK_SUBRESTART: case TBLOCK_SUBABORT_RESTART: case TBLOCK_PREPARE: +#ifdef PGXC + case TBLOCK_PREPARE_NO_2PC_FILE: +#endif elog(FATAL, "DefineSavepoint: unexpected state %s", BlockStateAsString(s->blockState)); break; @@ -3503,6 +3568,9 @@ ReleaseSavepoint(List *options) case TBLOCK_SUBRESTART: case TBLOCK_SUBABORT_RESTART: case TBLOCK_PREPARE: +#ifdef PGXC + case TBLOCK_PREPARE_NO_2PC_FILE: +#endif elog(FATAL, "ReleaseSavepoint: unexpected state %s", BlockStateAsString(s->blockState)); break; @@ -3601,6 +3669,9 @@ RollbackToSavepoint(List *options) case TBLOCK_SUBRESTART: case TBLOCK_SUBABORT_RESTART: case TBLOCK_PREPARE: +#ifdef PGXC + case TBLOCK_PREPARE_NO_2PC_FILE: +#endif elog(FATAL, "RollbackToSavepoint: unexpected state %s", BlockStateAsString(s->blockState)); break; @@ -3684,6 +3755,9 @@ BeginInternalSubTransaction(char *name) case TBLOCK_INPROGRESS: case TBLOCK_END: case TBLOCK_PREPARE: +#ifdef PGXC + case TBLOCK_PREPARE_NO_2PC_FILE: +#endif case TBLOCK_SUBINPROGRESS: /* Normal subtransaction start */ PushTransaction(); @@ -3776,6 +3850,9 @@ RollbackAndReleaseCurrentSubTransaction(void) case TBLOCK_SUBRESTART: case TBLOCK_SUBABORT_RESTART: case TBLOCK_PREPARE: +#ifdef PGXC + case TBLOCK_PREPARE_NO_2PC_FILE: +#endif elog(FATAL, "RollbackAndReleaseCurrentSubTransaction: unexpected state %s", BlockStateAsString(s->blockState)); break; @@ -3824,6 +3901,9 @@ AbortOutOfAnyTransaction(void) case TBLOCK_END: case TBLOCK_ABORT_PENDING: case TBLOCK_PREPARE: +#ifdef PGXC + case TBLOCK_PREPARE_NO_2PC_FILE: +#endif /* In a transaction, so clean up */ AbortTransaction(); CleanupTransaction(); @@ -3915,6 +3995,9 @@ TransactionBlockStatusCode(void) case TBLOCK_END: case TBLOCK_SUBEND: case TBLOCK_PREPARE: +#ifdef PGXC + case TBLOCK_PREPARE_NO_2PC_FILE: +#endif return 'T'; /* in transaction */ case TBLOCK_ABORT: case TBLOCK_SUBABORT: @@ -4273,7 +4356,7 @@ PushTransaction(void) * failure. */ #ifdef PGXC /* PGXC_COORD */ - if (IS_PGXC_COORDINATOR) + if (IS_PGXC_COORDINATOR && !IsConnFromCoord()) s->globalTransactionId = InvalidGlobalTransactionId; #endif s->transactionId = InvalidTransactionId; /* until assigned */ @@ -4410,6 +4493,9 @@ BlockStateAsString(TBlockState blockState) return "ABORT END"; case TBLOCK_ABORT_PENDING: return "ABORT PEND"; +#ifdef PGXC + case TBLOCK_PREPARE_NO_2PC_FILE: +#endif case TBLOCK_PREPARE: return "PREPARE"; case TBLOCK_SUBBEGIN: diff --git a/src/backend/catalog/dependency.c b/src/backend/catalog/dependency.c index af57e68..dbbca98 100644 --- a/src/backend/catalog/dependency.c +++ b/src/backend/catalog/dependency.c @@ -359,8 +359,11 @@ doRename(const ObjectAddress *object, const char *oldname, const char *newname) * If we are here, a schema is being renamed, a sequence depends on it. * as sequences' global name use the schema name, this sequence * has also to be renamed on GTM. + * An operation with GTM can just be done from a remote Coordinator. */ - if (relKind == RELKIND_SEQUENCE && IS_PGXC_COORDINATOR) + if (relKind == RELKIND_SEQUENCE + && IS_PGXC_COORDINATOR + && !IsConnFromCoord()) { Relation relseq = relation_open(object->objectId, AccessShareLock); char *seqname = GetGlobalSeqName(relseq, NULL, oldname); @@ -1136,8 +1139,11 @@ doDeletion(const ObjectAddress *object) } #ifdef PGXC - /* Drop the sequence on GTM */ - if (relKind == RELKIND_SEQUENCE && IS_PGXC_COORDINATOR) + /* + * Drop the sequence on GTM. + * Sequence is dropped on GTM by a remote Coordinator only. + */ + if (relKind == RELKIND_SEQUENCE && IS_PGXC_COORDINATOR && !IsConnFromCoord()) { /* * The sequence has already been removed from coordinator, diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c index 772a6f7..a1da3a0 100644 --- a/src/backend/commands/copy.c +++ b/src/backend/commands/copy.c @@ -180,7 +180,7 @@ typedef struct CopyStateData RelationLocInfo *rel_loc; /* the locator key */ int hash_idx; /* index of the hash column */ - DataNodeHandle **connections; /* Involved data node connections */ + PGXCNodeHandle **connections; /* Involved data node connections */ #endif } CopyStateData; diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c index 83ddbab..7f98b4d 100644 --- a/src/backend/commands/sequence.c +++ b/src/backend/commands/sequence.c @@ -350,7 +350,8 @@ DefineSequence(CreateSeqStmt *seq) heap_close(rel, NoLock); #ifdef PGXC /* PGXC_COORD */ - if (IS_PGXC_COORDINATOR) + /* Remote Coordinator is in charge of creating sequence in GTM */ + if (IS_PGXC_COORDINATOR && !IsConnFromCoord()) { char *seqname = GetGlobalSeqName(rel, NULL, NULL); @@ -492,7 +493,8 @@ AlterSequenceInternal(Oid relid, List *options) relation_close(seqrel, NoLock); #ifdef PGXC - if (IS_PGXC_COORDINATOR) + /* Remote Coordinator is in charge of create sequence in GTM */ + if (IS_PGXC_COORDINATOR && !IsConnFromCoord()) { char *seqname = GetGlobalSeqName(seqrel, NULL, NULL); diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index c8b1456..d3506c8 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -2094,8 +2094,10 @@ RenameRelation(Oid myrelid, const char *newrelname, ObjectType reltype) /* Do the work */ RenameRelationInternal(myrelid, newrelname, namespaceId); #ifdef PGXC - if (IS_PGXC_COORDINATOR && - (reltype == OBJECT_SEQUENCE || relkind == RELKIND_SEQUENCE)) /* It is possible to rename a sequence with ALTER TABLE */ + /* Operation with GTM can only be done with a Remote Coordinator */ + if (IS_PGXC_COORDINATOR + && !IsConnFromCoord() + && (reltype == OBJECT_SEQUENCE || relkind == RELKIND_SEQUENCE)) /* It is possible to rename a sequence with ALTER TABLE */ { char *seqname = GetGlobalSeqName(targetrelation, NULL, NULL); char *newseqname = GetGlobalSeqName(targetrelation, newrelname, NULL); diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c index 519ea4f..86db1eb 100644 --- a/src/backend/executor/execMain.c +++ b/src/backend/executor/execMain.c @@ -59,7 +59,9 @@ #include "utils/memutils.h" #include "utils/snapmgr.h" #include "utils/tqual.h" - +#ifdef PGXC +#include "pgxc/pgxc.h" +#endif /* Hooks for plugins to get control in ExecutorStart/Run/End() */ ExecutorStart_hook_type ExecutorStart_hook = NULL; diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c index 8dd924d..2c95815 100644 --- a/src/backend/optimizer/plan/planner.c +++ b/src/backend/optimizer/plan/planner.c @@ -124,7 +124,11 @@ planner(Query *parse, int cursorOptions, ParamListInfo boundParams) result = (*planner_hook) (parse, cursorOptions, boundParams); else #ifdef PGXC - if (IS_PGXC_COORDINATOR) + /* + * A coordinator receiving a query from another Coordinator + * is not allowed to go into PGXC planner. + */ + if (IS_PGXC_COORDINATOR && !IsConnFromCoord()) result = pgxc_planner(parse, cursorOptions, boundParams); else #endif diff --git a/src/backend/parser/parse_utilcmd.c b/src/backend/parser/parse_utilcmd.c index f47cc6a..cfa470f 100644 --- a/src/backend/parser/parse_utilcmd.c +++ b/src/backend/parser/parse_utilcmd.c @@ -277,6 +277,8 @@ transformCreateStmt(CreateStmt *stmt, const char *queryString) RemoteQuery *step = makeNode(RemoteQuery); step->combine_type = COMBINE_TYPE_SAME; step->sql_statement = queryString; + /* This query is a DDL, Launch it on both Datanodes and Coordinators. */ + step->exec_type = EXEC_ON_ALL_NODES; result = lappend(result, step); } #endif @@ -1970,6 +1972,8 @@ transformAlterTableStmt(AlterTableStmt *stmt, const char *queryString) RemoteQuery *step = makeNode(RemoteQuery); step->combine_type = COMBINE_TYPE_SAME; step->sql_statement = queryString; + /* This query is a DDl, it is launched on both Coordinators and Datanodes. */ + step->exec_type = EXEC_ON_ALL_NODES; result = lappend(result, step); } #endif diff --git a/src/backend/pgxc/locator/locator.c b/src/backend/pgxc/locator/locator.c index debbc77..098e254 100644 --- a/src/backend/pgxc/locator/locator.c +++ b/src/backend/pgxc/locator/locator.c @@ -24,6 +24,7 @@ #include "postgres.h" #include "access/skey.h" +#include "access/gtm.h" #include "access/relscan.h" #include "catalog/indexing.h" #include "catalog/pg_type.h" @@ -440,12 +441,12 @@ GetLocatorType(Oid relid) /* - * Return a list of all nodes. + * Return a list of all Datanodes. * We assume all tables use all nodes in the prototype, so just return a list * from first one. */ List * -GetAllNodes(void) +GetAllDataNodes(void) { int i; @@ -463,10 +464,38 @@ GetAllNodes(void) return nodeList; } +/* + * Return a list of all Coordinators + * This is used to send DDL to all nodes + * Do not put in the list the local Coordinator where this function is launched + */ +List * +GetAllCoordNodes(void) +{ + int i; + + /* + * PGXCTODO - add support for having nodes on a subset of nodes + * For now, assume on all nodes + */ + List *nodeList = NIL; + + for (i = 1; i < NumCoords + 1; i++) + { + /* + * Do not put in list the Coordinator we are on, + * it doesn't make sense to connect to the local coordinator. + */ + if (i != PGXCNodeId) + nodeList = lappend_int(nodeList, i); + } + + return nodeList; +} + /* * Build locator information associated with the specified relation. - * */ void RelationBuildLocator(Relation rel) @@ -528,7 +557,7 @@ RelationBuildLocator(Relation rel) /** PGXCTODO - add support for having nodes on a subset of nodes * For now, assume on all nodes */ - relationLocInfo->nodeList = GetAllNodes(); + relationLocInfo->nodeList = GetAllDataNodes(); relationLocInfo->nodeCount = relationLocInfo->nodeList->length; /* diff --git a/src/backend/pgxc/pool/Makefile b/src/backend/pgxc/pool/Makefile index c7e950a..f8679eb 100644 --- a/src/backend/pgxc/pool/Makefile +++ b/src/backend/pgxc/pool/Makefile @@ -14,6 +14,6 @@ subdir = src/backend/pgxc/pool top_builddir = ../../../.. include $(top_builddir)/src/Makefile.global -OBJS = datanode.o execRemote.o poolmgr.o poolcomm.o postgresql_fdw.o +OBJS = pgxcnode.o execRemote.o poolmgr.o poolcomm.o postgresql_fdw.o include $(top_srcdir)/src/backend/common.mk diff --git a/src/backend/pgxc/pool/execRemote.c b/src/backend/pgxc/pool/execRemote.c index 16d2f6b..14dce33 100644 --- a/src/backend/pgxc/pool/execRemote.c +++ b/src/backend/pgxc/pool/execRemote.c @@ -30,6 +30,8 @@ #include "utils/memutils.h" #include "utils/tuplesort.h" #include "utils/snapmgr.h" +#include "pgxc/locator.h" +#include "pgxc/pgxc.h" #define END_QUERY_TIMEOUT 20 #define CLEAR_TIMEOUT 5 @@ -45,26 +47,30 @@ extern char *deparseSql(RemoteQueryState *scanstate); #define PRIMARY_NODE_WRITEAHEAD 1024 * 1024 static bool autocommit = true; -static DataNodeHandle **write_node_list = NULL; +static PGXCNodeHandle **write_node_list = NULL; static int write_node_count = 0; -static int data_node_begin(int conn_count, DataNodeHandle ** connections, +static int pgxc_node_begin(int conn_count, PGXCNodeHandle ** connections, GlobalTransactionId gxid); -static int data_node_commit(int conn_count, DataNodeHandle ** connections); -static int data_node_rollback(int conn_count, DataNodeHandle ** connections); -static int data_node_prepare(int conn_count, DataNodeHandle ** connections, - char *gid); -static int data_node_rollback_prepared(GlobalTransactionId gxid, GlobalTransactionId prepared_gxid, - int conn_count, DataNodeHandle ** connections, - char *gid); -static int data_node_commit_prepared(GlobalTransactionId gxid, GlobalTransactionId prepared_gxid, - int conn_count, DataNodeHandle ** connections, - char *gid); - -static void clear_write_node_list(); - -static int handle_response_clear(DataNodeHandle * conn); - +static int pgxc_node_commit(PGXCNodeAllHandles * pgxc_handles); +static int pgxc_node_rollback(PGXCNodeAllHandles * pgxc_handles); +static int pgxc_node_prepare(PGXCNodeAllHandles * pgxc_handles, char *gid); +static int pgxc_node_rollback_prepared(GlobalTransactionId gxid, GlobalTransactionId prepared_gxid, + PGXCNodeAllHandles * pgxc_handles, char *gid); +static int pgxc_node_commit_prepared(GlobalTransactionId gxid, GlobalTransactionId prepared_gxid, + PGXCNodeAllHandles * pgxc_handles, char *gid); +static PGXCNodeAllHandles * get_exec_connections(ExecNodes *exec_nodes, + RemoteQueryExecType exec_type); +static int pgxc_node_receive_and_validate(const int conn_count, + PGXCNodeHandle ** connections, + bool reset_combiner); +static void clear_write_node_list(void); + +static void pfree_pgxc_all_handles(PGXCNodeAllHandles *pgxc_handles); + +static int handle_response_clear(PGXCNodeHandle * conn); + +static PGXCNodeAllHandles *pgxc_get_all_transaction_nodes(void); #define MAX_STATEMENTS_PER_TRAN 10 @@ -922,14 +928,14 @@ FetchTuple(RemoteQueryState *combiner, TupleTableSlot *slot) * Handle responses from the Data node connections */ static int -data_node_receive_responses(const int conn_count, DataNodeHandle ** connections, +pgxc_node_receive_responses(const int conn_count, PGXCNodeHandle ** connections, struct timeval * timeout, RemoteQueryState *combiner) { int count = conn_count; - DataNodeHandle *to_receive[conn_count]; + PGXCNodeHandle *to_receive[conn_count]; /* make a copy of the pointers to the connections */ - memcpy(to_receive, connections, conn_count * sizeof(DataNodeHandle *)); + memcpy(to_receive, connections, conn_count * sizeof(PGXCNodeHandle *)); /* * Read results. @@ -941,7 +947,7 @@ data_node_receive_responses(const int conn_count, DataNodeHandle ** connections, { int i = 0; - if (data_node_receive(count, to_receive, timeout)) + if (pgxc_node_receive(count, to_receive, timeout)) return EOF; while (i < count) { @@ -986,7 +992,7 @@ data_node_receive_responses(const int conn_count, DataNodeHandle ** connections, * 2 - got copy response */ int -handle_response(DataNodeHandle * conn, RemoteQueryState *combiner) +handle_response(PGXCNodeHandle * conn, RemoteQueryState *combiner) { char *msg; int msg_len; @@ -1094,7 +1100,7 @@ handle_response(DataNodeHandle * conn, RemoteQueryState *combiner) * RESPONSE_COMPLETE - done with the connection, or done trying (error) */ static int -handle_response_clear(DataNodeHandle * conn) +handle_response_clear(PGXCNodeHandle * conn) { char *msg; int msg_len; @@ -1156,10 +1162,10 @@ handle_response_clear(DataNodeHandle * conn) /* - * Send BEGIN command to the Data nodes and receive responses + * Send BEGIN command to the Datanodes or Coordinators and receive responses */ static int -data_node_begin(int conn_count, DataNodeHandle ** connections, +pgxc_node_begin(int conn_count, PGXCNodeHandle ** connections, GlobalTransactionId gxid) { int i; @@ -1170,20 +1176,20 @@ data_node_begin(int conn_count, DataNodeHandle ** connections, /* Send BEGIN */ for (i = 0; i < conn_count; i++) { - if (GlobalTransactionIdIsValid(gxid) && data_node_send_gxid(connections[i], gxid)) + if (GlobalTransactionIdIsValid(gxid) && pgxc_node_send_gxid(connections[i], gxid)) return EOF; - if (GlobalTimestampIsValid(timestamp) && data_node_send_timestamp(connections[i], timestamp)) + if (GlobalTimestampIsValid(timestamp) && pgxc_node_send_timestamp(connections[i], timestamp)) return EOF; - if (data_node_send_query(connections[i], "BEGIN")) + if (pgxc_node_send_query(connections[i], "BEGIN")) return EOF; } combiner = CreateResponseCombiner(conn_count, COMBINE_TYPE_NONE); /* Receive responses */ - if (data_node_receive_responses(conn_count, connections, timeout, combiner)) + if (pgxc_node_receive_responses(conn_count, connections, timeout, combiner)) return EOF; /* Verify status */ @@ -1197,17 +1203,17 @@ clear_write_node_list() /* we just malloc once and use counter */ if (write_node_list == NULL) { - write_node_list = (DataNodeHandle **) malloc(NumDataNodes * sizeof(DataNodeHandle *)); + write_node_list = (PGXCNodeHandle **) malloc(NumDataNodes * sizeof(PGXCNodeHandle *)); } write_node_count = 0; } /* - * Switch autocommmit mode off, so all subsequent statements will be in the same transaction + * Switch autocommit mode off, so all subsequent statements will be in the same transaction */ void -DataNodeBegin(void) +PGXCNodeBegin(void) { autocommit = false; clear_write_node_list(); @@ -1215,18 +1221,30 @@ DataNodeBegin(void) /* - * Prepare transaction on Datanodes involved in current transaction. + * Prepare transaction on Datanodes and Coordinators involved in current transaction. * GXID associated to current transaction has to be committed on GTM. */ -int -DataNodePrepare(char *gid) +bool +PGXCNodePrepare(char *gid) { int res = 0; int tran_count; - DataNodeHandle *connections[NumDataNodes]; + PGXCNodeAllHandles *pgxc_connections; + bool local_operation = false; + + pgxc_connections = pgxc_get_all_transaction_nodes(); - /* gather connections to prepare */ - tran_count = get_transaction_nodes(connections); + /* DDL involved in transaction, so make a local prepare too */ + if (pgxc_connections->co_conn_count != 0) + local_operation = true; + + /* + * If no connections have been gathered for Coordinators, + * it means that no DDL has been involved in this transaction. + * And so this transaction is not prepared on Coordinators. + * It is only on Datanodes that data is involved. + */ + tran_count = pgxc_connections->dn_conn_count + pgxc_connections->co_conn_count; /* * If we do not have open transactions we have nothing to prepare just @@ -1234,12 +1252,11 @@ DataNodePrepare(char *gid) */ if (tran_count == 0) { - elog(WARNING, "Nothing to PREPARE on Datanodes, gid is not used"); + elog(WARNING, "Nothing to PREPARE on Datanodes and Coordinators, gid is not used"); goto finish; } - /* TODO: data_node_prepare */ - res = data_node_prepare(tran_count, connections, gid); + res = pgxc_node_prepare(pgxc_connections, gid); finish: /* @@ -1249,12 +1266,16 @@ finish: * Release the connections for the moment. */ if (!autocommit) - stat_transaction(tran_count); + stat_transaction(pgxc_connections->dn_conn_count); if (!PersistentConnections) release_handles(false); autocommit = true; clear_write_node_list(); - return res; + + /* Clean up connections */ + pfree_pgxc_all_handles(pgxc_connections); + + return local_operation; } @@ -1262,47 +1283,64 @@ finish: * Prepare transaction on dedicated nodes with gid received from application */ static int -data_node_prepare(int conn_count, DataNodeHandle ** connections, char *gid) +pgxc_node_prepare(PGXCNodeAllHandles *pgxc_handles, char *gid) { - int i; + int real_co_conn_count; int result = 0; - struct timeval *timeout = NULL; + int co_conn_count = pgxc_handles->co_conn_count; + int dn_conn_count = pgxc_handles->dn_conn_count; char *buffer = (char *) palloc0(22 + strlen(gid) + 1); - RemoteQueryState *combiner = NULL; GlobalTransactionId gxid = InvalidGlobalTransactionId; PGXC_NodeId *datanodes = NULL; + PGXC_NodeId *coordinators = NULL; gxid = GetCurrentGlobalTransactionId(); /* * Now that the transaction has been prepared on the nodes, - * Initialize to make the business on GTM + * Initialize to make the business on GTM. + * We also had the Coordinator we are on in the prepared state. + */ + if (dn_conn_count != 0) + datanodes = collect_pgxcnode_numbers(dn_conn_count, + pgxc_handles->datanode_handles, REMOTE_CONN_DATANODE); + + /* + * Local Coordinator is saved in the list sent to GTM + * only when a DDL is involved in the transaction. + * So we don't need to complete the list of Coordinators sent to GTM + * when number of connections to Coordinator is zero (no DDL). */ - datanodes = collect_datanode_numbers(conn_count, connections); + if (co_conn_count != 0) + coordinators = collect_pgxcnode_numbers(co_conn_count, + pgxc_handles->coord_handles, REMOTE_CONN_COORD); /* - * Send a Prepare in Progress message to GTM. - * At the same time node list is saved on GTM. + * Tell to GTM that the transaction is being prepared first. + * Don't forget to add in the list of Coordinators the coordinator we are on + * if a DDL is involved in the transaction. + * This one also is being prepared ! */ - result = BeingPreparedTranGTM(gxid, gid, conn_count, datanodes, 0, NULL); + if (co_conn_count == 0) + real_co_conn_count = co_conn_count; + else + real_co_conn_count = co_conn_count + 1; + + result = StartPreparedTranGTM(gxid, gid, dn_conn_count, + datanodes, real_co_conn_count, coordinators); if (result < 0) return EOF; sprintf(buffer, "PREPARE TRANSACTION '%s'", gid); - /* Send PREPARE */ - for (i = 0; i < conn_count; i++) - if (data_node_send_query(connections[i], buffer)) - return EOF; + /* Continue even after an error here, to consume the messages */ + result = pgxc_all_handles_send_query(pgxc_handles, buffer, true); - combiner = CreateResponseCombiner(conn_count, COMBINE_TYPE_NONE); + /* Receive and Combine results from Datanodes and Coordinators */ + result |= pgxc_node_receive_and_validate(dn_conn_count, pgxc_handles->datanode_handles, false); + result |= pgxc_node_receive_and_validate(co_conn_count, pgxc_handles->coord_handles, false); - /* Receive responses */ - if (data_node_receive_responses(conn_count, connections, timeout, combiner)) - return EOF; - - result = ValidateAndCloseCombiner(combiner) ? result : EOF; if (result) goto finish; @@ -1324,31 +1362,27 @@ finish: if (result) { GlobalTransactionId rollback_xid = InvalidGlobalTransactionId; - buffer = (char *) repalloc(buffer, 20 + strlen(gid) + 1); + result = 0; + buffer = (char *) repalloc(buffer, 20 + strlen(gid) + 1); sprintf(buffer, "ROLLBACK PREPARED '%s'", gid); - rollback_xid = BeginTranGTM(NULL); - for (i = 0; i < conn_count; i++) - { - if (data_node_send_gxid(connections[i], rollback_xid)) - { - add_error_message(connections[i], "Can not send request"); - return EOF; - } - if (data_node_send_query(connections[i], buffer)) - { - add_error_message(connections[i], "Can not send request"); - return EOF; - } - } + /* Consume any messages on the Datanodes and Coordinators first if necessary */ + PGXCNodeConsumeMessages(); - if (!combiner) - combiner = CreateResponseCombiner(conn_count, COMBINE_TYPE_NONE); + rollback_xid = BeginTranGTM(NULL); - if (data_node_receive_responses(conn_count, connections, timeout, combiner)) + /* + * Send xid and rollback prepared down to Datanodes and Coordinators + * Even if we get an error on one, we try and send to the others + */ + if (pgxc_all_handles_send_gxid(pgxc_handles, rollback_xid, false)) result = EOF; - result = ValidateAndCloseCombiner(combiner) ? result : EOF; + if (pgxc_all_handles_send_query(pgxc_handles, buffer, false)) + result = EOF; + + result = pgxc_node_receive_and_validate(dn_conn_count, pgxc_handles->datanode_handles, false); + result |= pgxc_node_receive_and_validate(co_conn_count, pgxc_handles->coord_handles, false); /* * Don't forget to rollback also on GTM @@ -1364,26 +1398,30 @@ finish: /* - * Commit prepared transaction on Datanodes where it has been prepared. + * Commit prepared transaction on Datanodes and Coordinators (as necessary) + * where it has been prepared. * Connection to backends has been cut when transaction has been prepared, * So it is necessary to send the COMMIT PREPARE message to all the nodes. * We are not sure if the transaction prepared has involved all the datanodes * or not but send the message to all of them. * This avoid to have any additional interaction with GTM when making a 2PC transaction. */ -void -DataNodeCommitPrepared(char *gid) +bool +PGXCNodeCommitPrepared(char *gid) { int res = 0; int res_gtm = 0; - DataNodeHandle **connections; - List *nodelist = NIL; + PGXCNodeAllHandles *pgxc_handles; + List *datanodelist = NIL; + List *coordlist = NIL; int i, tran_count; PGXC_NodeId *datanodes = NULL; PGXC_NodeId *coordinators = NULL; int coordcnt = 0; int datanodecnt = 0; GlobalTransactionId gxid, prepared_gxid; + /* This flag tracks if the transaction has to be committed locally */ + bool operation_local = false; res_gtm = GetGIDDataGTM(gid, &gxid, &prepared_gxid, &datanodecnt, &datanodes, &coordcnt, &coordinators); @@ -1394,17 +1432,33 @@ DataNodeCommitPrepared(char *gid) autocommit = false; - /* Build the list of nodes based on data received from GTM */ + /* + * Build the list of nodes based on data received from GTM. + * For Sequence DDL this list is NULL. + */ for (i = 0; i < datanodecnt; i++) + datanodelist = lappend_int(datanodelist,datanodes[i]); + + for (i = 0; i < coordcnt; i++) { - nodelist = lappend_int(nodelist,datanodes[i]); + /* Local Coordinator number found, has to commit locally also */ + if (coordinators[i] == PGXCNodeId) + operation_local = true; + else + coordlist = lappend_int(coordlist,coordinators[i]); } /* Get connections */ - connections = get_handles(nodelist); + if (coordcnt > 0 && datanodecnt == 0) + pgxc_handles = get_handles(datanodelist, coordlist, true); + else + pgxc_handles = get_handles(datanodelist, coordlist, false); - /* Commit here the prepared transaction to all Datanodes */ - res = data_node_commit_prepared(gxid, prepared_gxid, datanodecnt, connections, gid); + /* + * Commit here the prepared transaction to all Datanodes and Coordinators + * If necessary, local Coordinator Commit is performed after this DataNodeCommitPrepared. + */ + res = pgxc_node_commit_prepared(gxid, prepared_gxid, pgxc_handles, gid); finish: /* In autocommit mode statistics is collected in DataNodeExec */ @@ -1416,11 +1470,13 @@ finish: clear_write_node_list(); /* Free node list taken from GTM */ - if (datanodes) + if (datanodes && datanodecnt != 0) free(datanodes); - if (coordinators) + + if (coordinators && coordcnt != 0) free(coordinators); + pfree_pgxc_all_handles(pgxc_handles); if (res_gtm < 0) ereport(ERROR, (errcode(ERRCODE_INTERNAL_ERROR), @@ -1429,6 +1485,8 @@ finish: ereport(ERROR, (errcode(ERRCODE_INTERNAL_ERROR), errmsg("Could not commit prepared transaction on data nodes"))); + + return operation_local; } /* @@ -1440,42 +1498,29 @@ finish: * This permits to avoid interactions with GTM. */ static int -data_node_commit_prepared(GlobalTransactionId gxid, GlobalTransactionId prepared_gxid, int conn_count, DataNodeHandle ** connections, char *gid) +pgxc_node_commit_prepared(GlobalTransactionId gxid, + GlobalTransactionId prepared_gxid, + PGXCNodeAllHandles *pgxc_handles, + char *gid) { int result = 0; - int i; - RemoteQueryState *combiner = NULL; - struct timeval *timeout = NULL; + int co_conn_count = pgxc_handles->co_conn_count; + int dn_conn_count = pgxc_handles->dn_conn_count; char *buffer = (char *) palloc0(18 + strlen(gid) + 1); /* GXID has been piggybacked when gid data has been received from GTM */ sprintf(buffer, "COMMIT PREPARED '%s'", gid); /* Send gxid and COMMIT PREPARED message to all the Datanodes */ - for (i = 0; i < conn_count; i++) - { - if (data_node_send_gxid(connections[i], gxid)) - { - add_error_message(connections[i], "Can not send request"); - result = EOF; - goto finish; - } - if (data_node_send_query(connections[i], buffer)) - { - add_error_message(connections[i], "Can not send request"); - result = EOF; - goto finish; - } - } - - combiner = CreateResponseCombiner(conn_count, COMBINE_TYPE_NONE); + if (pgxc_all_handles_send_gxid(pgxc_handles, gxid, true)) + goto finish; - /* Receive responses */ - if (data_node_receive_responses(conn_count, connections, timeout, combiner)) + /* Continue and receive responses even if there is an error */ + if (pgxc_all_handles_send_query(pgxc_handles, buffer, false)) result = EOF; - /* Validate and close combiner */ - result = ValidateAndCloseCombiner(combiner) ? result : EOF; + result = pgxc_node_receive_and_validate(dn_conn_count, pgxc_handles->datanode_handles, false); + result |= pgxc_node_receive_and_validate(co_conn_count, pgxc_handles->coord_handles, false); finish: /* Both GXIDs used for PREPARE and COMMIT PREPARED are discarded from GTM snapshot here */ @@ -1486,21 +1531,25 @@ finish: /* * Rollback prepared transaction on Datanodes involved in the current transaction + * + * Return whether or not a local operation required. */ -void -DataNodeRollbackPrepared(char *gid) +bool +PGXCNodeRollbackPrepared(char *gid) { int res = 0; int res_gtm = 0; - DataNodeHandle **connections; - List *nodelist = NIL; + PGXCNodeAllHandles *pgxc_handles; + List *datanodelist = NIL; + List *coordlist = NIL; int i, tran_count; - PGXC_NodeId *datanodes = NULL; PGXC_NodeId *coordinators = NULL; int coordcnt = 0; int datanodecnt = 0; GlobalTransactionId gxid, prepared_gxid; + /* This flag tracks if the transaction has to be rolled back locally */ + bool operation_local = false; res_gtm = GetGIDDataGTM(gid, &gxid, &prepared_gxid, &datanodecnt, &datanodes, &coordcnt, &coordinators); @@ -1513,15 +1562,25 @@ DataNodeRollbackPrepared(char *gid) /* Build the node list based on the result got from GTM */ for (i = 0; i < datanodecnt; i++) + datanodelist = lappend_int(datanodelist,datanodes[i]); + + for (i = 0; i < coordcnt; i++) { - nodelist = lappend_int(nodelist,datanodes[i]); + /* Local Coordinator number found, has to rollback locally also */ + if (coordinators[i] == PGXCNodeId) + operation_local = true; + else + coordlist = lappend_int(coordlist,coordinators[i]); } /* Get connections */ - connections = get_handles(nodelist); + if (coordcnt > 0 && datanodecnt == 0) + pgxc_handles = get_handles(datanodelist, coordlist, true); + else + pgxc_handles = get_handles(datanodelist, coordlist, false); - /* Here do the real rollback to Datanodes */ - res = data_node_rollback_prepared(gxid, prepared_gxid, datanodecnt, connections, gid); + /* Here do the real rollback to Datanodes and Coordinators */ + res = pgxc_node_rollback_prepared(gxid, prepared_gxid, pgxc_handles, gid); finish: /* In autocommit mode statistics is collected in DataNodeExec */ @@ -1530,7 +1589,16 @@ finish: if (!PersistentConnections) release_handles(true); autocommit = true; - clear_write_node_list(true); + clear_write_node_list(); + + /* Free node list taken from GTM */ + if (datanodes) + free(datanodes); + + if (coordinators) + free(coordinators); + + pfree_pgxc_all_handles(pgxc_handles); if (res_gtm < 0) ereport(ERROR, (errcode(ERRCODE_INTERNAL_ERROR), @@ -1539,6 +1607,8 @@ finish: ereport(ERROR, (errcode(ERRCODE_INTERNAL_ERROR), errmsg("Could not rollback prepared transaction on Datanodes"))); + + return operation_local; } @@ -1548,13 +1618,12 @@ finish: * At the end both prepared GXID and GXID are committed. */ static int -data_node_rollback_prepared(GlobalTransactionId gxid, GlobalTransactionId prepared_gxid, - int conn_count, DataNodeHandle ** connections, char *gid) +pgxc_node_rollback_prepared(GlobalTransactionId gxid, GlobalTransactionId prepared_gxid, + PGXCNodeAllHandles *pgxc_handles, char *gid) { int result = 0; - int i; - RemoteQueryState *combiner = NULL; - struct timeval *timeout = NULL; + int dn_conn_count = pgxc_handles->dn_conn_count; + int co_conn_count = pgxc_handles->co_conn_count; char *buffer = (char *) palloc0(20 + strlen(gid) + 1); /* Datanodes have reset after prepared state, so get a new gxid */ @@ -1562,34 +1631,15 @@ data_node_rollback_prepared(GlobalTransactionId gxid, GlobalTransactionId prepar sprintf(buffer, "ROLLBACK PREPARED '%s'", gid); - /* Send gxid and COMMIT PREPARED message to all the Datanodes */ - for (i = 0; i < conn_count; i++) - { - if (data_node_send_gxid(connections[i], gxid)) - { - add_error_message(connections[i], "Can not send request"); - result = EOF; - goto finish; - } - - if (data_node_send_query(connections[i], buffer)) - { - add_error_message(connections[i], "Can not send request"); - result = EOF; - goto finish; - } - } - - combiner = CreateResponseCombiner(conn_count, COMBINE_TYPE_NONE); - - /* Receive responses */ - if (data_node_receive_responses(conn_count, connections, timeout, combiner)) + /* Send gxid and ROLLBACK PREPARED message to all the Datanodes */ + if (pgxc_all_handles_send_gxid(pgxc_handles, gxid, false)) + result = EOF; + if (pgxc_all_handles_send_query(pgxc_handles, buffer, false)) result = EOF; - /* Validate and close combiner */ - result = ValidateAndCloseCombiner(combiner) ? result : EOF; + result = pgxc_node_receive_and_validate(dn_conn_count, pgxc_handles->datanode_handles, false); + result |= pgxc_node_receive_and_validate(co_conn_count, pgxc_handles->coord_handles, false); -finish: /* Both GXIDs used for PREPARE and COMMIT PREPARED are discarded from GTM snapshot here */ CommitPreparedTranGTM(gxid, prepared_gxid); @@ -1601,14 +1651,15 @@ finish: * Commit current transaction on data nodes where it has been started */ void -DataNodeCommit(void) +PGXCNodeCommit(void) { int res = 0; int tran_count; - DataNodeHandle *connections[NumDataNodes]; + PGXCNodeAllHandles *pgxc_connections; - /* gather connections to commit */ - tran_count = get_transaction_nodes(connections); + pgxc_connections = pgxc_get_all_transaction_nodes(); + + tran_count = pgxc_connections->dn_conn_count + pgxc_connections->co_conn_count; /* * If we do not have open transactions we have nothing to commit, just @@ -1617,7 +1668,7 @@ DataNodeCommit(void) if (tran_count == 0) goto finish; - res = data_node_commit(tran_count, connections); + res = pgxc_node_commit(pgxc_connections); finish: /* In autocommit mode statistics is collected in DataNodeExec */ @@ -1627,6 +1678,9 @@ finish: release_handles(false); autocommit = true; clear_write_node_list(); + + /* Clear up connection */ + pfree_pgxc_all_handles(pgxc_connections); if (res != 0) ereport(ERROR, (errcode(ERRCODE_INTERNAL_ERROR), @@ -1639,15 +1693,13 @@ finish: * if more then on one node data have been modified during the transactioon. */ static int -data_node_commit(int conn_count, DataNodeHandle ** connections) +pgxc_node_commit(PGXCNodeAllHandles *pgxc_handles) { - int i; - struct timeval *timeout = NULL; char buffer[256]; GlobalTransactionId gxid = InvalidGlobalTransactionId; int result = 0; - RemoteQueryState *combiner = NULL; - + int co_conn_count = pgxc_handles->co_conn_count; + int dn_conn_count = pgxc_handles->dn_conn_count; /* can set this to false to disable temporarily */ /* bool do2PC = conn_count > 1; */ @@ -1674,21 +1726,13 @@ data_node_commit(int conn_count, DataNodeHandle ** connections) gxid = GetCurrentGlobalTransactionId(); sprintf(buffer, "PREPARE TRANSACTION 'T%d'", gxid); - /* Send PREPARE */ - for (i = 0; i < conn_count; i++) - { - if (data_node_send_query(connections[i], buffer)) - return EOF; - } - combiner = CreateResponseCombiner(conn_count, COMBINE_TYPE_NONE); - /* Receive responses */ - if (data_node_receive_responses(conn_count, connections, timeout, combiner)) + if (pgxc_all_handles_send_query(pgxc_handles, buffer, false)) result = EOF; - /* Reset combiner */ - if (!ValidateAndResetCombiner(combiner)) - result = EOF; + /* Receive and Combine results from Datanodes and Coordinators */ + result |= pgxc_node_receive_and_validate(dn_conn_count, pgxc_handles->datanode_handles, true); + result |= pgxc_node_receive_and_validate(co_conn_count, pgxc_handles->coord_handles, true); } if (!do2PC) @@ -1696,7 +1740,11 @@ data_node_commit(int conn_count, DataNodeHandle ** connections) else { if (result) + { sprintf(buffer, "ROLLBACK PREPARED 'T%d'", gxid); + /* Consume any messages on the Datanodes and Coordinators first if necessary */ + PGXCNodeConsumeMessages(); + } else sprintf(buffer, "COMMIT PREPARED 'T%d'", gxid); @@ -1707,33 +1755,20 @@ data_node_commit(int conn_count, DataNodeHandle ** connections) */ two_phase_xid = BeginTranGTM(NULL); - for (i = 0; i < conn_count; i++) - { - if (data_node_send_gxid(connections[i], two_phase_xid)) - { - add_error_message(connections[i], "Can not send request"); - result = EOF; - goto finish; - } - } - } - - /* Send COMMIT */ - for (i = 0; i < conn_count; i++) - { - if (data_node_send_query(connections[i], buffer)) + if (pgxc_all_handles_send_gxid(pgxc_handles, two_phase_xid, true)) { result = EOF; goto finish; } } - if (!combiner) - combiner = CreateResponseCombiner(conn_count, COMBINE_TYPE_NONE); - /* Receive responses */ - if (data_node_receive_responses(conn_count, connections, timeout, combiner)) + /* Send COMMIT to all handles */ + if (pgxc_all_handles_send_query(pgxc_handles, buffer, false)) result = EOF; - result = ValidateAndCloseCombiner(combiner) ? result : EOF; + + /* Receive and Combine results from Datanodes and Coordinators */ + result |= pgxc_node_receive_and_validate(dn_conn_count, pgxc_handles->datanode_handles, false); + result |= pgxc_node_receive_and_validate(co_conn_count, pgxc_handles->coord_handles, false); finish: if (do2PC) @@ -1748,18 +1783,18 @@ finish: * This will happen */ int -DataNodeRollback(void) +PGXCNodeRollback(void) { int res = 0; int tran_count; - DataNodeHandle *connections[NumDataNodes]; + PGXCNodeAllHandles *pgxc_connections; + pgxc_connections = pgxc_get_all_transaction_nodes(); - /* Consume any messages on the data nodes first if necessary */ - DataNodeConsumeMessages(); + tran_count = pgxc_connections->dn_conn_count + pgxc_connections->co_conn_count; - /* gather connections to rollback */ - tran_count = get_transaction_nodes(connections); + /* Consume any messages on the Datanodes and Coordinators first if necessary */ + PGXCNodeConsumeMessages(); /* * If we do not have open transactions we have nothing to rollback just @@ -1768,7 +1803,7 @@ DataNodeRollback(void) if (tran_count == 0) goto finish; - res = data_node_rollback(tran_count, connections); + res = pgxc_node_rollback(pgxc_connections); finish: /* In autocommit mode statistics is collected in DataNodeExec */ @@ -1778,20 +1813,23 @@ finish: release_handles(true); autocommit = true; clear_write_node_list(); + + /* Clean up connections */ + pfree_pgxc_all_handles(pgxc_connections); return res; } /* - * Send ROLLBACK command down to the Data nodes and handle responses + * Send ROLLBACK command down to Datanodes and Coordinators and handle responses */ static int -data_node_rollback(int conn_count, DataNodeHandle ** connections) +pgxc_node_rollback(PGXCNodeAllHandles *pgxc_handles) { int i; - struct timeval *timeout = NULL; - RemoteQueryState *combiner; - + int result = 0; + int co_conn_count = pgxc_handles->co_conn_count; + int dn_conn_count = pgxc_handles->dn_conn_count; /* * Rollback is a special case, being issued because of an error. @@ -1799,20 +1837,21 @@ data_node_rollback(int conn_count, DataNodeHandle ** connections) * issuing our rollbacks so that we did not read the results of the * previous command. */ - for (i = 0; i < conn_count; i++) - clear_socket_data(connections[i]); + for (i = 0; i < dn_conn_count; i++) + clear_socket_data(pgxc_handles->datanode_handles[i]); - /* Send ROLLBACK - */ - for (i = 0; i < conn_count; i++) - data_node_send_query(connections[i], "ROLLBACK"); + for (i = 0; i < co_conn_count; i++) + clear_socket_data(pgxc_handles->coord_handles[i]); - combiner = CreateResponseCombiner(conn_count, COMBINE_TYPE_NONE); - /* Receive responses */ - if (data_node_receive_responses(conn_count, connections, timeout, combiner)) - return EOF; + /* Send ROLLBACK to all handles */ + if (pgxc_all_handles_send_query(pgxc_handles, "ROLLBACK", false)) + result = EOF; - /* Verify status */ - return ValidateAndCloseCombiner(combiner) ? 0 : EOF; + /* Receive and Combine results from Datanodes and Coordinators */ + result |= pgxc_node_receive_and_validate(dn_conn_count, pgxc_handles->datanode_handles, false); + result |= pgxc_node_receive_and_validate(co_conn_count, pgxc_handles->coord_handles, false); + + return result; } @@ -1820,15 +1859,16 @@ data_node_rollback(int conn_count, DataNodeHandle ** connections) * Begin COPY command * The copy_connections array must have room for NumDataNodes items */ -DataNodeHandle** +PGXCNodeHandle** DataNodeCopyBegin(const char *query, List *nodelist, Snapshot snapshot, bool is_from) { int i, j; int conn_count = list_length(nodelist) == 0 ? NumDataNodes : list_length(nodelist); struct timeval *timeout = NULL; - DataNodeHandle **connections; - DataNodeHandle **copy_connections; - DataNodeHandle *newConnections[conn_count]; + PGXCNodeAllHandles *pgxc_handles; + PGXCNodeHandle **connections; + PGXCNodeHandle **copy_connections; + PGXCNodeHandle *newConnections[conn_count]; int new_count = 0; ListCell *nodeitem; bool need_tran; @@ -1840,7 +1880,9 @@ DataNodeCopyBegin(const char *query, List *nodelist, Snapshot snapshot, bool is_ return NULL; /* Get needed datanode connections */ - connections = get_handles(nodelist); + pgxc_handles = get_handles(nodelist, NULL, false); + connections = pgxc_handles->datanode_handles; + if (!connections) return NULL; @@ -1853,7 +1895,7 @@ DataNodeCopyBegin(const char *query, List *nodelist, Snapshot snapshot, bool is_ * So store connections in an array where index is node-1. * Unused items in the array should be NULL */ - copy_connections = (DataNodeHandle **) palloc0(NumDataNodes * sizeof(DataNodeHandle *)); + copy_connections = (PGXCNodeHandle **) palloc0(NumDataNodes * sizeof(PGXCNodeHandle *)); i = 0; foreach(nodeitem, nodelist) copy_connections[lfirst_int(nodeitem) - 1] = connections[i++]; @@ -1910,7 +1952,7 @@ DataNodeCopyBegin(const char *query, List *nodelist, Snapshot snapshot, bool is_ if (new_count > 0 && need_tran) { /* Start transaction on connections where it is not started */ - if (data_node_begin(new_count, newConnections, gxid)) + if (pgxc_node_begin(new_count, newConnections, gxid)) { pfree(connections); pfree(copy_connections); @@ -1922,18 +1964,18 @@ DataNodeCopyBegin(const char *query, List *nodelist, Snapshot snapshot, bool is_ for (i = 0; i < conn_count; i++) { /* If explicit transaction is needed gxid is already sent */ - if (!need_tran && data_node_send_gxid(connections[i], gxid)) + if (!need_tran && pgxc_node_send_gxid(connections[i], gxid)) { add_error_message(connections[i], "Can not send request"); pfree(connections); pfree(copy_connections); return NULL; } - if (conn_count == 1 && data_node_send_timestamp(connections[i], timestamp)) + if (conn_count == 1 && pgxc_node_send_timestamp(connections[i], tim... [truncated message content] |
From: mason_s <ma...@us...> - 2010-10-04 20:54:09
|
Project "Postgres-XC". The branch, master has been updated via ea13b66f4beaeb13db9741fb5a1347f976b9ebab (commit) from d044db4cc1b8cf18f14cfaa6c65d39ec14905dfb (commit) - Log ----------------------------------------------------------------- commit ea13b66f4beaeb13db9741fb5a1347f976b9ebab Author: Mason Sharp <ma...@us...> Date: Mon Oct 4 16:53:07 2010 -0400 Fixed bug where extra materialization nodes were being created. By Pavan Deolasee diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c index 337f17b..818ea1b 100644 --- a/src/backend/optimizer/plan/createplan.c +++ b/src/backend/optimizer/plan/createplan.c @@ -321,15 +321,6 @@ create_scan_plan(PlannerInfo *root, Path *best_path) best_path, tlist, scan_clauses); - - /* - * Insert a materialization plan above this temporarily - * until we better handle multiple steps using the same connection. - */ - matplan = (Plan *) make_material(plan); - copy_plan_costsize(matplan, plan); - matplan->total_cost += cpu_tuple_cost * matplan->plan_rows; - plan = matplan; break; #endif default: diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c index cbf7618..ca2e2a2 100644 --- a/src/backend/optimizer/util/pathnode.c +++ b/src/backend/optimizer/util/pathnode.c @@ -1325,9 +1325,15 @@ create_remotequery_path(PlannerInfo *root, RelOptInfo *rel) pathnode->parent = rel; pathnode->pathkeys = NIL; /* result is always unordered */ - // PGXCTODO - set cost properly + /* PGXCTODO - set cost properly */ cost_seqscan(pathnode, root, rel); + /* + * Insert a materialization plan above this temporarily + * until we better handle multiple steps using the same connection. + */ + pathnode = create_material_path(rel, pathnode); + return pathnode; } #endif ----------------------------------------------------------------------- Summary of changes: src/backend/optimizer/plan/createplan.c | 9 --------- src/backend/optimizer/util/pathnode.c | 8 +++++++- 2 files changed, 7 insertions(+), 10 deletions(-) hooks/post-receive -- Postgres-XC |
From: mason_s <ma...@us...> - 2010-10-04 20:52:39
|
Project "Postgres-XC". The branch, master has been updated via d044db4cc1b8cf18f14cfaa6c65d39ec14905dfb (commit) from e4978385ac1e81be3b95fe51656a0a166cfc22fb (commit) - Log ----------------------------------------------------------------- commit d044db4cc1b8cf18f14cfaa6c65d39ec14905dfb Author: Mason Sharp <ma...@us...> Date: Sat Oct 2 19:21:57 2010 +0900 Fix a bug with EXPLAIN and EXPLAIN VERBOSE. If it was a single-step statement, the output plan would incorrectly display a coordinator-based standard plan instead of the simple one. Bug and cause of problem discovered by Pavan Deolasee diff --git a/src/backend/pgxc/plan/planner.c b/src/backend/pgxc/plan/planner.c index a88179b..29e4ee0 100644 --- a/src/backend/pgxc/plan/planner.c +++ b/src/backend/pgxc/plan/planner.c @@ -2218,10 +2218,12 @@ pgxc_planner(Query *query, int cursorOptions, ParamListInfo boundParams) } /* - * If there already is an active portal, we may be doing planning within a function. - * Just use the standard plan + * If there already is an active portal, we may be doing planning + * within a function. Just use the standard plan, but check if + * it is part of an EXPLAIN statement so that we do not show that + * we plan multiple steps when it is a single-step operation. */ - if (ActivePortal) + if (ActivePortal && strcmp(ActivePortal->commandTag, "EXPLAIN")) return standard_planner(query, cursorOptions, boundParams); query_step->is_single_step = true; ----------------------------------------------------------------------- Summary of changes: src/backend/pgxc/plan/planner.c | 8 +++++--- 1 files changed, 5 insertions(+), 3 deletions(-) hooks/post-receive -- Postgres-XC |