You can subscribe to this list here.
2010 |
Jan
|
Feb
|
Mar
|
Apr
(10) |
May
(17) |
Jun
(3) |
Jul
|
Aug
|
Sep
(8) |
Oct
(18) |
Nov
(51) |
Dec
(74) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2011 |
Jan
(47) |
Feb
(44) |
Mar
(44) |
Apr
(102) |
May
(35) |
Jun
(25) |
Jul
(56) |
Aug
(69) |
Sep
(32) |
Oct
(37) |
Nov
(31) |
Dec
(16) |
2012 |
Jan
(34) |
Feb
(127) |
Mar
(218) |
Apr
(252) |
May
(80) |
Jun
(137) |
Jul
(205) |
Aug
(159) |
Sep
(35) |
Oct
(50) |
Nov
(82) |
Dec
(52) |
2013 |
Jan
(107) |
Feb
(159) |
Mar
(118) |
Apr
(163) |
May
(151) |
Jun
(89) |
Jul
(106) |
Aug
(177) |
Sep
(49) |
Oct
(63) |
Nov
(46) |
Dec
(7) |
2014 |
Jan
(65) |
Feb
(128) |
Mar
(40) |
Apr
(11) |
May
(4) |
Jun
(8) |
Jul
(16) |
Aug
(11) |
Sep
(4) |
Oct
(1) |
Nov
(5) |
Dec
(16) |
2015 |
Jan
(5) |
Feb
|
Mar
(2) |
Apr
(5) |
May
(4) |
Jun
(12) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(4) |
2019 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(2) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
S | M | T | W | T | F | S |
---|---|---|---|---|---|---|
1
|
2
(1) |
3
(6) |
4
(19) |
5
|
6
(15) |
7
(2) |
8
(2) |
9
(22) |
10
(20) |
11
(20) |
12
(14) |
13
(12) |
14
(2) |
15
|
16
(14) |
17
(17) |
18
(4) |
19
(8) |
20
(2) |
21
(3) |
22
|
23
(8) |
24
(1) |
25
|
26
(2) |
27
(1) |
28
|
29
|
30
(7) |
31
(3) |
|
|
|
|
From: Michael P. <mic...@gm...> - 2012-07-06 23:16:22
|
On 2012/07/07, at 0:03, Shankar Hariharan <har...@ya...> wrote: > Michael, read your message on incorrect copy. I was wondering if it is possible to both replicate and distribute a table. Just wondering if it can be used to gather all distributed data in one spot to work around data loss. You cannot do yet distribution and replication of a table at the same time. Well, there is no data loss in my bug. We just do not select the correct node when running copy in those particular circumstances. > > thanks, > Shankar > > From: "pos...@li..." <pos...@li...> > To: pos...@li... > Sent: Friday, July 6, 2012 8:24 AM > Subject: Postgres-xc-developers Digest, Vol 25, Issue 10 > > Send Postgres-xc-developers mailing list submissions to > pos...@li... > > To subscribe or unsubscribe via the World Wide Web, visit > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > or, via email, send a message with subject or body 'help' to > pos...@li... > > You can reach the person managing the list at > pos...@li... > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Postgres-xc-developers digest..." > > > Today's Topics: > > 1. Re: Question on gtm-proxy (Michael Paquier) > 2. Re: Question on gtm-proxy (Ashutosh Bapat) > 3. Incorrect COPY for partially replicated tables (Michael Paquier) > 4. Re: Question on gtm-proxy (Shankar Hariharan) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Fri, 6 Jul 2012 14:00:51 +0900 > From: Michael Paquier <mic...@gm...> > Subject: Re: [Postgres-xc-developers] Question on gtm-proxy > To: Shankar Hariharan <har...@ya...> > Cc: "pos...@li..." > <pos...@li...> > Message-ID: > <CAB...@ma...> > Content-Type: text/plain; charset="iso-8859-1" > > On Fri, Jul 6, 2012 at 1:38 PM, Shankar Hariharan < > har...@ya...> wrote: > > > Follow up to earlier email. In the setup described below, can I avoid > > using a gtm-proxy? That is, can I just simply point coordinators to the one > > gtm running on node 3 ? > > My initial plan was to just run the gtm on node 3 then I thought I could > > try a datanode without a local coordinator which was why I put these two > > together on node 3. > > thanks, > > > GTM proxy is not a mandatory element in an XC cluster. > So yes, you can connect directly a Coordinator or a Datanode to a GTM. > -- > Michael Paquier > http://michael.otacoo.com > -------------- next part -------------- > An HTML attachment was scrubbed... > > ------------------------------ > > Message: 2 > Date: Fri, 6 Jul 2012 10:55:37 +0530 > From: Ashutosh Bapat <ash...@en...> > Subject: Re: [Postgres-xc-developers] Question on gtm-proxy > To: Shankar Hariharan <har...@ya...> > Cc: "pos...@li..." > <pos...@li...> > Message-ID: > <CAF...@ma...> > Content-Type: text/plain; charset="iso-8859-1" > > Hi Shankar, > Running gtm-proxy has shown to improve the performance, because it lessens > the load on GTM, by serving requests locally. Why do you want the > coordinators to connect directly to the GTM? Are you seeing any performance > improvement from doing that? > > On Fri, Jul 6, 2012 at 10:08 AM, Shankar Hariharan < > har...@ya...> wrote: > > > Follow up to earlier email. In the setup described below, can I avoid > > using a gtm-proxy? That is, can I just simply point coordinators to the one > > gtm running on node 3 ? > > My initial plan was to just run the gtm on node 3 then I thought I could > > try a datanode without a local coordinator which was why I put these two > > together on node 3. > > thanks, > > Shankar > > > > ------------------------------ > > *From:* Shankar Hariharan <har...@ya...> > > *To:* "pos...@li..." < > > pos...@li...> > > *Sent:* Thursday, July 5, 2012 11:35 PM > > *Subject:* Question on multiple coordinators > > > > Hello, > > > > Am trying out XC 1.0 in the following configuraiton. > > Node 1 - Coord1, Datanode1, gtm-proxy1 > > Node 2- Coord2, Datanode2, gtm-proxy2 > > Node 3- Datanode3, gtm > > > > I setup all nodes but forgot to add Coord1 to Coord2 and vice versa. In > > addition I missed the pg_hba edit as well. So the first table T1 that I > > created for distribution from Coord1 was not "visible| from Coord2 but > > was on all the data nodes. > > I tried to get Coord2 backinto business in various ways but the first > > table I created refused to show up on Coord2 : > > - edit pg_hba and add node on both coord1 and 2. Then run select > > pgxc_pool_reload(); > > - restart coord 1 and 2 > > - drop node c2 from c1 and c1 from c2 and add them back followed by select > > pgxc_pool_reload(); > > > > So I tried to create the same table T1 from Coord2 to observe behavior > > and it did not like it clearly as all nodes it "wrote" to reported that the > > table already existed which was good. At this point I could understand that > > Coord2 and Coord1 are not talking alright so I created a new table from > > coord1 with replication. This table was visible from both now. > > > > Question is should I expect to see the first table, let me call it T1 > > after a while from Coord2 also? > > > > > > thanks, > > Shankar > > > > > > > > > > ------------------------------------------------------------------------------ > > Live Security Virtual Conference > > Exclusive live event will cover all the ways today's security and > > threat landscape has changed and how IT managers can respond. Discussions > > will include endpoint security, mobile security and the latest in malware > > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > > _______________________________________________ > > Postgres-xc-developers mailing list > > Pos...@li... > > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > > > > > > > -- > Best Wishes, > Ashutosh Bapat > EntepriseDB Corporation > The Enterprise Postgres Company > -------------- next part -------------- > An HTML attachment was scrubbed... > > ------------------------------ > > Message: 3 > Date: Fri, 6 Jul 2012 18:07:44 +0900 > From: Michael Paquier <mic...@gm...> > Subject: [Postgres-xc-developers] Incorrect COPY for partially > replicated tables > To: Postgres-XC mailing list > <pos...@li...> > Message-ID: > <CAB7nPqR=dA4=Hbw...@ma...> > Content-Type: text/plain; charset="iso-8859-1" > > Hi all, > > While testing data redistribution I found this bug with COPY. > It is reproducible with master, and very probably with 1.0 stable. > postgres=# create table aa (a int) distribute by replication to node dn2; > CREATE TABLE > postgres=# insert into aa values (generate_series(1,10)); > INSERT 0 10 > postgres=# copy aa to stdout; -- no output here > postgres=# > I'll investigate this problem on Monday. For the time being this bug is > registered here: > https://sourceforge.net/tracker/?func=detail&aid=3540784&group_id=311227&atid=1310232 > > Regards, > -- > Michael Paquier > http://michael.otacoo.com > -------------- next part -------------- > An HTML attachment was scrubbed... > > ------------------------------ > > Message: 4 > Date: Fri, 6 Jul 2012 06:22:12 -0700 (PDT) > From: Shankar Hariharan <har...@ya...> > Subject: Re: [Postgres-xc-developers] Question on gtm-proxy > To: Ashutosh Bapat <ash...@en...> > Cc: "pos...@li..." > <pos...@li...> > Message-ID: > <134...@we...> > Content-Type: text/plain; charset="iso-8859-1" > > Hi Ashutosh, > I was trying to size the load on a server and was wondering if ?a GTM could be shared w/o much performance overhead between a small number of datanodes and coordinators. I will post my findings here. > thanks, > Shankar > > > ________________________________ > From: Ashutosh Bapat <ash...@en...> > To: Shankar Hariharan <har...@ya...> > Cc: "pos...@li..." <pos...@li...> > Sent: Friday, July 6, 2012 12:25 AM > Subject: Re: [Postgres-xc-developers] Question on gtm-proxy > > > Hi Shankar, > Running gtm-proxy has shown to improve the performance, because it lessens the load on GTM, by serving requests locally. Why do you want the coordinators to connect directly to the GTM? Are you seeing any performance improvement from doing that? > > > On Fri, Jul 6, 2012 at 10:08 AM, Shankar Hariharan <har...@ya...> wrote: > > Follow up to earlier email. In the setup described below, can I avoid using a gtm-proxy? That is, can I just simply point coordinators to the one gtm running on node 3 ? > >My initial plan was to just run the gtm on node 3 then I thought I could try a datanode without a local coordinator which was why I put these two together on node 3. > >thanks, > >Shankar > > > > > > > >________________________________ > > From: Shankar Hariharan <har...@ya...> > >To: "pos...@li..." <pos...@li...> > >Sent: Thursday, July 5, 2012 11:35 PM > >Subject: Question on multiple coordinators > > > > > >Hello, > > > > > >Am trying out XC 1.0 in the following configuraiton. > >Node 1 - Coord1, Datanode1, gtm-proxy1 > >Node 2-?Coord2, Datanode2, gtm-proxy2 > >Node 3-?Datanode3, gtm > > > > > >I setup all nodes but forgot to add Coord1 to?Coord2 and vice versa. In addition I missed the pg_hba edit as well.?So the first table T1 that I created for distribution from?Coord1?was not "visible| from?Coord2 but was on all the data nodes.? > >I tried to get Coord2 backinto business in various ways but the first table I created refused to show up on Coord2 : > >- edit pg_hba and add node on both coord1 and 2. Then run?select pgxc_pool_reload(); > >- restart coord 1 and 2 > >- drop node c2 from c1 and c1 from c2 and add them back followed by?select pgxc_pool_reload(); > > > > > >So I tried to create the same table T1 from?Coord2 to observe behavior and it did not like it clearly as all nodes it "wrote" to reported that the table already existed which was good. At this point I could understand that Coord2 and Coord1 are not talking alright so I created a new table from coord1 with replication. This table was visible from both now.? > > > > > >Question is should I expect to see the first table, let me call it T1 after a while from Coord2 also?? > > > > > > > > > >thanks, > >Shankar > > > > > >------------------------------------------------------------------------------ > >Live Security Virtual Conference > >Exclusive live event will cover all the ways today's security and > >threat landscape has changed and how IT managers can respond. Discussions > >will include endpoint security, mobile security and the latest in malware > >threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > >_______________________________________________ > >Postgres-xc-developers mailing list > >Pos...@li... > >https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > > > > > > > -- > Best Wishes, > Ashutosh Bapat > EntepriseDB Corporation > The Enterprise Postgres Company > -------------- next part -------------- > An HTML attachment was scrubbed... > > ------------------------------ > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > > ------------------------------ > > _______________________________________________ > Postgres-xc-developers mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > > > End of Postgres-xc-developers Digest, Vol 25, Issue 10 > ****************************************************** > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Postgres-xc-developers mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers |
From: Shankar H. <har...@ya...> - 2012-07-06 20:15:03
|
As planned I ran some tests using PGBench on this setup : Node 1 - Coord1, Datanode1, gtm-proxy1 Node 2- Coord2, Datanode2, gtm-proxy2 Node 3- Datanode3, gtm I was connecting via Coord1 for these tests: - scale factor of 30 used - tests run using the following input parameters for pgbench: ClientsThreadsDurationTransactions 111006204 221009960 4410012880 6610016768 8810019758 101010021944 121210020674 The run went well until the 8 clients. I started seeing errors on 10 clients onwards and eventually the 14 client run has been hanging around for over an hour now. The errors I have been seeing on console are the following : pgbench console : Client 8 aborted in state 12: ERROR: GTM error, could not obtain snapshot Client 0 aborted in state 13: ERROR: maximum number of prepared transactions reached Client 7 aborted in state 13: ERROR: maximum number of prepared transactions reached Client 11 aborted in state 13: ERROR: maximum number of prepared transactions reached Client 9 aborted in state 13: ERROR: maximum number of prepared transactions reached node console: ERROR: GTM error, could not obtain snapshot STATEMENT: INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (253, 26, 1888413, -817, CURRENT_TIMESTAMP); ERROR: maximum number of prepared transactions reached HINT: Increase max_prepared_transactions (currently 10). STATEMENT: PREPARE TRANSACTION 'T201428' ERROR: maximum number of prepared transactions reached STATEMENT: END; ERROR: maximum number of prepared transactions reached STATEMENT: END; ERROR: maximum number of prepared transactions reached STATEMENT: END; ERROR: maximum number of prepared transactions reached STATEMENT: END; ERROR: GTM error, could not obtain snapshot STATEMENT: INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (140, 29, 2416403, -4192, CURRENT_TIMESTAMP); I was also watching the processes on each node and see the following for the 14 client run: Node1 : postgres 25571 10511 0 04:41 ? 00:00:02 postgres: postgres postgres ::1(33481) TRUNCATE TABLE waiting postgres 25620 11694 0 04:46 ? 00:00:00 postgres: postgres postgres pgbench-address (50388) TRUNCATE TABLE Node2: postgres 10979 9631 0 Jul05 ? 00:00:42 postgres: postgres postgres coord1-address(57357) idle in transaction Node3: postgres 20264 9911 0 08:35 ? 00:00:05 postgres: postgres postgres coord1-address(51406) TRUNCATE TABLE waiting I was going to restart the processes on all nodes and start over but did not want to lose this data as it could be useful information. Any explanation on the above issue is much appreciated. I will try the next run with a higher value set for max_prepared_transactions. Any recommendations for a good value on this front? thanks, Shankar ________________________________ From: Shankar Hariharan <har...@ya...> To: Ashutosh Bapat <ash...@en...> Cc: "pos...@li..." <pos...@li...> Sent: Friday, July 6, 2012 8:22 AM Subject: Re: [Postgres-xc-developers] Question on gtm-proxy Hi Ashutosh, I was trying to size the load on a server and was wondering if a GTM could be shared w/o much performance overhead between a small number of datanodes and coordinators. I will post my findings here. thanks, Shankar ________________________________ From: Ashutosh Bapat <ash...@en...> To: Shankar Hariharan <har...@ya...> Cc: "pos...@li..." <pos...@li...> Sent: Friday, July 6, 2012 12:25 AM Subject: Re: [Postgres-xc-developers] Question on gtm-proxy Hi Shankar, Running gtm-proxy has shown to improve the performance, because it lessens the load on GTM, by serving requests locally. Why do you want the coordinators to connect directly to the GTM? Are you seeing any performance improvement from doing that? On Fri, Jul 6, 2012 at 10:08 AM, Shankar Hariharan <har...@ya...> wrote: Follow up to earlier email. In the setup described below, can I avoid using a gtm-proxy? That is, can I just simply point coordinators to the one gtm running on node 3 ? >My initial plan was to just run the gtm on node 3 then I thought I could try a datanode without a local coordinator which was why I put these two together on node 3. >thanks, >Shankar > > > >________________________________ > From: Shankar Hariharan <har...@ya...> >To: "pos...@li..." <pos...@li...> >Sent: Thursday, July 5, 2012 11:35 PM >Subject: Question on multiple coordinators > > >Hello, > > >Am trying out XC 1.0 in the following configuraiton. >Node 1 - Coord1, Datanode1, gtm-proxy1 >Node 2- Coord2, Datanode2, gtm-proxy2 >Node 3- Datanode3, gtm > > >I setup all nodes but forgot to add Coord1 to Coord2 and vice versa. In addition I missed the pg_hba edit as well. So the first table T1 that I created for distribution from Coord1 was not "visible| from Coord2 but was on all the data nodes. >I tried to get Coord2 backinto business in various ways but the first table I created refused to show up on Coord2 : >- edit pg_hba and add node on both coord1 and 2. Then run select pgxc_pool_reload(); >- restart coord 1 and 2 >- drop node c2 from c1 and c1 from c2 and add them back followed by select pgxc_pool_reload(); > > >So I tried to create the same table T1 from Coord2 to observe behavior and it did not like it clearly as all nodes it "wrote" to reported that the table already existed which was good. At this point I could understand that Coord2 and Coord1 are not talking alright so I created a new table from coord1 with replication. This table was visible from both now. > > >Question is should I expect to see the first table, let me call it T1 after a while from Coord2 also? > > > > >thanks, >Shankar > > >------------------------------------------------------------------------------ >Live Security Virtual Conference >Exclusive live event will cover all the ways today's security and >threat landscape has changed and how IT managers can respond. Discussions >will include endpoint security, mobile security and the latest in malware >threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ >_______________________________________________ >Postgres-xc-developers mailing list >Pos...@li... >https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > > -- Best Wishes, Ashutosh Bapat EntepriseDB Corporation The Enterprise Postgres Company |
From: Nikhil S. <ni...@st...> - 2012-07-06 16:34:55
|
Hi Shankar, Yeah, the GTM might be able to scale a bit to some level, but after that having the proxies around on each node makes much more sense. It also helps reduce the direct CPU load on the GTM node. And the proxies shouldn't consume that much CPU by themselves too. Unless you are trying a CPU intensive benchmark, but most benchmarks try to churn up IO.. Regards, Nikhils On Fri, Jul 6, 2012 at 9:22 AM, Shankar Hariharan <har...@ya...> wrote: > Hi Ashutosh, > I was trying to size the load on a server and was wondering if a GTM could > be shared w/o much performance overhead between a small number of datanodes > and coordinators. I will post my findings here. > thanks, > Shankar > > ________________________________ > From: Ashutosh Bapat <ash...@en...> > To: Shankar Hariharan <har...@ya...> > Cc: "pos...@li..." > <pos...@li...> > Sent: Friday, July 6, 2012 12:25 AM > Subject: Re: [Postgres-xc-developers] Question on gtm-proxy > > Hi Shankar, > Running gtm-proxy has shown to improve the performance, because it lessens > the load on GTM, by serving requests locally. Why do you want the > coordinators to connect directly to the GTM? Are you seeing any performance > improvement from doing that? > > On Fri, Jul 6, 2012 at 10:08 AM, Shankar Hariharan > <har...@ya...> wrote: > > Follow up to earlier email. In the setup described below, can I avoid using > a gtm-proxy? That is, can I just simply point coordinators to the one gtm > running on node 3 ? > My initial plan was to just run the gtm on node 3 then I thought I could try > a datanode without a local coordinator which was why I put these two > together on node 3. > thanks, > Shankar > > ________________________________ > From: Shankar Hariharan <har...@ya...> > To: "pos...@li..." > <pos...@li...> > Sent: Thursday, July 5, 2012 11:35 PM > Subject: Question on multiple coordinators > > Hello, > > Am trying out XC 1.0 in the following configuraiton. > Node 1 - Coord1, Datanode1, gtm-proxy1 > Node 2- Coord2, Datanode2, gtm-proxy2 > Node 3- Datanode3, gtm > > I setup all nodes but forgot to add Coord1 to Coord2 and vice versa. In > addition I missed the pg_hba edit as well. So the first table T1 that I > created for distribution from Coord1 was not "visible| from Coord2 but was > on all the data nodes. > I tried to get Coord2 backinto business in various ways but the first table > I created refused to show up on Coord2 : > - edit pg_hba and add node on both coord1 and 2. Then run select > pgxc_pool_reload(); > - restart coord 1 and 2 > - drop node c2 from c1 and c1 from c2 and add them back followed by select > pgxc_pool_reload(); > > So I tried to create the same table T1 from Coord2 to observe behavior and > it did not like it clearly as all nodes it "wrote" to reported that the > table already existed which was good. At this point I could understand that > Coord2 and Coord1 are not talking alright so I created a new table from > coord1 with replication. This table was visible from both now. > > Question is should I expect to see the first table, let me call it T1 after > a while from Coord2 also? > > > thanks, > Shankar > > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Postgres-xc-developers mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > > > > > -- > Best Wishes, > Ashutosh Bapat > EntepriseDB Corporation > The Enterprise Postgres Company > > > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Postgres-xc-developers mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > -- StormDB - http://www.stormdb.com The Database Cloud |
From: Shankar H. <har...@ya...> - 2012-07-06 15:04:40
|
Thanks Andrei. I will try this out. Appreciate the help. ________________________________ From: Andrei Martsinchyk <and...@gm...> To: Shankar Hariharan <har...@ya...> Cc: "pos...@li..." <pos...@li...> Sent: Friday, July 6, 2012 9:18 AM Subject: Re: [Postgres-xc-developers] Question on multiple coordinators Hi Shankar, You can safely clone Coordinators by plain copying of the data directory and adjusting configuration afterwards. That approach may be used to fix your problem or to add a coordinator to existing cluster. 1. Stop all cluster components. 2. Copy coordinator database to the new location. 3. Start GTM, GTM proxy if appropriate and Coordinator specifying -D <new datadir location>. Make sure you are not running coordinator on the master copy of the data directory, they share the same name yet, GTM would not allow that. 4. Connect psql or other client to the coordinator and create a record in pgxc_node table for future "self" entry using CREATE NODE command. You may need to adjust connection info for current self, which will be pointing to the original coordinator, since the view may be different from the new location, like you may want to replace host='localhost' with host = '<IP adress>' 5. Adjust configuration of the new coordinator, you must change pgxc_node_name so it is unique in the cluster; if the new location is on the same box you may need to change port to listen on for client connections and pooler port. The configuration should match the "self" entry you created on previous step. 6. Restart the new coordinator, start other cluster components. 7. Connect to old coordinators and use CREATE NODE command to make them aware of new coordinator. 8. Enjoy. 2012/7/6 Shankar Hariharan <har...@ya...> Hello, > > >Am trying out XC 1.0 in the following configuraiton. >Node 1 - Coord1, Datanode1, gtm-proxy1 >Node 2- Coord2, Datanode2, gtm-proxy2 >Node 3- Datanode3, gtm > > >I setup all nodes but forgot to add Coord1 to Coord2 and vice versa. In addition I missed the pg_hba edit as well. So the first table T1 that I created for distribution from Coord1 was not "visible| from Coord2 but was on all the data nodes. >I tried to get Coord2 backinto business in various ways but the first table I created refused to show up on Coord2 : >- edit pg_hba and add node on both coord1 and 2. Then run select pgxc_pool_reload(); >- restart coord 1 and 2 >- drop node c2 from c1 and c1 from c2 and add them back followed by select pgxc_pool_reload(); > > >So I tried to create the same table T1 from Coord2 to observe behavior and it did not like it clearly as all nodes it "wrote" to reported that the table already existed which was good. At this point I could understand that Coord2 and Coord1 are not talking alright so I created a new table from coord1 with replication. This table was visible from both now. > > >Question is should I expect to see the first table, let me call it T1 after a while from Coord2 also? > > > > >thanks, >Shankar >------------------------------------------------------------------------------ >Live Security Virtual Conference >Exclusive live event will cover all the ways today's security and >threat landscape has changed and how IT managers can respond. Discussions >will include endpoint security, mobile security and the latest in malware >threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ >_______________________________________________ >Postgres-xc-developers mailing list >Pos...@li... >https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > > -- Andrei Martsinchyk StormDB - http://www.stormdb.com/ The Database Cloud |
From: Shankar H. <har...@ya...> - 2012-07-06 15:03:43
|
Michael, read your message on incorrect copy. I was wondering if it is possible to both replicate and distribute a table. Just wondering if it can be used to gather all distributed data in one spot to work around data loss. thanks, Shankar ________________________________ From: "pos...@li..." <pos...@li...> To: pos...@li... Sent: Friday, July 6, 2012 8:24 AM Subject: Postgres-xc-developers Digest, Vol 25, Issue 10 Send Postgres-xc-developers mailing list submissions to pos...@li... To subscribe or unsubscribe via the World Wide Web, visit https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers or, via email, send a message with subject or body 'help' to pos...@li... You can reach the person managing the list at pos...@li... When replying, please edit your Subject line so it is more specific than "Re: Contents of Postgres-xc-developers digest..." Today's Topics: 1. Re: Question on gtm-proxy (Michael Paquier) 2. Re: Question on gtm-proxy (Ashutosh Bapat) 3. Incorrect COPY for partially replicated tables (Michael Paquier) 4. Re: Question on gtm-proxy (Shankar Hariharan) ---------------------------------------------------------------------- Message: 1 Date: Fri, 6 Jul 2012 14:00:51 +0900 From: Michael Paquier <mic...@gm...> Subject: Re: [Postgres-xc-developers] Question on gtm-proxy To: Shankar Hariharan <har...@ya...> Cc: "pos...@li..." <pos...@li...> Message-ID: <CAB...@ma...> Content-Type: text/plain; charset="iso-8859-1" On Fri, Jul 6, 2012 at 1:38 PM, Shankar Hariharan < har...@ya...> wrote: > Follow up to earlier email. In the setup described below, can I avoid > using a gtm-proxy? That is, can I just simply point coordinators to the one > gtm running on node 3 ? > My initial plan was to just run the gtm on node 3 then I thought I could > try a datanode without a local coordinator which was why I put these two > together on node 3. > thanks, > GTM proxy is not a mandatory element in an XC cluster. So yes, you can connect directly a Coordinator or a Datanode to a GTM. -- Michael Paquier http://michael.otacoo.com -------------- next part -------------- An HTML attachment was scrubbed... ------------------------------ Message: 2 Date: Fri, 6 Jul 2012 10:55:37 +0530 From: Ashutosh Bapat <ash...@en...> Subject: Re: [Postgres-xc-developers] Question on gtm-proxy To: Shankar Hariharan <har...@ya...> Cc: "pos...@li..." <pos...@li...> Message-ID: <CAF...@ma...> Content-Type: text/plain; charset="iso-8859-1" Hi Shankar, Running gtm-proxy has shown to improve the performance, because it lessens the load on GTM, by serving requests locally. Why do you want the coordinators to connect directly to the GTM? Are you seeing any performance improvement from doing that? On Fri, Jul 6, 2012 at 10:08 AM, Shankar Hariharan < har...@ya...> wrote: > Follow up to earlier email. In the setup described below, can I avoid > using a gtm-proxy? That is, can I just simply point coordinators to the one > gtm running on node 3 ? > My initial plan was to just run the gtm on node 3 then I thought I could > try a datanode without a local coordinator which was why I put these two > together on node 3. > thanks, > Shankar > > ------------------------------ > *From:* Shankar Hariharan <har...@ya...> > *To:* "pos...@li..." < > pos...@li...> > *Sent:* Thursday, July 5, 2012 11:35 PM > *Subject:* Question on multiple coordinators > > Hello, > > Am trying out XC 1.0 in the following configuraiton. > Node 1 - Coord1, Datanode1, gtm-proxy1 > Node 2- Coord2, Datanode2, gtm-proxy2 > Node 3- Datanode3, gtm > > I setup all nodes but forgot to add Coord1 to Coord2 and vice versa. In > addition I missed the pg_hba edit as well. So the first table T1 that I > created for distribution from Coord1 was not "visible| from Coord2 but > was on all the data nodes. > I tried to get Coord2 backinto business in various ways but the first > table I created refused to show up on Coord2 : > - edit pg_hba and add node on both coord1 and 2. Then run select > pgxc_pool_reload(); > - restart coord 1 and 2 > - drop node c2 from c1 and c1 from c2 and add them back followed by select > pgxc_pool_reload(); > > So I tried to create the same table T1 from Coord2 to observe behavior > and it did not like it clearly as all nodes it "wrote" to reported that the > table already existed which was good. At this point I could understand that > Coord2 and Coord1 are not talking alright so I created a new table from > coord1 with replication. This table was visible from both now. > > Question is should I expect to see the first table, let me call it T1 > after a while from Coord2 also? > > > thanks, > Shankar > > > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Postgres-xc-developers mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > > -- Best Wishes, Ashutosh Bapat EntepriseDB Corporation The Enterprise Postgres Company -------------- next part -------------- An HTML attachment was scrubbed... ------------------------------ Message: 3 Date: Fri, 6 Jul 2012 18:07:44 +0900 From: Michael Paquier <mic...@gm...> Subject: [Postgres-xc-developers] Incorrect COPY for partially replicated tables To: Postgres-XC mailing list <pos...@li...> Message-ID: <CAB7nPqR=dA4=Hbw...@ma...> Content-Type: text/plain; charset="iso-8859-1" Hi all, While testing data redistribution I found this bug with COPY. It is reproducible with master, and very probably with 1.0 stable. postgres=# create table aa (a int) distribute by replication to node dn2; CREATE TABLE postgres=# insert into aa values (generate_series(1,10)); INSERT 0 10 postgres=# copy aa to stdout; -- no output here postgres=# I'll investigate this problem on Monday. For the time being this bug is registered here: https://sourceforge.net/tracker/?func=detail&aid=3540784&group_id=311227&atid=1310232 Regards, -- Michael Paquier http://michael.otacoo.com -------------- next part -------------- An HTML attachment was scrubbed... ------------------------------ Message: 4 Date: Fri, 6 Jul 2012 06:22:12 -0700 (PDT) From: Shankar Hariharan <har...@ya...> Subject: Re: [Postgres-xc-developers] Question on gtm-proxy To: Ashutosh Bapat <ash...@en...> Cc: "pos...@li..." <pos...@li...> Message-ID: <134...@we...> Content-Type: text/plain; charset="iso-8859-1" Hi Ashutosh, I was trying to size the load on a server and was wondering if ?a GTM could be shared w/o much performance overhead between a small number of datanodes and coordinators. I will post my findings here. thanks, Shankar ________________________________ From: Ashutosh Bapat <ash...@en...> To: Shankar Hariharan <har...@ya...> Cc: "pos...@li..." <pos...@li...> Sent: Friday, July 6, 2012 12:25 AM Subject: Re: [Postgres-xc-developers] Question on gtm-proxy Hi Shankar, Running gtm-proxy has shown to improve the performance, because it lessens the load on GTM, by serving requests locally. Why do you want the coordinators to connect directly to the GTM? Are you seeing any performance improvement from doing that? On Fri, Jul 6, 2012 at 10:08 AM, Shankar Hariharan <har...@ya...> wrote: Follow up to earlier email. In the setup described below, can I avoid using a gtm-proxy? That is, can I just simply point coordinators to the one gtm running on node 3 ? >My initial plan was to just run the gtm on node 3 then I thought I could try a datanode without a local coordinator which was why I put these two together on node 3. >thanks, >Shankar > > > >________________________________ > From: Shankar Hariharan <har...@ya...> >To: "pos...@li..." <pos...@li...> >Sent: Thursday, July 5, 2012 11:35 PM >Subject: Question on multiple coordinators > > >Hello, > > >Am trying out XC 1.0 in the following configuraiton. >Node 1 - Coord1, Datanode1, gtm-proxy1 >Node 2-?Coord2, Datanode2, gtm-proxy2 >Node 3-?Datanode3, gtm > > >I setup all nodes but forgot to add Coord1 to?Coord2 and vice versa. In addition I missed the pg_hba edit as well.?So the first table T1 that I created for distribution from?Coord1?was not "visible| from?Coord2 but was on all the data nodes.? >I tried to get Coord2 backinto business in various ways but the first table I created refused to show up on Coord2 : >- edit pg_hba and add node on both coord1 and 2. Then run?select pgxc_pool_reload(); >- restart coord 1 and 2 >- drop node c2 from c1 and c1 from c2 and add them back followed by?select pgxc_pool_reload(); > > >So I tried to create the same table T1 from?Coord2 to observe behavior and it did not like it clearly as all nodes it "wrote" to reported that the table already existed which was good. At this point I could understand that Coord2 and Coord1 are not talking alright so I created a new table from coord1 with replication. This table was visible from both now.? > > >Question is should I expect to see the first table, let me call it T1 after a while from Coord2 also?? > > > > >thanks, >Shankar > > >------------------------------------------------------------------------------ >Live Security Virtual Conference >Exclusive live event will cover all the ways today's security and >threat landscape has changed and how IT managers can respond. Discussions >will include endpoint security, mobile security and the latest in malware >threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ >_______________________________________________ >Postgres-xc-developers mailing list >Pos...@li... >https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > > -- Best Wishes, Ashutosh Bapat EntepriseDB Corporation The Enterprise Postgres Company -------------- next part -------------- An HTML attachment was scrubbed... ------------------------------ ------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ------------------------------ _______________________________________________ Postgres-xc-developers mailing list Pos...@li... https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers End of Postgres-xc-developers Digest, Vol 25, Issue 10 ****************************************************** |
From: Andrei M. <and...@gm...> - 2012-07-06 14:18:11
|
Hi Shankar, You can safely clone Coordinators by plain copying of the data directory and adjusting configuration afterwards. That approach may be used to fix your problem or to add a coordinator to existing cluster. 1. Stop all cluster components. 2. Copy coordinator database to the new location. 3. Start GTM, GTM proxy if appropriate and Coordinator specifying -D <new datadir location>. Make sure you are not running coordinator on the master copy of the data directory, they share the same name yet, GTM would not allow that. 4. Connect psql or other client to the coordinator and create a record in pgxc_node table for future "self" entry using CREATE NODE command. You may need to adjust connection info for current self, which will be pointing to the original coordinator, since the view may be different from the new location, like you may want to replace host='localhost' with host = '<IP adress>' 5. Adjust configuration of the new coordinator, you must change pgxc_node_name so it is unique in the cluster; if the new location is on the same box you may need to change port to listen on for client connections and pooler port. The configuration should match the "self" entry you created on previous step. 6. Restart the new coordinator, start other cluster components. 7. Connect to old coordinators and use CREATE NODE command to make them aware of new coordinator. 8. Enjoy. 2012/7/6 Shankar Hariharan <har...@ya...> > Hello, > > Am trying out XC 1.0 in the following configuraiton. > Node 1 - Coord1, Datanode1, gtm-proxy1 > Node 2- Coord2, Datanode2, gtm-proxy2 > Node 3- Datanode3, gtm > > I setup all nodes but forgot to add Coord1 to Coord2 and vice versa. In > addition I missed the pg_hba edit as well. So the first table T1 that I > created for distribution from Coord1 was not "visible| from Coord2 but > was on all the data nodes. > I tried to get Coord2 backinto business in various ways but the first > table I created refused to show up on Coord2 : > - edit pg_hba and add node on both coord1 and 2. Then run select > pgxc_pool_reload(); > - restart coord 1 and 2 > - drop node c2 from c1 and c1 from c2 and add them back followed by select > pgxc_pool_reload(); > > So I tried to create the same table T1 from Coord2 to observe behavior > and it did not like it clearly as all nodes it "wrote" to reported that the > table already existed which was good. At this point I could understand that > Coord2 and Coord1 are not talking alright so I created a new table from > coord1 with replication. This table was visible from both now. > > Question is should I expect to see the first table, let me call it T1 > after a while from Coord2 also? > > > thanks, > Shankar > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Postgres-xc-developers mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > > -- Andrei Martsinchyk StormDB - http://www.stormdb.com The Database Cloud |
From: Shankar H. <har...@ya...> - 2012-07-06 13:24:29
|
Thanks Michael. Based on your experience, is that good behavior to have a GTM shared across multiple coordinators/datanodes? I do plan to run some tests to observe behavior but would like to understand if there are some metrics around this already. thanks, Shankar ________________________________ From: Michael Paquier <mic...@gm...> To: Shankar Hariharan <har...@ya...> Cc: "pos...@li..." <pos...@li...> Sent: Friday, July 6, 2012 12:00 AM Subject: Re: [Postgres-xc-developers] Question on gtm-proxy On Fri, Jul 6, 2012 at 1:38 PM, Shankar Hariharan <har...@ya...> wrote: Follow up to earlier email. In the setup described below, can I avoid using a gtm-proxy? That is, can I just simply point coordinators to the one gtm running on node 3 ? >My initial plan was to just run the gtm on node 3 then I thought I could try a datanode without a local coordinator which was why I put these two together on node 3. >thanks, GTM proxy is not a mandatory element in an XC cluster. So yes, you can connect directly a Coordinator or a Datanode to a GTM. -- Michael Paquier http://michael.otacoo.com |
From: Shankar H. <har...@ya...> - 2012-07-06 13:22:20
|
Hi Ashutosh, I was trying to size the load on a server and was wondering if a GTM could be shared w/o much performance overhead between a small number of datanodes and coordinators. I will post my findings here. thanks, Shankar ________________________________ From: Ashutosh Bapat <ash...@en...> To: Shankar Hariharan <har...@ya...> Cc: "pos...@li..." <pos...@li...> Sent: Friday, July 6, 2012 12:25 AM Subject: Re: [Postgres-xc-developers] Question on gtm-proxy Hi Shankar, Running gtm-proxy has shown to improve the performance, because it lessens the load on GTM, by serving requests locally. Why do you want the coordinators to connect directly to the GTM? Are you seeing any performance improvement from doing that? On Fri, Jul 6, 2012 at 10:08 AM, Shankar Hariharan <har...@ya...> wrote: Follow up to earlier email. In the setup described below, can I avoid using a gtm-proxy? That is, can I just simply point coordinators to the one gtm running on node 3 ? >My initial plan was to just run the gtm on node 3 then I thought I could try a datanode without a local coordinator which was why I put these two together on node 3. >thanks, >Shankar > > > >________________________________ > From: Shankar Hariharan <har...@ya...> >To: "pos...@li..." <pos...@li...> >Sent: Thursday, July 5, 2012 11:35 PM >Subject: Question on multiple coordinators > > >Hello, > > >Am trying out XC 1.0 in the following configuraiton. >Node 1 - Coord1, Datanode1, gtm-proxy1 >Node 2- Coord2, Datanode2, gtm-proxy2 >Node 3- Datanode3, gtm > > >I setup all nodes but forgot to add Coord1 to Coord2 and vice versa. In addition I missed the pg_hba edit as well. So the first table T1 that I created for distribution from Coord1 was not "visible| from Coord2 but was on all the data nodes. >I tried to get Coord2 backinto business in various ways but the first table I created refused to show up on Coord2 : >- edit pg_hba and add node on both coord1 and 2. Then run select pgxc_pool_reload(); >- restart coord 1 and 2 >- drop node c2 from c1 and c1 from c2 and add them back followed by select pgxc_pool_reload(); > > >So I tried to create the same table T1 from Coord2 to observe behavior and it did not like it clearly as all nodes it "wrote" to reported that the table already existed which was good. At this point I could understand that Coord2 and Coord1 are not talking alright so I created a new table from coord1 with replication. This table was visible from both now. > > >Question is should I expect to see the first table, let me call it T1 after a while from Coord2 also? > > > > >thanks, >Shankar > > >------------------------------------------------------------------------------ >Live Security Virtual Conference >Exclusive live event will cover all the ways today's security and >threat landscape has changed and how IT managers can respond. Discussions >will include endpoint security, mobile security and the latest in malware >threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ >_______________________________________________ >Postgres-xc-developers mailing list >Pos...@li... >https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > > -- Best Wishes, Ashutosh Bapat EntepriseDB Corporation The Enterprise Postgres Company |
From: Michael P. <mic...@gm...> - 2012-07-06 09:07:56
|
Hi all, While testing data redistribution I found this bug with COPY. It is reproducible with master, and very probably with 1.0 stable. postgres=# create table aa (a int) distribute by replication to node dn2; CREATE TABLE postgres=# insert into aa values (generate_series(1,10)); INSERT 0 10 postgres=# copy aa to stdout; -- no output here postgres=# I'll investigate this problem on Monday. For the time being this bug is registered here: https://sourceforge.net/tracker/?func=detail&aid=3540784&group_id=311227&atid=1310232 Regards, -- Michael Paquier http://michael.otacoo.com |
From: Ashutosh B. <ash...@en...> - 2012-07-06 05:25:46
|
Hi Shankar, Running gtm-proxy has shown to improve the performance, because it lessens the load on GTM, by serving requests locally. Why do you want the coordinators to connect directly to the GTM? Are you seeing any performance improvement from doing that? On Fri, Jul 6, 2012 at 10:08 AM, Shankar Hariharan < har...@ya...> wrote: > Follow up to earlier email. In the setup described below, can I avoid > using a gtm-proxy? That is, can I just simply point coordinators to the one > gtm running on node 3 ? > My initial plan was to just run the gtm on node 3 then I thought I could > try a datanode without a local coordinator which was why I put these two > together on node 3. > thanks, > Shankar > > ------------------------------ > *From:* Shankar Hariharan <har...@ya...> > *To:* "pos...@li..." < > pos...@li...> > *Sent:* Thursday, July 5, 2012 11:35 PM > *Subject:* Question on multiple coordinators > > Hello, > > Am trying out XC 1.0 in the following configuraiton. > Node 1 - Coord1, Datanode1, gtm-proxy1 > Node 2- Coord2, Datanode2, gtm-proxy2 > Node 3- Datanode3, gtm > > I setup all nodes but forgot to add Coord1 to Coord2 and vice versa. In > addition I missed the pg_hba edit as well. So the first table T1 that I > created for distribution from Coord1 was not "visible| from Coord2 but > was on all the data nodes. > I tried to get Coord2 backinto business in various ways but the first > table I created refused to show up on Coord2 : > - edit pg_hba and add node on both coord1 and 2. Then run select > pgxc_pool_reload(); > - restart coord 1 and 2 > - drop node c2 from c1 and c1 from c2 and add them back followed by select > pgxc_pool_reload(); > > So I tried to create the same table T1 from Coord2 to observe behavior > and it did not like it clearly as all nodes it "wrote" to reported that the > table already existed which was good. At this point I could understand that > Coord2 and Coord1 are not talking alright so I created a new table from > coord1 with replication. This table was visible from both now. > > Question is should I expect to see the first table, let me call it T1 > after a while from Coord2 also? > > > thanks, > Shankar > > > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Postgres-xc-developers mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > > -- Best Wishes, Ashutosh Bapat EntepriseDB Corporation The Enterprise Postgres Company |
From: Michael P. <mic...@gm...> - 2012-07-06 05:01:18
|
On Fri, Jul 6, 2012 at 1:38 PM, Shankar Hariharan < har...@ya...> wrote: > Follow up to earlier email. In the setup described below, can I avoid > using a gtm-proxy? That is, can I just simply point coordinators to the one > gtm running on node 3 ? > My initial plan was to just run the gtm on node 3 then I thought I could > try a datanode without a local coordinator which was why I put these two > together on node 3. > thanks, > GTM proxy is not a mandatory element in an XC cluster. So yes, you can connect directly a Coordinator or a Datanode to a GTM. -- Michael Paquier http://michael.otacoo.com |
From: Shankar H. <har...@ya...> - 2012-07-06 04:38:25
|
Follow up to earlier email. In the setup described below, can I avoid using a gtm-proxy? That is, can I just simply point coordinators to the one gtm running on node 3 ? My initial plan was to just run the gtm on node 3 then I thought I could try a datanode without a local coordinator which was why I put these two together on node 3. thanks, Shankar ________________________________ From: Shankar Hariharan <har...@ya...> To: "pos...@li..." <pos...@li...> Sent: Thursday, July 5, 2012 11:35 PM Subject: Question on multiple coordinators Hello, Am trying out XC 1.0 in the following configuraiton. Node 1 - Coord1, Datanode1, gtm-proxy1 Node 2- Coord2, Datanode2, gtm-proxy2 Node 3- Datanode3, gtm I setup all nodes but forgot to add Coord1 to Coord2 and vice versa. In addition I missed the pg_hba edit as well. So the first table T1 that I created for distribution from Coord1 was not "visible| from Coord2 but was on all the data nodes. I tried to get Coord2 backinto business in various ways but the first table I created refused to show up on Coord2 : - edit pg_hba and add node on both coord1 and 2. Then run select pgxc_pool_reload(); - restart coord 1 and 2 - drop node c2 from c1 and c1 from c2 and add them back followed by select pgxc_pool_reload(); So I tried to create the same table T1 from Coord2 to observe behavior and it did not like it clearly as all nodes it "wrote" to reported that the table already existed which was good. At this point I could understand that Coord2 and Coord1 are not talking alright so I created a new table from coord1 with replication. This table was visible from both now. Question is should I expect to see the first table, let me call it T1 after a while from Coord2 also? thanks, Shankar |
From: Shankar H. <har...@ya...> - 2012-07-06 04:35:42
|
Hello, Am trying out XC 1.0 in the following configuraiton. Node 1 - Coord1, Datanode1, gtm-proxy1 Node 2- Coord2, Datanode2, gtm-proxy2 Node 3- Datanode3, gtm I setup all nodes but forgot to add Coord1 to Coord2 and vice versa. In addition I missed the pg_hba edit as well. So the first table T1 that I created for distribution from Coord1 was not "visible| from Coord2 but was on all the data nodes. I tried to get Coord2 backinto business in various ways but the first table I created refused to show up on Coord2 : - edit pg_hba and add node on both coord1 and 2. Then run select pgxc_pool_reload(); - restart coord 1 and 2 - drop node c2 from c1 and c1 from c2 and add them back followed by select pgxc_pool_reload(); So I tried to create the same table T1 from Coord2 to observe behavior and it did not like it clearly as all nodes it "wrote" to reported that the table already existed which was good. At this point I could understand that Coord2 and Coord1 are not talking alright so I created a new table from coord1 with replication. This table was visible from both now. Question is should I expect to see the first table, let me call it T1 after a while from Coord2 also? thanks, Shankar |
From: Amit K. <ami...@en...> - 2012-07-04 05:39:05
|
On 4 July 2012 09:23, Koichi Suzuki <koi...@gm...> wrote: > I'd like to post one more thing on this. > > Yes, pg_ctl status and gtm_ctl status (I've corrected this and will > submit soon) check that postmaster and main thread is alive but they > don't show if each node responds correctly. For this purpose, we can > use psql -c 'select 1' only for coordinator. Yes, we can use the > same thing to datanode but we should restrict psql use directly > against the datanode generally. Also, we need gtm/gtm_proxy > counterpart. > May be we can use the PQping() call to test the connectivity. That's what pg_ctl start does. After starting up, it waits for a response from the server and only then returns. > So please let me implement pgxc_monitor command as in the previous > post which really interacts with each node and reports if they're > running. It may need some extension to gtm/gtm_proxy to add a > message just to respond ack. No other core extension will be > needed. > > Regards; > ---------- > Koichi Suzuki > > > 2012/7/4 Koichi Suzuki <koi...@gm...>: > > We don't have to do such a round robin. As Michael suggested, pg_ctl > > status works well even with datanodes. It doesn't issue any query > > but checks if the postmaster is running. I think it is sufficient. > > Only one restriction is pg_stl status determines zombie process as > > running. > > > > Regards; > > ---------- > > Koichi Suzuki > > > > > > 2012/7/4 Nikhil Sontakke <ni...@st...>: > >>> I also believe it's not a good idea to monitor a datanode through a > >>> coordinator using EXECUTE DIRECT because the latter may be failed > >>> while the whole cluster is in operation. > >>> > >> > >> Well, if there are multiple failures we ought to know about them > >> anyways. So if this particular coordinator fails the monitor tells us > >> about it first. We fix it and then move on to the datanode failure > >> detection. Since the datanodes have to be reachable via coordinators > >> and we have multiple coordinators around to load balance anyways, I > >> still think EXECUTE DIRECT via the coordinator node is a decent idea. > >> If we can round robin the calls via all the coordinators that would be > >> better too I think. > >> > >> Regards, > >> Nikhils > >> > >> > >>> Regards; > >>> ---------- > >>> Koichi Suzuki > >>> > >>> > >>> 2012/7/4 Koichi Suzuki <koi...@gm...>: > >>>> The background of xc_watchdog is to provide quicker means to detect > >>>> node fault. I understand that it is not compatible with what we're > >>>> doing in conventional PG applications, which are mostly based upon > >>>> psql -c 'select 1'. It takes at most 60sec to detect the error (TCP > >>>> timeout value). Some applications will be satisfied with this and > >>>> some may not. This is raised at the clustering summit in Ottawa and > >>>> the suggestion was to have this kind of means (watchdog). > >>>> > >>>> I don't know if PG people are interested in this now. Maybe we > >>>> should wait until such fault detection is more realistic issue. > >>>> Implementation is very straightforward. > >>>> > >>>> For datanode, I don't like to ask applications to connect to it > >>>> directly using psql because it is a kind of tricky use and it may mean > >>>> that we allow applications to connect to datanodes directly. So I > >>>> think we should encapsulate this with dedicated command like > >>>> xc_monitor. Xc_ping sounds good too but "ping" reminds me > >>>> consecutive monitoring. Current practice needs only one monitoring. > >>>> So I'd like xc_monitor (or node_monitor). > >>>> > >>>> Command like 'xc_monitor -Z nodetype -h host -p port' will not need > >>>> any modification to the core. Will be submitted soon as contrib > >>>> module. > >>>> > >>>> Regards; > >>>> ---------- > >>>> Koichi Suzuki > >>>> > >>>> > >>>> 2012/7/4 Nikhil Sontakke <ni...@st...>: > >>>>>> Are there people with a similar opinion to mine??? > >>>>>> > >>>>> > >>>>> +1 > >>>>> > >>>>> IMO too we should not be making any too invasive internal changes to > >>>>> support monitoring. What would be better would be to maybe allow > >>>>> commands which can be scripted and which can work against each of the > >>>>> components. > >>>>> > >>>>> For example, for the coordinator/datanode periodic "SELECT 1" > commands > >>>>> should be good enough. Even doing an EXECUTE DIRECT via a coordinator > >>>>> to the datanodes will help. > >>>>> > >>>>> For GTM/GTM_Standy/GTM_Proxy components we should introduce "gtm_ctl > >>>>> ping" kinds of commands which will basically connect to them and see > >>>>> that they are responding ok. > >>>>> > >>>>> Such interfaces make it really easy for monitoring solutions like > >>>>> nagios, zabbix etc. to monitor them. These tools have been used for a > >>>>> while now to monitor Postgres and it should be a natural logical > >>>>> evolution for users to see them being used for PG XC. > >>>>> > >>>>> Regards, > >>>>> Nikhils > >>>>> -- > >>>>> StormDB - http://www.stormdb.com > >>>>> The Database Cloud > >> > >> > >> > >> -- > >> StormDB - http://www.stormdb.com > >> The Database Cloud > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Postgres-xc-developers mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > |
From: Koichi S. <koi...@gm...> - 2012-07-04 05:20:06
|
Thanks a lot. ---------- Koichi Suzuki 2012/7/4 Michael Paquier <mic...@gm...>: > > > On Wed, Jul 4, 2012 at 12:57 PM, Koichi Suzuki <koi...@gm...> > wrote: >> >> This is a fix for gtm_ctl (only moving one line below). Now, gtm_ctl >> status works correctly. It checks if the main thread is running. >> If running gtl_ctl exits with the exit code zero, if not, exits with >> the exit code one. >> >> This is very similar to pg_ctl status, although gtm_ctl needs -Z >> option to specify if the target is gtm or gtm_proxy. > > Had a look at that and committed a clean up on master, backpatched to 1.0 > stable: > http://github.com/postgres-xc/postgres-xc/commit/7c113772a540cc2282ecca031cea953f53f0fe88 > >> >> >> Regards; >> ---------- >> Koichi Suzuki >> >> >> 2012/7/4 Koichi Suzuki <koi...@gm...>: >> > I'd like to post one more thing on this. >> > >> > Yes, pg_ctl status and gtm_ctl status (I've corrected this and will >> > submit soon) check that postmaster and main thread is alive but they >> > don't show if each node responds correctly. For this purpose, we can >> > use psql -c 'select 1' only for coordinator. Yes, we can use the >> > same thing to datanode but we should restrict psql use directly >> > against the datanode generally. Also, we need gtm/gtm_proxy >> > counterpart. >> > >> > So please let me implement pgxc_monitor command as in the previous >> > post which really interacts with each node and reports if they're >> > running. It may need some extension to gtm/gtm_proxy to add a >> > message just to respond ack. No other core extension will be >> > needed. >> > >> > Regards; >> > ---------- >> > Koichi Suzuki >> > >> > >> > 2012/7/4 Koichi Suzuki <koi...@gm...>: >> >> We don't have to do such a round robin. As Michael suggested, pg_ctl >> >> status works well even with datanodes. It doesn't issue any query >> >> but checks if the postmaster is running. I think it is sufficient. >> >> Only one restriction is pg_stl status determines zombie process as >> >> running. >> >> >> >> Regards; >> >> ---------- >> >> Koichi Suzuki >> >> >> >> >> >> 2012/7/4 Nikhil Sontakke <ni...@st...>: >> >>>> I also believe it's not a good idea to monitor a datanode through >> >>>> a >> >>>> coordinator using EXECUTE DIRECT because the latter may be failed >> >>>> while the whole cluster is in operation. >> >>>> >> >>> >> >>> Well, if there are multiple failures we ought to know about them >> >>> anyways. So if this particular coordinator fails the monitor tells us >> >>> about it first. We fix it and then move on to the datanode failure >> >>> detection. Since the datanodes have to be reachable via coordinators >> >>> and we have multiple coordinators around to load balance anyways, I >> >>> still think EXECUTE DIRECT via the coordinator node is a decent idea. >> >>> If we can round robin the calls via all the coordinators that would be >> >>> better too I think. >> >>> >> >>> Regards, >> >>> Nikhils >> >>> >> >>> >> >>>> Regards; >> >>>> ---------- >> >>>> Koichi Suzuki >> >>>> >> >>>> >> >>>> 2012/7/4 Koichi Suzuki <koi...@gm...>: >> >>>>> The background of xc_watchdog is to provide quicker means to detect >> >>>>> node fault. I understand that it is not compatible with what >> >>>>> we're >> >>>>> doing in conventional PG applications, which are mostly based upon >> >>>>> psql -c 'select 1'. It takes at most 60sec to detect the error >> >>>>> (TCP >> >>>>> timeout value). Some applications will be satisfied with this and >> >>>>> some may not. This is raised at the clustering summit in Ottawa >> >>>>> and >> >>>>> the suggestion was to have this kind of means (watchdog). >> >>>>> >> >>>>> I don't know if PG people are interested in this now. Maybe we >> >>>>> should wait until such fault detection is more realistic issue. >> >>>>> Implementation is very straightforward. >> >>>>> >> >>>>> For datanode, I don't like to ask applications to connect to it >> >>>>> directly using psql because it is a kind of tricky use and it may >> >>>>> mean >> >>>>> that we allow applications to connect to datanodes directly. So I >> >>>>> think we should encapsulate this with dedicated command like >> >>>>> xc_monitor. Xc_ping sounds good too but "ping" reminds me >> >>>>> consecutive monitoring. Current practice needs only one >> >>>>> monitoring. >> >>>>> So I'd like xc_monitor (or node_monitor). >> >>>>> >> >>>>> Command like 'xc_monitor -Z nodetype -h host -p port' will not need >> >>>>> any modification to the core. Will be submitted soon as contrib >> >>>>> module. >> >>>>> >> >>>>> Regards; >> >>>>> ---------- >> >>>>> Koichi Suzuki >> >>>>> >> >>>>> >> >>>>> 2012/7/4 Nikhil Sontakke <ni...@st...>: >> >>>>>>> Are there people with a similar opinion to mine??? >> >>>>>>> >> >>>>>> >> >>>>>> +1 >> >>>>>> >> >>>>>> IMO too we should not be making any too invasive internal changes >> >>>>>> to >> >>>>>> support monitoring. What would be better would be to maybe allow >> >>>>>> commands which can be scripted and which can work against each of >> >>>>>> the >> >>>>>> components. >> >>>>>> >> >>>>>> For example, for the coordinator/datanode periodic "SELECT 1" >> >>>>>> commands >> >>>>>> should be good enough. Even doing an EXECUTE DIRECT via a >> >>>>>> coordinator >> >>>>>> to the datanodes will help. >> >>>>>> >> >>>>>> For GTM/GTM_Standy/GTM_Proxy components we should introduce >> >>>>>> "gtm_ctl >> >>>>>> ping" kinds of commands which will basically connect to them and >> >>>>>> see >> >>>>>> that they are responding ok. >> >>>>>> >> >>>>>> Such interfaces make it really easy for monitoring solutions like >> >>>>>> nagios, zabbix etc. to monitor them. These tools have been used for >> >>>>>> a >> >>>>>> while now to monitor Postgres and it should be a natural logical >> >>>>>> evolution for users to see them being used for PG XC. >> >>>>>> >> >>>>>> Regards, >> >>>>>> Nikhils >> >>>>>> -- >> >>>>>> StormDB - http://www.stormdb.com >> >>>>>> The Database Cloud >> >>> >> >>> >> >>> >> >>> -- >> >>> StormDB - http://www.stormdb.com >> >>> The Database Cloud > > > > > -- > Michael Paquier > http://michael.otacoo.com |
From: Koichi S. <koi...@gm...> - 2012-07-04 05:08:03
|
Yes, I agree to make it separate. Will be submitted soon. ---------- Koichi Suzuki 2012/7/4 Michael Paquier <mic...@gm...>: > > > On Wed, Jul 4, 2012 at 12:53 PM, Koichi Suzuki <koi...@gm...> > wrote: >> >> I'd like to post one more thing on this. >> >> Yes, pg_ctl status and gtm_ctl status (I've corrected this and will >> submit soon) check that postmaster and main thread is alive but they >> don't show if each node responds correctly. For this purpose, we can >> use psql -c 'select 1' only for coordinator. Yes, we can use the >> same thing to datanode but we should restrict psql use directly >> against the datanode generally. Also, we need gtm/gtm_proxy >> counterpart. >> >> So please let me implement pgxc_monitor command as in the previous >> post which really interacts with each node and reports if they're >> running. It may need some extension to gtm/gtm_proxy to add a >> message just to respond ack. No other core extension will be >> needed. > > This additional message for GTM/proxy makes sense, it can be considered as > an equivalent to the simple "SELECT 1" on nodes. > I imagine that it can also be used for other purposes. This should be > written as an independant patch. > -- > Michael Paquier > http://michael.otacoo.com |
From: Michael P. <mic...@gm...> - 2012-07-04 05:04:04
|
On Wed, Jul 4, 2012 at 12:53 PM, Koichi Suzuki <koi...@gm...>wrote: > I'd like to post one more thing on this. > > Yes, pg_ctl status and gtm_ctl status (I've corrected this and will > submit soon) check that postmaster and main thread is alive but they > don't show if each node responds correctly. For this purpose, we can > use psql -c 'select 1' only for coordinator. Yes, we can use the > same thing to datanode but we should restrict psql use directly > against the datanode generally. Also, we need gtm/gtm_proxy > counterpart. > > So please let me implement pgxc_monitor command as in the previous > post which really interacts with each node and reports if they're > running. It may need some extension to gtm/gtm_proxy to add a > message just to respond ack. No other core extension will be > needed. > This additional message for GTM/proxy makes sense, it can be considered as an equivalent to the simple "SELECT 1" on nodes. I imagine that it can also be used for other purposes. This should be written as an independant patch. -- Michael Paquier http://michael.otacoo.com |
From: Michael P. <mic...@gm...> - 2012-07-04 05:00:16
|
On Wed, Jul 4, 2012 at 12:57 PM, Koichi Suzuki <koi...@gm...>wrote: > This is a fix for gtm_ctl (only moving one line below). Now, gtm_ctl > status works correctly. It checks if the main thread is running. > If running gtl_ctl exits with the exit code zero, if not, exits with > the exit code one. > > This is very similar to pg_ctl status, although gtm_ctl needs -Z > option to specify if the target is gtm or gtm_proxy. > Had a look at that and committed a clean up on master, backpatched to 1.0 stable: http://github.com/postgres-xc/postgres-xc/commit/7c113772a540cc2282ecca031cea953f53f0fe88 > > Regards; > ---------- > Koichi Suzuki > > > 2012/7/4 Koichi Suzuki <koi...@gm...>: > > I'd like to post one more thing on this. > > > > Yes, pg_ctl status and gtm_ctl status (I've corrected this and will > > submit soon) check that postmaster and main thread is alive but they > > don't show if each node responds correctly. For this purpose, we can > > use psql -c 'select 1' only for coordinator. Yes, we can use the > > same thing to datanode but we should restrict psql use directly > > against the datanode generally. Also, we need gtm/gtm_proxy > > counterpart. > > > > So please let me implement pgxc_monitor command as in the previous > > post which really interacts with each node and reports if they're > > running. It may need some extension to gtm/gtm_proxy to add a > > message just to respond ack. No other core extension will be > > needed. > > > > Regards; > > ---------- > > Koichi Suzuki > > > > > > 2012/7/4 Koichi Suzuki <koi...@gm...>: > >> We don't have to do such a round robin. As Michael suggested, pg_ctl > >> status works well even with datanodes. It doesn't issue any query > >> but checks if the postmaster is running. I think it is sufficient. > >> Only one restriction is pg_stl status determines zombie process as > >> running. > >> > >> Regards; > >> ---------- > >> Koichi Suzuki > >> > >> > >> 2012/7/4 Nikhil Sontakke <ni...@st...>: > >>>> I also believe it's not a good idea to monitor a datanode through a > >>>> coordinator using EXECUTE DIRECT because the latter may be failed > >>>> while the whole cluster is in operation. > >>>> > >>> > >>> Well, if there are multiple failures we ought to know about them > >>> anyways. So if this particular coordinator fails the monitor tells us > >>> about it first. We fix it and then move on to the datanode failure > >>> detection. Since the datanodes have to be reachable via coordinators > >>> and we have multiple coordinators around to load balance anyways, I > >>> still think EXECUTE DIRECT via the coordinator node is a decent idea. > >>> If we can round robin the calls via all the coordinators that would be > >>> better too I think. > >>> > >>> Regards, > >>> Nikhils > >>> > >>> > >>>> Regards; > >>>> ---------- > >>>> Koichi Suzuki > >>>> > >>>> > >>>> 2012/7/4 Koichi Suzuki <koi...@gm...>: > >>>>> The background of xc_watchdog is to provide quicker means to detect > >>>>> node fault. I understand that it is not compatible with what we're > >>>>> doing in conventional PG applications, which are mostly based upon > >>>>> psql -c 'select 1'. It takes at most 60sec to detect the error (TCP > >>>>> timeout value). Some applications will be satisfied with this and > >>>>> some may not. This is raised at the clustering summit in Ottawa > and > >>>>> the suggestion was to have this kind of means (watchdog). > >>>>> > >>>>> I don't know if PG people are interested in this now. Maybe we > >>>>> should wait until such fault detection is more realistic issue. > >>>>> Implementation is very straightforward. > >>>>> > >>>>> For datanode, I don't like to ask applications to connect to it > >>>>> directly using psql because it is a kind of tricky use and it may > mean > >>>>> that we allow applications to connect to datanodes directly. So I > >>>>> think we should encapsulate this with dedicated command like > >>>>> xc_monitor. Xc_ping sounds good too but "ping" reminds me > >>>>> consecutive monitoring. Current practice needs only one monitoring. > >>>>> So I'd like xc_monitor (or node_monitor). > >>>>> > >>>>> Command like 'xc_monitor -Z nodetype -h host -p port' will not need > >>>>> any modification to the core. Will be submitted soon as contrib > >>>>> module. > >>>>> > >>>>> Regards; > >>>>> ---------- > >>>>> Koichi Suzuki > >>>>> > >>>>> > >>>>> 2012/7/4 Nikhil Sontakke <ni...@st...>: > >>>>>>> Are there people with a similar opinion to mine??? > >>>>>>> > >>>>>> > >>>>>> +1 > >>>>>> > >>>>>> IMO too we should not be making any too invasive internal changes to > >>>>>> support monitoring. What would be better would be to maybe allow > >>>>>> commands which can be scripted and which can work against each of > the > >>>>>> components. > >>>>>> > >>>>>> For example, for the coordinator/datanode periodic "SELECT 1" > commands > >>>>>> should be good enough. Even doing an EXECUTE DIRECT via a > coordinator > >>>>>> to the datanodes will help. > >>>>>> > >>>>>> For GTM/GTM_Standy/GTM_Proxy components we should introduce "gtm_ctl > >>>>>> ping" kinds of commands which will basically connect to them and see > >>>>>> that they are responding ok. > >>>>>> > >>>>>> Such interfaces make it really easy for monitoring solutions like > >>>>>> nagios, zabbix etc. to monitor them. These tools have been used for > a > >>>>>> while now to monitor Postgres and it should be a natural logical > >>>>>> evolution for users to see them being used for PG XC. > >>>>>> > >>>>>> Regards, > >>>>>> Nikhils > >>>>>> -- > >>>>>> StormDB - http://www.stormdb.com > >>>>>> The Database Cloud > >>> > >>> > >>> > >>> -- > >>> StormDB - http://www.stormdb.com > >>> The Database Cloud > -- Michael Paquier http://michael.otacoo.com |
From: Ashutosh B. <ash...@en...> - 2012-07-04 04:21:18
|
Amit, If a complete automated solution is not possible, you can as well think about reporting an error and expecting human intervention. On Tue, Jul 3, 2012 at 10:16 AM, Michael Paquier <mic...@gm...>wrote: > > On Fri, Jun 29, 2012 at 8:34 PM, Amit Khandekar < > ami...@en...> wrote: > >> For utility statements in general, the coordinator propagates SQL >> statements to all the required nodes, and most of these statements get run >> on the datanodes inside a transaction block. So, when the statement fails >> on at least one of the nodes, the statement gets rollbacked on all the >> nodes due to the two-phase commit taking place, and therefore the cluster >> rollbacks to a consistent state. But there are some statements which >> cannot be run inside a transaction block. Here are some important ones: >> CREATE/DROP DATABASE >> CREATE/DROP TABLESPACE >> ALTER DATABASE SET TABLESPACE >> ALTER TYPE ADD ... (for enum types) >> CREATE INDEX CONCURRENTLY >> REINDEX DATABASE >> DISCARD ALL >> >> So such statements run on datanodes in auto-commit mode, and so create >> problems if they succeed on some nodes and abort on other nodes. For e.g. >> : CREATE DATABASE. If a datanode d1 returns with error, and any other >> datanode d2 has already returned back to coordinator with success, the >> coordinator can't undo the commit of d2 because this is already committed. >> Or if the coordinator itself crashes after datanodes commit but before the >> coordinator commits, then again we have the same problem. The database >> cannot be recreated from coordinator, since it is already created on some >> of the other nodes. In such a cluster state, administrator needs to connect >> to datanodes and do the needed cleanup. >> >> The committed statements can be followed by statements that undo the >> operation, for e.g. DROP DATABASE for a CREATE DATABASE. But here again >> this statement can fail for some reason. Also, typically for such >> statements, their UNDO counterparts themselves cannot be run inside a >> transaction block as well. So this is not a guaranteed way to bring back >> the cluster to a consistent state. >> >> To find out how we can get around this issue, let's see why these >> statements require to be run outside a transaction block in the first >> place. There are two reasons why: >> >> 1. Typically such statements modify OS files and directories which cannot >> be rollbacked. >> >> For DMLs, the rollback does not have to be explicitly undone. MVCC takes >> care of it. But for OS file operations, there is no automatic way. So such >> operations cannot be rollbacked. So in a transaction block, if a >> create-database is followed by 10 other SQL statements before commit, and >> one of the statements throws an error, ultimately the database won't be >> created but there will be database files taking up disk space, and this has >> happened just because the user has written the script wrongly. >> >> So by restricting such statement to be run outside a transaction block, >> an unrelated error won't cause garbage files to be created. >> >> The statement itself does get committed eventually as usual. And it can >> also get rolled back in the end. But maximum care has been taken in the >> statement function (for e.g. createdb) such that the chances of an error >> occurring *after* the files are created is least. For this, such a code >> segment is inside PG_ENSURE_ERROR_CLEANUP() with some error_callback >> function (createdb_failure_callback) which tries to clean up the files >> created. >> >> So the end result is that this window between files-created and >> error-occurred is minimized, not that such statements will never create >> such cleanup issues if run outside transaction block. >> >> Possible solution: >> >> So regarding Postgres-XC, if we let such statements to be run inside >> transaction block but only on remote nodes, what are the consequences? This >> will of course prevent the issue of the statement committed on one node and >> not the other. Also, the end user will still be prevented from running the >> statement inside the transaction. Moreover, for such statement, say >> create-database, the database will be created on all nodes or none, even if >> one of the nodes return error. The only issue is, if the create-database is >> aborted, it will leave disk space wasted on nodes where it has succeeded. >> But this will be caused because of some configuration issues like disk >> space, network down etc. The issue of other unrelated operations in the >> same transaction causing rollback of create-database will not occur anyways >> because we still don't allow it in a transaction block for the end-user. >> >> So the end result is we have solved the inconsistent cluster issue, >> leaving some chances of disk cleanup issue, although not due to >> user-queries getting aborted. So may be when such statements error out, we >> display a notice that files need to be cleaned up. >> > Could it be possible to store somewhere in the PGDATA folder of the node > involved the files that need to be cleaned up? We could use for this > purpose some binary encoding or something. Ultimately this would finish > just by being a list of files inside PGDATA to be cleaned up. > We could then create a system function that unlinks all the files whose > name have been stored on local node. As such a system function does not > interact with other databases it could be immutable in order to allow a > clean up from coordinator with EXECUTE DIRECT. > > >> We can go further ahead to reduce this window. We split the >> create-database operation. We begin a transaction block, and then let >> datanodes create the non-file operations first, like inserting pg_database >> row, etc, by running them using a new function call. Don't commit it yet. >> Then fire the last part: file system operations, this too using another >> function call. And then finally commit. This file operation will be under >> PG_ENSURE_ERROR_CLEANUP(). Due to synchronizing these individual tasks, we >> reduce the window further. >> > We need to be careful here with the impact of our code on PostgreSQL code. > It would be a pain to have a complecated implementation here for future > merges. > > >> 2. Some statements do internal commits. >> >> For e.g. movedb() calls TransactionCommit() after copying the files, and >> then removes the original files, so that if it crashes while removing the >> files, the database with the new tablespace is already committed and >> intact, so we just leave some old files. >> >> Such statements doing internal commits cannot be rolled back if run >> inside transaction block, because they already do some commits. For such >> statements, the above solution does not work. We need to find a separate >> way for these specific statements. Few of such statements include: >> ALTER DATABASE SET TABLESPACE >> CLUSTER >> CREATE INDEX CONCURRENTLY >> >> One similar solution is to split the individual tasks that get internally >> committed using different functions for each task, and run the individual >> functions on all the nodes synchronously. So the 2nd task does not start >> until the first one gets committed on all the nodes. Whether it is feasible >> to split the task is a question, and it depends on the particular command. >> > We would need a locking system for each task and each task step like what > is done for barrier. > Or a new communication protocol, once again like barriers. Those are once > again just ideas on the top of my mind. > > >> >> As of now, I am not sure whether we can do some common changes in the way >> transactions are implemented to find a common solution which does not >> require changes for individual commands. But I will investigate more. >> > Thanks. > -- > Michael Paquier > http://michael.otacoo.com > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Postgres-xc-developers mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > > -- Best Wishes, Ashutosh Bapat EntepriseDB Corporation The Enterprise Postgres Company |
From: Michael P. <mic...@gm...> - 2012-07-04 04:20:11
|
On Wed, Jul 4, 2012 at 1:15 PM, Ashutosh Bapat < ash...@en...> wrote: > Hi Michael, > This change would cause enormous diffs with my current work, and slow it > down. We should do this after I have committed the planner changes, which > themselves might take some time. I don't see any urgency in doing it right > away. > OK agreed, I won't do anything else for the time being, but I think it was a point to raise. As it has impact on your work, perhaps you could take care of some reorganization and put it in a shape more suited like in my patch? -- Michael Paquier http://michael.otacoo.com |
From: Ashutosh B. <ash...@en...> - 2012-07-04 04:15:32
|
Hi Michael, This change would cause enormous diffs with my current work, and slow it down. We should do this after I have committed the planner changes, which themselves might take some time. I don't see any urgency in doing it right away. On Tue, Jul 3, 2012 at 8:27 AM, Michael Paquier <mic...@gm...>wrote: > Hi all, > > Currently postgresql_fdw.c is located in src/backend/pgxc/pool, but it is > only used to evaluate the shippability of expressions. > Having that in pooler and knowing that it only evaluates foreign > expressions doesn't really make sense. > So why not moving it on the XC planner part? Please find attached a patch > that cleans up the file, moves it to XC planner, and renames it to > ship_eval.c. > This patch would be only applied on master. > > Comments? > -- > Michael Paquier > http://michael.otacoo.com > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Postgres-xc-developers mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > > -- Best Wishes, Ashutosh Bapat EntepriseDB Corporation The Enterprise Postgres Company |
From: Koichi S. <koi...@gm...> - 2012-07-04 03:57:28
|
This is a fix for gtm_ctl (only moving one line below). Now, gtm_ctl status works correctly. It checks if the main thread is running. If running gtl_ctl exits with the exit code zero, if not, exits with the exit code one. This is very similar to pg_ctl status, although gtm_ctl needs -Z option to specify if the target is gtm or gtm_proxy. Regards; ---------- Koichi Suzuki 2012/7/4 Koichi Suzuki <koi...@gm...>: > I'd like to post one more thing on this. > > Yes, pg_ctl status and gtm_ctl status (I've corrected this and will > submit soon) check that postmaster and main thread is alive but they > don't show if each node responds correctly. For this purpose, we can > use psql -c 'select 1' only for coordinator. Yes, we can use the > same thing to datanode but we should restrict psql use directly > against the datanode generally. Also, we need gtm/gtm_proxy > counterpart. > > So please let me implement pgxc_monitor command as in the previous > post which really interacts with each node and reports if they're > running. It may need some extension to gtm/gtm_proxy to add a > message just to respond ack. No other core extension will be > needed. > > Regards; > ---------- > Koichi Suzuki > > > 2012/7/4 Koichi Suzuki <koi...@gm...>: >> We don't have to do such a round robin. As Michael suggested, pg_ctl >> status works well even with datanodes. It doesn't issue any query >> but checks if the postmaster is running. I think it is sufficient. >> Only one restriction is pg_stl status determines zombie process as >> running. >> >> Regards; >> ---------- >> Koichi Suzuki >> >> >> 2012/7/4 Nikhil Sontakke <ni...@st...>: >>>> I also believe it's not a good idea to monitor a datanode through a >>>> coordinator using EXECUTE DIRECT because the latter may be failed >>>> while the whole cluster is in operation. >>>> >>> >>> Well, if there are multiple failures we ought to know about them >>> anyways. So if this particular coordinator fails the monitor tells us >>> about it first. We fix it and then move on to the datanode failure >>> detection. Since the datanodes have to be reachable via coordinators >>> and we have multiple coordinators around to load balance anyways, I >>> still think EXECUTE DIRECT via the coordinator node is a decent idea. >>> If we can round robin the calls via all the coordinators that would be >>> better too I think. >>> >>> Regards, >>> Nikhils >>> >>> >>>> Regards; >>>> ---------- >>>> Koichi Suzuki >>>> >>>> >>>> 2012/7/4 Koichi Suzuki <koi...@gm...>: >>>>> The background of xc_watchdog is to provide quicker means to detect >>>>> node fault. I understand that it is not compatible with what we're >>>>> doing in conventional PG applications, which are mostly based upon >>>>> psql -c 'select 1'. It takes at most 60sec to detect the error (TCP >>>>> timeout value). Some applications will be satisfied with this and >>>>> some may not. This is raised at the clustering summit in Ottawa and >>>>> the suggestion was to have this kind of means (watchdog). >>>>> >>>>> I don't know if PG people are interested in this now. Maybe we >>>>> should wait until such fault detection is more realistic issue. >>>>> Implementation is very straightforward. >>>>> >>>>> For datanode, I don't like to ask applications to connect to it >>>>> directly using psql because it is a kind of tricky use and it may mean >>>>> that we allow applications to connect to datanodes directly. So I >>>>> think we should encapsulate this with dedicated command like >>>>> xc_monitor. Xc_ping sounds good too but "ping" reminds me >>>>> consecutive monitoring. Current practice needs only one monitoring. >>>>> So I'd like xc_monitor (or node_monitor). >>>>> >>>>> Command like 'xc_monitor -Z nodetype -h host -p port' will not need >>>>> any modification to the core. Will be submitted soon as contrib >>>>> module. >>>>> >>>>> Regards; >>>>> ---------- >>>>> Koichi Suzuki >>>>> >>>>> >>>>> 2012/7/4 Nikhil Sontakke <ni...@st...>: >>>>>>> Are there people with a similar opinion to mine??? >>>>>>> >>>>>> >>>>>> +1 >>>>>> >>>>>> IMO too we should not be making any too invasive internal changes to >>>>>> support monitoring. What would be better would be to maybe allow >>>>>> commands which can be scripted and which can work against each of the >>>>>> components. >>>>>> >>>>>> For example, for the coordinator/datanode periodic "SELECT 1" commands >>>>>> should be good enough. Even doing an EXECUTE DIRECT via a coordinator >>>>>> to the datanodes will help. >>>>>> >>>>>> For GTM/GTM_Standy/GTM_Proxy components we should introduce "gtm_ctl >>>>>> ping" kinds of commands which will basically connect to them and see >>>>>> that they are responding ok. >>>>>> >>>>>> Such interfaces make it really easy for monitoring solutions like >>>>>> nagios, zabbix etc. to monitor them. These tools have been used for a >>>>>> while now to monitor Postgres and it should be a natural logical >>>>>> evolution for users to see them being used for PG XC. >>>>>> >>>>>> Regards, >>>>>> Nikhils >>>>>> -- >>>>>> StormDB - http://www.stormdb.com >>>>>> The Database Cloud >>> >>> >>> >>> -- >>> StormDB - http://www.stormdb.com >>> The Database Cloud |
From: Koichi S. <koi...@gm...> - 2012-07-04 03:53:09
|
I'd like to post one more thing on this. Yes, pg_ctl status and gtm_ctl status (I've corrected this and will submit soon) check that postmaster and main thread is alive but they don't show if each node responds correctly. For this purpose, we can use psql -c 'select 1' only for coordinator. Yes, we can use the same thing to datanode but we should restrict psql use directly against the datanode generally. Also, we need gtm/gtm_proxy counterpart. So please let me implement pgxc_monitor command as in the previous post which really interacts with each node and reports if they're running. It may need some extension to gtm/gtm_proxy to add a message just to respond ack. No other core extension will be needed. Regards; ---------- Koichi Suzuki 2012/7/4 Koichi Suzuki <koi...@gm...>: > We don't have to do such a round robin. As Michael suggested, pg_ctl > status works well even with datanodes. It doesn't issue any query > but checks if the postmaster is running. I think it is sufficient. > Only one restriction is pg_stl status determines zombie process as > running. > > Regards; > ---------- > Koichi Suzuki > > > 2012/7/4 Nikhil Sontakke <ni...@st...>: >>> I also believe it's not a good idea to monitor a datanode through a >>> coordinator using EXECUTE DIRECT because the latter may be failed >>> while the whole cluster is in operation. >>> >> >> Well, if there are multiple failures we ought to know about them >> anyways. So if this particular coordinator fails the monitor tells us >> about it first. We fix it and then move on to the datanode failure >> detection. Since the datanodes have to be reachable via coordinators >> and we have multiple coordinators around to load balance anyways, I >> still think EXECUTE DIRECT via the coordinator node is a decent idea. >> If we can round robin the calls via all the coordinators that would be >> better too I think. >> >> Regards, >> Nikhils >> >> >>> Regards; >>> ---------- >>> Koichi Suzuki >>> >>> >>> 2012/7/4 Koichi Suzuki <koi...@gm...>: >>>> The background of xc_watchdog is to provide quicker means to detect >>>> node fault. I understand that it is not compatible with what we're >>>> doing in conventional PG applications, which are mostly based upon >>>> psql -c 'select 1'. It takes at most 60sec to detect the error (TCP >>>> timeout value). Some applications will be satisfied with this and >>>> some may not. This is raised at the clustering summit in Ottawa and >>>> the suggestion was to have this kind of means (watchdog). >>>> >>>> I don't know if PG people are interested in this now. Maybe we >>>> should wait until such fault detection is more realistic issue. >>>> Implementation is very straightforward. >>>> >>>> For datanode, I don't like to ask applications to connect to it >>>> directly using psql because it is a kind of tricky use and it may mean >>>> that we allow applications to connect to datanodes directly. So I >>>> think we should encapsulate this with dedicated command like >>>> xc_monitor. Xc_ping sounds good too but "ping" reminds me >>>> consecutive monitoring. Current practice needs only one monitoring. >>>> So I'd like xc_monitor (or node_monitor). >>>> >>>> Command like 'xc_monitor -Z nodetype -h host -p port' will not need >>>> any modification to the core. Will be submitted soon as contrib >>>> module. >>>> >>>> Regards; >>>> ---------- >>>> Koichi Suzuki >>>> >>>> >>>> 2012/7/4 Nikhil Sontakke <ni...@st...>: >>>>>> Are there people with a similar opinion to mine??? >>>>>> >>>>> >>>>> +1 >>>>> >>>>> IMO too we should not be making any too invasive internal changes to >>>>> support monitoring. What would be better would be to maybe allow >>>>> commands which can be scripted and which can work against each of the >>>>> components. >>>>> >>>>> For example, for the coordinator/datanode periodic "SELECT 1" commands >>>>> should be good enough. Even doing an EXECUTE DIRECT via a coordinator >>>>> to the datanodes will help. >>>>> >>>>> For GTM/GTM_Standy/GTM_Proxy components we should introduce "gtm_ctl >>>>> ping" kinds of commands which will basically connect to them and see >>>>> that they are responding ok. >>>>> >>>>> Such interfaces make it really easy for monitoring solutions like >>>>> nagios, zabbix etc. to monitor them. These tools have been used for a >>>>> while now to monitor Postgres and it should be a natural logical >>>>> evolution for users to see them being used for PG XC. >>>>> >>>>> Regards, >>>>> Nikhils >>>>> -- >>>>> StormDB - http://www.stormdb.com >>>>> The Database Cloud >> >> >> >> -- >> StormDB - http://www.stormdb.com >> The Database Cloud |
From: Abbas B. <abb...@en...> - 2012-07-04 03:35:20
|
While fixing the regression failures resulting from the changes done by the patch I was able to fix all except this test case set enforce_two_phase_commit = off; CREATE TEMP TABLE users ( id INT PRIMARY KEY, name VARCHAR NOT NULL ) DISTRIBUTE BY REPLICATION; INSERT INTO users VALUES (1, 'Jozko'); INSERT INTO users VALUES (2, 'Ferko'); INSERT INTO users VALUES (3, 'Samko'); CREATE TEMP TABLE tasks ( id INT PRIMARY KEY, owner INT REFERENCES users ON UPDATE CASCADE ON DELETE SET NULL, worker INT REFERENCES users ON UPDATE CASCADE ON DELETE SET NULL, checked_by INT REFERENCES users ON UPDATE CASCADE ON DELETE SET NULL ) DISTRIBUTE BY REPLICATION; INSERT INTO tasks VALUES (1,1,NULL,NULL); INSERT INTO tasks VALUES (2,2,2,NULL); INSERT INTO tasks VALUES (3,3,3,3); BEGIN; UPDATE tasks set id=id WHERE id=2; SELECT * FROM tasks; DELETE FROM users WHERE id = 2; SELECT * FROM tasks; COMMIT; The obtained output from the last select statement is id | owner | worker | checked_by ----+-------+--------+------------ 1 | 1 | | 3 | 3 | 3 | 3 2 | 2 | 2 | (3 rows) where as the expected output is id | owner | worker | checked_by ----+-------+--------+------------ 1 | 1 | | 3 | 3 | 3 | 3 2 | | | (3 rows) Note that the owner and worker have been set to null due to "ON DELETE SET NULL". Here is the reason why this does not work properly. Consider the last transaction BEGIN; UPDATE tasks set id=id WHERE id=2; SELECT * FROM tasks; DELETE FROM users WHERE id = 2; SELECT * FROM tasks; COMMIT; Here are the command id values the coordinator sends to the data node 0 for the first update that gets incremented to 1 because this is a DML and needs to consume a command id 1 for the first select that remains 1 since it is not required to be consumed. 1 for the delete statement that gets incremented to 2 because it is a DML and 2 for the last select. Now this is what happens on the data node When the data node receives the first update with command id 0, it increments it once due to the update itself and once due to the update run because of "ON UPDATE CASCADE". Hence the command id at the end of update on data node is 2. The first select comes to data node with command id 1, which is incorrect. The user's intention is to see data after update and its command id should be 2. Now delete comes with command id 1, and data node increments it once due to the delete itself and once due to the update run because of "ON DELETE SET NULL", hence the command id at the end of delete is 3. Coordinator now sends last select with command id 2, which is again incorrect since user's intention is to see data after delete and select should have been sent to data node with command id 3 or 4. Every time data node increments command id due to any statements run implicitly either because of the constraints or triggers, this scheme of sending command ids to data node from coordinator to solve fetch problems would fail. Datanode can have a trigger e.g. inserting rows thrice on every single insert and would increment command id on every insert. Therefore this design cannot work. Either we have to synchronize command ids between datanode and coordinator through GTM OR We will have to send the DECLARE CURSOR down to the datanode. In this case however we will not be able to send the cursor query as it is because the query might contain a join on two tables which exist on a disjoint set of data nodes. Comments or suggestions are welcome. On Tue, Jun 19, 2012 at 2:43 PM, Abbas Butt <abb...@en...>wrote: > Thanks for your comments. > > On Tue, Jun 19, 2012 at 1:54 PM, Ashutosh Bapat < > ash...@en...> wrote: > >> Hi Abbas, >> I have few comments to make >> 1. With this patch there are two variables for having command Id, that is >> going to cause confusion and will be a maintenance burden, might be error >> prone. Is it possible to use a single variable instead of two? > > > Are you talking about receivedCommandId and currentCommandId? If yes, I > would prefer not having a packet received from coordinator overwrite the > currentCommandId at data node, because I am not 100% sure about the life > time of currentCommandId, I might overwrite it before time. It would be > safe to let currentCommandId as is unless we are compelled to get the next > command ID, and have the received command id take priority at that time. > > >> Right now there is some code which is specific to cursors in your patch. >> If you can plug the coordinator command id somehow into currentCommandId, >> you won't need that code and any other code which needs coordinator command >> ID will be automatically taken care of. >> > > That code is required to solve a problem. Consider this case when a > coordinator received this transaction > > > BEGIN; > insert into tt1 values(1); > declare c50 cursor for select * from tt1; > insert into tt1 values(2); > fetch all from c50; > COMMIT; > > While sending select to the data node in response to a fetch we need to > know what was the command ID of the declare cursor statement and we need to > send that command ID to the data node for this particular fetch. This is > the main idea behind this solution. > > The first insert goes to the data node with command id 0, the second > insert goes with 2. Command ID 1 is consumed by declare cursor. When > coordinator sees fetch it needs to send select to the data node with > command ID 1 rather than 3. > > > >> 2. A non-transaction on coordinator can spawn tranasactions on datanode >> or subtransactions (if there is already a transaction running). Does your >> patch handle that case? > > > No and it does not need to, because that case has no known problems that > we need to solve. I don't think my patch would impact any such case but I > will analyze any failures that I may get in regressions. > > >> Should we do more thorough research in the transaction management, esp. >> to see the impact of getting same command id for two commands on the >> datanode? >> > > If we issue two commands with the same command ID then we will definitely > have visibility issues according to the rules I have already explained. But > we will not have two commands sent to the data node with same command id. > > >> >> >> On Tue, Jun 19, 2012 at 1:56 PM, Abbas Butt <abb...@en...>wrote: >> >>> Hi Ashutosh, >>> Here are the results with the val column, Thanks. >>> >>> test=# drop table mvcc_demo; >>> DROP TABLE >>> test=# >>> test=# create table mvcc_demo (val int); >>> CREATE TABLE >>> test=# >>> test=# TRUNCATE mvcc_demo; >>> TRUNCATE TABLE >>> test=# >>> test=# BEGIN; >>> BEGIN >>> test=# DELETE FROM mvcc_demo; -- increment command id to show that combo >>> id would be different >>> DELETE 0 >>> test=# DELETE FROM mvcc_demo; >>> DELETE 0 >>> test=# DELETE FROM mvcc_demo; >>> DELETE 0 >>> test=# INSERT INTO mvcc_demo VALUES (1); >>> INSERT 0 1 >>> test=# INSERT INTO mvcc_demo VALUES (2); >>> INSERT 0 1 >>> test=# INSERT INTO mvcc_demo VALUES (3); >>> INSERT 0 1 >>> test=# SELECT t_xmin AS xmin, >>> test-# t_xmax::text::int8 AS xmax, >>> test-# t_field3::text::int8 AS cmin_cmax, >>> test-# (t_infomask::integer & X'0020'::integer)::bool AS >>> is_combocid >>> test-# FROM heap_page_items(get_raw_page('mvcc_demo', 0)) >>> test-# ORDER BY 2 DESC, 3; >>> xmin | xmax | cmin_cmax | is_combocid >>> -------+------+-----------+------------- >>> 80689 | 0 | 3 | f >>> 80689 | 0 | 4 | f >>> 80689 | 0 | 5 | f >>> (3 rows) >>> >>> test=# >>> test=# select xmin,xmax,cmin,cmax,* from mvcc_demo order by val; >>> xmin | xmax | cmin | cmax | val >>> -------+------+------+------+----- >>> 80689 | 0 | 3 | 3 | 1 >>> 80689 | 0 | 4 | 4 | 2 >>> 80689 | 0 | 5 | 5 | 3 >>> >>> (3 rows) >>> >>> test=# >>> test=# DELETE FROM mvcc_demo; >>> DELETE 3 >>> test=# SELECT t_xmin AS xmin, >>> test-# t_xmax::text::int8 AS xmax, >>> test-# t_field3::text::int8 AS cmin_cmax, >>> test-# (t_infomask::integer & X'0020'::integer)::bool AS >>> is_combocid >>> test-# FROM heap_page_items(get_raw_page('mvcc_demo', 0)) >>> test-# ORDER BY 2 DESC, 3; >>> xmin | xmax | cmin_cmax | is_combocid >>> -------+-------+-----------+------------- >>> 80689 | 80689 | 0 | t >>> 80689 | 80689 | 1 | t >>> 80689 | 80689 | 2 | t >>> (3 rows) >>> >>> test=# >>> test=# select xmin,xmax,cmin,cmax,* from mvcc_demo order by val; >>> xmin | xmax | cmin | cmax | val >>> ------+------+------+------+----- >>> (0 rows) >>> >>> >>> test=# >>> test=# END; >>> COMMIT >>> test=# >>> test=# >>> test=# TRUNCATE mvcc_demo; >>> TRUNCATE TABLE >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> test=# BEGIN; >>> BEGIN >>> test=# INSERT INTO mvcc_demo VALUES (1); >>> INSERT 0 1 >>> test=# INSERT INTO mvcc_demo VALUES (2); >>> INSERT 0 1 >>> test=# INSERT INTO mvcc_demo VALUES (3); >>> INSERT 0 1 >>> test=# SELECT t_xmin AS xmin, >>> test-# t_xmax::text::int8 AS xmax, >>> test-# t_field3::text::int8 AS cmin_cmax, >>> test-# (t_infomask::integer & X'0020'::integer)::bool AS >>> is_combocid >>> test-# FROM heap_page_items(get_raw_page('mvcc_demo', 0)) >>> test-# ORDER BY 2 DESC, 3; >>> xmin | xmax | cmin_cmax | is_combocid >>> -------+------+-----------+------------- >>> 80693 | 0 | 0 | f >>> 80693 | 0 | 1 | f >>> 80693 | 0 | 2 | f >>> (3 rows) >>> >>> test=# >>> test=# select xmin,xmax,cmin,cmax,* from mvcc_demo order by val; >>> xmin | xmax | cmin | cmax | val >>> -------+------+------+------+----- >>> 80693 | 0 | 0 | 0 | 1 >>> 80693 | 0 | 1 | 1 | 2 >>> 80693 | 0 | 2 | 2 | 3 >>> (3 rows) >>> >>> test=# >>> test=# UPDATE mvcc_demo SET val = 10; >>> >>> UPDATE 3 >>> test=# >>> test=# SELECT t_xmin AS xmin, >>> test-# t_xmax::text::int8 AS xmax, >>> test-# t_field3::text::int8 AS cmin_cmax, >>> test-# (t_infomask::integer & X'0020'::integer)::bool AS >>> is_combocid >>> test-# FROM heap_page_items(get_raw_page('mvcc_demo', 0)) >>> test-# ORDER BY 2 DESC, 3; >>> xmin | xmax | cmin_cmax | is_combocid >>> -------+-------+-----------+------------- >>> 80693 | 80693 | 0 | t >>> 80693 | 80693 | 1 | t >>> 80693 | 80693 | 2 | t >>> 80693 | 0 | 3 | f >>> 80693 | 0 | 3 | f >>> 80693 | 0 | 3 | f >>> (6 rows) >>> >>> test=# >>> test=# select xmin,xmax,cmin,cmax,* from mvcc_demo order by val; >>> xmin | xmax | cmin | cmax | val >>> -------+------+------+------+----- >>> 80693 | 0 | 3 | 3 | 10 >>> 80693 | 0 | 3 | 3 | 10 >>> 80693 | 0 | 3 | 3 | 10 >>> (3 rows) >>> >>> >>> test=# >>> test=# END; >>> COMMIT >>> test=# >>> test=# TRUNCATE mvcc_demo; >>> TRUNCATE TABLE >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> -- From one psql issue >>> test=# INSERT INTO mvcc_demo VALUES (1); >>> INSERT 0 1 >>> test=# SELECT t_xmin AS xmin, >>> test-# t_xmax::text::int8 AS xmax, >>> test-# t_field3::text::int8 AS cmin_cmax, >>> test-# (t_infomask::integer & X'0020'::integer)::bool AS >>> is_combocid >>> test-# FROM heap_page_items(get_raw_page('mvcc_demo', 0)) >>> test-# ORDER BY 2 DESC, 3; >>> xmin | xmax | cmin_cmax | is_combocid >>> -------+------+-----------+------------- >>> 80699 | 0 | 0 | f >>> (1 row) >>> >>> test=# >>> test=# select xmin,xmax,cmin,cmax,* from mvcc_demo order by val; >>> xmin | xmax | cmin | cmax | val >>> -------+------+------+------+----- >>> 80699 | 0 | 0 | 0 | 1 >>> (1 row) >>> >>> >>> >>> >>> >>> test=# -- From another issue >>> test=# BEGIN; >>> BEGIN >>> test=# INSERT INTO mvcc_demo VALUES (2); >>> INSERT 0 1 >>> test=# INSERT INTO mvcc_demo VALUES (3); >>> INSERT 0 1 >>> test=# INSERT INTO mvcc_demo VALUES (4); >>> INSERT 0 1 >>> test=# SELECT t_xmin AS xmin, >>> test-# t_xmax::text::int8 AS xmax, >>> test-# t_field3::text::int8 AS cmin_cmax, >>> test-# (t_infomask::integer & X'0020'::integer)::bool AS >>> is_combocid >>> test-# FROM heap_page_items(get_raw_page('mvcc_demo', 0)) >>> test-# ORDER BY 2 DESC, 3; >>> xmin | xmax | cmin_cmax | is_combocid >>> -------+------+-----------+------------- >>> 80699 | 0 | 0 | f >>> 80700 | 0 | 0 | f >>> 80700 | 0 | 1 | f >>> 80700 | 0 | 2 | f >>> (4 rows) >>> >>> test=# >>> test=# select xmin,xmax,cmin,cmax,* from mvcc_demo order by val; >>> xmin | xmax | cmin | cmax | val >>> -------+------+------+------+----- >>> 80699 | 0 | 0 | 0 | 1 >>> 80700 | 0 | 0 | 0 | 2 >>> 80700 | 0 | 1 | 1 | 3 >>> 80700 | 0 | 2 | 2 | 4 >>> (4 rows) >>> >>> test=# >>> test=# UPDATE mvcc_demo SET val = 10; >>> >>> UPDATE 4 >>> test=# >>> test=# SELECT t_xmin AS xmin, >>> test-# t_xmax::text::int8 AS xmax, >>> test-# t_field3::text::int8 AS cmin_cmax, >>> test-# (t_infomask::integer & X'0020'::integer)::bool AS >>> is_combocid >>> test-# FROM heap_page_items(get_raw_page('mvcc_demo', 0)) >>> test-# ORDER BY 2 DESC, 3; >>> xmin | xmax | cmin_cmax | is_combocid >>> -------+-------+-----------+------------- >>> 80700 | 80700 | 0 | t >>> 80700 | 80700 | 1 | t >>> 80700 | 80700 | 2 | t >>> 80699 | 80700 | 3 | f >>> 80700 | 0 | 3 | f >>> 80700 | 0 | 3 | f >>> 80700 | 0 | 3 | f >>> 80700 | 0 | 3 | f >>> (8 rows) >>> >>> test=# >>> test=# select xmin,xmax,cmin,cmax,* from mvcc_demo order by val; >>> xmin | xmax | cmin | cmax | val >>> -------+------+------+------+----- >>> 80700 | 0 | 3 | 3 | 10 >>> 80700 | 0 | 3 | 3 | 10 >>> 80700 | 0 | 3 | 3 | 10 >>> 80700 | 0 | 3 | 3 | 10 >>> (4 rows) >>> >>> >>> >>> >>> test=# -- Before finishing this, issue these from the first psql >>> test=# SELECT t_xmin AS xmin, >>> test-# t_xmax::text::int8 AS xmax, >>> test-# t_field3::text::int8 AS cmin_cmax, >>> test-# (t_infomask::integer & X'0020'::integer)::bool AS >>> is_combocid >>> test-# FROM heap_page_items(get_raw_page('mvcc_demo', 0)) >>> test-# ORDER BY 2 DESC, 3; >>> xmin | xmax | cmin_cmax | is_combocid >>> -------+-------+-----------+------------- >>> 80700 | 80700 | 0 | t >>> 80700 | 80700 | 1 | t >>> 80700 | 80700 | 2 | t >>> 80699 | 80700 | 3 | f >>> 80700 | 0 | 3 | f >>> 80700 | 0 | 3 | f >>> 80700 | 0 | 3 | f >>> 80700 | 0 | 3 | f >>> (8 rows) >>> >>> test=# >>> test=# select xmin,xmax,cmin,cmax,* from mvcc_demo order by val; >>> xmin | xmax | cmin | cmax | val >>> -------+-------+------+------+----- >>> 80699 | 80700 | 3 | 3 | 1 >>> (1 row) >>> >>> test=# end; >>> COMMIT >>> >>> >>> On Tue, Jun 19, 2012 at 10:26 AM, Michael Paquier < >>> mic...@gm...> wrote: >>> >>>> Hi, >>>> >>>> I expect pgxc_node_send_cmd_id to have some impact on performance, so >>>> be sure to send it to remote Datanodes really only if necessary. >>>> You should put more severe conditions blocking this function cid can >>>> easily get incremented in Postgres. >>>> >>>> Regards, >>>> >>>> On Tue, Jun 19, 2012 at 5:31 AM, Abbas Butt < >>>> abb...@en...> wrote: >>>> >>>>> PFA a WIP patch implementing the design presented earlier. >>>>> The patch is WIP because it still has and FIXME and it shows some >>>>> regression failures that need to be fixed, but other than that it confirms >>>>> that the suggested design would work fine. The following test cases now >>>>> work fine >>>>> >>>>> drop table tt1; >>>>> create table tt1(f1 int) distribute by replication; >>>>> >>>>> >>>>> BEGIN; >>>>> insert into tt1 values(1); >>>>> declare c50 cursor for select * from tt1; >>>>> insert into tt1 values(2); >>>>> fetch all from c50; >>>>> COMMIT; >>>>> truncate table tt1; >>>>> >>>>> BEGIN; >>>>> >>>>> declare c50 cursor for select * from tt1; >>>>> insert into tt1 values(1); >>>>> >>>>> insert into tt1 values(2); >>>>> fetch all from c50; >>>>> COMMIT; >>>>> truncate table tt1; >>>>> >>>>> >>>>> BEGIN; >>>>> insert into tt1 values(1); >>>>> insert into tt1 values(2); >>>>> >>>>> declare c50 cursor for select * from tt1; >>>>> insert into tt1 values(3); >>>>> >>>>> fetch all from c50; >>>>> COMMIT; >>>>> truncate table tt1; >>>>> >>>>> >>>>> BEGIN; >>>>> insert into tt1 values(1); >>>>> declare c50 cursor for select * from tt1; >>>>> insert into tt1 values(2); >>>>> declare c51 cursor for select * from tt1; >>>>> insert into tt1 values(3); >>>>> fetch all from c50; >>>>> fetch all from c51; >>>>> COMMIT; >>>>> truncate table tt1; >>>>> >>>>> >>>>> BEGIN; >>>>> insert into tt1 values(1); >>>>> declare c50 cursor for select * from tt1; >>>>> declare c51 cursor for select * from tt1; >>>>> insert into tt1 values(2); >>>>> insert into tt1 values(3); >>>>> fetch all from c50; >>>>> fetch all from c51; >>>>> COMMIT; >>>>> truncate table tt1; >>>>> >>>>> >>>>> On Fri, Jun 15, 2012 at 8:07 AM, Abbas Butt < >>>>> abb...@en...> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> In a multi-statement transaction each statement is given a command >>>>>> identifier >>>>>> starting from zero and incrementing for each statement. >>>>>> These command indentifers are required for extra tracking because each >>>>>> statement has its own visibility rules with in the transaction. >>>>>> For example, a cursor’s contents must remain unchanged even if later >>>>>> statements in the >>>>>> same transaction modify rows. Such tracking is implemented using >>>>>> system command id >>>>>> columns cmin/cmax, which is internally actually is a single column. >>>>>> >>>>>> cmin/cmax come into play in case of multi-statement transactions >>>>>> only, >>>>>> they are both zero otherwise. >>>>>> >>>>>> cmin "The command identifier of the statement within the inserting >>>>>> transaction." >>>>>> cmax "The command identifier of the statement within the deleting >>>>>> transaction." >>>>>> >>>>>> Here are the visibility rules (taken from comments of tqual.c) >>>>>> >>>>>> ( // A heap tuple is valid >>>>>> "now" iff >>>>>> Xmin == my-transaction && // inserted by the current >>>>>> transaction >>>>>> Cmin < my-command && // before this command, and >>>>>> ( >>>>>> Xmax is null || // the row has not been >>>>>> deleted, or >>>>>> ( >>>>>> Xmax == my-transaction && // it was deleted by the >>>>>> current transaction >>>>>> Cmax >= my-command // but not before this >>>>>> command, >>>>>> ) >>>>>> ) >>>>>> ) >>>>>> || // or >>>>>> ( >>>>>> Xmin is committed && // the row was inserted by a >>>>>> committed transaction, and >>>>>> ( >>>>>> Xmax is null || // the row has not been >>>>>> deleted, or >>>>>> ( >>>>>> Xmax == my-transaction && // the row is being deleted >>>>>> by this transaction >>>>>> Cmax >= my-command) || // but it's not deleted >>>>>> "yet", or >>>>>> ( >>>>>> Xmax != my-transaction && // the row was deleted by >>>>>> another transaction >>>>>> Xmax is not committed // that has not been >>>>>> committed >>>>>> ) >>>>>> ) >>>>>> ) >>>>>> ) >>>>>> >>>>>> Because cmin and cmax are internally a single system column, >>>>>> it is therefore not possible to simply record the status of a row >>>>>> that is created and expired in the same multi-statement transaction. >>>>>> For that reason, a special combo command id is created that >>>>>> references >>>>>> a local memory hash that contains the actual cmin and cmax values. >>>>>> It means that if combo id is being used the number we are seeing >>>>>> would not be the cmin or cmax it will be an index into a local >>>>>> array that contains a structure with has the actual cmin and cmax >>>>>> values. >>>>>> >>>>>> The following queries (taken mostly from >>>>>> http://momjian.us/main/writings/pgsql/mvcc.pdf) >>>>>> use the contrib module pageinspect, which allows >>>>>> visibility of internal heap page structures and all stored rows, >>>>>> including those not visible in the current snapshot. >>>>>> (Bit 0x0020 is defined as HEAP_COMBOCID.) >>>>>> >>>>>> We are exploring 3 examples here: >>>>>> 1) INSERT & DELETE in a single transaction >>>>>> 2) INSERT & UPDATE in a single transaction >>>>>> 3) INSERT from two different transactions & UPDATE from one >>>>>> >>>>>> test=# drop table mvcc_demo; >>>>>> DROP TABLE >>>>>> test=# >>>>>> test=# create table mvcc_demo (val int); >>>>>> CREATE TABLE >>>>>> test=# >>>>>> test=# TRUNCATE mvcc_demo; >>>>>> TRUNCATE TABLE >>>>>> test=# >>>>>> test=# BEGIN; >>>>>> BEGIN >>>>>> test=# DELETE FROM mvcc_demo; -- increment command id to show that >>>>>> combo id would be different >>>>>> DELETE 0 >>>>>> test=# DELETE FROM mvcc_demo; >>>>>> DELETE 0 >>>>>> test=# DELETE FROM mvcc_demo; >>>>>> DELETE 0 >>>>>> test=# INSERT INTO mvcc_demo VALUES (1); >>>>>> INSERT 0 1 >>>>>> test=# INSERT INTO mvcc_demo VALUES (2); >>>>>> INSERT 0 1 >>>>>> test=# INSERT INTO mvcc_demo VALUES (3); >>>>>> INSERT 0 1 >>>>>> test=# SELECT t_xmin AS xmin, >>>>>> test-# t_xmax::text::int8 AS xmax, >>>>>> test-# t_field3::text::int8 AS cmin_cmax, >>>>>> test-# (t_infomask::integer & X'0020'::integer)::bool AS >>>>>> is_combocid >>>>>> test-# FROM heap_page_items(get_raw_page('mvcc_demo', 0)) >>>>>> test-# ORDER BY 2 DESC, 3; >>>>>> xmin | xmax | cmin_cmax | is_combocid >>>>>> -------+------+-----------+------------- >>>>>> 80685 | 0 | 3 | f >>>>>> 80685 | 0 | 4 | f >>>>>> 80685 | 0 | 5 | f >>>>>> (3 rows) >>>>>> >>>>>> test=# >>>>>> test=# DELETE FROM mvcc_demo; >>>>>> DELETE 3 >>>>>> test=# SELECT t_xmin AS xmin, >>>>>> test-# t_xmax::text::int8 AS xmax, >>>>>> test-# t_field3::text::int8 AS cmin_cmax, >>>>>> test-# (t_infomask::integer & X'0020'::integer)::bool AS >>>>>> is_combocid >>>>>> test-# FROM heap_page_items(get_raw_page('mvcc_demo', 0)) >>>>>> test-# ORDER BY 2 DESC, 3; >>>>>> xmin | xmax | cmin_cmax | is_combocid >>>>>> -------+-------+-----------+------------- >>>>>> 80685 | 80685 | 0 | t >>>>>> 80685 | 80685 | 1 | t >>>>>> 80685 | 80685 | 2 | t >>>>>> (3 rows) >>>>>> >>>>>> Note that since is_combocid is true the numbers are not cmin/cmax >>>>>> they are actually >>>>>> the indexes of the internal array already explained above. >>>>>> combo id index 0 would contain cmin 3, cmax 6 >>>>>> combo id index 1 would contain cmin 4, cmax 6 >>>>>> combo id index 2 would contain cmin 5, cmax 6 >>>>>> >>>>>> test=# >>>>>> test=# END; >>>>>> COMMIT >>>>>> test=# >>>>>> test=# >>>>>> test=# TRUNCATE mvcc_demo; >>>>>> TRUNCATE TABLE >>>>>> test=# >>>>>> test=# >>>>>> test=# >>>>>> test=# BEGIN; >>>>>> BEGIN >>>>>> test=# INSERT INTO mvcc_demo VALUES (1); >>>>>> INSERT 0 1 >>>>>> test=# INSERT INTO mvcc_demo VALUES (2); >>>>>> INSERT 0 1 >>>>>> test=# INSERT INTO mvcc_demo VALUES (3); >>>>>> INSERT 0 1 >>>>>> test=# SELECT t_xmin AS xmin, >>>>>> test-# t_xmax::text::int8 AS xmax, >>>>>> test-# t_field3::text::int8 AS cmin_cmax, >>>>>> test-# (t_infomask::integer & X'0020'::integer)::bool AS >>>>>> is_combocid >>>>>> test-# FROM heap_page_items(get_raw_page('mvcc_demo', 0)) >>>>>> test-# ORDER BY 2 DESC, 3; >>>>>> xmin | xmax | cmin_cmax | is_combocid >>>>>> -------+------+-----------+------------- >>>>>> 80675 | 0 | 0 | f >>>>>> 80675 | 0 | 1 | f >>>>>> 80675 | 0 | 2 | f >>>>>> (3 rows) >>>>>> >>>>>> test=# >>>>>> test=# UPDATE mvcc_demo SET val = val * 10; >>>>>> UPDATE 3 >>>>>> test=# >>>>>> test=# SELECT t_xmin AS xmin, >>>>>> test-# t_xmax::text::int8 AS xmax, >>>>>> test-# t_field3::text::int8 AS cmin_cmax, >>>>>> test-# (t_infomask::integer & X'0020'::integer)::bool AS >>>>>> is_combocid >>>>>> test-# FROM heap_page_items(get_raw_page('mvcc_demo', 0)) >>>>>> test-# ORDER BY 2 DESC, 3; >>>>>> xmin | xmax | cmin_cmax | is_combocid >>>>>> -------+-------+-----------+------------- >>>>>> 80675 | 80675 | 0 | t >>>>>> 80675 | 80675 | 1 | t >>>>>> 80675 | 80675 | 2 | t >>>>>> 80675 | 0 | 3 | f >>>>>> 80675 | 0 | 3 | f >>>>>> 80675 | 0 | 3 | f >>>>>> (6 rows) >>>>>> >>>>>> test=# >>>>>> test=# END; >>>>>> COMMIT >>>>>> test=# >>>>>> test=# >>>>>> test=# TRUNCATE mvcc_demo; >>>>>> TRUNCATE TABLE >>>>>> test=# >>>>>> >>>>>> -- From one psql issue >>>>>> test=# INSERT INTO mvcc_demo VALUES (1); >>>>>> INSERT 0 1 >>>>>> test=# SELECT t_xmin AS xmin, >>>>>> test-# t_xmax::text::int8 AS xmax, >>>>>> test-# t_field3::text::int8 AS cmin_cmax, >>>>>> test-# (t_infomask::integer & X'0020'::integer)::bool AS >>>>>> is_combocid >>>>>> test-# FROM heap_page_items(get_raw_page('mvcc_demo', 0)) >>>>>> test-# ORDER BY 2 DESC, 3; >>>>>> xmin | xmax | cmin_cmax | is_combocid >>>>>> -------+------+-----------+------------- >>>>>> 80677 | 0 | 0 | f >>>>>> (1 row) >>>>>> >>>>>> >>>>>> test=# -- From another issue >>>>>> test=# BEGIN; >>>>>> BEGIN >>>>>> test=# INSERT INTO mvcc_demo VALUES (2); >>>>>> INSERT 0 1 >>>>>> test=# INSERT INTO mvcc_demo VALUES (3); >>>>>> INSERT 0 1 >>>>>> test=# INSERT INTO mvcc_demo VALUES (4); >>>>>> INSERT 0 1 >>>>>> test=# SELECT t_xmin AS xmin, >>>>>> test-# t_xmax::text::int8 AS xmax, >>>>>> test-# t_field3::text::int8 AS cmin_cmax, >>>>>> test-# (t_infomask::integer & X'0020'::integer)::bool AS >>>>>> is_combocid >>>>>> test-# FROM heap_page_items(get_raw_page('mvcc_demo', 0)) >>>>>> test-# ORDER BY 2 DESC, 3; >>>>>> xmin | xmax | cmin_cmax | is_combocid >>>>>> -------+------+-----------+------------- >>>>>> 80677 | 0 | 0 | f >>>>>> 80678 | 0 | 0 | f >>>>>> 80678 | 0 | 1 | f >>>>>> 80678 | 0 | 2 | f >>>>>> (4 rows) >>>>>> >>>>>> test=# >>>>>> test=# UPDATE mvcc_demo SET val = val * 10; >>>>>> UPDATE 4 >>>>>> test=# SELECT t_xmin AS xmin, >>>>>> test-# t_xmax::text::int8 AS xmax, >>>>>> test-# t_field3::text::int8 AS cmin_cmax, >>>>>> test-# (t_infomask::integer & X'0020'::integer)::bool AS >>>>>> is_combocid >>>>>> test-# FROM heap_page_items(get_raw_page('mvcc_demo', 0)) >>>>>> test-# ORDER BY 2 DESC, 3; >>>>>> xmin | xmax | cmin_cmax | is_combocid >>>>>> -------+-------+-----------+------------- >>>>>> 80678 | 80678 | 0 | t >>>>>> 80678 | 80678 | 1 | t >>>>>> 80678 | 80678 | 2 | t >>>>>> 80677 | 80678 | 3 | f >>>>>> 80678 | 0 | 3 | f >>>>>> 80678 | 0 | 3 | f >>>>>> 80678 | 0 | 3 | f >>>>>> 80678 | 0 | 3 | f >>>>>> (8 rows) >>>>>> >>>>>> test=# >>>>>> >>>>>> test=# -- Before finishing this, issue these from the first psql >>>>>> test=# SELECT t_xmin AS xmin, >>>>>> test-# t_xmax::text::int8 AS xmax, >>>>>> test-# t_field3::text::int8 AS cmin_cmax, >>>>>> test-# (t_infomask::integer & X'0020'::integer)::bool AS >>>>>> is_combocid >>>>>> test-# FROM heap_page_items(get_raw_page('mvcc_demo', 0)) >>>>>> test-# ORDER BY 2 DESC, 3; >>>>>> xmin | xmax | cmin_cmax | is_combocid >>>>>> -------+-------+-----------+------------- >>>>>> 80678 | 80678 | 0 | t >>>>>> 80678 | 80678 | 1 | t >>>>>> 80678 | 80678 | 2 | t >>>>>> 80677 | 80678 | 3 | f >>>>>> 80678 | 0 | 3 | f >>>>>> 80678 | 0 | 3 | f >>>>>> 80678 | 0 | 3 | f >>>>>> 80678 | 0 | 3 | f >>>>>> (8 rows) >>>>>> >>>>>> test=# END; >>>>>> COMMIT >>>>>> >>>>>> >>>>>> Now consider the case we are trying to solve >>>>>> >>>>>> drop table tt1; >>>>>> create table tt1(f1 int); >>>>>> >>>>>> BEGIN; >>>>>> insert into tt1 values(1); >>>>>> declare c50 cursor for select * from tt1; -- should show one row only >>>>>> insert into tt1 values(2); >>>>>> fetch all from c50; >>>>>> COMMIT; >>>>>> >>>>>> >>>>>> Consider Data node 1 log >>>>>> >>>>>> (a) [exec_simple_query][1026][START TRANSACTION ISOLATION LEVEL read >>>>>> committed READ WRITE] >>>>>> (b) [exec_simple_query][1026][drop table tt1;] >>>>>> (c) [exec_simple_query][1026][PREPARE TRANSACTION 'T21075'] >>>>>> (d) [exec_simple_query][1026][COMMIT PREPARED 'T21075'] >>>>>> (e) [exec_simple_query][1026][START TRANSACTION ISOLATION LEVEL read >>>>>> committed READ WRITE] >>>>>> (f) [exec_simple_query][1026][create table tt1(f1 int);] >>>>>> (g) [exec_simple_query][1026][PREPARE TRANSACTION 'T21077'] >>>>>> (h) [exec_simple_query][1026][COMMIT PREPARED 'T21077'] >>>>>> (i) [exec_simple_query][1026][START TRANSACTION ISOLATION LEVEL read >>>>>> committed READ WRITE] >>>>>> (j) [exec_simple_query][1026][INSERT INTO tt1 (f1) VALUES (1)] >>>>>> (k) [exec_simple_query][1026][INSERT INTO tt1 (f1) VALUES (2)] >>>>>> (l) [PostgresMain][4155][SELECT tt1.f1, tt1.ctid, pgxc_node_str() >>>>>> FROM tt1] >>>>>> (m) [exec_simple_query][1026][COMMIT TRANSACTION] >>>>>> >>>>>> The cursor currently shows both inserted rows because command id at >>>>>> data node in >>>>>> step (j) is 0 >>>>>> step (k) is 1 & >>>>>> step (l) is 2 >>>>>> >>>>>> Where as we need command ids to be >>>>>> >>>>>> step (j) should be 0 >>>>>> step (k) should be 2 & >>>>>> step (l) should be 1 >>>>>> >>>>>> This will solve the cursor visibility problem. >>>>>> >>>>>> To implement this I suggest we send command IDs to data nodes from >>>>>> the coordinator >>>>>> like we send gxid. The only difference will be that we do not need to >>>>>> take command IDs >>>>>> from GTM since they are only valid with in the transaction. >>>>>> >>>>>> See this example >>>>>> >>>>>> test=# select xmin,xmax,cmin,cmax,* from tt1; >>>>>> xmin | xmax | cmin | cmax | f1 >>>>>> ------+------+------+------+---- >>>>>> (0 rows) >>>>>> >>>>>> test=# begin; >>>>>> BEGIN >>>>>> test=# insert into tt1 values(1); >>>>>> INSERT 0 1 >>>>>> test=# select xmin,xmax,cmin,cmax,* from tt1; >>>>>> xmin | xmax | cmin | cmax | f1 >>>>>> -------+------+------+------+---- >>>>>> 80615 | 0 | 0 | 0 | 1 >>>>>> (1 row) >>>>>> >>>>>> test=# insert into tt1 values(2); >>>>>> INSERT 0 1 >>>>>> test=# select xmin,xmax,cmin,cmax,* from tt1; >>>>>> xmin | xmax | cmin | cmax | f1 >>>>>> -------+------+------+------+---- >>>>>> 80615 | 0 | 0 | 0 | 1 >>>>>> 80615 | 0 | 1 | 1 | 2 >>>>>> (2 rows) >>>>>> >>>>>> test=# insert into tt1 values(3); >>>>>> INSERT 0 1 >>>>>> test=# select xmin,xmax,cmin,cmax,* from tt1; >>>>>> xmin | xmax | cmin | cmax | f1 >>>>>> -------+------+------+------+---- >>>>>> 80615 | 0 | 0 | 0 | 1 >>>>>> 80615 | 0 | 1 | 1 | 2 >>>>>> 80615 | 0 | 2 | 2 | 3 >>>>>> (3 rows) >>>>>> >>>>>> test=# insert into tt1 values(4); >>>>>> INSERT 0 1 >>>>>> test=# select xmin,xmax,cmin,cmax,* from tt1; >>>>>> xmin | xmax | cmin | cmax | f1 >>>>>> -------+------+------+------+---- >>>>>> 80615 | 0 | 0 | 0 | 1 >>>>>> 80615 | 0 | 1 | 1 | 2 >>>>>> 80615 | 0 | 2 | 2 | 3 >>>>>> 80615 | 0 | 3 | 3 | 4 >>>>>> (4 rows) >>>>>> >>>>>> test=# end; >>>>>> COMMIT >>>>>> test=# >>>>>> test=# >>>>>> test=# select xmin,xmax,cmin,cmax,* from tt1; >>>>>> xmin | xmax | cmin | cmax | f1 >>>>>> -------+------+------+------+---- >>>>>> 80615 | 0 | 0 | 0 | 1 >>>>>> 80615 | 0 | 1 | 1 | 2 >>>>>> 80615 | 0 | 2 | 2 | 3 >>>>>> 80615 | 0 | 3 | 3 | 4 >>>>>> (4 rows) >>>>>> >>>>>> test=# insert into tt1 values(5); >>>>>> INSERT 0 1 >>>>>> test=# select xmin,xmax,cmin,cmax,* from tt1; >>>>>> xmin | xmax | cmin | cmax | f1 >>>>>> -------+------+------+------+---- >>>>>> 80615 | 0 | 0 | 0 | 1 >>>>>> 80615 | 0 | 1 | 1 | 2 >>>>>> 80615 | 0 | 2 | 2 | 3 >>>>>> 80615 | 0 | 3 | 3 | 4 >>>>>> 80616 | 0 | 0 | 0 | 5 >>>>>> (5 rows) >>>>>> >>>>>> test=# insert into tt1 values(6); >>>>>> INSERT 0 1 >>>>>> test=# >>>>>> test=# >>>>>> test=# select xmin,xmax,cmin,cmax,* from tt1; >>>>>> xmin | xmax | cmin | cmax | f1 >>>>>> -------+------+------+------+---- >>>>>> 80615 | 0 | 0 | 0 | 1 >>>>>> 80615 | 0 | 1 | 1 | 2 >>>>>> 80615 | 0 | 2 | 2 | 3 >>>>>> 80615 | 0 | 3 | 3 | 4 >>>>>> 80616 | 0 | 0 | 0 | 5 >>>>>> 80617 | 0 | 0 | 0 | 6 >>>>>> (6 rows) >>>>>> >>>>>> Note that at the end of the multi-statement transaction the command >>>>>> id gets reset to zero. >>>>>> >>>>>> -- >>>>>> Abbas >>>>>> Architect >>>>>> EnterpriseDB Corporation >>>>>> The Enterprise PostgreSQL Company >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> -- >>>>> Abbas >>>>> Architect >>>>> EnterpriseDB Corporation >>>>> The Enterprise PostgreSQL Company >>>>> >>>>> Phone: 92-334-5100153 >>>>> >>>>> Website: www.enterprisedb.com >>>>> EnterpriseDB Blog: http://blogs.enterprisedb.com/ >>>>> Follow us on Twitter: http://www.twitter.com/enterprisedb >>>>> >>>>> This e-mail message (and any attachment) is intended for the use of >>>>> the individual or entity to whom it is addressed. This message >>>>> contains information from EnterpriseDB Corporation that may be >>>>> privileged, confidential, or exempt from disclosure under applicable >>>>> law. If you are not the intended recipient or authorized to receive >>>>> this for the intended recipient, any use, dissemination, distribution, >>>>> retention, archiving, or copying of this communication is strictly >>>>> prohibited. If you have received this e-mail in error, please notify >>>>> the sender immediately by reply e-mail and delete this message. >>>>> >>>>> >>>>> ------------------------------------------------------------------------------ >>>>> Live Security Virtual Conference >>>>> Exclusive live event will cover all the ways today's security and >>>>> threat landscape has changed and how IT managers can respond. >>>>> Discussions >>>>> will include endpoint security, mobile security and the latest in >>>>> malware >>>>> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ >>>>> _______________________________________________ >>>>> Postgres-xc-developers mailing list >>>>> Pos...@li... >>>>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >>>>> >>>>> >>>> >>>> >>>> -- >>>> Michael Paquier >>>> http://michael.otacoo.com >>>> >>> >>> >>> >>> -- >>> -- >>> Abbas >>> Architect >>> EnterpriseDB Corporation >>> The Enterprise PostgreSQL Company >>> >>> Phone: 92-334-5100153 >>> >>> Website: www.enterprisedb.com >>> EnterpriseDB Blog: http://blogs.enterprisedb.com/ >>> Follow us on Twitter: http://www.twitter.com/enterprisedb >>> >>> This e-mail message (and any attachment) is intended for the use of >>> the individual or entity to whom it is addressed. This message >>> contains information from EnterpriseDB Corporation that may be >>> privileged, confidential, or exempt from disclosure under applicable >>> law. If you are not the intended recipient or authorized to receive >>> this for the intended recipient, any use, dissemination, distribution, >>> retention, archiving, or copying of this communication is strictly >>> prohibited. If you have received this e-mail in error, please notify >>> the sender immediately by reply e-mail and delete this message. >>> >>> >>> ------------------------------------------------------------------------------ >>> Live Security Virtual Conference >>> Exclusive live event will cover all the ways today's security and >>> threat landscape has changed and how IT managers can respond. Discussions >>> will include endpoint security, mobile security and the latest in malware >>> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ >>> _______________________________________________ >>> Postgres-xc-developers mailing list >>> Pos...@li... >>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >>> >>> >> >> >> -- >> Best Wishes, >> Ashutosh Bapat >> EntepriseDB Corporation >> The Enterprise Postgres Company >> >> > > > -- > -- > Abbas > Architect > EnterpriseDB Corporation > The Enterprise PostgreSQL Company > > Phone: 92-334-5100153 > > Website: www.enterprisedb.com > EnterpriseDB Blog: http://blogs.enterprisedb.com/ > Follow us on Twitter: http://www.twitter.com/enterprisedb > > This e-mail message (and any attachment) is intended for the use of > the individual or entity to whom it is addressed. This message > contains information from EnterpriseDB Corporation that may be > privileged, confidential, or exempt from disclosure under applicable > law. If you are not the intended recipient or authorized to receive > this for the intended recipient, any use, dissemination, distribution, > retention, archiving, or copying of this communication is strictly > prohibited. If you have received this e-mail in error, please notify > the sender immediately by reply e-mail and delete this message. > -- -- Abbas Architect EnterpriseDB Corporation The Enterprise PostgreSQL Company Phone: 92-334-5100153 Website: www.enterprisedb.com EnterpriseDB Blog: http://blogs.enterprisedb.com/ Follow us on Twitter: http://www.twitter.com/enterprisedb This e-mail message (and any attachment) is intended for the use of the individual or entity to whom it is addressed. This message contains information from EnterpriseDB Corporation that may be privileged, confidential, or exempt from disclosure under applicable law. If you are not the intended recipient or authorized to receive this for the intended recipient, any use, dissemination, distribution, retention, archiving, or copying of this communication is strictly prohibited. If you have received this e-mail in error, please notify the sender immediately by reply e-mail and delete this message. |
From: Koichi S. <koi...@gm...> - 2012-07-04 03:15:19
|
We don't have to do such a round robin. As Michael suggested, pg_ctl status works well even with datanodes. It doesn't issue any query but checks if the postmaster is running. I think it is sufficient. Only one restriction is pg_stl status determines zombie process as running. Regards; ---------- Koichi Suzuki 2012/7/4 Nikhil Sontakke <ni...@st...>: >> I also believe it's not a good idea to monitor a datanode through a >> coordinator using EXECUTE DIRECT because the latter may be failed >> while the whole cluster is in operation. >> > > Well, if there are multiple failures we ought to know about them > anyways. So if this particular coordinator fails the monitor tells us > about it first. We fix it and then move on to the datanode failure > detection. Since the datanodes have to be reachable via coordinators > and we have multiple coordinators around to load balance anyways, I > still think EXECUTE DIRECT via the coordinator node is a decent idea. > If we can round robin the calls via all the coordinators that would be > better too I think. > > Regards, > Nikhils > > >> Regards; >> ---------- >> Koichi Suzuki >> >> >> 2012/7/4 Koichi Suzuki <koi...@gm...>: >>> The background of xc_watchdog is to provide quicker means to detect >>> node fault. I understand that it is not compatible with what we're >>> doing in conventional PG applications, which are mostly based upon >>> psql -c 'select 1'. It takes at most 60sec to detect the error (TCP >>> timeout value). Some applications will be satisfied with this and >>> some may not. This is raised at the clustering summit in Ottawa and >>> the suggestion was to have this kind of means (watchdog). >>> >>> I don't know if PG people are interested in this now. Maybe we >>> should wait until such fault detection is more realistic issue. >>> Implementation is very straightforward. >>> >>> For datanode, I don't like to ask applications to connect to it >>> directly using psql because it is a kind of tricky use and it may mean >>> that we allow applications to connect to datanodes directly. So I >>> think we should encapsulate this with dedicated command like >>> xc_monitor. Xc_ping sounds good too but "ping" reminds me >>> consecutive monitoring. Current practice needs only one monitoring. >>> So I'd like xc_monitor (or node_monitor). >>> >>> Command like 'xc_monitor -Z nodetype -h host -p port' will not need >>> any modification to the core. Will be submitted soon as contrib >>> module. >>> >>> Regards; >>> ---------- >>> Koichi Suzuki >>> >>> >>> 2012/7/4 Nikhil Sontakke <ni...@st...>: >>>>> Are there people with a similar opinion to mine??? >>>>> >>>> >>>> +1 >>>> >>>> IMO too we should not be making any too invasive internal changes to >>>> support monitoring. What would be better would be to maybe allow >>>> commands which can be scripted and which can work against each of the >>>> components. >>>> >>>> For example, for the coordinator/datanode periodic "SELECT 1" commands >>>> should be good enough. Even doing an EXECUTE DIRECT via a coordinator >>>> to the datanodes will help. >>>> >>>> For GTM/GTM_Standy/GTM_Proxy components we should introduce "gtm_ctl >>>> ping" kinds of commands which will basically connect to them and see >>>> that they are responding ok. >>>> >>>> Such interfaces make it really easy for monitoring solutions like >>>> nagios, zabbix etc. to monitor them. These tools have been used for a >>>> while now to monitor Postgres and it should be a natural logical >>>> evolution for users to see them being used for PG XC. >>>> >>>> Regards, >>>> Nikhils >>>> -- >>>> StormDB - http://www.stormdb.com >>>> The Database Cloud > > > > -- > StormDB - http://www.stormdb.com > The Database Cloud |