You can subscribe to this list here.
2010 |
Jan
|
Feb
|
Mar
|
Apr
(10) |
May
(17) |
Jun
(3) |
Jul
|
Aug
|
Sep
(8) |
Oct
(18) |
Nov
(51) |
Dec
(74) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2011 |
Jan
(47) |
Feb
(44) |
Mar
(44) |
Apr
(102) |
May
(35) |
Jun
(25) |
Jul
(56) |
Aug
(69) |
Sep
(32) |
Oct
(37) |
Nov
(31) |
Dec
(16) |
2012 |
Jan
(34) |
Feb
(127) |
Mar
(218) |
Apr
(252) |
May
(80) |
Jun
(137) |
Jul
(205) |
Aug
(159) |
Sep
(35) |
Oct
(50) |
Nov
(82) |
Dec
(52) |
2013 |
Jan
(107) |
Feb
(159) |
Mar
(118) |
Apr
(163) |
May
(151) |
Jun
(89) |
Jul
(106) |
Aug
(177) |
Sep
(49) |
Oct
(63) |
Nov
(46) |
Dec
(7) |
2014 |
Jan
(65) |
Feb
(128) |
Mar
(40) |
Apr
(11) |
May
(4) |
Jun
(8) |
Jul
(16) |
Aug
(11) |
Sep
(4) |
Oct
(1) |
Nov
(5) |
Dec
(16) |
2015 |
Jan
(5) |
Feb
|
Mar
(2) |
Apr
(5) |
May
(4) |
Jun
(12) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(4) |
2019 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(2) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
S | M | T | W | T | F | S |
---|---|---|---|---|---|---|
|
|
|
1
(11) |
2
(7) |
3
(6) |
4
|
5
|
6
(1) |
7
(2) |
8
(2) |
9
(6) |
10
(2) |
11
|
12
(1) |
13
(6) |
14
(5) |
15
(4) |
16
(5) |
17
(4) |
18
|
19
|
20
(1) |
21
(3) |
22
(2) |
23
(1) |
24
|
25
|
26
|
27
(5) |
28
|
29
|
30
|
31
|
|
From: Mason S. <mas...@en...> - 2010-12-14 23:26:41
|
> Hi all, > > Here is the fix I propose based on the idea I proposed in a previous mail. > If a prepared transaction, partially committed, is aborted, this patch > gathers the handles to nodes where an error occurred and saves them on > GTM. > > The prepared transaction partially committed is kept alive on GTM, so > other transactions cannot see the partially committed results. > To complete the commit of the prepared transaction partially > committed, it is necessary to issue a COMMIT PREPARED 'gid'. > Once this command is issued, transaction will finish its commit properly. > > Mason, this solves the problem you saw when you made your tests. > It also respects the rule that a 2PC transaction partially committed > has to be committed. > Just took a brief look so far. Seems better. I understand that recovery and HA is in development and things are being done to lay the groundwork and improve, and that with this patch we are not trying to yet handle any and every situation. What happens if the coordinator fails before it can update GTM though? Also, I did a test and got this: WARNING: unexpected EOF on datanode connection WARNING: Connection to Datanode 1 has unexpected state 1 and will be dropped ERROR: Could not commit prepared transaction implicitely server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Failed. #0 0x907afe42 in kill$UNIX2003 () #1 0x9082223a in raise () #2 0x9082e679 in abort () #3 0x003917ce in ExceptionalCondition (conditionName=0x433f6c "!(((proc->xid) != ((TransactionId) 0)))", errorType=0x3ecfd4 "FailedAssertion", fileName=0x433f50 "procarray.c", lineNumber=283) at assert.c:57 #4 0x00280916 in ProcArrayEndTransaction (proc=0x41cca70, latestXid=1018) at procarray.c:283 #5 0x0005905c in AbortTransaction () at xact.c:2525 #6 0x00059a6e in AbortCurrentTransaction () at xact.c:3001 #7 0x00059b10 in AbortCurrentTransactionOnce () at xact.c:3094 #8 0x0029c8d6 in PostgresMain (argc=4, argv=0x1002ff8, username=0x1002fc8 "masonsharp") at postgres.c:3622 #9 0x0025851c in BackendRun (port=0x7016f0) at postmaster.c:3607 #10 0x00257883 in BackendStartup (port=0x7016f0) at postmaster.c:3216 #11 0x002542b5 in ServerLoop () at postmaster.c:1445 #12 0x002538c1 in PostmasterMain (argc=5, argv=0x7005a0) at postmaster.c:1098 #13 0x001cf2f1 in main (argc=5, argv=0x7005a0) at main.c:188 I did the same test as before. I killed a data node after it received a COMMIT PREPARED message. I think we should be able to continue. The good news is that I should not see partially committed data, which I do not. But if I try and manually commit it from a new connection to the coordinator: mds=# COMMIT PREPARED 'T1018'; ERROR: Could not get GID data from GTM Maybe GTM removed this info when the coordinator disconnected? (Or maybe implicit transactions are only associated with a certain connection?) I can see the transaction on one data node, but not the other. Ideally we would come up with a scheme where if the coordinator session does not notify GTM, we can somehow recover. Maybe this is my fault- I believe I advocated avoiding the extra work for implicit 2PC in the name of performance. :-) We can think about what to do in the short term, and how to handle in the long term. In the short term, your approach may be good enough once debugged, since it is a relatively rare case. Long term we could think about a thread that runs on GTM and wakes up every 30 or 60 seconds or so (configurable), collects implicit transactions from the nodes (extension to pg_prepared_xacts required?) and if it sees that the XID does not have an associated live connection, knows that something went awry. It then sees if it committed on any of the nodes. If not, rollback all, if it did on at least one, commit on all. If one of the data nodes is down, it won't do anything, perhaps log a warning. This would avoid user intervention, and would be pretty cool. Some of this code you may already have been working on for recovery and we could reuse here. Regards, Mason > Thanks, > > -- > Michael Paquier > http://michaelpq.users.sourceforge.net > -- Mason Sharp EnterpriseDB Corporation The Enterprise Postgres Company This e-mail message (and any attachment) is intended for the use of the individual or entity to whom it is addressed. This message contains information from EnterpriseDB Corporation that may be privileged, confidential, or exempt from disclosure under applicable law. If you are not the intended recipient or authorized to receive this for the intended recipient, any use, dissemination, distribution, retention, archiving, or copying of this communication is strictly prohibited. If you have received this e-mail in error, please notify the sender immediately by reply e-mail and delete this message. |
From: Michael P. <mic...@gm...> - 2010-12-14 08:07:59
|
Hi all, Here is the fix I propose based on the idea I proposed in a previous mail. If a prepared transaction, partially committed, is aborted, this patch gathers the handles to nodes where an error occurred and saves them on GTM. The prepared transaction partially committed is kept alive on GTM, so other transactions cannot see the partially committed results. To complete the commit of the prepared transaction partially committed, it is necessary to issue a COMMIT PREPARED 'gid'. Once this command is issued, transaction will finish its commit properly. Mason, this solves the problem you saw when you made your tests. It also respects the rule that a 2PC transaction partially committed has to be committed. Thanks, -- Michael Paquier http://michaelpq.users.sourceforge.net |
From: Koichi S. <koi...@gm...> - 2010-12-14 01:15:11
|
Hi, please see inline... ---------- Koichi Suzuki 2010/12/13 Mason Sharp <mas...@en...>: > On 12/12/10 9:28 PM, Michael Paquier wrote: >> >> I reviewed, and I thought it looked good, except for a possible issue with >> committing. >> >> I wanted to test what happened with implicit transactions when there was a >> failure. >> >> I executed this in one session: >> >> mds1=# begin; >> BEGIN >> mds1=# insert into mds1 values (1,1); >> INSERT 0 1 >> mds1=# insert into mds1 values (2,2); >> INSERT 0 1 >> mds1=# commit; >> >> Before committing, I fired up gdb for a coordinator session and a data >> node session. >> >> On one of the data nodes, when the COMMIT PREPARED was received, I killed >> the backend to see what would happen. On the Coordinator I saw this: >> >> >> WARNING: unexpected EOF on datanode connection >> WARNING: Connection to Datanode 1 has unexpected state 1 and will be >> dropped >> WARNING: Connection to Datanode 2 has unexpected state 1 and will be >> dropped >> >> ERROR: Could not commit prepared transaction implicitely >> PANIC: cannot abort transaction 10312, it was already committed >> server closed the connection unexpectedly >> This probably means the server terminated abnormally >> before or while processing the request. >> The connection to the server was lost. Attempting reset: Failed. >> >> I am not sure we should be aborting 10312, since it was committed on one >> of the nodes. It corresponds to the original prepared transaction. We also >> do not want a panic to happen. > > This has to be corrected. > If a PANIC happens on Coordinators each time a Datanode crashes, a simple > node crash would mess up the whole cluster. > It is a real problem I think. > > Yes. > > >> >> Next, I started a new coordinator session: >> >> mds1=# select * from mds1; >> col1 | col2 >> ------+------ >> 2 | 2 >> (1 row) >> >> >> I only see one of the rows. I thought, well, ok, we cannot undo a commit, >> and the other one must commit eventually. I was able to continue working >> normally: >> >> mds1=# insert into mds1 values (3,3); >> INSERT 0 1 >> mds1=# insert into mds1 values (4,4); >> INSERT 0 1 >> mds1=# insert into mds1 values (5,5); >> INSERT 0 1 >> mds1=# insert into mds1 values (6,6); >> INSERT 0 1 Are these statements run as a transaction block or did they run as "autocommit" statements? >> >> mds1=# select xmin,* from mds1; >> xmin | col1 | col2 >> -------+------+------ >> 10420 | 4 | 4 >> 10422 | 6 | 6 >> 10312 | 2 | 2 >> 10415 | 3 | 3 >> 10421 | 5 | 5 >> (5 rows) >> >> >> Note xmin keeps increasing because we closed the transaction on GTM at the >> "finish:" label. This may or may not be ok. > > This should be OK, no? If the above statements ran in "autocommit" mode, each statement ran as separate transaction. Xmin just indicates GXID which "created" the row. To determine if it is visible or not, we have to visit CLOG (if GXID is not "frozen") and the list of live transactions to see if it is running, committed or aborted. Then we can determine if a given row should be visible or not. Therefore, if the creator transaction is left just "PREPARED", the creator transaction information will remain in PgProc and is regarded "running", thus it should be regarded "invisible" from other transactions. Similar consideration should be made to see "xmac" value of the row, in the case of "update" or "delete" statement. Hope it helps. --- Koichi Suzuki > > Not necessarily. > > >> >> Meanwhile, on the failed data node: >> >> mds1=# select * from pg_prepared_xacts; >> WARNING: Do not have a GTM snapshot available >> WARNING: Do not have a GTM snapshot available >> transaction | gid | prepared | owner | >> database >> >> -------------+--------+-------------------------------+------------+---------- >> 10312 | T10312 | 2010-12-12 12:04:30.946287-05 | xxxxxx | mds1 >> (1 row) >> >> The transaction id is 10312. Normally this would still appear in >> snapshots, but we close it on GTM. >> >> What should we do? >> >> - We could leave as is. We may in the future have an XC monitoring process >> look for possible 2PC anomalies occasionally and send an alert so that they >> could be resolved by a DBA. > > I was thinking about an external utility that could clean up partially > committed or prepared transactions when a node crash happens. > This is a part of HA, so I think the only thing that should be corrected now > is the way errors are managed in the case of a partially committed prepared > transaction on nodes. > A PANIC is not acceptable for this case. > >> >> - We could instead choose not close out the transaction on GTM, so that >> the xid is still in snapshots. We could test if the rows are viewable or >> not. This could result in other side effects, but without further testing, I >> am guessing this may be similar to when an existing statement is running and >> cannot see a previously committed transaction that is open in its snapshot. >> So, I am thinking this is probably the preferable option (keeping it open on >> GTM until committed on all nodes), but we should test it. In any event, we >> should also fix the panic. > > If we let it open the transaction open on GTM, how do we know the GXID that > has been used for Commit (different from the one that has been used for > PREPARE as I recall)? > > We can test the behavior to see if it is ok to close this one out, > otherwise, we have more work to do... > > If we do a Commit prepare on the remaining node that crashed, we have to > commit the former PREPARE GXID, the former COMMIT PREPARED GXID and also the > GXID that is used to issue the new COMMIT PREPARED on the remaining node. > > It is easy to get the GXID used for former PREPARE and new COMMIT PREPARED. > But there is no real way yet to get back the GXID used for the former COMMIT > PREPARE. > I would see two ways to correct that: > 1) Save the former COMMIT PREPARED GXID in GTM, but this would really impact > performance. > 2) Save the COMMIT PREPARED GXID on Coordinator and let the GXACT open on > Coordinator (would be the best solution, but the transaction has already > been committed on Coordinator). > > I think we need to research the effects of this and see how the system > behaves if the partially failed commit prepared GXID is closed. I suppose it > could cause a problem with viewing pg_prepared_xacts. We don't want the > hint bits to get updated.... well, the first XID will be lower, so the lower > open xmin should keep this from having the tuple frozen. > > That's why I think the transaction should be to close the transaction on > GTM, and a monitoring agent would be in charge to commit on the remaining > nodes that crashed if a partial COMMIT has been done. > > From above, the node is still active and the query after the transaction is > returning partial results. It should be an all or nothing operation. If we > close the transaction on GTM, then it means that Postgres-XC is not atomic. > I think it is important to be ACID compliant. > > I think we should fix the panic, then test how the system behaves if, even > though the transaction is committed on one node, if we keep the transaction > open. The XID will appear in all the snapshots and the row should not be > viewable, and we can make sure that vacuum is also ok (should be). If it > works ok, then I think we should keep the transaction open on GTM until all > components have committed. > > > Btw, it is a complicated point, so other's opinion is completely welcome. > > Yes. > > Thanks, > > Mason > > Regards, > > -- > Michael Paquier > http://michaelpq.users.sourceforge.net > > > > -- > Mason Sharp > EnterpriseDB Corporation > The Enterprise Postgres Company > > > This e-mail message (and any attachment) is intended for the use of > the individual or entity to whom it is addressed. This message > contains information from EnterpriseDB Corporation that may be > privileged, confidential, or exempt from disclosure under applicable > law. If you are not the intended recipient or authorized to receive > this for the intended recipient, any use, dissemination, distribution, > retention, archiving, or copying of this communication is strictly > prohibited. If you have received this e-mail in error, please notify > the sender immediately by reply e-mail and delete this message. > > ------------------------------------------------------------------------------ > Oracle to DB2 Conversion Guide: Learn learn about native support for PL/SQL, > new data types, scalar functions, improved concurrency, built-in packages, > OCI, SQL*Plus, data movement tools, best practices and more. > http://p.sf.net/sfu/oracle-sfdev2dev > _______________________________________________ > Postgres-xc-developers mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > > |
From: Michael P. <mic...@gm...> - 2010-12-14 00:59:53
|
> >> mds1=# select xmin,* from mds1; >> xmin | col1 | col2 >> -------+------+------ >> 10420 | 4 | 4 >> 10422 | 6 | 6 >> 10312 | 2 | 2 >> 10415 | 3 | 3 >> 10421 | 5 | 5 >> (5 rows) >> >> >> Note xmin keeps increasing because we closed the transaction on GTM at the >> "finish:" label. This may or may not be ok. >> > This should be OK, no? > > Not necessarily. > I see, the transaction has been only partially committed so the xmin should keep the value of the oldest GXID (in this case the one that has not been completely committed). If we let it open the transaction open on GTM, how do we know the GXID that > has been used for Commit (different from the one that has been used for > PREPARE as I recall)? > > We can test the behavior to see if it is ok to close this one out, > otherwise, we have more work to do... > OK, I see, so not commit the transaction on GTM... In accordance with the current patch, we can know if implicit 2PC is used with CommitTransactionID I added in GlobalTransactionData for the implicit 2PC. If this value is set, it means that the transaction has been committed on Coordinator and that this Coordinator is using an implicit 2PC. This value set also means that the the nodes are partially committed or completely prepared. Here is my proposition. When an ABORT happens and CommitTransactionID is set, we do not commit the transaction ID used for PREPARE but we commit CommitTransactionID (no effect on visibility). On the other hand, we register the transaction as still prepared on GTM when Abort happens. This could be done with the API used for explicit 2PC. Then if there is a conflict, the DBA or a monitoring tool could use the explicit 2PC to finish the commit of the transaction partially prepared. This could make the deal. What do you think about that? I think we should fix the panic, then test how the system behaves if, even > though the transaction is committed on one node, if we keep the transaction > open. The XID will appear in all the snapshots and the row should not be > viewable, and we can make sure that vacuum is also ok (should be). If it > works ok, then I think we should keep the transaction open on GTM until all > components have committed. > The PANIC can be easily fixed. Without testing I would say that the system may be OK, as the transaction ID is still kept alive in snapshot. With that transaction is seen as alive in the cluster. -- Michael Paquier http://michaelpq.users.sourceforge.net |
From: Mason S. <mas...@en...> - 2010-12-13 15:04:14
|
On 12/12/10 9:28 PM, Michael Paquier wrote: > > > I reviewed, and I thought it looked good, except for a possible > issue with committing. > > I wanted to test what happened with implicit transactions when > there was a failure. > > I executed this in one session: > > mds1=# begin; > BEGIN > mds1=# insert into mds1 values (1,1); > INSERT 0 1 > mds1=# insert into mds1 values (2,2); > INSERT 0 1 > mds1=# commit; > > Before committing, I fired up gdb for a coordinator session and a > data node session. > > On one of the data nodes, when the COMMIT PREPARED was received, I > killed the backend to see what would happen. On the Coordinator I > saw this: > > > WARNING: unexpected EOF on datanode connection > WARNING: Connection to Datanode 1 has unexpected state 1 and will > be dropped > WARNING: Connection to Datanode 2 has unexpected state 1 and will > be dropped > > ERROR: Could not commit prepared transaction implicitely > PANIC: cannot abort transaction 10312, it was already committed > server closed the connection unexpectedly > This probably means the server terminated abnormally > before or while processing the request. > The connection to the server was lost. Attempting reset: Failed. > > I am not sure we should be aborting 10312, since it was committed > on one of the nodes. It corresponds to the original prepared > transaction. We also do not want a panic to happen. > > This has to be corrected. > If a PANIC happens on Coordinators each time a Datanode crashes, a > simple node crash would mess up the whole cluster. > It is a real problem I think. Yes. > > > Next, I started a new coordinator session: > > mds1=# select * from mds1; > col1 | col2 > ------+------ > 2 | 2 > (1 row) > > > I only see one of the rows. I thought, well, ok, we cannot undo a > commit, and the other one must commit eventually. I was able to > continue working normally: > > mds1=# insert into mds1 values (3,3); > INSERT 0 1 > mds1=# insert into mds1 values (4,4); > INSERT 0 1 > mds1=# insert into mds1 values (5,5); > INSERT 0 1 > mds1=# insert into mds1 values (6,6); > INSERT 0 1 > > mds1=# select xmin,* from mds1; > xmin | col1 | col2 > -------+------+------ > 10420 | 4 | 4 > 10422 | 6 | 6 > 10312 | 2 | 2 > 10415 | 3 | 3 > 10421 | 5 | 5 > (5 rows) > > > Note xmin keeps increasing because we closed the transaction on > GTM at the "finish:" label. This may or may not be ok. > > This should be OK, no? Not necessarily. > > > Meanwhile, on the failed data node: > > mds1=# select * from pg_prepared_xacts; > WARNING: Do not have a GTM snapshot available > WARNING: Do not have a GTM snapshot available > transaction | gid | prepared | owner > | database > -------------+--------+-------------------------------+------------+---------- > 10312 | T10312 | 2010-12-12 12:04:30.946287-05 | xxxxxx | mds1 > (1 row) > > The transaction id is 10312. Normally this would still appear in > snapshots, but we close it on GTM. > > What should we do? > > - We could leave as is. We may in the future have an XC monitoring > process look for possible 2PC anomalies occasionally and send an > alert so that they could be resolved by a DBA. > > I was thinking about an external utility that could clean up partially > committed or prepared transactions when a node crash happens. > This is a part of HA, so I think the only thing that should be > corrected now is the way errors are managed in the case of a partially > committed prepared transaction on nodes. > A PANIC is not acceptable for this case. > > > - We could instead choose not close out the transaction on GTM, so > that the xid is still in snapshots. We could test if the rows are > viewable or not. This could result in other side effects, but > without further testing, I am guessing this may be similar to when > an existing statement is running and cannot see a previously > committed transaction that is open in its snapshot. So, I am > thinking this is probably the preferable option (keeping it open > on GTM until committed on all nodes), but we should test it. In > any event, we should also fix the panic. > > > If we let it open the transaction open on GTM, how do we know the GXID > that has been used for Commit (different from the one that has been > used for PREPARE as I recall)? We can test the behavior to see if it is ok to close this one out, otherwise, we have more work to do... > If we do a Commit prepare on the remaining node that crashed, we have > to commit the former PREPARE GXID, the former COMMIT PREPARED GXID and > also the GXID that is used to issue the new COMMIT PREPARED on the > remaining node. > It is easy to get the GXID used for former PREPARE and new COMMIT > PREPARED. But there is no real way yet to get back the GXID used for > the former COMMIT PREPARE. > I would see two ways to correct that: > 1) Save the former COMMIT PREPARED GXID in GTM, but this would really > impact performance. > 2) Save the COMMIT PREPARED GXID on Coordinator and let the GXACT open > on Coordinator (would be the best solution, but the transaction has > already been committed on Coordinator). > I think we need to research the effects of this and see how the system behaves if the partially failed commit prepared GXID is closed. I suppose it could cause a problem with viewing pg_prepared_xacts. We don't want the hint bits to get updated.... well, the first XID will be lower, so the lower open xmin should keep this from having the tuple frozen. > That's why I think the transaction should be to close the transaction > on GTM, and a monitoring agent would be in charge to commit on the > remaining nodes that crashed if a partial COMMIT has been done. From above, the node is still active and the query after the transaction is returning partial results. It should be an all or nothing operation. If we close the transaction on GTM, then it means that Postgres-XC is not atomic. I think it is important to be ACID compliant. I think we should fix the panic, then test how the system behaves if, even though the transaction is committed on one node, if we keep the transaction open. The XID will appear in all the snapshots and the row should not be viewable, and we can make sure that vacuum is also ok (should be). If it works ok, then I think we should keep the transaction open on GTM until all components have committed. > > Btw, it is a complicated point, so other's opinion is completely welcome. > Yes. Thanks, Mason > Regards, > > -- > Michael Paquier > http://michaelpq.users.sourceforge.net > -- Mason Sharp EnterpriseDB Corporation The Enterprise Postgres Company This e-mail message (and any attachment) is intended for the use of the individual or entity to whom it is addressed. This message contains information from EnterpriseDB Corporation that may be privileged, confidential, or exempt from disclosure under applicable law. If you are not the intended recipient or authorized to receive this for the intended recipient, any use, dissemination, distribution, retention, archiving, or copying of this communication is strictly prohibited. If you have received this e-mail in error, please notify the sender immediately by reply e-mail and delete this message. |
From: xiong w. <wan...@gm...> - 2010-12-13 08:46:20
|
Dears, The enclosure is a patch for mutiple insert. It assignes values according to distributed method.If the table is distributed by Hash, the values will be assigned to specified datanodes according to partition key. If the table is distributed by Robin, the values will be assigned averagely to datanodes according to robin next. Otherwise, the statement will not be processed. Your advice will be apprecited. Regards, Benny |
From: 黄秋华 <ra...@16...> - 2010-12-13 07:25:38
|
hi there is still error in "insert...select" create 3 tables: CREATE TABLE INT4_TBL(f1 int4); CREATE TABLE FLOAT8_TBL(f1 float8); CREATE TABLE TEMP_GROUP (f1 INT4, f2 INT4, f3 FLOAT8); insert records insert into INT4_TBL values(1); insert into FLOAT8_TBL values(1.0); INSERT INTO TEMP_GROUP SELECT 1, (- i.f1), (- f.f1) FROM INT4_TBL i, FLOAT8_TBL f; now: select * from TEMP_GROUP; result is 0 row |
From: xiong w. <wan...@gm...> - 2010-12-13 07:19:40
|
Hi Mason, Take a look at the following cases: postgres=# CREATE TABLE atest5 (one int, two int, three int); CREATE TABLE postgres=# INSERT INTO atest5 (two) VALUES (3); server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. postgres=# INSERT INTO atest5(three) VALUES (3); server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Failed. Core was generated by `postgres: postgres postgres [local] INSERT '. Program terminated with signal 11, Segmentation fault. [New process 27754] #0 0x0000000000530bbf in CopySendString (cstate=0x5c82238, str=0x0) at copy.c:450 450 appendBinaryStringInfo(cstate->fe_msgbuf, str, strlen(str)); (gdb) bt #0 0x0000000000530bbf in CopySendString (cstate=0x5c82238, str=0x0) at copy.c:450 #1 0x000000000053407b in CopyOneRowTo (cstate=0x5c82238, tupleOid=0, values=0x5c82f68, nulls=0x5c82f98 "\001\001") at copy.c:1716 #2 0x0000000000538789 in DoInsertSelectCopy (estate=0x5c7da50, slot=0x5c81b88) at copy.c:4017 #3 0x000000000058b185 in ExecutePlan (estate=0x5c7da50, planstate=0x5c7f8e0, operation=CMD_INSERT, numberTuples=0, direction=ForwardScanDirection, dest=0x5c78860) at execMain.c:1698 #4 0x0000000000588f3c in standard_ExecutorRun (queryDesc=0x5c36330, direction=ForwardScanDirection, count=0) at execMain.c:312 #5 0x0000000000588e45 in ExecutorRun (queryDesc=0x5c36330, direction=ForwardScanDirection, count=0) at execMain.c:261 #6 0x000000000068f51a in ProcessQuery (plan=0x5c78780, sourceText=0x5c13ee0 "INSERT INTO atest5(three) VALUES (3);", params=0x0, dest=0x5c78860, completionTag=0x7fff6d345520 "") at pquery.c:205 #7 0x0000000000690c8b in PortalRunMulti (portal=0x5c79a30, isTopLevel=1 '\001', dest=0x5c78860, altdest=0x5c78860, completionTag=0x7fff6d345520 "") at pquery.c:1299 #8 0x000000000069036f in PortalRun (portal=0x5c79a30, count=9223372036854775807, isTopLevel=1 '\001', dest=0x5c78860, altdest=0x5c78860, completionTag=0x7fff6d345520 "") at pquery.c:843 #9 0x000000000068a64a in exec_simple_query (query_string=0x5c13ee0 "INSERT INTO atest5(three) VALUES (3);") at postgres.c:1054 #10 0x000000000068e5b8 in PostgresMain (argc=4, argv=0x5b69520, username=0x5b694e0 "postgres") at postgres.c:3767 #11 0x000000000065620a in BackendRun (port=0x5b8aba0) at postmaster.c:3607 #12 0x0000000000655767 in BackendStartup (port=0x5b8aba0) at postmaster.c:3216 #13 0x0000000000652b32 in ServerLoop () at postmaster.c:1445 #14 0x00000000006522d8 in PostmasterMain (argc=9, argv=0x5b668d0) at postmaster.c:1098 #15 0x00000000005d9c1f in main (argc=9, argv=0x5b668d0) at main.c:188 Regards, Benny |
From: Michael P. <mic...@gm...> - 2010-12-13 02:28:34
|
> > I reviewed, and I thought it looked good, except for a possible issue with > committing. > > I wanted to test what happened with implicit transactions when there was a > failure. > > I executed this in one session: > > mds1=# begin; > BEGIN > mds1=# insert into mds1 values (1,1); > INSERT 0 1 > mds1=# insert into mds1 values (2,2); > INSERT 0 1 > mds1=# commit; > > Before committing, I fired up gdb for a coordinator session and a data node > session. > > On one of the data nodes, when the COMMIT PREPARED was received, I killed > the backend to see what would happen. On the Coordinator I saw this: > > > WARNING: unexpected EOF on datanode connection > WARNING: Connection to Datanode 1 has unexpected state 1 and will be > dropped > WARNING: Connection to Datanode 2 has unexpected state 1 and will be > dropped > > ERROR: Could not commit prepared transaction implicitely > PANIC: cannot abort transaction 10312, it was already committed > server closed the connection unexpectedly > This probably means the server terminated abnormally > before or while processing the request. > The connection to the server was lost. Attempting reset: Failed. > > I am not sure we should be aborting 10312, since it was committed on one of > the nodes. It corresponds to the original prepared transaction. We also do > not want a panic to happen. > This has to be corrected. If a PANIC happens on Coordinators each time a Datanode crashes, a simple node crash would mess up the whole cluster. It is a real problem I think. > Next, I started a new coordinator session: > > mds1=# select * from mds1; > col1 | col2 > ------+------ > 2 | 2 > (1 row) > > > I only see one of the rows. I thought, well, ok, we cannot undo a commit, > and the other one must commit eventually. I was able to continue working > normally: > > mds1=# insert into mds1 values (3,3); > INSERT 0 1 > mds1=# insert into mds1 values (4,4); > INSERT 0 1 > mds1=# insert into mds1 values (5,5); > INSERT 0 1 > mds1=# insert into mds1 values (6,6); > INSERT 0 1 > > mds1=# select xmin,* from mds1; > xmin | col1 | col2 > -------+------+------ > 10420 | 4 | 4 > 10422 | 6 | 6 > 10312 | 2 | 2 > 10415 | 3 | 3 > 10421 | 5 | 5 > (5 rows) > > > Note xmin keeps increasing because we closed the transaction on GTM at the > "finish:" label. This may or may not be ok. > This should be OK, no? > > Meanwhile, on the failed data node: > > mds1=# select * from pg_prepared_xacts; > WARNING: Do not have a GTM snapshot available > WARNING: Do not have a GTM snapshot available > transaction | gid | prepared | owner | > database > > -------------+--------+-------------------------------+------------+---------- > 10312 | T10312 | 2010-12-12 12:04:30.946287-05 | xxxxxx | mds1 > (1 row) > > The transaction id is 10312. Normally this would still appear in snapshots, > but we close it on GTM. > > What should we do? > > - We could leave as is. We may in the future have an XC monitoring process > look for possible 2PC anomalies occasionally and send an alert so that they > could be resolved by a DBA. > I was thinking about an external utility that could clean up partially committed or prepared transactions when a node crash happens. This is a part of HA, so I think the only thing that should be corrected now is the way errors are managed in the case of a partially committed prepared transaction on nodes. A PANIC is not acceptable for this case. > - We could instead choose not close out the transaction on GTM, so that the > xid is still in snapshots. We could test if the rows are viewable or not. > This could result in other side effects, but without further testing, I am > guessing this may be similar to when an existing statement is running and > cannot see a previously committed transaction that is open in its snapshot. > So, I am thinking this is probably the preferable option (keeping it open on > GTM until committed on all nodes), but we should test it. In any event, we > should also fix the panic. > If we let it open the transaction open on GTM, how do we know the GXID that has been used for Commit (different from the one that has been used for PREPARE as I recall)? If we do a Commit prepare on the remaining node that crashed, we have to commit the former PREPARE GXID, the former COMMIT PREPARED GXID and also the GXID that is used to issue the new COMMIT PREPARED on the remaining node. It is easy to get the GXID used for former PREPARE and new COMMIT PREPARED. But there is no real way yet to get back the GXID used for the former COMMIT PREPARE. I would see two ways to correct that: 1) Save the former COMMIT PREPARED GXID in GTM, but this would really impact performance. 2) Save the COMMIT PREPARED GXID on Coordinator and let the GXACT open on Coordinator (would be the best solution, but the transaction has already been committed on Coordinator). That's why I think the transaction should be to close the transaction on GTM, and a monitoring agent would be in charge to commit on the remaining nodes that crashed if a partial COMMIT has been done. Btw, it is a complicated point, so other's opinion is completely welcome. Regards, -- Michael Paquier http://michaelpq.users.sourceforge.net |
From: Michael P. <mic...@gm...> - 2010-12-13 01:14:31
|
> > > So here are the main lines I propose to fix that, with an implementation > inside portal. > > First, since DDL synchronize commit, it is possible Coordinators to > interact between themselves, > so the query should be extended to: > -- EXECUTE DIRECT ON (COORDINATOR num | NODE num, ...) query > to be compared to what is in the current code: > -- EXECUTE DIRECT ON (COORDINATOR | NODE num, ...) query > > Sounds good. What about > > EXECUTE DIRECT ON ([COORDINATOR num[,num...]] [NODE num[,num...]]) query > > maybe it is useful to see on all nodes at once with a single command. > EXECUTE DIRECT ON COORDINATOR * query; may also be possible. This way of manipulating multiple node numbers at the same time or even include all the nodes at the same time is already included in gram.y. CLEAN CONNECTION also uses it. > > BTW, in GridSQL we optionally include the source node number in the tuples > returned. We should add something similar at some point (don't need this now > though). Similarly, something like a NODE() function would be nice, to even > be able to do SELECT *,NODE(). > > Are the coordinator numbers and node numbers are separate? That is, we can > have both coordinator 1 and data node 1? > We can have a Coordinator 1 and a Datanode 1. With the registration features that will be added soon, nodes are differenced with their types and their Ids. > Then, we have to modify query analyze in analyze.c. > There is an API in the code called transformExecDirectStmt that transforms > the query and changes its shape. > In the analyze part, you have to check if the query is launched locally or > not. > If it is not local, change the node type to Remote Query to make it run in > ExecRemoteQuery when launching it. > > If it is local, you have to parse the query with parse_query and then to > analyze it with parse_analyze. > After parsing and analyzing, change its node type to Query, to make it > launch locally. > > The difficult part of this implementation does not seem to be the analyze > and parsing part, it is in the planner. > The question is: > Should the query go through pgxc_planner or normal planner if it is local? > Here is my proposal: > pgxc_planner looks to be better but we have to put a flag (when analyzing) > in the query to be sure > to launch the query on the correct nodes when determining execution nodes > in get_plan_nodes. > > Yeah, I think we could go either way, but we know that with EXECUTE DIRECT > it will always be a single step, so I think it is OK to put it in > pgxc_planner. It should be pretty straight-forward though, I think we just > need to additionally set step->exec_nodes, which we know already from > parsing. It may be that we need to extend this though to indicate to > execute on specific Coordinators. > I agree. I had a look at the code and it should not be that complicated to fix finally. The only difficulty, if it is one, is to set correctly the execution node list when analyzing the data. It is also necessary to modify a little bit ExecRemoteQuery to be able to execute on single or multiple Coordinators (not the case yet). -- Michael Paquier http://michaelpq.users.sourceforge.net |
From: Mason S. <mas...@en...> - 2010-12-12 19:37:22
|
On 12/6/10 12:32 AM, Michael Paquier wrote: > > I changed deeply the algorithm to avoid code duplication for implicit 2PC. > With the patch attached, Coordinator is prepared only if 2 > Coordinators at least are involved in a transaction (DDL case). > If only one Coordinator is involved in transaction or if transaction > does not contain any DDL, transaction is prepared on the involved > nodes only. > > To sum up: > 1) for DDL transaction (more than 1 Coordinator and more than 1 > Datanode involved in a transaction) > - prepare on Coordinator (2PC file written) > - prepare on Nodes (2PC file written) > - Commit prepared on Coordinator > - Commit prepared on Datanodes > 2) If no Coordinator, or only one Coordinator is involved in a transaction > - prepare on nodes > - commit on Coordinator > - Commit on Datanodes > > Note: I didn' t put calls to implicit prepare functions in a separate > functions because modification of CommitTransaction() are really light. > I reviewed, and I thought it looked good, except for a possible issue with committing. I wanted to test what happened with implicit transactions when there was a failure. I executed this in one session: mds1=# begin; BEGIN mds1=# insert into mds1 values (1,1); INSERT 0 1 mds1=# insert into mds1 values (2,2); INSERT 0 1 mds1=# commit; Before committing, I fired up gdb for a coordinator session and a data node session. On one of the data nodes, when the COMMIT PREPARED was received, I killed the backend to see what would happen. On the Coordinator I saw this: WARNING: unexpected EOF on datanode connection WARNING: Connection to Datanode 1 has unexpected state 1 and will be dropped WARNING: Connection to Datanode 2 has unexpected state 1 and will be dropped ERROR: Could not commit prepared transaction implicitely PANIC: cannot abort transaction 10312, it was already committed server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Failed. I am not sure we should be aborting 10312, since it was committed on one of the nodes. It corresponds to the original prepared transaction. We also do not want a panic to happen. Next, I started a new coordinator session: mds1=# select * from mds1; col1 | col2 ------+------ 2 | 2 (1 row) I only see one of the rows. I thought, well, ok, we cannot undo a commit, and the other one must commit eventually. I was able to continue working normally: mds1=# insert into mds1 values (3,3); INSERT 0 1 mds1=# insert into mds1 values (4,4); INSERT 0 1 mds1=# insert into mds1 values (5,5); INSERT 0 1 mds1=# insert into mds1 values (6,6); INSERT 0 1 mds1=# select xmin,* from mds1; xmin | col1 | col2 -------+------+------ 10420 | 4 | 4 10422 | 6 | 6 10312 | 2 | 2 10415 | 3 | 3 10421 | 5 | 5 (5 rows) Note xmin keeps increasing because we closed the transaction on GTM at the "finish:" label. This may or may not be ok. Meanwhile, on the failed data node: mds1=# select * from pg_prepared_xacts; WARNING: Do not have a GTM snapshot available WARNING: Do not have a GTM snapshot available transaction | gid | prepared | owner | database -------------+--------+-------------------------------+------------+---------- 10312 | T10312 | 2010-12-12 12:04:30.946287-05 | xxxxxx | mds1 (1 row) The transaction id is 10312. Normally this would still appear in snapshots, but we close it on GTM. What should we do? - We could leave as is. We may in the future have an XC monitoring process look for possible 2PC anomalies occasionally and send an alert so that they could be resolved by a DBA. - We could instead choose not close out the transaction on GTM, so that the xid is still in snapshots. We could test if the rows are viewable or not. This could result in other side effects, but without further testing, I am guessing this may be similar to when an existing statement is running and cannot see a previously committed transaction that is open in its snapshot. So, I am thinking this is probably the preferable option (keeping it open on GTM until committed on all nodes), but we should test it. In any event, we should also fix the panic. It may be that we had a similar problem in the existing code before this patch, although I did some testing a few months back Pavan's crash test patch and things seemed stable. Also, we might want to check that explicit 2PC also handles this OK. Thanks, Mason > Regards, > > -- > Michael Paquier > http://michaelpq.users.sourceforge.net > > > ------------------------------------------------------------------------------ > What happens now with your Lotus Notes apps - do you make another costly > upgrade, or settle for being marooned without product support? Time to move > off Lotus Notes and onto the cloud with Force.com, apps are easier to build, > use, and manage than apps on traditional platforms. Sign up for the Lotus > Notes Migration Kit to learn more. http://p.sf.net/sfu/salesforce-d2d > > > _______________________________________________ > Postgres-xc-developers mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers -- Mason Sharp EnterpriseDB Corporation The Enterprise Postgres Company This e-mail message (and any attachment) is intended for the use of the individual or entity to whom it is addressed. This message contains information from EnterpriseDB Corporation that may be privileged, confidential, or exempt from disclosure under applicable law. If you are not the intended recipient or authorized to receive this for the intended recipient, any use, dissemination, distribution, retention, archiving, or copying of this communication is strictly prohibited. If you have received this e-mail in error, please notify the sender immediately by reply e-mail and delete this message. |
From: Mason S. <mas...@en...> - 2010-12-10 15:35:05
|
On 12/9/10 7:02 PM, Michael Paquier wrote: > Hi all, > > I had a look at the code to try to understand how it could be able to > fix EXECUTE DIRECT. > It is going to be needed to implement some HA features, and I believe > users would find this functionality useful if fixed. > > I am seeing two ways to fix it now. > The first one could be to do the implementation out of Portal as we > did before moving XC query execution there. > This was the first implementation of EXECUTE DIRECT that was done. > It looks to be an easy solution, but if we have a look long-term, it > does not follow the will to move query execution inside portal. I agree- it may be ok to do it outside and grab some old code, but I think we should try and use the current code and make it work. > > So here are the main lines I propose to fix that, with an > implementation inside portal. > > First, since DDL synchronize commit, it is possible Coordinators to > interact between themselves, > so the query should be extended to: > -- EXECUTE DIRECT ON (COORDINATOR num | NODE num, ...) query > to be compared to what is in the current code: > -- EXECUTE DIRECT ON (COORDINATOR | NODE num, ...) query Sounds good. What about EXECUTE DIRECT ON ([COORDINATOR num[,num...]] [NODE num[,num...]]) query maybe it is useful to see on all nodes at once with a single command. BTW, in GridSQL we optionally include the source node number in the tuples returned. We should add something similar at some point (don't need this now though). Similarly, something like a NODE() function would be nice, to even be able to do SELECT *,NODE(). Are the coordinator numbers and node numbers are separate? That is, we can have both coordinator 1 and data node 1? > > Then, we have to modify query analyze in analyze.c. > There is an API in the code called transformExecDirectStmt that > transforms the query and changes its shape. > In the analyze part, you have to check if the query is launched > locally or not. > If it is not local, change the node type to Remote Query to make it > run in ExecRemoteQuery when launching it. > > If it is local, you have to parse the query with parse_query and then > to analyze it with parse_analyze. > After parsing and analyzing, change its node type to Query, to make it > launch locally. > > The difficult part of this implementation does not seem to be the > analyze and parsing part, it is in the planner. > The question is: > Should the query go through pgxc_planner or normal planner if it is local? > Here is my proposal: > pgxc_planner looks to be better but we have to put a flag (when > analyzing) in the query to be sure > to launch the query on the correct nodes when determining execution > nodes in get_plan_nodes. > Yeah, I think we could go either way, but we know that with EXECUTE DIRECT it will always be a single step, so I think it is OK to put it in pgxc_planner. It should be pretty straight-forward though, I think we just need to additionally set step->exec_nodes, which we know already from parsing. It may be that we need to extend this though to indicate to execute on specific Coordinators. Overall, I don't think that this should be difficult to get working again. Mason > Regards, > > ------ > Michael Paquier > http://michaelpq.users.sourceforge.net > > > ------------------------------------------------------------------------------ > > > _______________________________________________ > Postgres-xc-developers mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers -- Mason Sharp EnterpriseDB Corporation The Enterprise Postgres Company This e-mail message (and any attachment) is intended for the use of the individual or entity to whom it is addressed. This message contains information from EnterpriseDB Corporation that may be privileged, confidential, or exempt from disclosure under applicable law. If you are not the intended recipient or authorized to receive this for the intended recipient, any use, dissemination, distribution, retention, archiving, or copying of this communication is strictly prohibited. If you have received this e-mail in error, please notify the sender immediately by reply e-mail and delete this message. |
From: Michael P. <mic...@gm...> - 2010-12-10 00:02:53
|
Hi all, I had a look at the code to try to understand how it could be able to fix EXECUTE DIRECT. It is going to be needed to implement some HA features, and I believe users would find this functionality useful if fixed. I am seeing two ways to fix it now. The first one could be to do the implementation out of Portal as we did before moving XC query execution there. This was the first implementation of EXECUTE DIRECT that was done. It looks to be an easy solution, but if we have a look long-term, it does not follow the will to move query execution inside portal. So here are the main lines I propose to fix that, with an implementation inside portal. First, since DDL synchronize commit, it is possible Coordinators to interact between themselves, so the query should be extended to: -- EXECUTE DIRECT ON (COORDINATOR num | NODE num, ...) query to be compared to what is in the current code: -- EXECUTE DIRECT ON (COORDINATOR | NODE num, ...) query Then, we have to modify query analyze in analyze.c. There is an API in the code called transformExecDirectStmt that transforms the query and changes its shape. In the analyze part, you have to check if the query is launched locally or not. If it is not local, change the node type to Remote Query to make it run in ExecRemoteQuery when launching it. If it is local, you have to parse the query with parse_query and then to analyze it with parse_analyze. After parsing and analyzing, change its node type to Query, to make it launch locally. The difficult part of this implementation does not seem to be the analyze and parsing part, it is in the planner. The question is: Should the query go through pgxc_planner or normal planner if it is local? Here is my proposal: pgxc_planner looks to be better but we have to put a flag (when analyzing) in the query to be sure to launch the query on the correct nodes when determining execution nodes in get_plan_nodes. Regards, ------ Michael Paquier http://michaelpq.users.sourceforge.net |
From: Mason S. <mas...@en...> - 2010-12-09 13:39:24
|
On 12/8/10 8:11 PM, xiong wang wrote: > Hi Koichi, > > Yes, I consider sequence should be created on datanodes but not only > on coordinators. But all the sequence value should be from GTM. > > Regards, > Benny > 2010/12/9 Koichi Suzuki <ko...@in...>: >> In the current implementation, sequence value is supplied by GTM, as you >> know. It is assumed that this value is supplied to the datanode through >> the coordinator. In the case of your case, default value must be handled >> by the datanode and the datanode has to inquire GTM for the nextval of the >> sequence. >> >> I'm afraid this is missing in the current code. >> --- >> Koichi Benny, In general we try and have the Coordinator manage everything and provide the data nodes with everything they need. I can think of a case that we should test though, when COPY is used. We would have to make sure that we are providing values for the sequence column if it is not included in an explicit column list. Thanks, Mason >> (2010年12月08日 19:33), xiong wang wrote: >>> Dears, >>> >>> steps: >>> postgres=# create sequence seq start with 1; >>> CREATE SEQUENCE >>> postgres=# create table t(a int default nextval('seq'), b int); >>> ERROR: Could not commit (or autocommit) data node connection >>> >>> datanode log as follows: >>> LOG: statement: create table t(a int default nextval('seq'), b int); >>> ERROR: relation "seq" does not exist >>> >>> When I checked the source code, I found sequence can't be created on >>> datanodes. Could you explain why? >>> >>> Regards, >>> Benny >>> >>> >>> ------------------------------------------------------------------------------ >>> What happens now with your Lotus Notes apps - do you make another costly >>> upgrade, or settle for being marooned without product support? Time to >>> move >>> off Lotus Notes and onto the cloud with Force.com, apps are easier to >>> build, >>> use, and manage than apps on traditional platforms. Sign up for the Lotus >>> Notes Migration Kit to learn more. http://p.sf.net/sfu/salesforce-d2d >>> _______________________________________________ >>> Postgres-xc-developers mailing list >>> Pos...@li... >>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >>> >> > ------------------------------------------------------------------------------ > This SF Dev2Dev email is sponsored by: > > WikiLeaks The End of the Free Internet > http://p.sf.net/sfu/therealnews-com > _______________________________________________ > Postgres-xc-developers mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers -- Mason Sharp EnterpriseDB Corporation The Enterprise Postgres Company This e-mail message (and any attachment) is intended for the use of the individual or entity to whom it is addressed. This message contains information from EnterpriseDB Corporation that may be privileged, confidential, or exempt from disclosure under applicable law. If you are not the intended recipient or authorized to receive this for the intended recipient, any use, dissemination, distribution, retention, archiving, or copying of this communication is strictly prohibited. If you have received this e-mail in error, please notify the sender immediately by reply e-mail and delete this message. |
From: Mason S. <mas...@en...> - 2010-12-09 13:30:16
|
Thanks for reporting this. This is an issue that is already in the tracker, issue 3019765. Thanks, Mason On 12/9/10 6:06 AM, xiong wang wrote: > Dears, > > steps: > > postgres=# begin; > BEGIN > postgres=# create sequence seq2; > CREATE SEQUENCE > postgres=# rollback; > ROLLBACK > postgres=# create sequence seq2; > ERROR: GTM error, could not create sequence > > GTM log: > 2:1089829184:2010-12-09 19:02:41.941 CST -LOG: Sequence with the > given key already exists > LOCATION: seq_add_seqinfo, gtm_seq.c:169 > 3:1089829184:2010-12-09 19:02:41.941 CST -ERROR: Failed to open a new sequence > LOCATION: ProcessSequenceInitCommand, gtm_seq.c:751 > > When rollback a transaction, the sequence created in the transaction > cann't be removed from GTM. > > Regards, > Benny > > ------------------------------------------------------------------------------ > This SF Dev2Dev email is sponsored by: > > WikiLeaks The End of the Free Internet > http://p.sf.net/sfu/therealnews-com > _______________________________________________ > Postgres-xc-developers mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers -- Mason Sharp EnterpriseDB Corporation The Enterprise Postgres Company This e-mail message (and any attachment) is intended for the use of the individual or entity to whom it is addressed. This message contains information from EnterpriseDB Corporation that may be privileged, confidential, or exempt from disclosure under applicable law. If you are not the intended recipient or authorized to receive this for the intended recipient, any use, dissemination, distribution, retention, archiving, or copying of this communication is strictly prohibited. If you have received this e-mail in error, please notify the sender immediately by reply e-mail and delete this message. |
From: xiong w. <wan...@gm...> - 2010-12-09 11:06:42
|
Dears, steps: postgres=# begin; BEGIN postgres=# create sequence seq2; CREATE SEQUENCE postgres=# rollback; ROLLBACK postgres=# create sequence seq2; ERROR: GTM error, could not create sequence GTM log: 2:1089829184:2010-12-09 19:02:41.941 CST -LOG: Sequence with the given key already exists LOCATION: seq_add_seqinfo, gtm_seq.c:169 3:1089829184:2010-12-09 19:02:41.941 CST -ERROR: Failed to open a new sequence LOCATION: ProcessSequenceInitCommand, gtm_seq.c:751 When rollback a transaction, the sequence created in the transaction cann't be removed from GTM. Regards, Benny |
From: Koichi S. <ko...@in...> - 2010-12-09 01:15:12
|
This should be put into the track. I discussed with Michael and found that the issue is not that simple because we should consider the case of the replicated table. It is not correct to get sequence value directly from GTM to Datanode in this case. Coordinator should handle this. Regards; --- Koichi (2010年12月09日 10:11), xiong wang wrote: > Hi Koichi, > > Yes, I consider sequence should be created on datanodes but not only > on coordinators. But all the sequence value should be from GTM. > > Regards, > Benny > > 2010/12/9 Koichi Suzuki<ko...@in...>: >> In the current implementation, sequence value is supplied by GTM, as you >> know. It is assumed that this value is supplied to the datanode through >> the coordinator. In the case of your case, default value must be handled >> by the datanode and the datanode has to inquire GTM for the nextval of the >> sequence. >> >> I'm afraid this is missing in the current code. >> --- >> Koichi >> >> (2010年12月08日 19:33), xiong wang wrote: >>> >>> Dears, >>> >>> steps: >>> postgres=# create sequence seq start with 1; >>> CREATE SEQUENCE >>> postgres=# create table t(a int default nextval('seq'), b int); >>> ERROR: Could not commit (or autocommit) data node connection >>> >>> datanode log as follows: >>> LOG: statement: create table t(a int default nextval('seq'), b int); >>> ERROR: relation "seq" does not exist >>> >>> When I checked the source code, I found sequence can't be created on >>> datanodes. Could you explain why? >>> >>> Regards, >>> Benny >>> >>> >>> ------------------------------------------------------------------------------ >>> What happens now with your Lotus Notes apps - do you make another costly >>> upgrade, or settle for being marooned without product support? Time to >>> move >>> off Lotus Notes and onto the cloud with Force.com, apps are easier to >>> build, >>> use, and manage than apps on traditional platforms. Sign up for the Lotus >>> Notes Migration Kit to learn more. http://p.sf.net/sfu/salesforce-d2d >>> _______________________________________________ >>> Postgres-xc-developers mailing list >>> Pos...@li... >>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >>> >> >> > |
From: xiong w. <wan...@gm...> - 2010-12-09 01:11:12
|
Hi Koichi, Yes, I consider sequence should be created on datanodes but not only on coordinators. But all the sequence value should be from GTM. Regards, Benny 2010/12/9 Koichi Suzuki <ko...@in...>: > In the current implementation, sequence value is supplied by GTM, as you > know. It is assumed that this value is supplied to the datanode through > the coordinator. In the case of your case, default value must be handled > by the datanode and the datanode has to inquire GTM for the nextval of the > sequence. > > I'm afraid this is missing in the current code. > --- > Koichi > > (2010年12月08日 19:33), xiong wang wrote: >> >> Dears, >> >> steps: >> postgres=# create sequence seq start with 1; >> CREATE SEQUENCE >> postgres=# create table t(a int default nextval('seq'), b int); >> ERROR: Could not commit (or autocommit) data node connection >> >> datanode log as follows: >> LOG: statement: create table t(a int default nextval('seq'), b int); >> ERROR: relation "seq" does not exist >> >> When I checked the source code, I found sequence can't be created on >> datanodes. Could you explain why? >> >> Regards, >> Benny >> >> >> ------------------------------------------------------------------------------ >> What happens now with your Lotus Notes apps - do you make another costly >> upgrade, or settle for being marooned without product support? Time to >> move >> off Lotus Notes and onto the cloud with Force.com, apps are easier to >> build, >> use, and manage than apps on traditional platforms. Sign up for the Lotus >> Notes Migration Kit to learn more. http://p.sf.net/sfu/salesforce-d2d >> _______________________________________________ >> Postgres-xc-developers mailing list >> Pos...@li... >> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >> > > |
From: Koichi S. <ko...@in...> - 2010-12-09 00:36:50
|
In the current implementation, sequence value is supplied by GTM, as you know. It is assumed that this value is supplied to the datanode through the coordinator. In the case of your case, default value must be handled by the datanode and the datanode has to inquire GTM for the nextval of the sequence. I'm afraid this is missing in the current code. --- Koichi (2010年12月08日 19:33), xiong wang wrote: > Dears, > > steps: > postgres=# create sequence seq start with 1; > CREATE SEQUENCE > postgres=# create table t(a int default nextval('seq'), b int); > ERROR: Could not commit (or autocommit) data node connection > > datanode log as follows: > LOG: statement: create table t(a int default nextval('seq'), b int); > ERROR: relation "seq" does not exist > > When I checked the source code, I found sequence can't be created on > datanodes. Could you explain why? > > Regards, > Benny > > ------------------------------------------------------------------------------ > What happens now with your Lotus Notes apps - do you make another costly > upgrade, or settle for being marooned without product support? Time to move > off Lotus Notes and onto the cloud with Force.com, apps are easier to build, > use, and manage than apps on traditional platforms. Sign up for the Lotus > Notes Migration Kit to learn more. http://p.sf.net/sfu/salesforce-d2d > _______________________________________________ > Postgres-xc-developers mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > |
From: xiong w. <wan...@gm...> - 2010-12-08 10:33:46
|
Dears, steps: postgres=# create sequence seq start with 1; CREATE SEQUENCE postgres=# create table t(a int default nextval('seq'), b int); ERROR: Could not commit (or autocommit) data node connection datanode log as follows: LOG: statement: create table t(a int default nextval('seq'), b int); ERROR: relation "seq" does not exist When I checked the source code, I found sequence can't be created on datanodes. Could you explain why? Regards, Benny |
From: Michael P. <mic...@gm...> - 2010-12-08 07:24:49
|
Continuing on the modifications for 2PC, I finished a patch extending 2PC file and 2PC xact data protocol. With the patch attached, 2PC data contains the following information: - if the 2PC is implicit or explicit - if the transaction prepared contained DDL or not (if yes, it means that the transaction has also been prepared on Coordinators) - the coordinator number from where the 2PC has been issued - the list of nodes where the transaction has been prepared. In case of a transaction that prepared only on Coordinators, the list of nodes is set to "n" (case of sequence transactions) 2PC information below is sent down to nodes when an implicit or explicit prepare is made, and only on the necessary nodes. The patch also contains an extension of the view pg_prepared_xacts to be able to get from catalog table the extended 2PC information as well as the usual 2PC data. If you want to make tests, you have to apply first on HEAD implicit2pc6.patch and then apply implicit2pc6_extend_pg_prepared_xacts.patch. I forgot to say that, as pg_prepared_xacts is a catalog view, you have to connect to a node directly to get the 2PC information. The information could also be obtained with EXECUTE DIRECT but currently this functionality is broken. In the test I did, of course I checked if the views were OK, but also I checked that recovery of prepared transactions was properly done. If you want to try, kill a postgres process with SIGQUIT (kill -3) and relaunch it. 2PC data will recover from the 2PC files correctly. You can check by launching "select * from pg_prepared_xacts;" before and after stopping the postgres instance. Regards, Michael -- Michael Paquier http://michaelpq.users.sourceforge.net |
From: Mason S. <mas...@en...> - 2010-12-07 08:01:09
|
How about departing into the same number if insert statements as there are nodes? That way we just send one insert to each. I think if the user/app uses this, they may be concerned about efficiency and performance, so I think passing in just a single statement to the data nodes is a good idea. Regards, Mason Sent from my IPhone On Dec 7, 2010, at 8:56 AM, xiong wang <wan...@gm...> wrote: > Dears, > > I have two solutions to resolve the bug#3013562. > > 1. I will rewrite insert statement if it is an mutiple insert. I will > deparse the mutiple insert statement into single insert statements > which count is the same as values_list in function pg_rewrite_query. > 2. I would like to use prepare statement to replace an mutiple insert. > In other word, during a mutiple insert statement executing, I will use > a prepare statement node to replace the mutiple insert statement node. > > The main idea of all above solutions is divided mutiple insert into > single insert. I don't know whether two methods are feasible or which > one will be better. Could you give me some suggestions? > > If you have other idea, please don't hesitate to tell me. > > Thanks. > > Regards, > Benny > > ------------------------------------------------------------------------------ > What happens now with your Lotus Notes apps - do you make another costly > upgrade, or settle for being marooned without product support? Time to move > off Lotus Notes and onto the cloud with Force.com, apps are easier to build, > use, and manage than apps on traditional platforms. Sign up for the Lotus > Notes Migration Kit to learn more. http://p.sf.net/sfu/salesforce-d2d > _______________________________________________ > Postgres-xc-developers mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers |
From: xiong w. <wan...@gm...> - 2010-12-07 07:56:53
|
Dears, I have two solutions to resolve the bug#3013562. 1. I will rewrite insert statement if it is an mutiple insert. I will deparse the mutiple insert statement into single insert statements which count is the same as values_list in function pg_rewrite_query. 2. I would like to use prepare statement to replace an mutiple insert. In other word, during a mutiple insert statement executing, I will use a prepare statement node to replace the mutiple insert statement node. The main idea of all above solutions is divided mutiple insert into single insert. I don't know whether two methods are feasible or which one will be better. Could you give me some suggestions? If you have other idea, please don't hesitate to tell me. Thanks. Regards, Benny |
From: Michael P. <mic...@gm...> - 2010-12-06 05:32:25
|
> > I think we should try to minimize changes to CommitTransaction. Why not use > the PrepareTransaction() to prepare the transaction instead of duplicating > that code inside CommitTransaction ? Also, it would be nice if you can take > away new code in a separate function and call that, something like > AtEOXact_PGXC() or something like that. > Hi all, I changed deeply the algorithm to avoid code duplication for implicit 2PC. With the patch attached, Coordinator is prepared only if 2 Coordinators at least are involved in a transaction (DDL case). If only one Coordinator is involved in transaction or if transaction does not contain any DDL, transaction is prepared on the involved nodes only. To sum up: 1) for DDL transaction (more than 1 Coordinator and more than 1 Datanode involved in a transaction) - prepare on Coordinator (2PC file written) - prepare on Nodes (2PC file written) - Commit prepared on Coordinator - Commit prepared on Datanodes 2) If no Coordinator, or only one Coordinator is involved in a transaction - prepare on nodes - commit on Coordinator - Commit on Datanodes Note: I didn' t put calls to implicit prepare functions in a separate functions because modification of CommitTransaction() are really light. Regards, -- Michael Paquier http://michaelpq.users.sourceforge.net |
From: Mason S. <mas...@en...> - 2010-12-03 21:32:18
|
On 12/3/10 7:10 PM, Andrei.Martsinchyk wrote: > Mason, > > 2010/12/3 Mason Sharp<mas...@en...>: > >> >> >> Sent from my IPhone >> >> On Dec 3, 2010, at 5:31 PM, "Andrei.Martsinchyk"<and...@en...> wrote: >> >> >>> Mason, >>> >>> 2010/12/3 Mason Sharp<mas...@en...>: >>> >>>> On 12/1/10 1:53 PM, Andrei.Martsinchyk wrote: >>>> >>>>> Hi Benny, >>>>> >>>>> Thanks for pointing this out. I tested with a program using Postgres C >>>>> library and extended query protocol. >>>>> For you and anyone else who want to test I am attaching the test >>>>> program and simple Makefile. >>>>> I fixed the segmentation fault (updated patch is attached), but >>>>> anyway, PREPARE / EXECUTE commands do not work properly. >>>>> >>>>> >>>> It looks like it is still not quite right: >>>> >>>> >>>> mds1=# create table mytab (col1 int, col2 int); >>>> CREATE TABLE >>>> mds1=# prepare p (int, int) AS INSERT INTO mytab VALUES ($1, $2); >>>> PREPARE >>>> mds1=# execute p (1,2); >>>> INSERT 0 1 >>>> mds1=# select * from mytab; >>>> col1 | col2 >>>> ------+------ >>>> (0 rows) >>>> >>>> It does not find the row that should have been inserted. >>>> >>>> >>> Yes, I mentioned the PREPARE command does not work properly. >>> In this particular case it is inserting row into Coordinator database >>> and not visible for select. >>> >>> >> Oh. So, only SELECT is currently handled? >> >> > Some SELECTs work properly, not all. > Only single-step ones? That is fine. Are any other SELECT statements problematic? How about UPDATE and DELETE? I just ran a simple UPDATE, and it failed, too. > >>>> Also, one other question- the session seems to retain the fact that there >>>> are associated prepared statements. Does that mean that the pooler will not >>>> put these back in the pool until all are deallocated? >>>> >>>> >>> Coordinator does not prepare statements on datanodes, so connection >>> can be released at the transaction end. >>> >> It converts it into a simple statement? I think we need to support prepare and execute on the data nodes for performance. >> >> > It does not seem straightforward. Prepared statement is a cached plan, > and it is cached on coordinator. If the plan contains multiple > RemoteQuery nodes we should prepare each, and should not release these > until all they are closed. We should be holding the data node > connections all this time. > If it is just a matter of holding on to the connections, that is fine, we can persist those for the duration of the session. Do you need the RemoteQuery nodes to persist, too, or just the connections? I understand that we may have to track which connections a statement has already been prepared on, and which ones it has not yet. At EXECUTE time, if a data node has not been prepared yet, we send down the prepare message first. > >>> >>>> On a related note, for the WITH HOLD cursors you implemented, did you also >>>> do something to hold on to the connections? >>>> >>>> >>> I did not implement WITH HOLD. >>> >> Let me retest this, and look at old emails later when back at my laptop. >> >> > I remember someone told me it works, but I never tested, and never did > anything to handle it. > > I tried it out: mds1=# begin; BEGIN mds1=# declare c cursor with hold for select * from mds1; DECLARE CURSOR mds1=# fetch c; col1 | col2 ------+------ 1 | 10 (1 row) mds1=# commit; COMMIT mds1=# fetch c; col1 | col2 ------+------ 3 | 30 (1 row) This worries me a bit that it got "fixed" unintentionally. We may have gotten lucky in that we refetched the same connection(s) from the pooler. Meanwhile, the Coordinator still knows about cursor c, so it did not object. It may be that the only thing we need to do is, if we have any open hold cursors, we do not return the connections to the pool but persist them. I also expect we would add to this over time, like, if the user created any temp tables (and has not dropped them), persist the connections (similar to GridSQL). Thanks, Mason >> Mason >> >>> >>>> Thanks, >>>> >>>> Mason >>>> >>>> >>>> >>>> >>>> Mason Sharp >>>> EnterpriseDB Corporation >>>> The Enterprise Postgres Company >>>> >>>> >>>> This e-mail message (and any attachment) is intended for the use of >>>> the individual or entity to whom it is addressed. This message >>>> contains information from EnterpriseDB Corporation that may be >>>> privileged, confidential, or exempt from disclosure under applicable >>>> law. If you are not the intended recipient or authorized to receive >>>> this for the intended recipient, any use, dissemination, distribution, >>>> retention, archiving, or copying of this communication is strictly >>>> prohibited. If you have received this e-mail in error, please notify >>>> the sender immediately by reply e-mail and delete this message. >>>> >>>> >>>> >>> >>> >>> -- >>> Andrei Martsinchyk >>> >>> EntepriseDB Corporation >>> The Enterprise Postgres Company >>> >>> Website: www.enterprisedb.com >>> EnterpriseDB Blog: http://blogs.enterprisedb.com/ >>> Follow us on Twitter: http://www.twitter.com/enterprisedb >>> >>> This e-mail message (and any attachment) is intended for the use of >>> the individual or entity to whom it is addressed. This message >>> contains information from EnterpriseDB Corporation that may be >>> privileged, confidential, or exempt from disclosure under applicable >>> law. If you are not the intended recipient or authorized to receive >>> this for the intended recipient, any use, dissemination, distribution, >>> retention, archiving, or copying of this communication is strictly >>> prohibited. If you have received this e-mail in error, please notify >>> the sender immediately by reply e-mail and delete this message. >>> >> > > > -- Mason Sharp EnterpriseDB Corporation The Enterprise Postgres Company This e-mail message (and any attachment) is intended for the use of the individual or entity to whom it is addressed. This message contains information from EnterpriseDB Corporation that may be privileged, confidential, or exempt from disclosure under applicable law. If you are not the intended recipient or authorized to receive this for the intended recipient, any use, dissemination, distribution, retention, archiving, or copying of this communication is strictly prohibited. If you have received this e-mail in error, please notify the sender immediately by reply e-mail and delete this message. |