You can subscribe to this list here.
2010 |
Jan
|
Feb
|
Mar
|
Apr
(10) |
May
(17) |
Jun
(3) |
Jul
|
Aug
|
Sep
(8) |
Oct
(18) |
Nov
(51) |
Dec
(74) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2011 |
Jan
(47) |
Feb
(44) |
Mar
(44) |
Apr
(102) |
May
(35) |
Jun
(25) |
Jul
(56) |
Aug
(69) |
Sep
(32) |
Oct
(37) |
Nov
(31) |
Dec
(16) |
2012 |
Jan
(34) |
Feb
(127) |
Mar
(218) |
Apr
(252) |
May
(80) |
Jun
(137) |
Jul
(205) |
Aug
(159) |
Sep
(35) |
Oct
(50) |
Nov
(82) |
Dec
(52) |
2013 |
Jan
(107) |
Feb
(159) |
Mar
(118) |
Apr
(163) |
May
(151) |
Jun
(89) |
Jul
(106) |
Aug
(177) |
Sep
(49) |
Oct
(63) |
Nov
(46) |
Dec
(7) |
2014 |
Jan
(65) |
Feb
(128) |
Mar
(40) |
Apr
(11) |
May
(4) |
Jun
(8) |
Jul
(16) |
Aug
(11) |
Sep
(4) |
Oct
(1) |
Nov
(5) |
Dec
(16) |
2015 |
Jan
(5) |
Feb
|
Mar
(2) |
Apr
(5) |
May
(4) |
Jun
(12) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(4) |
2019 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(2) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
S | M | T | W | T | F | S |
---|---|---|---|---|---|---|
|
|
|
|
|
|
1
|
2
|
3
(7) |
4
|
5
(1) |
6
|
7
|
8
|
9
|
10
|
11
|
12
(4) |
13
(2) |
14
|
15
|
16
|
17
|
18
|
19
(2) |
20
(4) |
21
(1) |
22
|
23
|
24
|
25
(1) |
26
(6) |
27
(1) |
28
|
29
|
30
(5) |
31
(3) |
|
|
|
|
|
From: Mason <ma...@us...> - 2011-10-31 13:03:07
|
On Sun, Oct 30, 2011 at 8:47 PM, Michael Paquier <mic...@gm...> wrote: > > > On Mon, Oct 31, 2011 at 9:25 AM, Mason <ma...@us...> > wrote: >> >> On Sun, Oct 30, 2011 at 7:29 PM, Michael Paquier >> <mic...@gm...> wrote: >> > >> > >> > On Mon, Oct 31, 2011 at 6:56 AM, Mason <ma...@us...> >> > wrote: >> >> >> >> I have some feedback and questions from looking at the code. Note, I >> >> have not been able to get the latest code running yet. >> >> >> >> Instead of "TO" in CREATE TABLE ... TO NODE and TO GROUP, I think "ON" >> >> sounds more correct. >> > >> > This was the first thought. >> > However, using ON made bison complaining about conflicting keywords. >> >> Ah, the ON CONFIG clause. > > There are no such clauses existing but... This is tricky here. Sorry I meant ON COMMIT, Onconfig is an Informix term (which I recently used). > >> >> That is what I meant asking if there were any dangers with this >> scheme. I think you said though that these are only assigned at initdb >> time, and it appears can never be changed, so perhaps it is not a >> danger. I was getting at if partitions should be named and node >> instances also named. That way a given partition name will be on a >> master data node as well as slaves, which may be named differently, >> but still use the same partition name/id. I think you said we don't >> need to worry about slaves, but I was just confused by the new DDL for >> slaves. I guess the slaves need to be renamed to be the same as the >> master if one fails over to them to ensure sort order? > > Yes, this would be necessary to insure that OID sort order is not changed. > However, if a slave becomes a master I think that having the slave node > changing its name to the former master one is not a big matter. > This is another discussion related to failover though. > I was not sure of the overall design, just trying to point out the theoretical danger, and again, because DDL was added for slaves, I was originally not sure if the intention was to allow them to retain their names and yet be able to be promoted. That is why I mentioned naming the partitions and using that order. Alternatively, perhaps one could use an internal id/OID behind the scenes for the partions, sort by that for determine hash & modulo buckets, and have a mapping of the partitions and node instances. Each master and standby should know what its partion id/oid is, perhaps returned at connection time when the connection comes from a coordinator. This might do away with the node renaming issue. Just something to mull over. Or, maybe some standby naming convention will help. We should just think ahead a little bit about possible HA management scenarios for flexibility. >> >> How would (outside directed) failover occur if the standby has a >> >> different >> >> name? >> >> >> >> Does GTM insist on the standby having the same name if the primary >> >> disconnects and the standby connects? >> > >> > I am not sure about the internal mechanisms used by GTM-Standby when >> > registering on GTM. >> > Suzuki-san will be better placed than me to answer. >> >> I meant related to data node standbys. I was wondering if a standby >> data node takes over for a data node master and the standby has a >> different name, how GTM handles that. > > Just about that.... > The first meaning of registering nodes on GTM was to keep a track of all the > node information in the cluster. > But now that node names are the same and have to remain constant in the > cluster, is this really necessary? > Removing that will also allow to remove pgxc_node_name from guc params. > Then identifying a node self for Coordinator/Datanode could be done by > initdb with a SELF/NOT SELF keyword as an extension of CREATE DDL. Maybe it is a good idea. How would a node rename itself later then? ALTER NODE oldname RENAME newname? Then if it sees that its own name matches oldname, change itself to newname? Otherwise just update catalog info? > -- > Michael Paquier > http://michael.otacoo.com > |
From: Michael P. <mic...@gm...> - 2011-10-31 00:47:49
|
On Mon, Oct 31, 2011 at 9:25 AM, Mason <ma...@us...>wrote: > On Sun, Oct 30, 2011 at 7:29 PM, Michael Paquier > <mic...@gm...> wrote: > > > > > > On Mon, Oct 31, 2011 at 6:56 AM, Mason <ma...@us...> > > wrote: > >> > >> I have some feedback and questions from looking at the code. Note, I > >> have not been able to get the latest code running yet. > >> > >> Instead of "TO" in CREATE TABLE ... TO NODE and TO GROUP, I think "ON" > >> sounds more correct. > > > > This was the first thought. > > However, using ON made bison complaining about conflicting keywords. > > Ah, the ON CONFIG clause. > There are no such clauses existing but... This is tricky here. > That is what I meant asking if there were any dangers with this > scheme. I think you said though that these are only assigned at initdb > time, and it appears can never be changed, so perhaps it is not a > danger. I was getting at if partitions should be named and node > instances also named. That way a given partition name will be on a > master data node as well as slaves, which may be named differently, > but still use the same partition name/id. I think you said we don't > need to worry about slaves, but I was just confused by the new DDL for > slaves. I guess the slaves need to be renamed to be the same as the > master if one fails over to them to ensure sort order? > Yes, this would be necessary to insure that OID sort order is not changed. However, if a slave becomes a master I think that having the slave node changing its name to the former master one is not a big matter. This is another discussion related to failover though. >> How would (outside directed) failover occur if the standby has a > different > >> name? > >> > >> Does GTM insist on the standby having the same name if the primary > >> disconnects and the standby connects? > > > > I am not sure about the internal mechanisms used by GTM-Standby when > > registering on GTM. > > Suzuki-san will be better placed than me to answer. > > I meant related to data node standbys. I was wondering if a standby > data node takes over for a data node master and the standby has a > different name, how GTM handles that. > Just about that.... The first meaning of registering nodes on GTM was to keep a track of all the node information in the cluster. But now that node names are the same and have to remain constant in the cluster, is this really necessary? Removing that will also allow to remove pgxc_node_name from guc params. Then identifying a node self for Coordinator/Datanode could be done by initdb with a SELF/NOT SELF keyword as an extension of CREATE DDL. -- Michael Paquier http://michael.otacoo.com |
From: Mason <ma...@us...> - 2011-10-31 00:25:08
|
On Sun, Oct 30, 2011 at 7:29 PM, Michael Paquier <mic...@gm...> wrote: > > > On Mon, Oct 31, 2011 at 6:56 AM, Mason <ma...@us...> > wrote: >> >> I have some feedback and questions from looking at the code. Note, I >> have not been able to get the latest code running yet. >> >> Instead of "TO" in CREATE TABLE ... TO NODE and TO GROUP, I think "ON" >> sounds more correct. > > This was the first thought. > However, using ON made bison complaining about conflicting keywords. Ah, the ON CONFIG clause. >> >> We may want to consider just renaming HOSTIP to HOST. I don't feel >> that strongly about it though. "HOSTIP" makes it sound like an ip >> address is required. PORT might be enough instead of HOSTPORT. >> >> RELATED (TO) is somewhat awkward. How about defining slaves instead >> as something like: >> >> CREATE NODE datanode1a WITH >> (NODE SLAVE FOR datanode1 >> HOSTIP = 192.168.1.2, >> NODEPORT = 15243 >> ); >> >> >> I like the possibility to load balance reads, hence the PREFERRED >> option for which data node in a related group to use, however I think >> coordinators each should still be able to specify where they would >> like to go for load balancing reads for replicated tables, so that >> they can choose to override the default one for a data node, allowing >> them to pick a local instance, for example. Something like: >> >> CREATE NODE COORDINATOR coord1 WITH >> (HOSTIP = localhost, >> NODEPORT = 15244, >> PREFERRED = datanode1) > > On each Coordinator is locally stored a version of pgxc_node catalog. > As a next step, I was planning to add an option with host name, preferred > and primary in ALTER NODE to be able to modify the data of a node only on a > local Coordinator. > It would have the shape of an SQL like this: > CREATE/ALTER LOCAL NODE nodename WITH (...) I think allowing local only changes may be dangerous. I suppose you could do something like you say but heavily restrict it, like only allow the PREFERRED option to be used. I think what I suggested about would work well and still allow the info to be global. >> >> I see some comments about sorting the nodes so that modulo evaluates >> to right node consistently, etc. Is this still safe when say, a master >> goes down, we promote a standby who has an OID (or name) that changes >> the sorted order? > > Yes it is. When a session goes up, it reads the information located in > pgxc_node and stores in cache the information related to nodes by sorting > them into sort names. > This works as long as the information cached in pooler is not changed. > That's why an additional command able to change the data cached in pooler > once the information of catalog has been modified is necessary. > Launching this command also forces to refresh all the other session caches > though and might not be launched if a session has an open connection to a > node. >> >> Also, can you please describe more about how OIDs are used for nodes, >> and how/if they are consistent across the cluster in such a case? (The >> primary coordinator assigns these I assume?) > > Oid are used to get the host/port/preferred/primary data. > Also, when a session begins, this session reads information from pgxc_node, > classifies the node information by sorting Oids with node names, and then > uses the sorting order to specify a node index. > Have a look at the end of pgxnode.c with APIs PGXCNodeGetNodeOid and > PGXCNodeGetNodeId. > Those ones are the important folks. > >> We could think about it in terms of a logical partition/segment having >> an id/oid associated with it, which is the same whether it is on the >> master or one or of the slaves. I am not sure if OID in the current >> code here refers to an actual node instance, instead of a logical >> number. > > The node index is the same on all the nodes. > However the node Oid is different. > For example, let's say that you have 2 Coordinators with the following > Datanode information: > Coordinator 1: > - dn1, OID = 100 > - dn3, OID = 101 > - dn2, OID = 102 > Coordinator 2: > - dn3, OID = 100 > - dn1, OID = 101 > - dn2, OID = 102 > Each Coordinator will use dn1 with index 1, dn2 with index 2, and dn3 with > index 3. > When requesting connections from pooler, sessions use those indexes. OK, so on all coordinators, we sort by name. Because the new DDL has stuff in there for slaves, I was unclear what you had in mind related to that. Say dn3's slave is named dn1a for some crazy reason. If someone were to stop the cluster and change the node info we would then sort and have dn1 with index 1, dn1a with index 2, dn2 with index 3 That is what I meant asking if there were any dangers with this scheme. I think you said though that these are only assigned at initdb time, and it appears can never be changed, so perhaps it is not a danger. I was getting at if partitions should be named and node instances also named. That way a given partition name will be on a master data node as well as slaves, which may be named differently, but still use the same partition name/id. I think you said we don't need to worry about slaves, but I was just confused by the new DDL for slaves. I guess the slaves need to be renamed to be the same as the master if one fails over to them to ensure sort order? > >> >> A logical id would allow for making high availability more flexible. >> It would be nice if the connection is only obtained via an id for a >> logical partition, instead of a physical node id that was used during >> planning, in case failover occurred. > > It could be possible to change pooler to request connections not from a node > index, but from a node name, or from the locally consistent node Oid. > I am more a fan of the 2nd option, as node names are unique in the cluster. >> >> I see some stuff using nodestrings instead of arrays of nodeIds in >> places. I understand wanting to give the DBA the ability to name these >> nodes, but once things are happening internally, it might be good to >> try and have these use some internal integer id instead for >> efficiency. Admittedly, I don't see a lot of string comparisons going >> on though. > > As explained below, the node names are used only to sort the node > information in cache of pgxcnode.c. >> >> Does GTM need to worry about slaves? > > GTM do not need to care about slaves I think. This is another discussion > though. OK, just confused by the new DDL for slaves, was not sure. >> >> If not, why not just register a logical partition id instead of a node >> name for data nodes? > > And why Postgres does not use a partition ID when identifying slaves for > syncrep? > Names look more flexible. Has to do with the issue above, but I guess it is not an issue if node names are fixed. > >> How would (outside directed) failover occur if the standby has a different >> name? >> >> Does GTM insist on the standby having the same name if the primary >> disconnects and the standby connects? > > I am not sure about the internal mechanisms used by GTM-Standby when > registering on GTM. > Suzuki-san will be better placed than me to answer. I meant related to data node standbys. I was wondering if a standby data node takes over for a data node master and the standby has a different name, how GTM handles that. Thanks, Mason > > Regards, > -- > Michael Paquier > http://michael.otacoo.com > |
From: Michael P. <mic...@gm...> - 2011-10-30 23:29:34
|
On Mon, Oct 31, 2011 at 6:56 AM, Mason <ma...@us...>wrote: > I have some feedback and questions from looking at the code. Note, I > have not been able to get the latest code running yet. > > Instead of "TO" in CREATE TABLE ... TO NODE and TO GROUP, I think "ON" > sounds more correct. > This was the first thought. However, using ON made bison complaining about conflicting keywords. > > We may want to consider just renaming HOSTIP to HOST. I don't feel > that strongly about it though. "HOSTIP" makes it sound like an ip > address is required. PORT might be enough instead of HOSTPORT. > > RELATED (TO) is somewhat awkward. How about defining slaves instead > as something like: > > CREATE NODE datanode1a WITH > (NODE SLAVE FOR datanode1 > HOSTIP = 192.168.1.2, > NODEPORT = 15243 > ); > > > I like the possibility to load balance reads, hence the PREFERRED > option for which data node in a related group to use, however I think > coordinators each should still be able to specify where they would > like to go for load balancing reads for replicated tables, so that > they can choose to override the default one for a data node, allowing > them to pick a local instance, for example. Something like: > > CREATE NODE COORDINATOR coord1 WITH > (HOSTIP = localhost, > NODEPORT = 15244, > PREFERRED = datanode1) > On each Coordinator is locally stored a version of pgxc_node catalog. As a next step, I was planning to add an option with host name, preferred and primary in ALTER NODE to be able to modify the data of a node only on a local Coordinator. It would have the shape of an SQL like this: CREATE/ALTER LOCAL NODE nodename WITH (...) > > I see some comments about sorting the nodes so that modulo evaluates > to right node consistently, etc. Is this still safe when say, a master > goes down, we promote a standby who has an OID (or name) that changes > the sorted order? > Yes it is. When a session goes up, it reads the information located in pgxc_node and stores in cache the information related to nodes by sorting them into sort names. This works as long as the information cached in pooler is not changed. That's why an additional command able to change the data cached in pooler once the information of catalog has been modified is necessary. Launching this command also forces to refresh all the other session caches though and might not be launched if a session has an open connection to a node. > Also, can you please describe more about how OIDs are used for nodes, > and how/if they are consistent across the cluster in such a case? (The > primary coordinator assigns these I assume?) > Oid are used to get the host/port/preferred/primary data. Also, when a session begins, this session reads information from pgxc_node, classifies the node information by sorting Oids with node names, and then uses the sorting order to specify a node index. Have a look at the end of pgxnode.c with APIs PGXCNodeGetNodeOid and PGXCNodeGetNodeId. Those ones are the important folks. We could think about it in terms of a logical partition/segment having > an id/oid associated with it, which is the same whether it is on the > master or one or of the slaves. I am not sure if OID in the current > code here refers to an actual node instance, instead of a logical > number. > The node index is the same on all the nodes. However the node Oid is different. For example, let's say that you have 2 Coordinators with the following Datanode information: Coordinator 1: - dn1, OID = 100 - dn3, OID = 101 - dn2, OID = 102 Coordinator 2: - dn3, OID = 100 - dn1, OID = 101 - dn2, OID = 102 Each Coordinator will use dn1 with index 1, dn2 with index 2, and dn3 with index 3. When requesting connections from pooler, sessions use those indexes. > A logical id would allow for making high availability more flexible. > It would be nice if the connection is only obtained via an id for a > logical partition, instead of a physical node id that was used during > planning, in case failover occurred. > It could be possible to change pooler to request connections not from a node index, but from a node name, or from the locally consistent node Oid. I am more a fan of the 2nd option, as node names are unique in the cluster. > > I see some stuff using nodestrings instead of arrays of nodeIds in > places. I understand wanting to give the DBA the ability to name these > nodes, but once things are happening internally, it might be good to > try and have these use some internal integer id instead for > efficiency. Admittedly, I don't see a lot of string comparisons going > on though. > As explained below, the node names are used only to sort the node information in cache of pgxcnode.c. > > Does GTM need to worry about slaves? > GTM do not need to care about slaves I think. This is another discussion though. > If not, why not just register a logical partition id instead of a node > name for data nodes? > And why Postgres does not use a partition ID when identifying slaves for syncrep? Names look more flexible. How would (outside directed) failover occur if the standby has a different > name? Does GTM insist on the standby having the same name if the primary > disconnects and the standby connects? > I am not sure about the internal mechanisms used by GTM-Standby when registering on GTM. Suzuki-san will be better placed than me to answer. Regards, -- Michael Paquier http://michael.otacoo.com |
From: Mason S. <mas...@gm...> - 2011-10-30 22:45:44
|
On Sun, Oct 30, 2011 at 5:52 PM, Michael Paquier <mic...@gm...> wrote: > > > On Mon, Oct 31, 2011 at 6:42 AM, Mason Sharp <mas...@gm...> wrote: >> >> I am having problems getting the latest code to work since node DDL was >> added. >> >> I ensure that I have set pgxc_node_name in all of the config files. >> >> I start up the 2 data nodes and 1 coordinator I have so far. I assume >> that the next step would be to execute CREATE NODE statements to wire >> things up. (Please let me know if the instructions have now changed.) >> So, I tried to execute psql connecting to template1, and I get this >> error: >> >> could not connect to database postgres: FATAL: Coordinator cannot >> identify himself >> >> (We might want to change the message to be "itself" instead of "himself", >> btw.) >> >> This occurs in InitMultinodeExecutor. I can see that PGXCNodeName is >> set correctly, that NumCoords is 1, co_handles[0].nodeoid, but >> get_pgxc_nodename() must not be returning anything. I stopped >> debugging to see if there is an easier fix and if there something else >> I should configure first. Is there? >> >> Also, is that the sequence that needs to be done?: update config >> files, connect to Coordinator via psql, execute CREATE NODE >> statements, then the cluster is ready? > > For the time being, this functionality is still limited. > What you need to do to set your cluster is to fill in cluster_nodes.sql in > share repository where you installed your binaries. This file is read > automatically with initdb. That worked, thank you. |
From: Mason <ma...@us...> - 2011-10-30 21:56:26
|
I have some feedback and questions from looking at the code. Note, I have not been able to get the latest code running yet. Instead of "TO" in CREATE TABLE ... TO NODE and TO GROUP, I think "ON" sounds more correct. We may want to consider just renaming HOSTIP to HOST. I don't feel that strongly about it though. "HOSTIP" makes it sound like an ip address is required. PORT might be enough instead of HOSTPORT. RELATED (TO) is somewhat awkward. How about defining slaves instead as something like: CREATE NODE datanode1a WITH (NODE SLAVE FOR datanode1 HOSTIP = 192.168.1.2, NODEPORT = 15243 ); I like the possibility to load balance reads, hence the PREFERRED option for which data node in a related group to use, however I think coordinators each should still be able to specify where they would like to go for load balancing reads for replicated tables, so that they can choose to override the default one for a data node, allowing them to pick a local instance, for example. Something like: CREATE NODE COORDINATOR coord1 WITH (HOSTIP = localhost, NODEPORT = 15244, PREFERRED = datanode1) I see some comments about sorting the nodes so that modulo evaluates to right node consistently, etc. Is this still safe when say, a master goes down, we promote a standby who has an OID (or name) that changes the sorted order? Also, can you please describe more about how OIDs are used for nodes, and how/if they are consistent across the cluster in such a case? (The primary coordinator assigns these I assume?) We could think about it in terms of a logical partition/segment having an id/oid associated with it, which is the same whether it is on the master or one or of the slaves. I am not sure if OID in the current code here refers to an actual node instance, instead of a logical number. A logical id would allow for making high availability more flexible. It would be nice if the connection is only obtained via an id for a logical partition, instead of a physical node id that was used during planning, in case failover occurred. I see some stuff using nodestrings instead of arrays of nodeIds in places. I understand wanting to give the DBA the ability to name these nodes, but once things are happening internally, it might be good to try and have these use some internal integer id instead for efficiency. Admittedly, I don't see a lot of string comparisons going on though. Does GTM need to worry about slaves? If not, why not just register a logical partition id instead of a node name for data nodes? How would (outside directed) failover occur if the standby has a different name? Does GTM insist on the standby having the same name if the primary disconnects and the standby connects? Thanks, Mason |
From: Michael P. <mic...@gm...> - 2011-10-30 21:52:07
|
On Mon, Oct 31, 2011 at 6:42 AM, Mason Sharp <mas...@gm...> wrote: > I am having problems getting the latest code to work since node DDL was > added. > > I ensure that I have set pgxc_node_name in all of the config files. > > I start up the 2 data nodes and 1 coordinator I have so far. I assume > that the next step would be to execute CREATE NODE statements to wire > things up. (Please let me know if the instructions have now changed.) > So, I tried to execute psql connecting to template1, and I get this > error: > > could not connect to database postgres: FATAL: Coordinator cannot > identify himself > > (We might want to change the message to be "itself" instead of "himself", > btw.) > > This occurs in InitMultinodeExecutor. I can see that PGXCNodeName is > set correctly, that NumCoords is 1, co_handles[0].nodeoid, but > get_pgxc_nodename() must not be returning anything. I stopped > debugging to see if there is an easier fix and if there something else > I should configure first. Is there? > > Also, is that the sequence that needs to be done?: update config > files, connect to Coordinator via psql, execute CREATE NODE > statements, then the cluster is ready? > For the time being, this functionality is still limited. What you need to do to set your cluster is to fill in cluster_nodes.sql in share repository where you installed your binaries. This file is read automatically with initdb. The first cluster configuration is taken from there. As the default configuration in this file is 1Co/2Dn, what you saw is that. By the next release, I should be able to add 2 additional features: 1) The possibility to choose a custom file instead of cluster_nodes at initdb 2) The possibility to update the connection information in pooler with an additional SQL command based on the information in pgxc_node. If you have additional ideas, it is of course welcome. Regards, -- Michael Paquier http://michael.otacoo.com |
From: Mason S. <mas...@gm...> - 2011-10-30 21:42:46
|
I am having problems getting the latest code to work since node DDL was added. I ensure that I have set pgxc_node_name in all of the config files. I start up the 2 data nodes and 1 coordinator I have so far. I assume that the next step would be to execute CREATE NODE statements to wire things up. (Please let me know if the instructions have now changed.) So, I tried to execute psql connecting to template1, and I get this error: could not connect to database postgres: FATAL: Coordinator cannot identify himself (We might want to change the message to be "itself" instead of "himself", btw.) This occurs in InitMultinodeExecutor. I can see that PGXCNodeName is set correctly, that NumCoords is 1, co_handles[0].nodeoid, but get_pgxc_nodename() must not be returning anything. I stopped debugging to see if there is an easier fix and if there something else I should configure first. Is there? Also, is that the sequence that needs to be done?: update config files, connect to Coordinator via psql, execute CREATE NODE statements, then the cluster is ready? Thanks, Mason |
From: Michael P. <mic...@gm...> - 2011-10-27 04:40:22
|
Hi all, pgxc_node_name is used in postgresql.conf for XC for a node to identify itself. Now that node DDL is in place, I was wondering if it could be possible to remove that from the GUC list and add it in pgxc_node. There is however some side-effect if this is done. Now pgxc_node_name is used at postmaster startup to register the node on GTM. However, node registration has been added 1 year and a half ago (already) to keep a track of nodes connecting to the cluster. But now that nodes are managed through catalogs, it doesn't look that much useful. Adding the node-self identification will require a new column in pgxc_node, let's say called node_isself, using a boolean value. And CREATE/ALTER NODE will be extended with a keyword SELF to identify a node self. This new keyword is applicable on all nodes. Ex: CREATE NODE coord WITH ( [SELF | NOT SELF ] ) ALTER NODE coord SET [ SELF | NOT SELF ]; pros: - remaining connection parameters in postgresql.conf file are GTM-related - consistency in node management cons: - node registration has to be removed Any opinions on that? -- Michael Paquier http://michael.otacoo.com |
From: Michael P. <mic...@gm...> - 2011-10-26 05:56:01
|
I saw that the regression test xc_having is failing with this patch, because of false explain reports. This is not a big matter and please see updated patch attached. However, the keyword "__FOREIGN_QUERY__" is printed from time to time in tests. Everybody, especially Ashutosh, do you think it's OK like this? -- Michael Paquier http://michael.otacoo.com |
From: Michael P. <mic...@gm...> - 2011-10-26 05:47:06
|
On Wed, Oct 26, 2011 at 2:42 PM, Michael Paquier <mic...@gm...>wrote: > Hi, > > Could you be more precise with the SQLs you used in this test? > I get the following result on a remote join for 2 tables. > postgres=# create table aa (a int); > CREATE TABLE > postgres=# create table bb (a int); > CREATE TABLE > postgres=# explain select * from aa,bb; > QUERY > PLAN > > ---------------------------------------------------------------------------------------- > Nested Loop (cost=0.00..2.04 rows=1 width=8) > -> Materialize (cost=0.00..1.01 rows=1 width=4) > -> Data Node Scan (Node Count [2]) on aa (cost=0.00..1.01 > rows=1000 width=4) > -> Materialize (cost=0.00..1.01 rows=1 width=4) > -> Data Node Scan (Node Count [2]) (cost=0.00..1.01 rows=1000 > width=4) Oh, OK. The second table name does not appear. Here is the result with your patch: QUERY PLAN ---------------------------------------------------------------------------------------- Nested Loop (cost=0.00..2.04 rows=1 width=8) -> Materialize (cost=0.00..1.01 rows=1 width=4) -> Data Node Scan (Node Count [2]) on aa (cost=0.00..1.01 rows=1000 width=4) -> Materialize (cost=0.00..1.01 rows=1 width=4) -> Data Node Scan (Node Count [2]) on bb (cost=0.00..1.01 rows=1000 width=4) (5 rows) -- Michael Paquier http://michael.otacoo.com |
From: Michael P. <mic...@gm...> - 2011-10-26 05:42:14
|
Hi, Could you be more precise with the SQLs you used in this test? I get the following result on a remote join for 2 tables. postgres=# create table aa (a int); CREATE TABLE postgres=# create table bb (a int); CREATE TABLE postgres=# explain select * from aa,bb; QUERY PLAN ---------------------------------------------------------------------------------------- Nested Loop (cost=0.00..2.04 rows=1 width=8) -> Materialize (cost=0.00..1.01 rows=1 width=4) -> Data Node Scan (Node Count [2]) on aa (cost=0.00..1.01 rows=1000 width=4) -> Materialize (cost=0.00..1.01 rows=1 width=4) -> Data Node Scan (Node Count [2]) (cost=0.00..1.01 rows=1000 width=4) (5 rows) On Wed, Oct 26, 2011 at 10:43 AM, xiong wang <wan...@gm...> wrote: > Hi all, > The bug just like flows: > postgres=# explain select * from j1_tbl, j2_tbl ; > QUERY PLAN > > ----------------------------------------------------------------------------- > Data Node Scan (cost=0.00..0.00 rows=0 width=0) > -> Nested Loop (cost=2.00..31103.90 rows=2482400 width=48) > -> Broadcast Motion (cost=2.00..49.40 rows=2140 width=8) > Hash Key: ANY_KEY > -> Seq Scan (cost=0.00..31.40 rows=2140 width=8) > -> Materialize (cost=0.00..27.40 rows=1160 width=40) > -> Seq Scan on j1_tbl (cost=0.00..21.60 rows=1160 width=40) > > The second rangetable j2_tbl doesn't appear in the plan. The patch > fixes such a problem. > Regards, -- Michael Paquier http://michael.otacoo.com |
From: Koichi S. <ko...@in...> - 2011-10-26 05:40:59
|
I reviewed the patch. The original code looks strange and the proposed patch looks okay. Further comments? --- Koichi On Wed, 26 Oct 2011 09:43:39 +0800 xiong wang <wan...@gm...> wrote: > Hi all, > The bug just like flows: > postgres=# explain select * from j1_tbl, j2_tbl ; > QUERY PLAN > ----------------------------------------------------------------------------- > Data Node Scan (cost=0.00..0.00 rows=0 width=0) > -> Nested Loop (cost=2.00..31103.90 rows=2482400 width=48) > -> Broadcast Motion (cost=2.00..49.40 rows=2140 width=8) > Hash Key: ANY_KEY > -> Seq Scan (cost=0.00..31.40 rows=2140 width=8) > -> Materialize (cost=0.00..27.40 rows=1160 width=40) > -> Seq Scan on j1_tbl (cost=0.00..21.60 rows=1160 width=40) > > The second rangetable j2_tbl doesn't appear in the plan. The patch > fixes such a problem. |
From: Michael P. <mic...@gm...> - 2011-10-26 04:12:29
|
Hi, Please find attached an update of the patch that is ready for commit. Concerning the code changes, I corrected the following issues: - error messages in EXECUTE DIRECT - GTM-Proxy problems with node name, code is now stabilized - removed all the code warnings Then, I made tests with 2 cluster configurations: - 2Co/2Dn - 5Co/5Dn Regressions, DBT-1 and pgbench ran without issues. Regards, -- Michael Paquier http://michael.otacoo.com |
From: xiong w. <wan...@gm...> - 2011-10-26 01:43:46
|
Hi all, The bug just like flows: postgres=# explain select * from j1_tbl, j2_tbl ; QUERY PLAN ----------------------------------------------------------------------------- Data Node Scan (cost=0.00..0.00 rows=0 width=0) -> Nested Loop (cost=2.00..31103.90 rows=2482400 width=48) -> Broadcast Motion (cost=2.00..49.40 rows=2140 width=8) Hash Key: ANY_KEY -> Seq Scan (cost=0.00..31.40 rows=2140 width=8) -> Materialize (cost=0.00..27.40 rows=1160 width=40) -> Seq Scan on j1_tbl (cost=0.00..21.60 rows=1160 width=40) The second rangetable j2_tbl doesn't appear in the plan. The patch fixes such a problem. |
From: Michael P. <mic...@gm...> - 2011-10-25 10:11:39
|
Hi all, I am more than happy to announce you that a battle is finishing with one of the new functionalities. Here is a stabilized version of the patch for node management in catalogs. Compared to NODE_NAME branch in github, the following simplifications are applied: - No column node_indices in pgxc_node - No structure ExpArray - Node information is managed fully by system cache - Consistency of node information is maintained globally by classifying the nodes by their names - EXECUTE DIRECT works - pgxc_prepared_xacts is changed in consequence and works - CLEAN CONNECTION works - regressions show no failures - documentation is updated - pgbench works correctly - preferred and primary nodes is supported. Tomorrow I will do additional tests with dbt1 and clean up more the code. Have fun. Regards, -- Michael Paquier http://michael.otacoo.com |
From: Devrim G. <de...@gu...> - 2011-10-21 08:54:42
|
On Thu, 2011-10-20 at 11:37 +0900, Michael Paquier wrote: > I got a patch for this that looks to make it work better. > Please see attached. I still fail to see why you are not selecting a free port, but using specific ports? What if someone is running something on 13013? -- Devrim GÜNDÜZ Principal Systems Engineer @ EnterpriseDB: http://www.enterprisedb.com PostgreSQL Danışmanı/Consultant, Red Hat Certified Engineer Community: devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr http://www.gunduz.org Twitter: http://twitter.com/devrimgunduz |
From: Abbas B. <abb...@te...> - 2011-10-20 04:22:48
|
ERROR: Failed to get pooled connections This normally happens when we have wrong number of data nodes/coordinators in the configuration file. Please make sure that you did change the number of coordinators to TWO in both the configuration files of the coordinators. I am talking about num_coordinators parameter in the configuration file. If that does not work please send all four configuration files, It would help a lot to diagnose the problem. 2011/10/12 Devrim GÜNDÜZ <de...@gu...> > On Wed, 2011-10-12 at 10:42 +0900, Michael Paquier wrote: > > > > I just pushed in repository a new stable branch called REL0_9_6_STABLE > > on > > which will be based release 0.9.6. > > make check fails: > > mkdir ./testtablespace > ../../../src/test/regress/pg_regress --inputdir=. > --temp-install=./tmp_check --top-builddir=../../.. --dlpath=. > --schedule=./parallel_schedule > ============== removing existing temp installation ============== > ============== creating temporary installation ============== > ============== initializing database system ============== > ============== starting postmaster ============== > ============== starting GTM process ============== > running on port 57333 with PID 10638 for Coordinator 1 > running on port 5433 with PID 10647 for Coordinator 2 > running on port 5434 with PID 10656 for Datanode 1 > running on port 5435 with PID 10657 for Datanode 2 > ============== creating database "regression" ============== > ERROR: Failed to get pooled connections > command failed: > "/home/devrim/Downloads/pgxc-0.9.6/src/test/regress/./tmp_check/install//usr/local/pgxc096/bin/psql" > -X -c "CREATE DATABASE \"regression\" TEMPLATE=template0" "postgres" > pg_ctl: PID file > "/home/devrim/Downloads/pgxc-0.9.6/src/test/regress/./tmp_check/data_dn1/postmaster.pid" > does not exist > Is server running? > > pg_regress: could not stop postmaster: exit code was 256 > make[2]: *** [check] Error 2 > make[2]: Leaving directory > `/home/devrim/Downloads/pgxc-0.9.6/src/test/regress' > make[1]: *** [check] Error 2 > make[1]: Leaving directory `/home/devrim/Downloads/pgxc-0.9.6/src/test' > make: *** [check] Error 2 > > -- > Devrim GÜNDÜZ > Principal Systems Engineer @ EnterpriseDB: http://www.enterprisedb.com > PostgreSQL Danışmanı/Consultant, Red Hat Certified Engineer > Community: devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr > http://www.gunduz.org Twitter: http://twitter.com/devrimgunduz > > > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure contains a > definitive record of customers, application performance, security > threats, fraudulent activity and more. Splunk takes this data and makes > sense of it. Business sense. IT sense. Common sense. > http://p.sf.net/sfu/splunk-d2d-oct > _______________________________________________ > Postgres-xc-developers mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > > |
From: Michael P. <mic...@gm...> - 2011-10-20 04:09:38
|
2011/10/20 Abbas Butt <abb...@te...> > > ERROR: Failed to get pooled connections > > This normally happens when we have wrong number of data nodes/coordinators > in the configuration file. > Please make sure that you did change the number of coordinators to TWO in > both the configuration files of the coordinators. > I am talking about num_coordinators parameter in the configuration file. > If that does not work please send all four configuration files, It would > help a lot to diagnose the problem. > make check uses a 2Co/2Dn configuration. In this case, all the SQL queries are run from Coordinator 1, so Coordinator 2 does nothing except reacting as a remote node like Datanodes. num_coordinators is set to 2, this is OK. This error is really environment dependent, and as far as I saw, current implementation of make check does not choose very wisely port numbers of each node which may lead to an port overlap, and nodes cannot start. It looks that there are multiple issues here. For the error of Devrim, I would think of a port overlap. I am sure that by looking at the log files in src/test/regress/log/postmaster_X.log, you will find that some nodes have not started. The patch I sent before addresses this issue. For the error of Suzuki san, this is still mysterious to me, but I would expect this time a GTM or pooler port overlap. Once again, having a look at the log files for nodes and GTM will help find the other issue. Btw, I created a bug ticket for this problem: https://sourceforge.net/tracker/?func=detail&aid=3426140&group_id=311227&atid=1310232 Regards, Michael |
From: Koichi S. <koi...@gm...> - 2011-10-20 02:45:13
|
I tested this with my ubuntu-10.4. Unfortunately, it failed and the result looks the same. Terminal list is as follows: ------8<------------------------8<-------------- ../../../src/test/regress/pg_regress --inputdir=. --temp-install=./tmp_check --top-builddir=../../.. --dlpath=. --schedule=./parallel_schedule ============== creating temporary installation ============== ============== initializing database system ============== ============== starting postmaster ============== ============== starting GTM process ============== running on port 57332 with pid 13128 for Coordinator 1 running on port 5433 with pid 13129 for Coordinator 2 running on port 5434 with pid 13130 for Datanode 1 running on port 5435 with pid 13131 for Datanode 2 ============== creating database "regression" ============== psql: could not connect to server: No such file or directory Is the server running locally and accepting connections on Unix domain socket "/tmp/.s.PGSQL.57332"? command failed: "/home/common/PGXC/pgxc_head/postgres-xc/src/test/regress/./tmp_check/install//usr/local/pgsql/bin/psql" -X -c "CREATE DATABASE \"regression\" TEMPLATE=template0" "postgres" pg_ctl: PID file "/home/common/PGXC/pgxc_head/postgres-xc/src/test/regress/./tmp_check/data_co1/postmaster.pid" does not exist Is server running? pg_regress: could not stop postmaster: exit code was 256 make[2]: *** [check] エラー 2 make[2]: ディレクトリ `/home/common/PGXC/pgxc_head/postgres-xc/src/test/regress' から出ます make[1]: *** [check] エラー 2 make[1]: ディレクトリ `/home/common/PGXC/pgxc_head/postgres-xc/src/test' から出ます make: *** [check] エラー 2 [koichi@willey:postgres-xc]$ ------>8----------------------->8---------------- ---------- Koichi Suzuki 2011/10/20 Michael Paquier <mic...@gm...>: > Hi, > > I got a patch for this that looks to make it work better. > Please see attached. > Do you still have errors? I had it working on Fedora. > > Regards, > -- > Michael Paquier > http://michael.otacoo.com > > ------------------------------------------------------------------------------ > The demand for IT networking professionals continues to grow, and the > demand for specialized networking skills is growing even more rapidly. > Take a complimentary Learning@Ciosco Self-Assessment and learn > about Cisco certifications, training, and career opportunities. > http://p.sf.net/sfu/cisco-dev2dev > _______________________________________________ > Postgres-xc-developers mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > > |
From: Michael P. <mic...@gm...> - 2011-10-20 02:37:18
|
Hi, I got a patch for this that looks to make it work better. Please see attached. Do you still have errors? I had it working on Fedora. Regards, -- Michael Paquier http://michael.otacoo.com |
From: Koichi S. <koi...@gm...> - 2011-10-19 06:36:33
|
I tried to use another port number by invoking pg_regress manually, say, 10001, and encountered the same problem in my Ubuntu 10.4. What kind of port number selection problem do you suppose? Maximum value of the port number will be 2^16 and the port number used is within this limitation. Regards; ---------- Koichi Suzuki 2011/10/19 Michael Paquier <mic...@gm...>: > Hi, > > I am looking at this problem and have been finally able to reproduce the > port problem in a Fedora 15 environment. The problem looks to be indeed in > the port number selection, but I am expecting some additional issues that > are environment dependant. > I may need a couple of additional days though to write a patch for that but > fix looks simple. > > Regards, > -- > Michael Paquier > http://michael.otacoo.com > > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure contains a > definitive record of customers, application performance, security > threats, fraudulent activity and more. Splunk takes this data and makes > sense of it. Business sense. IT sense. Common sense. > http://p.sf.net/sfu/splunk-d2d-oct > _______________________________________________ > Postgres-xc-developers mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > > |
From: Michael P. <mic...@gm...> - 2011-10-19 01:45:55
|
Hi, I am looking at this problem and have been finally able to reproduce the port problem in a Fedora 15 environment. The problem looks to be indeed in the port number selection, but I am expecting some additional issues that are environment dependant. I may need a couple of additional days though to write a patch for that but fix looks simple. Regards, -- Michael Paquier http://michael.otacoo.com |
From: Koichi S. <ko...@in...> - 2011-10-13 01:14:02
|
Michael; This is exactly what I encountered in my Ubuntu 10.4 (^ ^;) --- Koichi On Wed, 12 Oct 2011 11:58:45 +0300 Devrim GÜNDÜZ <de...@gu...> wrote: > On Wed, 2011-10-12 at 17:29 +0900, Michael Paquier wrote: > > What is the environment you are using? > > I have it working on CentOS 5.6 and Ubuntu 10.04. > > Fedora 15, gcc 4.6.1. > > ..and just duplicated it on Scientific Linux 6.1 (See bottom for more) > > ============================================================ > ============== creating temporary installation ============== > ============== initializing database system ============== > ============== starting postmaster ============== > ============== starting GTM process ============== > running on port 57333 with PID 4883 for Coordinator 1 > running on port 5433 with PID 4884 for Coordinator 2 > running on port 5434 with PID 4886 for Datanode 1 > running on port 5435 with PID 4888 for Datanode 2 > ============== creating database "regression" ============== > psql: could not connect to server: No such file or directory > Is the server running locally and accepting > connections on Unix domain socket "/tmp/.s.PGSQL.57333"? > command failed: "/home/devrim/tmp/pgxc-0.9.6/src/test/regress/./tmp_check/install//usr/local/pgsql/bin/psql" -X -c "CREATE DATABASE \"regression\" TEMPLATE=template0" "postgres" > pg_ctl: PID file "/home/devrim/tmp/pgxc-0.9.6/src/test/regress/./tmp_check/data_co1/postmaster.pid" does not exist > Is server running? > > pg_regress: could not stop postmaster: exit code was 256 > > > > Well, looking at the logs, here is what I have found (symptoms are different on SL and Fedora, though) > > * AFAICS XC picks up ports starting with 5433. That is not good, since I > have already an instance running on 5433 on my SL 6.1 box. > > * Apparently postmaster could not start for the Datanodes, because of > limited shmmax availability on my SL box (even though the message above > claims that postmaster is running) > > * cat postmaster_2.log > FATAL: could not open lock file "/tmp/.s.PGSQL.5434.lock": Permission > denied > > This is on my Fedora box, and apparently it is because of the port > selection again: > > ls -al /tmp/.s.PGSQL.5434.lock > -rw------- 1 postgres postgres 52 Oct 6 03:42 /tmp/.s.PGSQL.5434.lock > > > Hope these help. > > > > -- > Devrim GÜNDÜZ > Principal Systems Engineer @ EnterpriseDB: http://www.enterprisedb.com > PostgreSQL Danışmanı/Consultant, Red Hat Certified Engineer > Community: devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr > http://www.gunduz.org Twitter: http://twitter.com/devrimgunduz |
From: Michael P. <mic...@gm...> - 2011-10-13 00:37:00
|
Thanks for your time. I think such information will be enough to dig into that. -- Michael Paquier http://michael.otacoo.com |