You can subscribe to this list here.
2010 |
Jan
|
Feb
|
Mar
|
Apr
(10) |
May
(17) |
Jun
(3) |
Jul
|
Aug
|
Sep
(8) |
Oct
(18) |
Nov
(51) |
Dec
(74) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2011 |
Jan
(47) |
Feb
(44) |
Mar
(44) |
Apr
(102) |
May
(35) |
Jun
(25) |
Jul
(56) |
Aug
(69) |
Sep
(32) |
Oct
(37) |
Nov
(31) |
Dec
(16) |
2012 |
Jan
(34) |
Feb
(127) |
Mar
(218) |
Apr
(252) |
May
(80) |
Jun
(137) |
Jul
(205) |
Aug
(159) |
Sep
(35) |
Oct
(50) |
Nov
(82) |
Dec
(52) |
2013 |
Jan
(107) |
Feb
(159) |
Mar
(118) |
Apr
(163) |
May
(151) |
Jun
(89) |
Jul
(106) |
Aug
(177) |
Sep
(49) |
Oct
(63) |
Nov
(46) |
Dec
(7) |
2014 |
Jan
(65) |
Feb
(128) |
Mar
(40) |
Apr
(11) |
May
(4) |
Jun
(8) |
Jul
(16) |
Aug
(11) |
Sep
(4) |
Oct
(1) |
Nov
(5) |
Dec
(16) |
2015 |
Jan
(5) |
Feb
|
Mar
(2) |
Apr
(5) |
May
(4) |
Jun
(12) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(4) |
2019 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(2) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
S | M | T | W | T | F | S |
---|---|---|---|---|---|---|
1
|
2
(1) |
3
(6) |
4
(19) |
5
|
6
(15) |
7
(2) |
8
(2) |
9
(22) |
10
(20) |
11
(20) |
12
(14) |
13
(12) |
14
(2) |
15
|
16
(14) |
17
(17) |
18
(4) |
19
(8) |
20
(2) |
21
(3) |
22
|
23
(8) |
24
(1) |
25
|
26
(2) |
27
(1) |
28
|
29
|
30
(7) |
31
(3) |
|
|
|
|
From: Nikhil S. <ni...@st...> - 2012-07-13 05:07:15
|
Just a thought. If we have a utility which spews out all of these statements to redistribute a table across node modifications, then we can just wrap them inside a transaction block and just run that? Wont it save all of the core changes? Regards, Nikhils On Fri, Jul 13, 2012 at 12:29 AM, Michael Paquier <mic...@gm...> wrote: > Hi all, > > Please find attached an updated patch adding redistribution optimizations > for replicated tables. > If the node subset of a replicated table is reduced, the necessary nodes are > simply truncated. > If it is increased, a COPY TO is done to fetch the data, and COPY FROM is > done only on the necessary nodes. > New regression tests have been added to test that. > > Regards, > > > On Thu, Jul 12, 2012 at 5:30 PM, Michael Paquier <mic...@gm...> > wrote: >> >> OK, here is the mammoth patch: 3000 lines including docs, implementation >> and regressions. >> The code has been realigned with current master. >> This patch introduces the latest thing I am working on: the redistribution >> command tree planning and execution. >> >> As I explained before, a redistribution consists of a series of commands >> (TRUNCATE, REINDEX, DELETE, COPY FROM, COPY TO) that need to be determined >> depending on the new and old locator information of the relation. Each >> action can be done on a subset of nodes. >> This patch introduces the basic infrastructure of the command tree build >> and execution. >> For the time being, redistribution uses only what is called the default >> command tree consisting of: >> 1) COPY TO >> 2) TRUNCATE >> 3) COPY FROM >> 4) REINDEX >> But this structure can be easily completed with more complicated >> operations. >> In this patch there is still a small thing missing which is the >> possibility to launch a COPY FROM on a subset of nodes, particularly useful >> when redistribution consists of a replicated table whose set of nodes is >> increased. >> Compared to the last versions, the impact of redistribution in tablecmds.c >> is limited. >> >> Regards, >> >> -- >> Michael Paquier >> http://michael.otacoo.com > > > > > -- > Michael Paquier > http://michael.otacoo.com > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Postgres-xc-developers mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > -- StormDB - http://www.stormdb.com The Database Cloud |
From: Michael P. <mic...@gm...> - 2012-07-13 04:51:07
|
On Fri, Jul 13, 2012 at 11:45 AM, nop <no...@in...> wrote: > Hi, > > Currently HEAD won't build because of a merge conflict. The attached patch > resolves that. > > Cheers, > Andrew > Thanks this has been fixed: http://github.com/postgres-xc/postgres-xc/commit/5655bc5 I didn't reused your patch, as you did not take into account the options "-undefined suppress -flat_namespace" in LINK.shared that are used when compiling code on MacOS. Regards. -- Michael Paquier http://michael.otacoo.com |
From: Ahsan H. <ahs...@en...> - 2012-07-12 07:47:37
|
On Thu, Jul 12, 2012 at 11:08 AM, Koichi Suzuki <koi...@gm...>wrote: > Hi, > > I found that we had syntax change in the commit > 011b1d7cfec2ebdf4aeb32611e4a3f8ceedb2dc0. Unfortunately, they're not > upward-compatible and may affect applications, if anybody has ever > written. > > Affected statements are as follows: > > EXECUTE DIRECT > CLEAN CONNECTION > CREATE TABLE (subset of nodes) > CREATE TABLE AS > CREATE NODE GROUP > > and now node list has to be enclosed with parenthesis. This is to > reduce bison shift/reduce conflict. Because the change is in utility > and ddl, I don't think you have serious problem with this change. > > Pgxc_clean needed a change and Michael did it. > > Although I understand the background of the change, I'd like to > suggest to discuss non upward-compatible change in the syntax in this > mailing list before commit to make sure that there's no serious impact > and to make the change understood by everyone. > Totally agree.. > > Regards; > ---------- > Koichi Suzuki > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Postgres-xc-developers mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > -- Ahsan Hadi Snr Director Product Development EnterpriseDB Corporation The Enterprise Postgres Company Phone: +92-51-8358874 Mobile: +92-333-5162114 Website: www.enterprisedb.com EnterpriseDB Blog: http://blogs.enterprisedb.com/ Follow us on Twitter: http://www.twitter.com/enterprisedb This e-mail message (and any attachment) is intended for the use of the individual or entity to whom it is addressed. This message contains information from EnterpriseDB Corporation that may be privileged, confidential, or exempt from disclosure under applicable law. If you are not the intended recipient or authorized to receive this for the intended recipient, any use, dissemination, distribution, retention, archiving, or copying of this communication is strictly prohibited. If you have received this e-mail in error, please notify the sender immediately by reply e-mail and delete this message. |
From: Michael P. <mic...@gm...> - 2012-07-12 06:44:15
|
On Thu, Jul 12, 2012 at 3:33 PM, Ashutosh Bapat < ash...@en...> wrote: > Ok. Yes, even I was wondering whether having only the * typedef > externalised, would allow usage of internal members. Now it's clear it > doesn't. > > Anyway, so we are left with following possibilities, > 1. Use another structure which is subset of CopyStateData, and copy from > CopyStateData to this new structure. Your patch takes this approach. > 2. Use long function signature and be error prone while calling the > function. > 3. Write small functions where CopyStateData is defined, to pull the > required member values. Use CopyState as input value. This increases the > code again. > 4. Don't make a remotecopy.c as a separate file and bracket the PGXC only > functions using #if PGXC. > > I would have gone with 4 to save code and unnecessary copy. But I will > leave it to you to decide finally. Option 4 is not possible unfortunately. Query generation is externalized to use it for redistribution. Let's go for the last version of the patch. -- Michael Paquier http://michael.otacoo.com |
From: Ashutosh B. <ash...@en...> - 2012-07-12 06:33:44
|
Ok. Yes, even I was wondering whether having only the * typedef externalised, would allow usage of internal members. Now it's clear it doesn't. Anyway, so we are left with following possibilities, 1. Use another structure which is subset of CopyStateData, and copy from CopyStateData to this new structure. Your patch takes this approach. 2. Use long function signature and be error prone while calling the function. 3. Write small functions where CopyStateData is defined, to pull the required member values. Use CopyState as input value. This increases the code again. 4. Don't make a remotecopy.c as a separate file and bracket the PGXC only functions using #if PGXC. I would have gone with 4 to save code and unnecessary copy. But I will leave it to you to decide finally. On Thu, Jul 12, 2012 at 11:12 AM, Michael Paquier <mic...@gm... > wrote: > > > As I understand it, CopyStateData is private to copy.c not CopyState? So, >>> one should be able to use CopyState outside copy.c? Am I right? >>> >> My mistake, I didn't notice that line in copy.h: >> typedef struct CopyStateData *CopyState; >> So yes, it is possible to use CopyState outside it. OK, let me eliminate >> the unnecessary options. >> Is there something else? If not, I might directly commit the patch >> tomorrow. >> > Well, I am coming back on that again... CopyState is built as a shadow > pointer of CopyStateData so the data of CopyStateData cannot be found > directly in remotecopy.c even by passing a pointer CopyState. So I changed > once again the patch. It is using the intermediate structure to pass values > related to remote query generation. > Suggestions before ultimate commit? > -- > Michael Paquier > http://michael.otacoo.com > -- Best Wishes, Ashutosh Bapat EntepriseDB Corporation The Enterprise Postgres Company |
From: Ashutosh B. <ash...@en...> - 2012-07-12 06:14:26
|
+1. On Thu, Jul 12, 2012 at 11:38 AM, Koichi Suzuki <koi...@gm...>wrote: > Hi, > > I found that we had syntax change in the commit > 011b1d7cfec2ebdf4aeb32611e4a3f8ceedb2dc0. Unfortunately, they're not > upward-compatible and may affect applications, if anybody has ever > written. > > Affected statements are as follows: > > EXECUTE DIRECT > CLEAN CONNECTION > CREATE TABLE (subset of nodes) > CREATE TABLE AS > CREATE NODE GROUP > > and now node list has to be enclosed with parenthesis. This is to > reduce bison shift/reduce conflict. Because the change is in utility > and ddl, I don't think you have serious problem with this change. > > Pgxc_clean needed a change and Michael did it. > > Although I understand the background of the change, I'd like to > suggest to discuss non upward-compatible change in the syntax in this > mailing list before commit to make sure that there's no serious impact > and to make the change understood by everyone. > > Regards; > ---------- > Koichi Suzuki > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Postgres-xc-developers mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > -- Best Wishes, Ashutosh Bapat EntepriseDB Corporation The Enterprise Postgres Company |
From: Koichi S. <koi...@gm...> - 2012-07-12 06:08:43
|
Hi, I found that we had syntax change in the commit 011b1d7cfec2ebdf4aeb32611e4a3f8ceedb2dc0. Unfortunately, they're not upward-compatible and may affect applications, if anybody has ever written. Affected statements are as follows: EXECUTE DIRECT CLEAN CONNECTION CREATE TABLE (subset of nodes) CREATE TABLE AS CREATE NODE GROUP and now node list has to be enclosed with parenthesis. This is to reduce bison shift/reduce conflict. Because the change is in utility and ddl, I don't think you have serious problem with this change. Pgxc_clean needed a change and Michael did it. Although I understand the background of the change, I'd like to suggest to discuss non upward-compatible change in the syntax in this mailing list before commit to make sure that there's no serious impact and to make the change understood by everyone. Regards; ---------- Koichi Suzuki |
From: Michael P. <mic...@gm...> - 2012-07-12 04:48:57
|
On Wed, Jul 11, 2012 at 5:20 AM, Andrei Martsinchyk < and...@gm...> wrote: > Hi, > > We have been thorougly testing the GTM Standby, along with related modules > like GTM and GTM proxy in order to use them to build a highly available > database cluster. > We found and fixed few bugs, implemented few useful features and now we > are contributing this stuff to the community. > Please find attached the series of patches. Most imortant goes first, just > in case if a less important one is not accapted it would be easier to > realign remainings. > Each patch contains a description, brief overview comes below. > 1-6 fix various bugs, they probably have not been noticed because of lack > of testing. > 7 - we found the reconnect procedure in GTM proxy unnecessarily complex. > If GTM connection was lost it first was trying to reconnect to last known > GTM for configured period of time, then it waited for a "reconnect" command > for another configured period of time, after it elapsed the GTM proxy > entered an invalid state, so attemt to issue "reconnect" command caused > program crash. We changed this, and now GTM proxy is trying to reconnect to > GTM infinitely. At any time host and port of the GTM to reconnect to may be > modified by issuing "reconnect" command. The only remaining configuration > parameter is the interval of time between reconnection attempt. Indeed, GTM > proxy is trying to reconnect again immediately after receiving the > "reconnect" command. > 7 is now done. > 8 - we changed format of the gtm.control file to text, so DBA do not need > a hex editor if it is needed to repair it manually in emergency case; it is > easier to notice corrupted gtm.control. > Hum... I am afraid that patch 8 is not aligned anymore (- -;) as I fixed trailing whitespaces in this area. -- Michael Paquier http://michael.otacoo.com |
From: Pavan D. <pav...@gm...> - 2012-07-12 04:40:44
|
On Thu, Jul 12, 2012 at 10:06 AM, Michael Paquier <mic...@gm... > wrote: > > > >> Is it really a material for back branches ? I think that should be >> limited to just bug fixes. >> > Pushing that down to stable branches is not a big deal, and permits to > avoid unnecessary merge conflicts if we touch the same area of code in the > future. I understand that the changes are trivial and wouldn't introduce any new issues. But IMHO as a practice, we should only checkin bug fixes in the stable branches. That would avoid unintentional changes which may change behavior or introduce regression in the stable releases. IOW, only absolutely must-have changes should go in the stable branches. Thanks, Pavan |
From: Michael P. <mic...@gm...> - 2012-07-12 04:36:09
|
On Thu, Jul 12, 2012 at 1:33 PM, Pavan Deolasee <pav...@gm...>wrote: > > > On Thu, Jul 12, 2012 at 5:50 AM, Michael Paquier < > mic...@us...> wrote: > >> Project "Postgres-XC". >> >> The branch, REL1_0_STABLE has been updated >> via 9b427c82371d036a9ffad22e535a104909e7b210 (commit) >> from 18fdb2f0e27e651a55c634e921cb46130a1670b0 (commit) >> >> >> - Log ----------------------------------------------------------------- >> >> http://postgres-xc.git.sourceforge.net/git/gitweb.cgi?p=postgres-xc/postgres-xc;a=commitdiff;h=9b427c82371d036a9ffad22e535a104909e7b210 >> >> commit c29739c5e6e717e1352b413d3cf6a7fa7566be53 >> Author: Michael Paquier <mi...@ot...> >> Date: Thu Jul 12 09:21:54 2012 +0900 >> >> Remove all the trailing whitespaces in XC backend files >> >> This includes locator, XC planner, pool, barrier and node manager. >> This clean-up should have been done long ago... >> >> > Is it really a material for back branches ? I think that should be limited > to just bug fixes. > Pushing that down to stable branches is not a big deal, and permits to avoid unnecessary merge conflicts if we touch the same area of code in the future. -- Michael Paquier http://michael.otacoo.com |
From: Koichi S. <koi...@gm...> - 2012-07-12 04:32:35
|
2012/7/12 Nikhil Sontakke <ni...@st...>: > Hi Suzuki-San, > > >> Enclosed is a path of pgxc_monitor to check if gtm, gtm_proxy, >> coordinator or datanode is running. > > AFAICS, gtm_proxy and datanode is not handled still? Yes, they're handled. To be honest, pgxc_monitor does not distinguish gtm and gtm_proxy, coordinator and datanode because we're using the same protocol. > >> I first thought to add >> dedicated protocol >> to gtm and gtm_proxy for this, but found PQconnectGTM() does many >> shakehands and it's sufficient to check if gtm/gtm_proxy is running. >> >> So there's no modification to the core. >> > > That's nice. Can't we do this same for the GTM proxy too? Just do a > PQconnectGTM()? Yes. > > Datanodes will need some more thinking. We can always send in a > "select 1" to them too just to see if they are up and about. But that > will unnecessarily consume local xids. In terms of local xid, because 'select 1' is read-only transaction, datanode will not consume local xids. > > OTOH, we can contact it via a coordinator. So in that case we can > extend the code you added for the coordinator to additionally take in > a node name. And if it's provided it contacts the datanode otherwise > it runs a "select 1" on itself or something like that. This means that we need to know what coordinator is alive. It may not fit to automatic failover system. XC can continue cluster operation even though some of the coordinators crashed. In this situation, we need a means to detect if specific datanode is alive without coordinator's help. > Regards, > Nikhils > >> Sorry the patch to filelist.sgmlin includes fix to restrict >> pgxc_clean, pgxc_ddl and pgxc_monitor only to XC. >> >> Regards; >> ---------- >> Koichi Suzuki >> >> ------------------------------------------------------------------------------ >> Live Security Virtual Conference >> Exclusive live event will cover all the ways today's security and >> threat landscape has changed and how IT managers can respond. Discussions >> will include endpoint security, mobile security and the latest in malware >> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ >> _______________________________________________ >> Postgres-xc-developers mailing list >> Pos...@li... >> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >> > > > > -- > StormDB - http://www.stormdb.com > The Database Cloud |
From: Koichi S. <koi...@gm...> - 2012-07-12 04:23:16
|
2012/7/11 Michael Paquier <mic...@gm...>: > 1-6 have been committed. > I just modified a couple of tiny places. Those were obviously serious bugs, > and they have been backported to 1.0 branch. > > So, I also had a look at the 2 last patches. > > On Wed, Jul 11, 2012 at 5:20 AM, Andrei Martsinchyk > <and...@gm...> wrote: >> >> 7 - we found the reconnect procedure in GTM proxy unnecessarily complex. >> If GTM connection was lost it first was trying to reconnect to last known >> GTM for configured period of time, then it waited for a "reconnect" command >> for another configured period of time, after it elapsed the GTM proxy >> entered an invalid state, so attemt to issue "reconnect" command caused >> program crash. We changed this, and now GTM proxy is trying to reconnect to >> GTM infinitely. At any time host and port of the GTM to reconnect to may be >> modified by issuing "reconnect" command. The only remaining configuration >> parameter is the interval of time between reconnection attempt. Indeed, GTM >> proxy is trying to reconnect again immediately after receiving the >> "reconnect" command. Hmm, this makes things much simpler while maintaining flexibility for various installation. Okay, we should go ahead. > This looks simpler, indeed. And I like 2 things here: > 1) Operation and setup gets better: elimination of 5 GUC params on Proxy > side. The elimination of count-related parameters and retry really makes > sense as GTM-proxy is only a connection bridge to GTM. > 2) Operation gets simpler: GTM-Proxy operation becomes completely > independant of the wait of a reconnect command, and won't put himself in an > inconsistent state because of its settings. So you remove 1 degree of > dependency here! > I have however one comment regarding the patch: > 1) In HandleGTMError, we try to reconnect with always the same > gtm_connect_string set up outside the loop. If we receive a reconnect > command while being in the loop gtm_connect_string will not get updated with > the new values given by reconnect command for GTMServerHost and > GTMServerPortNumber. > So even while being inside the reconnection loop we will not get the latest > connection parameters of GTM even if new things have been kicked to Proxy. > Just to call, reconnect information has the following format (from docs): > gtm_ctl reconnect -Z gtm_proxy -D datafolder_proxy -o '-s hostname -t > port_number' > Won't it make sense to generate gtm_connect_string inside the infinite for > loop once longjump has been disabled? I once thought an idea to use SIGHUP to reload gtm.conf/gtm_proxy.conf. This could be used in reconnect, as well as many other on-the-fly configuration change, as done in PG. This may end up with big change of the internal structure but may make things much more simpler. >> 8 - we changed format of the gtm.control file to text, so DBA do not need >> a hex editor if it is needed to repair it manually in emergency case; it is >> easier to notice corrupted gtm.control. +1. Nice! > > This patch is also fixing whitespaces... > Could it be possible to fix the format related problems into a separate > patch such as I could also backport the format related problems in 1.0 > stable? > The modification of the format of gtm.control is something that can only be > pushed on master, so we need to divide correctly each thing we are fixing > here. > > On the content of the patch, I don't have any arguments against such a > feature. It might even facilitate operator's work by allowing him to change > sequence infor manually. > Some arguments here? > -- > Michael Paquier > http://michael.otacoo.com > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Postgres-xc-developers mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > |
From: Nikhil S. <ni...@st...> - 2012-07-11 21:28:10
|
Hi Suzuki-San, > Enclosed is a path of pgxc_monitor to check if gtm, gtm_proxy, > coordinator or datanode is running. AFAICS, gtm_proxy and datanode is not handled still? > I first thought to add > dedicated protocol > to gtm and gtm_proxy for this, but found PQconnectGTM() does many > shakehands and it's sufficient to check if gtm/gtm_proxy is running. > > So there's no modification to the core. > That's nice. Can't we do this same for the GTM proxy too? Just do a PQconnectGTM()? Datanodes will need some more thinking. We can always send in a "select 1" to them too just to see if they are up and about. But that will unnecessarily consume local xids. OTOH, we can contact it via a coordinator. So in that case we can extend the code you added for the coordinator to additionally take in a node name. And if it's provided it contacts the datanode otherwise it runs a "select 1" on itself or something like that. Regards, Nikhils > Sorry the patch to filelist.sgmlin includes fix to restrict > pgxc_clean, pgxc_ddl and pgxc_monitor only to XC. > > Regards; > ---------- > Koichi Suzuki > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Postgres-xc-developers mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > -- StormDB - http://www.stormdb.com The Database Cloud |
From: Michael P. <mic...@gm...> - 2012-07-11 12:35:46
|
OK thanks for your feedback. On Wed, Jul 11, 2012 at 9:16 PM, Ashutosh Bapat < ash...@en...> wrote: > > > On Wed, Jul 11, 2012 at 5:30 PM, Michael Paquier < > mic...@gm...> wrote: > >> >> >> On Wed, Jul 11, 2012 at 8:46 PM, Ashutosh Bapat < >> ash...@en...> wrote: >> >>> >>> >>> On Wed, Jul 11, 2012 at 3:28 PM, Michael Paquier < >>> mic...@gm...> wrote: >>> >>>> >>>> On 2012/07/11, at 18:43, Ashutosh Bapat < >>>> ash...@en...> wrote: >>>> >>>> >>>> >>>> On Wed, Jul 11, 2012 at 3:03 PM, Michael Paquier < >>>> mic...@gm...> wrote: >>>> >>>>> >>>>> >>>>> Ok, understood. In this case, I will recommend two things >>>>>> 1. Whether an internal COPY generation can use CopyState itself? >>>>>> >>>>> Not a good idea... Another idea I have: >>>>> Only RemoteCopy_BuildStatement is using RemoteCopyOptions. So let's >>>>> eliminate it and RemoteCopy_BuildStatement being called with all the >>>>> arguments it has in. It makes the function a bit longer, but avoids to have >>>>> to copy all the elements of CopyState into an intermediate structure. What >>>>> would be tricky to maintain btw. >>>>> I recall that CreateProcedure in postgres does smth similar. >>>>> >>>>> >>>> That will make the function signature too long and prone to errors if >>>> arguments get jumbled up in the call (there are 9 members btw). Why can't >>>> we use CopyState structure with only the options initialised? >>>> >>>> CopyState is directly defined in copy.c, and we absolutely shouldn't >>>> put it out of that file to limit our code effect on Postgres. I had a hard >>>> time merging 9.1 copy code in xc and don't want to do that again for 9.2. >>>> So options I see are partially copy that structure into smth else, or >>>> use a longer function to build query. >>>> The longer function will be easier to maintain, and is more portable, >>>> even if it increases the chances of errors when called. Have a look at >>>> CreateProcedure or CreateFunction and you will see what is really a long >>>> function! >>>> >>>> >>> As I understand it, CopyStateData is private to copy.c not CopyState? >>> So, one should be able to use CopyState outside copy.c? Am I right? >>> >> My mistake, I didn't notice that line in copy.h: >> typedef struct CopyStateData *CopyState; >> So yes, it is possible to use CopyState outside it. OK, let me eliminate >> the unnecessary options. >> Is there something else? If not, I might directly commit the patch >> tomorrow. >> > > Nothing else. > > >> -- >> Michael Paquier >> http://michael.otacoo.com >> > > > > -- > Best Wishes, > Ashutosh Bapat > EntepriseDB Corporation > The Enterprise Postgres Company > > -- Michael Paquier http://michael.otacoo.com |
From: Ashutosh B. <ash...@en...> - 2012-07-11 12:16:37
|
On Wed, Jul 11, 2012 at 5:30 PM, Michael Paquier <mic...@gm...>wrote: > > > On Wed, Jul 11, 2012 at 8:46 PM, Ashutosh Bapat < > ash...@en...> wrote: > >> >> >> On Wed, Jul 11, 2012 at 3:28 PM, Michael Paquier < >> mic...@gm...> wrote: >> >>> >>> On 2012/07/11, at 18:43, Ashutosh Bapat <ash...@en...> >>> wrote: >>> >>> >>> >>> On Wed, Jul 11, 2012 at 3:03 PM, Michael Paquier < >>> mic...@gm...> wrote: >>> >>>> >>>> >>>> Ok, understood. In this case, I will recommend two things >>>>> 1. Whether an internal COPY generation can use CopyState itself? >>>>> >>>> Not a good idea... Another idea I have: >>>> Only RemoteCopy_BuildStatement is using RemoteCopyOptions. So let's >>>> eliminate it and RemoteCopy_BuildStatement being called with all the >>>> arguments it has in. It makes the function a bit longer, but avoids to have >>>> to copy all the elements of CopyState into an intermediate structure. What >>>> would be tricky to maintain btw. >>>> I recall that CreateProcedure in postgres does smth similar. >>>> >>>> >>> That will make the function signature too long and prone to errors if >>> arguments get jumbled up in the call (there are 9 members btw). Why can't >>> we use CopyState structure with only the options initialised? >>> >>> CopyState is directly defined in copy.c, and we absolutely shouldn't put >>> it out of that file to limit our code effect on Postgres. I had a hard time >>> merging 9.1 copy code in xc and don't want to do that again for 9.2. >>> So options I see are partially copy that structure into smth else, or >>> use a longer function to build query. >>> The longer function will be easier to maintain, and is more portable, >>> even if it increases the chances of errors when called. Have a look at >>> CreateProcedure or CreateFunction and you will see what is really a long >>> function! >>> >>> >> As I understand it, CopyStateData is private to copy.c not CopyState? So, >> one should be able to use CopyState outside copy.c? Am I right? >> > My mistake, I didn't notice that line in copy.h: > typedef struct CopyStateData *CopyState; > So yes, it is possible to use CopyState outside it. OK, let me eliminate > the unnecessary options. > Is there something else? If not, I might directly commit the patch > tomorrow. > Nothing else. > -- > Michael Paquier > http://michael.otacoo.com > -- Best Wishes, Ashutosh Bapat EntepriseDB Corporation The Enterprise Postgres Company |
From: Michael P. <mic...@gm...> - 2012-07-11 12:01:01
|
On Wed, Jul 11, 2012 at 8:46 PM, Ashutosh Bapat < ash...@en...> wrote: > > > On Wed, Jul 11, 2012 at 3:28 PM, Michael Paquier < > mic...@gm...> wrote: > >> >> On 2012/07/11, at 18:43, Ashutosh Bapat <ash...@en...> >> wrote: >> >> >> >> On Wed, Jul 11, 2012 at 3:03 PM, Michael Paquier < >> mic...@gm...> wrote: >> >>> >>> >>> Ok, understood. In this case, I will recommend two things >>>> 1. Whether an internal COPY generation can use CopyState itself? >>>> >>> Not a good idea... Another idea I have: >>> Only RemoteCopy_BuildStatement is using RemoteCopyOptions. So let's >>> eliminate it and RemoteCopy_BuildStatement being called with all the >>> arguments it has in. It makes the function a bit longer, but avoids to have >>> to copy all the elements of CopyState into an intermediate structure. What >>> would be tricky to maintain btw. >>> I recall that CreateProcedure in postgres does smth similar. >>> >>> >> That will make the function signature too long and prone to errors if >> arguments get jumbled up in the call (there are 9 members btw). Why can't >> we use CopyState structure with only the options initialised? >> >> CopyState is directly defined in copy.c, and we absolutely shouldn't put >> it out of that file to limit our code effect on Postgres. I had a hard time >> merging 9.1 copy code in xc and don't want to do that again for 9.2. >> So options I see are partially copy that structure into smth else, or use >> a longer function to build query. >> The longer function will be easier to maintain, and is more portable, >> even if it increases the chances of errors when called. Have a look at >> CreateProcedure or CreateFunction and you will see what is really a long >> function! >> >> > As I understand it, CopyStateData is private to copy.c not CopyState? So, > one should be able to use CopyState outside copy.c? Am I right? > My mistake, I didn't notice that line in copy.h: typedef struct CopyStateData *CopyState; So yes, it is possible to use CopyState outside it. OK, let me eliminate the unnecessary options. Is there something else? If not, I might directly commit the patch tomorrow. -- Michael Paquier http://michael.otacoo.com |
From: Ashutosh B. <ash...@en...> - 2012-07-11 11:46:32
|
On Wed, Jul 11, 2012 at 3:28 PM, Michael Paquier <mic...@gm...>wrote: > > On 2012/07/11, at 18:43, Ashutosh Bapat <ash...@en...> > wrote: > > > > On Wed, Jul 11, 2012 at 3:03 PM, Michael Paquier < > mic...@gm...> wrote: > >> >> >> Ok, understood. In this case, I will recommend two things >>> 1. Whether an internal COPY generation can use CopyState itself? >>> >> Not a good idea... Another idea I have: >> Only RemoteCopy_BuildStatement is using RemoteCopyOptions. So let's >> eliminate it and RemoteCopy_BuildStatement being called with all the >> arguments it has in. It makes the function a bit longer, but avoids to have >> to copy all the elements of CopyState into an intermediate structure. What >> would be tricky to maintain btw. >> I recall that CreateProcedure in postgres does smth similar. >> >> > That will make the function signature too long and prone to errors if > arguments get jumbled up in the call (there are 9 members btw). Why can't > we use CopyState structure with only the options initialised? > > CopyState is directly defined in copy.c, and we absolutely shouldn't put > it out of that file to limit our code effect on Postgres. I had a hard time > merging 9.1 copy code in xc and don't want to do that again for 9.2. > So options I see are partially copy that structure into smth else, or use > a longer function to build query. > The longer function will be easier to maintain, and is more portable, even > if it increases the chances of errors when called. Have a look at > CreateProcedure or CreateFunction and you will see what is really a long > function! > > As I understand it, CopyStateData is private to copy.c not CopyState? So, one should be able to use CopyState outside copy.c? Am I right? > > >> 2. If this structure is essential, rename the member variables with some >>> prefix like "rco_", to distinguish those from their CopyState counterparts. >>> This helps a lot when one wants to navigate code looking for all usages of >>> a particular member using say cscope or tags. Unfortunately PostgreSQL >>> doesn't use this idea and hence very common names like larg are used in >>> many structures and makes it difficult to find out say all usages of larg >>> in say SetOperationStmt. >>> >> So switching to this method? > I am more a fan of the long function honestly. > > Not that much essential if we use my idea above >> -- >> Michael Paquier >> http://michael.otacoo.com >> > > > > -- > Best Wishes, > Ashutosh Bapat > EntepriseDB Corporation > The Enterprise Postgres Company > > -- Best Wishes, Ashutosh Bapat EntepriseDB Corporation The Enterprise Postgres Company |
From: Michael P. <mic...@gm...> - 2012-07-11 09:58:33
|
On 2012/07/11, at 18:43, Ashutosh Bapat <ash...@en...> wrote: > > > On Wed, Jul 11, 2012 at 3:03 PM, Michael Paquier <mic...@gm...> wrote: > > > Ok, understood. In this case, I will recommend two things > 1. Whether an internal COPY generation can use CopyState itself? > Not a good idea... Another idea I have: > Only RemoteCopy_BuildStatement is using RemoteCopyOptions. So let's eliminate it and RemoteCopy_BuildStatement being called with all the arguments it has in. It makes the function a bit longer, but avoids to have to copy all the elements of CopyState into an intermediate structure. What would be tricky to maintain btw. > I recall that CreateProcedure in postgres does smth similar. > > > That will make the function signature too long and prone to errors if arguments get jumbled up in the call (there are 9 members btw). Why can't we use CopyState structure with only the options initialised? CopyState is directly defined in copy.c, and we absolutely shouldn't put it out of that file to limit our code effect on Postgres. I had a hard time merging 9.1 copy code in xc and don't want to do that again for 9.2. So options I see are partially copy that structure into smth else, or use a longer function to build query. The longer function will be easier to maintain, and is more portable, even if it increases the chances of errors when called. Have a look at CreateProcedure or CreateFunction and you will see what is really a long function! > > 2. If this structure is essential, rename the member variables with some prefix like "rco_", to distinguish those from their CopyState counterparts. This helps a lot when one wants to navigate code looking for all usages of a particular member using say cscope or tags. Unfortunately PostgreSQL doesn't use this idea and hence very common names like larg are used in many structures and makes it difficult to find out say all usages of larg in say SetOperationStmt. So switching to this method? I am more a fan of the long function honestly. > Not that much essential if we use my idea above > -- > Michael Paquier > http://michael.otacoo.com > > > > -- > Best Wishes, > Ashutosh Bapat > EntepriseDB Corporation > The Enterprise Postgres Company > |
From: Ashutosh B. <ash...@en...> - 2012-07-11 09:43:16
|
On Wed, Jul 11, 2012 at 3:03 PM, Michael Paquier <mic...@gm...>wrote: > > > Ok, understood. In this case, I will recommend two things >> 1. Whether an internal COPY generation can use CopyState itself? >> > Not a good idea... Another idea I have: > Only RemoteCopy_BuildStatement is using RemoteCopyOptions. So let's > eliminate it and RemoteCopy_BuildStatement being called with all the > arguments it has in. It makes the function a bit longer, but avoids to have > to copy all the elements of CopyState into an intermediate structure. What > would be tricky to maintain btw. > I recall that CreateProcedure in postgres does smth similar. > > That will make the function signature too long and prone to errors if arguments get jumbled up in the call (there are 9 members btw). Why can't we use CopyState structure with only the options initialised? > 2. If this structure is essential, rename the member variables with some >> prefix like "rco_", to distinguish those from their CopyState counterparts. >> This helps a lot when one wants to navigate code looking for all usages of >> a particular member using say cscope or tags. Unfortunately PostgreSQL >> doesn't use this idea and hence very common names like larg are used in >> many structures and makes it difficult to find out say all usages of larg >> in say SetOperationStmt. >> > Not that much essential if we use my idea above > -- > Michael Paquier > http://michael.otacoo.com > -- Best Wishes, Ashutosh Bapat EntepriseDB Corporation The Enterprise Postgres Company |
From: Michael P. <mic...@gm...> - 2012-07-11 09:33:17
|
Ok, understood. In this case, I will recommend two things > 1. Whether an internal COPY generation can use CopyState itself? > Not a good idea... Another idea I have: Only RemoteCopy_BuildStatement is using RemoteCopyOptions. So let's eliminate it and RemoteCopy_BuildStatement being called with all the arguments it has in. It makes the function a bit longer, but avoids to have to copy all the elements of CopyState into an intermediate structure. What would be tricky to maintain btw. I recall that CreateProcedure in postgres does smth similar. 2. If this structure is essential, rename the member variables with some > prefix like "rco_", to distinguish those from their CopyState counterparts. > This helps a lot when one wants to navigate code looking for all usages of > a particular member using say cscope or tags. Unfortunately PostgreSQL > doesn't use this idea and hence very common names like larg are used in > many structures and makes it difficult to find out say all usages of larg > in say SetOperationStmt. > Not that much essential if we use my idea above -- Michael Paquier http://michael.otacoo.com |
From: Ashutosh B. <ash...@en...> - 2012-07-11 09:24:29
|
On Wed, Jul 11, 2012 at 2:43 PM, Michael Paquier <mic...@gm...>wrote: > Please find attached an updated patch, not problems with regressions. > > On Wed, Jul 11, 2012 at 5:11 PM, Ashutosh Bapat < > ash...@en...> wrote: > >> Hi Michael, >> Here are comments for the remote copy refactoring >> 1. The member variable remoteState may be renamed as remoteCopyState. In >> BeginCopy(), however, the renaming becomes important as remoteState can be >> misleading. >> > That's true. Done. > > >> 2. In RemoteCopy_GetRelationLoc use macro LOCATOR_TYPE_REPLICATED rather >> than using literal 'R'. >> > Done. > > >> 3. Not your change, but, GetAnyDataNode is a misleading name, the >> function actually looks for the preferred data nodes as well. So, we should >> use name something like GetPreferredReplicationNode or something like that. >> > Makes perfectly sense, this code portion is pretty old. Done. > > >> 4. In RemoteCopy_GetRelationLoc, we use distribution column name, and >> scan the attribute list for a matching column, to get its attribute number. >> Why don't we use the partAttrNum directly? >> > Done. > > >> 5. The macro APPENDSOFAR is being defined within the function twice. Why >> don't we move it outside of the function by making it parameterised. >> > Done. > > >> 6. What's the advantage of having RemoteCopyOptions as a separate >> structure. It looks like a subset of CopyState. Why can't we use CopyState >> itself instead? Adding this structure seems to have added quite a few lines >> of code and quite of few pallocs underneath strdup, list_copy etc.? > > The goal here is to make the remote query generation independant of COPY > code as it is not really related. Why making it external? There are 2 > reasons which are first to reduce the footprint of XC code on postgres. > Second this structure is necessary to have an easily pluggable COPY query > generation which is not directly plugged inside postgreSQL's copy.c. > CopyState is a structure inside copy.c, and I can imagine that postgres > guys are not going to externalize that as well it is not necessary for > their case. However, for our case, this looks more than essential if we > want to improve the data exchange between nodes at an upper level than WAL. > For example, ALTER TABLE builds itself RemoteCopyOptions and uses this > function to generate a COPY query on the chosen options. This query > generated is then used to fetch table data from remote nodes when > necessary. For the time being it is only used with ALTER TABLE, but I am > pretty sure that there will be other cases where those functions could be > used to fetch quickly huge amount of data with COPY protocol. > > Ok, understood. In this case, I will recommend two things 1. Whether an internal COPY generation can use CopyState itself? 2. If this structure is essential, rename the member variables with some prefix like "rco_", to distinguish those from their CopyState counterparts. This helps a lot when one wants to navigate code looking for all usages of a particular member using say cscope or tags. Unfortunately PostgreSQL doesn't use this idea and hence very common names like larg are used in many structures and makes it difficult to find out say all usages of larg in say SetOperationStmt. > Once this patch is committed I will need to realign redistribution core > code. > It shouldn't take long, though. > -- > Michael Paquier > http://michael.otacoo.com > -- Best Wishes, Ashutosh Bapat EntepriseDB Corporation The Enterprise Postgres Company |
From: Ashutosh B. <ash...@en...> - 2012-07-11 08:11:17
|
Hi Michael, Here are comments for the remote copy refactoring 1. The member variable remoteState may be renamed as remoteCopyState. In BeginCopy(), however, the renaming becomes important as remoteState can be misleading. 2. In RemoteCopy_GetRelationLoc use macro LOCATOR_TYPE_REPLICATED rather than using literal 'R'. 3. Not your change, but, GetAnyDataNode is a misleading name, the function actually looks for the preferred data nodes as well. So, we should use name something like GetPreferredReplicationNode or something like that. 4. In RemoteCopy_GetRelationLoc, we use distribution column name, and scan the attribute list for a matching column, to get its attribute number. Why don't we use the partAttrNum directly? 5. The macro APPENDSOFAR is being defined within the function twice. Why don't we move it outside of the function by making it parameterised. 6. What's the advantage of having RemoteCopyOptions as a separate structure. It looks like a subset of CopyState. Why can't we use CopyState itself instead? Adding this structure seems to have added quite a few lines of code and quite of few pallocs underneath strdup, list_copy etc.? On Wed, Jul 11, 2012 at 10:33 AM, Michael Paquier <mic...@gm... > wrote: > Updated patch is attached. > I attach once again the patch for remote copy, still pending for review. > > On Tue, Jul 10, 2012 at 6:41 PM, Ashutosh Bapat < > ash...@en...> wrote: > >> 1. Please name the variables as local_hashalgorithm instead of >> hashalgorithm_loc (loc is also used mean the location info e.g Get >> RelationLocInfo()). >> > Done. > > 2. Please rename IsHashDistributable as IsTypeHashDistributable(). Same >> case with IsModuloDistributable(). >> > Done. > > 3. In function prologues, add information about input and output >> variables, esp. in case of GetRelationDistributionItems(). >> > Done. > > 4. This is not change in this patch, but good if you can accommodate it. >> At line 1102, there is a switch case, which has action only for a single >> case, so, it better be replaced with an "if". >> > Done. > > 5. SortRelationDistributionNodes, better be a macro, as it's not doing >> anything but call qsort(). >> > Don't agree on that. I feel it is clearer to let it as an external > function as it is used afterwards in a more flexible way by redistribution. > > >> 6. Following comment doesn't make much sense, please remove it. The >> executor state at the time of table creation and querying can be completely >> different. There is no connection >> 1218 * We should use session data because Executor uses it as >> well to run >> 1219 * commands on nodes. >> > Done. > > 7. In GetRelationDistributionNodes(), there are three places, node sorting >> function is called. Instead, you should just nodeoid array at these three >> places and call the sorting function at the end. In case we need to add >> another if case in that function, to get array of nodes in some other way, >> one has to remember to add the call to sort the nodes array, which can be >> avoided if you add the call to sort function at the end. >> BuildRelationDistributionNodes() sorts the nodeoids inside it, but you can >> take that call out of this function. >> > Done. Simplifies code. > > >> 8. Probably not your code but, Function BuildRelationDistributionNodes() >> does a repalloc() for every new nodeoid it finds. Each repalloc is costly. >> Instead we can allocate memory large enough to contain all members of the >> list passed. If there are node repeated (which will be less likely), we >> will waste a few bytes, but won't be as expensive as calling repalloc(). >> > > >> 9. All the renamed functions are marked as "extern", do you really need >> them so? Also, I don't understand why these functions are located in heap.c? >> > Yes and yes. Do you remember this patch is a base for redistribution? > Our code has an essential dependency with a static structure inside heap.c > classifying the typle attributes called SysAtt. This dependency is really > important because thanks to that we can check if a chosen distribution > column is a system column or not. In case it is a distribution column, we > return an error. > This makes sufficient reasons to keep this code inside heap.c and heap.h. > > I hope regression is sane. > > They are, and testing regressions is one of the first things to do when > reviewing a patch I believe. > > We should honestly move faster on those small reviews, and discuss about > the core of redistribution which is the real purpose here. > Thanks. > -- > Michael Paquier > http://michael.otacoo.com > -- Best Wishes, Ashutosh Bapat EntepriseDB Corporation The Enterprise Postgres Company |
From: Koichi S. <koi...@gm...> - 2012-07-11 07:52:10
|
Hi, Enclosed is a path of pgxc_monitor to check if gtm, gtm_proxy, coordinator or datanode is running. I first thought to add dedicated protocol to gtm and gtm_proxy for this, but found PQconnectGTM() does many shakehands and it's sufficient to check if gtm/gtm_proxy is running. So there's no modification to the core. Sorry the patch to filelist.sgmlin includes fix to restrict pgxc_clean, pgxc_ddl and pgxc_monitor only to XC. Regards; ---------- Koichi Suzuki |
From: Ashutosh B. <ash...@en...> - 2012-07-11 06:08:10
|
The refactoring patch is good to be committed. On Wed, Jul 11, 2012 at 10:33 AM, Michael Paquier <mic...@gm... > wrote: > Updated patch is attached. > I attach once again the patch for remote copy, still pending for review. > > On Tue, Jul 10, 2012 at 6:41 PM, Ashutosh Bapat < > ash...@en...> wrote: > >> 1. Please name the variables as local_hashalgorithm instead of >> hashalgorithm_loc (loc is also used mean the location info e.g Get >> RelationLocInfo()). >> > Done. > > 2. Please rename IsHashDistributable as IsTypeHashDistributable(). Same >> case with IsModuloDistributable(). >> > Done. > > 3. In function prologues, add information about input and output >> variables, esp. in case of GetRelationDistributionItems(). >> > Done. > > 4. This is not change in this patch, but good if you can accommodate it. >> At line 1102, there is a switch case, which has action only for a single >> case, so, it better be replaced with an "if". >> > Done. > > 5. SortRelationDistributionNodes, better be a macro, as it's not doing >> anything but call qsort(). >> > Don't agree on that. I feel it is clearer to let it as an external > function as it is used afterwards in a more flexible way by redistribution. > > >> 6. Following comment doesn't make much sense, please remove it. The >> executor state at the time of table creation and querying can be completely >> different. There is no connection >> 1218 * We should use session data because Executor uses it as >> well to run >> 1219 * commands on nodes. >> > Done. > > 7. In GetRelationDistributionNodes(), there are three places, node sorting >> function is called. Instead, you should just nodeoid array at these three >> places and call the sorting function at the end. In case we need to add >> another if case in that function, to get array of nodes in some other way, >> one has to remember to add the call to sort the nodes array, which can be >> avoided if you add the call to sort function at the end. >> BuildRelationDistributionNodes() sorts the nodeoids inside it, but you can >> take that call out of this function. >> > Done. Simplifies code. > > >> 8. Probably not your code but, Function BuildRelationDistributionNodes() >> does a repalloc() for every new nodeoid it finds. Each repalloc is costly. >> Instead we can allocate memory large enough to contain all members of the >> list passed. If there are node repeated (which will be less likely), we >> will waste a few bytes, but won't be as expensive as calling repalloc(). >> > > >> 9. All the renamed functions are marked as "extern", do you really need >> them so? Also, I don't understand why these functions are located in heap.c? >> > Yes and yes. Do you remember this patch is a base for redistribution? > Our code has an essential dependency with a static structure inside heap.c > classifying the typle attributes called SysAtt. This dependency is really > important because thanks to that we can check if a chosen distribution > column is a system column or not. In case it is a distribution column, we > return an error. > This makes sufficient reasons to keep this code inside heap.c and heap.h. > > I hope regression is sane. > > They are, and testing regressions is one of the first things to do when > reviewing a patch I believe. > > We should honestly move faster on those small reviews, and discuss about > the core of redistribution which is the real purpose here. > Thanks. > -- > Michael Paquier > http://michael.otacoo.com > -- Best Wishes, Ashutosh Bapat EntepriseDB Corporation The Enterprise Postgres Company |
From: Michael P. <mic...@gm...> - 2012-07-11 01:50:58
|
On Tue, Jul 10, 2012 at 11:15 PM, Nikhil Sontakke <ni...@st...>wrote: > Hi Shankar, > > It is probably fastest if you hack it up for your own use for now. I > would just keep the list of my coordinators in an array. Then in the > doConnect() function I would use a rand/srand call and then just mod > it with the number of servers to get the index into this array. I will > then use this to get the host/port info. The normal PG community would > not be interested in such type of a functionality anyways. > We should carefully think about any implementation here. We need to do something that is a minimum intrusive to Postgres code. Also, how to set up the multiple connection information? - External setting file? => seems heavy - With arguments? => might become hard to read. > > Regards, > Nikhils > > On Tue, Jul 10, 2012 at 9:49 AM, Shankar Hariharan > <har...@ya...> wrote: > > If no one else is looking at this I can definitely pick it up. Pls let me > > know. > > > > thanks, > > Shankar > > > > ________________________________ > > From: Ashutosh Bapat <ash...@en...> > > To: Nikhil Sontakke <ni...@st...> > > Cc: Koichi Suzuki <koi...@gm...>; Shankar Hariharan > > <har...@ya...>; > > "pos...@li..." > > <pos...@li...> > > Sent: Tuesday, July 10, 2012 7:24 AM > > > > Subject: Re: [Postgres-xc-developers] Question on gtm-proxy > > > > > > > > On Tue, Jul 10, 2012 at 5:46 PM, Nikhil Sontakke <ni...@st...> > > wrote: > > > >> Yes. Although we don't have to care application partitioning based > >> upon the distribution key, it's a good idea to make all the > >> coordinator workload as even as possible. > >> > >> In the case of DBT-1, we ran several DBT-1 process, each produces > >> random transaction but goes to specific coordinator. > >> > >> I think pgbench can do the similar. > >> > > > > Well, a quick look at pgbench.c suggests that changing the doConnect() > > function to pick up a random pghost and pgport set whenever it's > > called should be enough to get this going. > > > > > > That's good. May be we can pick those in round robin fashion to get > > deterministic results. > > > > > > > > Regards, > > Nikhils > > > >> Regards; > >> ---------- > >> Koichi Suzuki > >> > >> > >> 2012/7/10 Ashutosh Bapat <ash...@en...>: > >>> Hi Shankar, > >>> Will it be possible for you to change the pgbench code to dynamically > >>> fire > >>> on all available coordinators? > >>> > >>> Since we use modified DBT-1 for our benchmarking, we haven't got to the > >>> point where we can modify pg_bench to suite XC. But that's something, > we > >>> will welcome if anybody is interested. > >>> > >>> > >>> On Mon, Jul 9, 2012 at 9:41 PM, Shankar Hariharan > >>> <har...@ya...> wrote: > >>>> > >>>> Thanks Ashutosh. You are right, while running this test i just had > >>>> pgbench > >>>> running against one coordinator. Looks like pgbench by itself may not > be > >>>> an > >>>> apt tool for this kind of testing, I will instead run pgbench's > >>>> underlying > >>>> sql script from cmdline against either coordinators. Thanks for > that > >>>> tip. > >>>> > >>>> I got a lot of input on my problem from a lot of folks on the list, > the > >>>> feedback is much appreciated. Thanks everybody! > >>>> > >>>> On max_prepared_transactions, I will factor in the number of > >>>> coordinators > >>>> and the max_connections on each coordinator while arriving at a > figure. > >>>> Will also try out Koichi Suzuki's suggestion to have multiple NICs on > >>>> the > >>>> GTM. I will post my findings here for the same cluster configuration > as > >>>> before. > >>>> > >>>> thanks, > >>>> Shankar > >>>> > >>>> ________________________________ > >>>> From: Ashutosh Bapat <ash...@en...> > >>>> To: Shankar Hariharan <har...@ya...> > >>>> Cc: "pos...@li..." > >>>> <pos...@li...> > >>>> Sent: Sunday, July 8, 2012 11:02 PM > >>>> > >>>> Subject: Re: [Postgres-xc-developers] Question on gtm-proxy > >>>> > >>>> Hi Shankar, > >>>> You have got answers to the prepared transaction problem, I guess. I > >>>> have > >>>> something else below. > >>>> > >>>> On Sat, Jul 7, 2012 at 1:44 AM, Shankar Hariharan > >>>> <har...@ya...> wrote: > >>>> > >>>> As planned I ran some tests using PGBench on this setup : > >>>> > >>>> Node 1 - Coord1, Datanode1, gtm-proxy1 > >>>> Node 2- Coord2, Datanode2, gtm-proxy2 > >>>> Node 3- Datanode3, gtm > >>>> > >>>> I was connecting via Coord1 for these tests: > >>>> - scale factor of 30 used > >>>> - tests run using the following input parameters for pgbench: > >>>> > >>>> > >>>> Try connecting to both the coordinators, it should give you better > >>>> performance, esp, when you are using distributed tables. With > >>>> distributed > >>>> tables, coordinator gets involved in query execution more than that in > >>>> the > >>>> case of replicated tables. So, balancing load across two coordinators > >>>> would > >>>> help. > >>>> > >>>> > >>>> > >>>> Clients Threads Duration Transactions > >>>> 1 1 100 6204 > >>>> 2 2 100 9960 > >>>> 4 4 100 12880 > >>>> 6 6 100 1676 > >>>> > >>>> > >>>> > >>>> 8 > >>>> 8 8 100 19758 > >>>> 10 10 100 21944 > >>>> 12 12 100 20674 > >>>> > >>>> The run went well until the 8 clients. I started seeing errors on 10 > >>>> clients onwards and eventually the 14 client run has been hanging > around > >>>> for > >>>> over an hour now. The errors I have been seeing on console are the > >>>> following > >>>> : > >>>> > >>>> pgbench console : > >>>> Client 8 aborted in state 12: ERROR: GTM error, could not obtain > >>>> snapshot > >>>> Client 0 aborted in state 13: ERROR: maximum number of prepared > >>>> transactions reached > >>>> Client 7 aborted in state 13: ERROR: maximum number of prepared > >>>> transactions reached > >>>> Client 11 aborted in state 13: ERROR: maximum number of prepared > >>>> transactions reached > >>>> Client 9 aborted in state 13: ERROR: maximum number of prepared > >>>> transactions reached > >>>> > >>>> node console: > >>>> ERROR: GTM error, could not obtain snapshot > >>>> STATEMENT: INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) > >>>> VALUES (253, 26, 1888413, -817, CURRENT_TIMESTAMP); > >>>> ERROR: maximum number of prepared transactions reached > >>>> HINT: Increase max_prepared_transactions (currently 10). > >>>> STATEMENT: PREPARE TRANSACTION 'T201428' > >>>> ERROR: maximum number of prepared transactions reached > >>>> STATEMENT: END; > >>>> ERROR: maximum number of prepared transactions reached > >>>> STATEMENT: END; > >>>> ERROR: maximum number of prepared transactions reached > >>>> STATEMENT: END; > >>>> ERROR: maximum number of prepared transactions reached > >>>> STATEMENT: END; > >>>> ERROR: GTM error, could not obtain snapshot > >>>> STATEMENT: INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) > >>>> VALUES (140, 29, 2416403, -4192, CURRENT_TIMESTAMP); > >>>> > >>>> I was also watching the processes on each node and see the following > for > >>>> the 14 client run: > >>>> > >>>> > >>>> Node1 : > >>>> postgres 25571 10511 0 04:41 ? 00:00:02 postgres: postgres > >>>> postgres ::1(33481) TRUNCATE TABLE waiting > >>>> postgres 25620 11694 0 04:46 ? 00:00:00 postgres: postgres > >>>> postgres pgbench-address (50388) TRUNCATE TABLE > >>>> > >>>> Node2: > >>>> postgres 10979 9631 0 Jul05 ? 00:00:42 postgres: postgres > >>>> postgres coord1-address(57357) idle in transaction > >>>> > >>>> Node3: > >>>> postgres 20264 9911 0 08:35 ? 00:00:05 postgres: postgres > >>>> postgres coord1-address(51406) TRUNCATE TABLE waiting > >>>> > >>>> > >>>> I was going to restart the processes on all nodes and start over but > did > >>>> not want to lose this data as it could be useful information. > >>>> > >>>> Any explanation on the above issue is much appreciated. I will try the > >>>> next run with a higher value set for max_prepared_transactions. Any > >>>> recommendations for a good value on this front? > >>>> > >>>> thanks, > >>>> Shankar > >>>> > >>>> > >>>> ________________________________ > >>>> From: Shankar Hariharan <har...@ya...> > >>>> To: Ashutosh Bapat <ash...@en...> > >>>> Cc: "pos...@li..." > >>>> <pos...@li...> > >>>> Sent: Friday, July 6, 2012 8:22 AM > >>>> > >>>> Subject: Re: [Postgres-xc-developers] Question on gtm-proxy > >>>> > >>>> Hi Ashutosh, > >>>> I was trying to size the load on a server and was wondering if a GTM > >>>> could be shared w/o much performance overhead between a small number > of > >>>> datanodes and coordinators. I will post my findings here. > >>>> thanks, > >>>> Shankar > >>>> > >>>> ________________________________ > >>>> From: Ashutosh Bapat <ash...@en...> > >>>> To: Shankar Hariharan <har...@ya...> > >>>> Cc: "pos...@li..." > >>>> <pos...@li...> > >>>> Sent: Friday, July 6, 2012 12:25 AM > >>>> Subject: Re: [Postgres-xc-developers] Question on gtm-proxy > >>>> > >>>> Hi Shankar, > >>>> Running gtm-proxy has shown to improve the performance, because it > >>>> lessens > >>>> the load on GTM, by serving requests locally. Why do you want the > >>>> coordinators to connect directly to the GTM? Are you seeing any > >>>> performance > >>>> improvement from doing that? > >>>> > >>>> On Fri, Jul 6, 2012 at 10:08 AM, Shankar Hariharan > >>>> <har...@ya...> wrote: > >>>> > >>>> Follow up to earlier email. In the setup described below, can I avoid > >>>> using a gtm-proxy? That is, can I just simply point coordinators to > the > >>>> one > >>>> gtm running on node 3 ? > >>>> My initial plan was to just run the gtm on node 3 then I thought I > could > >>>> try a datanode without a local coordinator which was why I put these > two > >>>> together on node 3. > >>>> thanks, > >>>> Shankar > >>>> > >>>> ________________________________ > >>>> From: Shankar Hariharan <har...@ya...> > >>>> To: "pos...@li..." > >>>> <pos...@li...> > >>>> Sent: Thursday, July 5, 2012 11:35 PM > >>>> Subject: Question on multiple coordinators > >>>> > >>>> Hello, > >>>> > >>>> Am trying out XC 1.0 in the following configuraiton. > >>>> Node 1 - Coord1, Datanode1, gtm-proxy1 > >>>> Node 2- Coord2, Datanode2, gtm-proxy2 > >>>> Node 3- Datanode3, gtm > >>>> > >>>> I setup all nodes but forgot to add Coord1 to Coord2 and vice versa. > In > >>>> addition I missed the pg_hba edit as well. So the first table T1 that > I > >>>> created for distribution from Coord1 was not "visible| from Coord2 but > >>>> was > >>>> on all the data nodes. > >>>> I tried to get Coord2 backinto business in various ways but the first > >>>> table I created refused to show up on Coord2 : > >>>> - edit pg_hba and add node on both coord1 and 2. Then run select > >>>> pgxc_pool_reload(); > >>>> - restart coord 1 and 2 > >>>> - drop node c2 from c1 and c1 from c2 and add them back followed by > >>>> select > >>>> pgxc_pool_reload(); > >>>> > >>>> So I tried to create the same table T1 from Coord2 to observe behavior > >>>> and > >>>> it did not like it clearly as all nodes it "wrote" to reported that > the > >>>> table already existed which was good. At this point I could understand > >>>> that > >>>> Coord2 and Coord1 are not talking alright so I created a new table > from > >>>> coord1 with replication. This table was visible from both now. > >>>> > >>>> Question is should I expect to see the first table, let me call it T1 > >>>> after a while from Coord2 also? > >>>> > >>>> > >>>> thanks, > >>>> Shankar > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > ------------------------------------------------------------------------------ > >>>> Live Security Virtual Conference > >>>> Exclusive live event will cover all the ways today's security and > >>>> threat landscape has changed and how IT managers can respond. > >>>> Discussions > >>>> will include endpoint security, mobile security and the latest in > >>>> malware > >>>> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > >>>> _______________________________________________ > >>>> Postgres-xc-developers mailing list > >>>> Pos...@li... > >>>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > >>>> > >>>> > >>>> > >>>> > >>>> -- > >>>> Best Wishes, > >>>> Ashutosh Bapat > >>>> EntepriseDB Corporation > >>>> The Enterprise Postgres Company > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> -- > >>>> Best Wishes, > >>>> Ashutosh Bapat > >>>> EntepriseDB Corporation > >>>> The Enterprise Postgres Company > >>>> > >>>> > >>>> > >>> > >>> > >>> > >>> -- > >>> Best Wishes, > >>> Ashutosh Bapat > >>> EntepriseDB Corporation > >>> The Enterprise Postgres Company > >>> > >>> > >>> > >>> > ------------------------------------------------------------------------------ > >>> Live Security Virtual Conference > >>> Exclusive live event will cover all the ways today's security and > >>> threat landscape has changed and how IT managers can respond. > Discussions > >>> will include endpoint security, mobile security and the latest in > malware > >>> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > >>> _______________________________________________ > >>> Postgres-xc-developers mailing list > >>> Pos...@li... > >>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > >>> > >> > >> > >> > ------------------------------------------------------------------------------ > >> Live Security Virtual Conference > >> Exclusive live event will cover all the ways today's security and > >> threat landscape has changed and how IT managers can respond. > Discussions > >> will include endpoint security, mobile security and the latest in > malware > >> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > >> _______________________________________________ > >> Postgres-xc-developers mailing list > >> Pos...@li... > >> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > > > > > > > > -- > > StormDB - http://www.stormdb.com > > The Database Cloud > > > > > > > > > > -- > > Best Wishes, > > Ashutosh Bapat > > EntepriseDB Corporation > > The Enterprise Postgres Company > > > > > > > > > > -- > StormDB - http://www.stormdb.com > The Database Cloud > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Postgres-xc-developers mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > -- Michael Paquier http://michael.otacoo.com |