postgres-xc-developers Mailing List for Postgres-XC

Brought to you by: ahsanhadi, amitdkhan, ashutoshbapat, gabbasb, and 3 others

postgres-xc-developers — Postgres-XC hackers and developers

You can subscribe to this list here.

2010	Jan	Feb	Mar	Apr (10)	May (17)	Jun (3)	Jul	Aug	Sep (8)	Oct (18)	Nov (51)	Dec (74)
2011	Jan (47)	Feb (44)	Mar (44)	Apr (102)	May (35)	Jun (25)	Jul (56)	Aug (69)	Sep (32)	Oct (37)	Nov (31)	Dec (16)
2012	Jan (34)	Feb (127)	Mar (218)	Apr (252)	May (80)	Jun (137)	Jul (205)	Aug (159)	Sep (35)	Oct (50)	Nov (82)	Dec (52)
2013	Jan (107)	Feb (159)	Mar (118)	Apr (163)	May (151)	Jun (89)	Jul (106)	Aug (177)	Sep (49)	Oct (63)	Nov (46)	Dec (7)
2014	Jan (65)	Feb (128)	Mar (40)	Apr (11)	May (4)	Jun (8)	Jul (16)	Aug (11)	Sep (4)	Oct (1)	Nov (5)	Dec (16)
2015	Jan (5)	Feb	Mar (2)	Apr (5)	May (4)	Jun (12)	Jul	Aug	Sep	Oct	Nov	Dec (4)
2019	Jan	Feb	Mar	Apr	May	Jun	Jul (2)	Aug	Sep	Oct	Nov	Dec

S	M	T	W	T	F	S
				1 (10)	2 (18)	3 (1)
4	5 (15)	6 (16)	7 (11)	8 (17)	9 (7)	10 (6)
11 (1)	12 (6)	13 (4)	14 (8)	15 (3)	16 (3)	17
18	19 (8)	20 (10)	21 (12)	22 (5)	23 (3)	24
25	26 (2)	27 (2)	28 (1)	29 (2)	30 (5)	31 (1)

Flat | Threaded

1 2 3 .. 7 > >> (Page 1 of 7)

Re: [Postgres-xc-developers] Savepoints

From: Michael P. <mic...@gm...> - 2013-08-31 08:43:38

On Sat, Aug 31, 2013 at 5:03 AM, Leonard Boyce <le...@si...> wrote:
> Maybe this use-case will bump the feature priority a touch.
The more the demand, the more it would get on the priority list of
developers. This is not the first demand for savepoints I am seeing
since 1.1beta has been released. Suzuki-san?
-- 
Michael

[Postgres-xc-developers] Savepoints

From: Leonard B. <le...@si...> - 2013-08-30 20:32:35

Hi all,

We've been eagerly following XC since it's first announcement and are
eager to start testing XC. We had to wait for trigger support, but it
seems our only remaining barrier is no support for savepoints.

I've seen some recent activity on this question and wanted to chime in
with our use-case.

We access Pg from Erlang and use savepoints extensively to implement
update-insert / insert-update (with loop limits) logic within
transactions which require safe modification of other data *prior* to
the insert-update / update-insert.

Yes, we could move all this into db functions, but that would take a
fairly large effort on our part and it makes code management more
complex for us.

Maybe this use-case will bump the feature priority a touch.

Kind regards,
Leonard

Re: [Postgres-xc-developers] Small patch to reduce GTM logging

From: Michael P. <mic...@gm...> - 2013-08-30 12:19:39

On Fri, Aug 30, 2013 at 6:06 PM, Nikhil Sontakke <ni...@st...> wrote:
> A small patch to change the log level of one particular message from LOG to
> DEBUG1. In case a standby is configured, for each connection done to the
> GTM, a corresponding call log gets added in the log file.
>
> "Connection established with standby. - 0x2a1dd60"
>
> The log file can grow up in size unnecessarily (we have seen 5GB+ sized log
> files!) with these not so useful messages given that we already log if the
> connection to the Standby fails.
>
> This patch applies on both head and REL1.1
+1 for applying that to 1.1 as well. This is not really a bug fix, but
it has been an annoyance for ages, and parsing log files of many GBs
is always a problem.
-- 
Michael

Re: [Postgres-xc-developers] XC Stuck in a Loop

From: Michael P. <mic...@gm...> - 2013-08-30 12:17:57

On Fri, Aug 30, 2013 at 4:36 PM, Abbas Butt <abb...@en...> wrote:
> Can you share the query and table structure that you are using to perform
> the test? You can obscure the column/table names if they are part of some
> proprietary application.
> The comment you mentioned in execRemote.c is relevant for DMLs only but you
> said that your's is a SELECT, isn't it?
Adding the output of EXPLAIN VERBOSE could also help to understand the
plan your query is using.
-- 
Michael

Re: [Postgres-xc-developers] XC Stuck in a Loop

From: Abbas B. <abb...@en...> - 2013-08-30 07:36:55

Can you share the query and table structure that you are using to perform
the test? You can obscure the column/table names if they are part of some
proprietary application.
The comment you mentioned in execRemote.c is relevant for DMLs only but you
said that your's is a SELECT, isn't it?


On Fri, Aug 30, 2013 at 1:30 AM, Matt Warner <MW...@xi...> wrote:

> Short version: it looks like there’s a problem with ExecNestLoop and
> ExecProcNode recursively calling each other and getting stuck in a loop
> that never completes.****
>
> ** **
>
> Details:****
>
> I’ve been experimenting with XC using fairly large tables. In this case
> it’s 4 tables, 2 replicated, 2 distributed by hash. The select statement is
> a mere 31 lines long and contains a group by on 2 columns of one of the
> tables. The query never completes, even days later.****
>
> ** **
>
> I’m using dtrace and a “git clone” version of XC from a few days ago
> compiled with the debug flag (-g). I see that ExecNestLoop and ExecProcNode
> appear to be calling each other heavily, as in thousands of times per
> second. That is, I am seeing stack traces where one calls the other, but
> also vice versa.****
>
> ** **
>
> In researching further, I see a note in execRemote.c that seems to
> indicate that recursively calling ExecProcNode is happening by design:****
>
> ** **
>
>         /*****
>
>          * The current implementation of DMLs with RETURNING when run on
> replicated****
>
>          * tables returns row from one of the datanodes. In order to
> achieve this****
>
> *         * ExecProcNode is repeatedly called saving one tuple and
> rejecting the rest.*
>
>          * Do we have a DML on replicated table with RETURNING?****
>
>          */****
>
> ** **
>
> I don’t know about the accuracy of the debugging and certainly I’m out of
> my element when poring through the XC source code, so my guess as to the
> source of the problem should be questioned.****
>
> ** **
>
> What additional debugging information can I provide to assist with the
> correct identification and debugging of this problem?****
>
> ** **
>
> Regards,****
>
> ** **
>
> Matt****
>
> ** **
>
> NOTICE OF CONFIDENTIALITY - This material is intended for the use of the
> individual or entity to which it is addressed, and may contain information
> that is privileged, confidential and exempt from disclosure under
> applicable laws.  BE FURTHER ADVISED THAT THIS EMAIL MAY CONTAIN PROTECTED
> HEALTH INFORMATION (PHI). BY ACCEPTING THIS MESSAGE, YOU ACKNOWLEDGE THE
> FOREGOING, AND AGREE AS FOLLOWS: YOU AGREE TO NOT DISCLOSE TO ANY THIRD
> PARTY ANY PHI CONTAINED HEREIN, EXCEPT AS EXPRESSLY PERMITTED AND ONLY TO
> THE EXTENT NECESSARY TO PERFORM YOUR OBLIGATIONS RELATING TO THE RECEIPT OF
> THIS MESSAGE.  If the reader of this email (and attachments) is not the
> intended recipient, you are hereby notified that any dissemination,
> distribution or copying of this communication is strictly prohibited.
> Please notify the sender of the error and delete the e-mail you received.
> Thank you.****
>
> ** **
>
>
> ------------------------------------------------------------------------------
> Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more!
> Discover the easy way to master current and previous Microsoft technologies
> and advance your career. Get an incredible 1,500+ hours of step-by-step
> tutorial videos with LearnDevNow. Subscribe today and save!
> http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk
> _______________________________________________
> Postgres-xc-developers mailing list
> Pos...@li...
> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
>
>


-- 
-- 
*Abbas*
 Architect

Ph: 92.334.5100153
 Skype ID: gabbasb
www.enterprisedb.co
<http://www.enterprisedb.com/>m<http://www.enterprisedb.com/>
*
Follow us on Twitter*
@EnterpriseDB

Visit EnterpriseDB for tutorials, webinars,
whitepapers<http://www.enterprisedb.com/resources-community>and
more<http://www.enterprisedb.com/resources-community>

[Postgres-xc-developers] XC Stuck in a Loop

From: Matt W. <MW...@XI...> - 2013-08-29 20:31:31

Short version: it looks like there's a problem with ExecNestLoop and ExecProcNode recursively calling each other and getting stuck in a loop that never completes.

Details:
I've been experimenting with XC using fairly large tables. In this case it's 4 tables, 2 replicated, 2 distributed by hash. The select statement is a mere 31 lines long and contains a group by on 2 columns of one of the tables. The query never completes, even days later.

I'm using dtrace and a "git clone" version of XC from a few days ago compiled with the debug flag (-g). I see that ExecNestLoop and ExecProcNode appear to be calling each other heavily, as in thousands of times per second. That is, I am seeing stack traces where one calls the other, but also vice versa.

In researching further, I see a note in execRemote.c that seems to indicate that recursively calling ExecProcNode is happening by design:

/*
* The current implementation of DMLs with RETURNING when run on replicated
* tables returns row from one of the datanodes. In order to achieve this
* ExecProcNode is repeatedly called saving one tuple and rejecting the rest.
* Do we have a DML on replicated table with RETURNING?
*/

I don't know about the accuracy of the debugging and certainly I'm out of my element when poring through the XC source code, so my guess as to the source of the problem should be questioned.

What additional debugging information can I provide to assist with the correct identification and debugging of this problem?

Regards,

Matt

NOTICE OF CONFIDENTIALITY - This material is intended for the use of the individual or entity to which it is addressed, and may contain information that is privileged, confidential and exempt from disclosure under applicable laws. BE FURTHER ADVISED THAT THIS EMAIL MAY CONTAIN PROTECTED HEALTH INFORMATION (PHI). BY ACCEPTING THIS MESSAGE, YOU ACKNOWLEDGE THE FOREGOING, AND AGREE AS FOLLOWS: YOU AGREE TO NOT DISCLOSE TO ANY THIRD PARTY ANY PHI CONTAINED HEREIN, EXCEPT AS EXPRESSLY PERMITTED AND ONLY TO THE EXTENT NECESSARY TO PERFORM YOUR OBLIGATIONS RELATING TO THE RECEIPT OF THIS MESSAGE. If the reader of this email (and attachments) is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. Please notify the sender of the error and delete the e-mail you received. Thank you.

Re: [Postgres-xc-developers] Doubt with the distribution of composite indices.

From: Michael P. <mic...@gm...> - 2013-08-29 01:45:48

On Wed, Aug 28, 2013 at 10:18 PM, Afonso Bione <aag...@gm...> wrote:
> Dear Postgres-XC menbers,
>
> I have this index
> CREATE UNIQUE INDEX ON mdl_backidstemp_baciteite_uix mdl_backup_ids_template
> USING btree (backupid, itemname, itemid) and I would like to convert it to the postgres-xc
OK, there are two things here:
1) this CREATE INDEX query is written incorrectly, you need to define
the index name after the keyword INDEX, so... it should be written
like that:
CREATE UNIQUE INDEX mdl_backidstemp_baciteite_uix ON
mdl_backup_ids_template USING btree (backupid, itemname, itemid);
2) If you want to create a unique index like that on table
mdl_backup_ids_template, its distribution column needs be be either
backupid, itemname or itemid.
-- 
Michael

[Postgres-xc-developers] Doubt with the distribution of composite indices.

From: Afonso B. <aag...@gm...> - 2013-08-28 13:18:51

Dear Postgres-XC menbers,

I have this index
CREATE UNIQUE INDEX ON mdl_backidstemp_baciteite_uix mdl_backup_ids_template
USING btree (backupid, itemname, itemid)
and I would like to convert it to the postgres-xc

I like to use for distributed way.

Best Regards

Afonso Bione

Re: [Postgres-xc-developers] Clarification on materialized view restriction needed

From: Ashutosh B. <ash...@en...> - 2013-08-27 08:27:17

Sorry, wrong mailing list. It should have been pgsql-hackers. Ignore this
mail.



On Tue, Aug 27, 2013 at 12:45 PM, Ashutosh Bapat <
ash...@en...> wrote:

> Hi All,
> I want to create a materialized view as the output of a plpgsql function
> returning a set of rows. But that function creates temporary tables and
> thus can not be used for creating materialized view as per the
> documentation at
> http://www.postgresql.org/docs/9.3/static/sql-creatematerializedview.html.
> "This query will run within a security-restricted operation; in
> particular, calls to functions that themselves create temporary tables will
> fail."
>
> I tried to understand what is "security-restricted operation", and didn't
> find any definition of this term or any listing as to "these are
> security-restricted operations ...". I am wondering what are other
> restrictions on the queries whose results can be used to create
> materialized views.
>
> --
> Best Wishes,
> Ashutosh Bapat
> EntepriseDB Corporation
> The Postgres Database Company
>



-- 
Best Wishes,
Ashutosh Bapat
EntepriseDB Corporation
The Postgres Database Company

[Postgres-xc-developers] Clarification on materialized view restriction needed

From: Ashutosh B. <ash...@en...> - 2013-08-27 07:15:43

Hi All,
I want to create a materialized view as the output of a plpgsql function
returning a set of rows. But that function creates temporary tables and
thus can not be used for creating materialized view as per the
documentation at
http://www.postgresql.org/docs/9.3/static/sql-creatematerializedview.html.
"This query will run within a security-restricted operation; in particular,
calls to functions that themselves create temporary tables will fail."

I tried to understand what is "security-restricted operation", and didn't
find any definition of this term or any listing as to "these are
security-restricted operations ...". I am wondering what are other
restrictions on the queries whose results can be used to create
materialized views.

-- 
Best Wishes,
Ashutosh Bapat
EntepriseDB Corporation
The Postgres Database Company

Re: [Postgres-xc-developers] Subquery as relations in standard planner

From: Koichi S. <koi...@gm...> - 2013-08-26 01:44:15

I guess the option 1 needs least effort.   On the other hand, option 3 will
be useful in user functions when we allow more XC-specific functions.    If
option 1 or 2 can be integrated easily with option 3, I agree to take an
option with the least effort.

Regards;
---
Koichi Suzuki


---
Koichi Suzuki


2013/8/23 Ashutosh Bapat <ash...@en...>

> Hi All,
> Consider the following query,
> select * from (select avg(val2), val from tab1) a join (select avg(val2),
> val from tab2) b using (val); where tab1 and tab2 are tables distributed on
> column val. The query gets shipped completely using FQS. But not through
> standard planner. The standard planner, as of now, doesn't have ability to
> ship subquery RTEs. I am trying a fix for the same.
>
> If the subquery is shippable completely i.e. corresponding rel has subplan
> with RemoteQuery as top plan which doesn't have any coordinator quals and
> does not require projection (doesn't have any unshippable expressions in
> the targetlist), it can be used in the same fashion as a table on
> datanodes, except that fetching data requires firing query, which should be
> same as the query constructed in the RemoteQuery node. While constructing
> this query, if there are any aggregates in the query, those need to be
> finalised on the datanode (remember, we get transitioned results for
> aggregates from the datanodes by default.) In order to specify that the
> datanode should finalise the aggregates, we add finalisation function to
> the aggregate. This is done during deparsing phase, using flag
> fianalise_aggs. This flag was earlier in Query, then moved to RemoteQuery
> to avoid changes to PostgreSQL structures. But having it in RemoteQuery
> implies that the whole query in that node should have aggregates finalised.
> That may not be true once we start reducing subquery relations, since in
> such cases, you may not want to finalise aggregates in the top query but do
> want to do that for a subquery. Thus it fits to have this switch to be in
> Query structure. Now, that we have seen one back and forth of this flag, I
> would like to discuss some more options for getting finalised results from
> the datanode and have some poll of which one is best.
>
> (This discussion assumes that readers know that we construct a Query
> structure for the query to be passed to the datanode, and then deparse it)
>
> 1. Use the finalise_aggs flag in Query and while deparsing the query add
> finalisation function. Less impact on PG code.
>
> 2. Add final function node on the top of aggregate nodes in the Query to
> be sent to the datanode and deparse this Query structure. No change in PG
> code, but we need to add final function nodes on each aggregate node by
> pulling those from everywhere in the query, so some coding involved.
> Deparsing is expected to take care of properly constructing the query
> automatically.
>
> 3. Add an aggregate directive like "finalise" in line with other
> directives like "order by", etc. This requires syntax change in PG, which
> can be invasive.
>
> Any other ideas?
> --
> Best Wishes,
> Ashutosh Bapat
> EntepriseDB Corporation
> The Postgres Database Company
>
>
> ------------------------------------------------------------------------------
> Introducing Performance Central, a new site from SourceForge and
> AppDynamics. Performance Central is your source for news, insights,
> analysis and resources for efficient Application Performance Management.
> Visit us today!
> http://pubads.g.doubleclick.net/gampad/clk?id=48897511&iu=/4140/ostg.clktrk
> _______________________________________________
> Postgres-xc-developers mailing list
> Pos...@li...
> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
>
>

Re: [Postgres-xc-developers] Small leak in GTM

From: Koichi S. <koi...@gm...> - 2013-08-26 01:40:38

I committed this to all the releases with additional comment.

Regards;
---
Koichi Suzuki

---
Koichi Suzuki


2013/8/22 Nikhil Sontakke <ni...@st...>

> Hi,
>
> PFA, patch which applies against PGXC head, rel11, and rel10.
>
> The node_name which gets added via GTM_RegisterPGXCNode does not get freed
> up.
>
> Btw, Andrei's memleak patch should also be committed soon.
>
> Regards,
> Nikhils
> --
> StormDB - http://www.stormdb.com
> The Database Cloud
>
> ------------------------------------------------------------------------------
> Introducing Performance Central, a new site from SourceForge and
> AppDynamics. Performance Central is your source for news, insights,
> analysis and resources for efficient Application Performance Management.
> Visit us today!
> http://pubads.g.doubleclick.net/gampad/clk?id=48897511&iu=/4140/ostg.clktrk
> _______________________________________________
> Postgres-xc-developers mailing list
> Pos...@li...
> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
>
>

[Postgres-xc-developers] Subquery as relations in standard planner

From: Ashutosh B. <ash...@en...> - 2013-08-23 06:38:52

Hi All,
Consider the following query,
select * from (select avg(val2), val from tab1) a join (select avg(val2),
val from tab2) b using (val); where tab1 and tab2 are tables distributed on
column val. The query gets shipped completely using FQS. But not through
standard planner. The standard planner, as of now, doesn't have ability to
ship subquery RTEs. I am trying a fix for the same.

If the subquery is shippable completely i.e. corresponding rel has subplan
with RemoteQuery as top plan which doesn't have any coordinator quals and
does not require projection (doesn't have any unshippable expressions in
the targetlist), it can be used in the same fashion as a table on
datanodes, except that fetching data requires firing query, which should be
same as the query constructed in the RemoteQuery node. While constructing
this query, if there are any aggregates in the query, those need to be
finalised on the datanode (remember, we get transitioned results for
aggregates from the datanodes by default.) In order to specify that the
datanode should finalise the aggregates, we add finalisation function to
the aggregate. This is done during deparsing phase, using flag
fianalise_aggs. This flag was earlier in Query, then moved to RemoteQuery
to avoid changes to PostgreSQL structures. But having it in RemoteQuery
implies that the whole query in that node should have aggregates finalised.
That may not be true once we start reducing subquery relations, since in
such cases, you may not want to finalise aggregates in the top query but do
want to do that for a subquery. Thus it fits to have this switch to be in
Query structure. Now, that we have seen one back and forth of this flag, I
would like to discuss some more options for getting finalised results from
the datanode and have some poll of which one is best.

(This discussion assumes that readers know that we construct a Query
structure for the query to be passed to the datanode, and then deparse it)

1. Use the finalise_aggs flag in Query and while deparsing the query add
finalisation function. Less impact on PG code.

2. Add final function node on the top of aggregate nodes in the Query to be
sent to the datanode and deparse this Query structure. No change in PG
code, but we need to add final function nodes on each aggregate node by
pulling those from everywhere in the query, so some coding involved.
Deparsing is expected to take care of properly constructing the query
automatically.

3. Add an aggregate directive like "finalise" in line with other directives
like "order by", etc. This requires syntax change in PG, which can be
invasive.

Any other ideas?
-- 
Best Wishes,
Ashutosh Bapat
EntepriseDB Corporation
The Postgres Database Company

Re: [Postgres-xc-developers] Small leak in GTM

From: Nikhil S. <ni...@st...> - 2013-08-23 04:12:40

>  Accumulating such fixes are very valuable to the project.
>
>
Yes indeed. Focusing on performance along with emphasis on stability is
indeed important.

And I can see that you have already committed Andrei's memory leak patch.
Sorry about that. For some reason my GIT tree was reporting up-to-date but
did not contain any updates from August! I did a fresh clone from github
and can see all commits properly now.

Regards,
Nikhils


 Regards;
> ---
> Koichi Suzuki
>
>  On 2013/08/22, at 20:31, Nikhil Sontakke <ni...@st...> wrote:
>
> Hi,
>
> PFA, patch which applies against PGXC head, rel11, and rel10.
>
> The node_name which gets added via GTM_RegisterPGXCNode does not get freed
> up.
>
> Btw, Andrei's memleak patch should also be committed soon.
>
> Regards,
> Nikhils
> --
> StormDB - http://www.stormdb.com
> The Database Cloud <small_memleak_gtm.patch>
> ------------------------------------------------------------------------------
> Introducing Performance Central, a new site from SourceForge and
> AppDynamics. Performance Central is your source for news, insights,
> analysis and resources for efficient Application Performance Management.
> Visit us today!
>
> http://pubads.g.doubleclick.net/gampad/clk?id=48897511&iu=/4140/ostg.clktrk_______________________________________________
> Postgres-xc-developers mailing list
> Pos...@li...
> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
>
>
>


-- 
StormDB - http://www.stormdb.com
The Database Cloud

Re: [Postgres-xc-developers] Small leak in GTM

From: 鈴木幸市 <ko...@in...> - 2013-08-23 00:56:51

Thanks Nikhil.

Accumulating such fixes are very valuable to the project.

Regards;
---
Koichi Suzuki

On 2013/08/22, at 20:31, Nikhil Sontakke <ni...@st...<mailto:ni...@st...>> wrote:

Hi,

PFA, patch which applies against PGXC head, rel11, and rel10.

The node_name which gets added via GTM_RegisterPGXCNode does not get freed up.

Btw, Andrei's memleak patch should also be committed soon.

Regards,
Nikhils
--
StormDB - http://www.stormdb.com<http://www.stormdb.com/>
The Database Cloud <small_memleak_gtm.patch>------------------------------------------------------------------------------
Introducing Performance Central, a new site from SourceForge and
AppDynamics. Performance Central is your source for news, insights,
analysis and resources for efficient Application Performance Management.
Visit us today!
http://pubads.g.doubleclick.net/gampad/clk?id=48897511&iu=/4140/ostg.clktrk_______________________________________________
Postgres-xc-developers mailing list
Pos...@li...
https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers

Re: [Postgres-xc-developers] [Postgres-xc-core] Postgres-XC 1.1 is out

From: Ahsan H. <ahs...@en...> - 2013-08-22 08:00:49

Congratulation. This is a major achievement.




On Thu, Aug 22, 2013 at 10:13 AM, Koichi Suzuki <koi...@gm...>wrote:

> Please copy this announcement to you channel.
>
> Thank you;
> ---
> Koichi Suzuki
>
> ---
> Koichi Suzuki
>
>
> 2013/8/22 Koichi Suzuki <koi...@gm...>
>
>> Postgres-XC development group is proud to announce the release of
>> Postgres-XC version 1.1.   This is the second major release and comes with
>> many useful features.
>>
>> Source tarball will be available at
>> http://sourceforge.net/projects/postgres-xc/files/Version_1.1/pgxc-v1.1.tar.gz/downloadwhich comes with HTML documentation and man pages.
>>
>>
>> Please visit the project page http://postgres-xc.sourceforge.net and
>> development page https://sourceforge.net/projects/postgres-xc/ for more
>> materials.
>>
>> New features of Postgres-XC 1.1 include:
>>
>> * Node addition and removal while Postgres-XC cluster is in operation.
>> * Added --restoremode option to pg_ctl to import catalog information from
>> other coordinator/datanode, used when adding new node.
>> * Added --include-nodes option to pg_dump and pg_dumpall to export node
>> information as well. Mainly for node addition.
>> * pgxc_lock_for_backup() function to disable DDLs while new node is going
>> to be added and catalog is exported to the new node.
>> * Row TRIGGER support.
>> * RETURNING support.
>> * pgxc_ctl tool for Postgres-XC cluster configuration and operation
>> (contrib module).
>> * Backup GTM restart point with CREATE BARRIER statement.
>> * Merged with PostgreSQL 9.2.4.
>> * ALTER TABLE statement to redistribute tables
>>
>> Also we have number of improvements in the planner for better performance
>> as:
>>
>> * Push down sorting operation to the datanodes by using ORDER BY clause
>> in queries to sent to the datanodes.
>> * Push down LIMIT clause to datanodes.
>> * Pushdown outer joins to datanodes.
>> * Improve fast query shipping to ship queries containing subqueries.
>> * Push GROUP BY clause to the datanodes when there is ORDER BY, LIMIT and
>> other clauses in the query.
>>
>> It also comes with number of another improvements and fixes.
>>
>>
>> The group appreciate all the members who provided valuable codes and
>> fruitful discussions.
>>
>> Best Regards;
>> ---
>> Koichi Suzuki
>>
>
>
>
> ------------------------------------------------------------------------------
> Introducing Performance Central, a new site from SourceForge and
> AppDynamics. Performance Central is your source for news, insights,
> analysis and resources for efficient Application Performance Management.
> Visit us today!
> http://pubads.g.doubleclick.net/gampad/clk?id=48897511&iu=/4140/ostg.clktrk
> _______________________________________________
> Postgres-xc-core mailing list
> Pos...@li...
> https://lists.sourceforge.net/lists/listinfo/postgres-xc-core
>
>


-- 
Ahsan Hadi
Snr Director Product Development
EnterpriseDB Corporation
The Enterprise Postgres Company

Phone: +92-51-8358874
Mobile: +92-333-5162114

Website: www.enterprisedb.com
EnterpriseDB Blog: http://blogs.enterprisedb.com/
Follow us on Twitter: http://www.twitter.com/enterprisedb

This e-mail message (and any attachment) is intended for the use of the
individual or entity to whom it is addressed. This message contains
information from EnterpriseDB Corporation that may be privileged,
confidential, or exempt from disclosure under applicable law. If you are
not the intended recipient or authorized to receive this for the intended
recipient, any use, dissemination, distribution, retention, archiving, or
copying of this communication is strictly prohibited. If you have received
this e-mail in error, please notify the sender immediately by reply e-mail
and delete this message.

Re: [Postgres-xc-developers] Postgres-XC 1.1 is out

From: Koichi S. <koi...@gm...> - 2013-08-22 05:13:37

Please copy this announcement to you channel.

Thank you;
---
Koichi Suzuki

---
Koichi Suzuki


2013/8/22 Koichi Suzuki <koi...@gm...>

> Postgres-XC development group is proud to announce the release of
> Postgres-XC version 1.1.   This is the second major release and comes with
> many useful features.
>
> Source tarball will be available at
> http://sourceforge.net/projects/postgres-xc/files/Version_1.1/pgxc-v1.1.tar.gz/downloadwhich comes with HTML documentation and man pages.
>
>
> Please visit the project page http://postgres-xc.sourceforge.net and
> development page https://sourceforge.net/projects/postgres-xc/ for more
> materials.
>
> New features of Postgres-XC 1.1 include:
>
> * Node addition and removal while Postgres-XC cluster is in operation.
> * Added --restoremode option to pg_ctl to import catalog information from
> other coordinator/datanode, used when adding new node.
> * Added --include-nodes option to pg_dump and pg_dumpall to export node
> information as well. Mainly for node addition.
> * pgxc_lock_for_backup() function to disable DDLs while new node is going
> to be added and catalog is exported to the new node.
> * Row TRIGGER support.
> * RETURNING support.
> * pgxc_ctl tool for Postgres-XC cluster configuration and operation
> (contrib module).
> * Backup GTM restart point with CREATE BARRIER statement.
> * Merged with PostgreSQL 9.2.4.
> * ALTER TABLE statement to redistribute tables
>
> Also we have number of improvements in the planner for better performance
> as:
>
> * Push down sorting operation to the datanodes by using ORDER BY clause in
> queries to sent to the datanodes.
> * Push down LIMIT clause to datanodes.
> * Pushdown outer joins to datanodes.
> * Improve fast query shipping to ship queries containing subqueries.
> * Push GROUP BY clause to the datanodes when there is ORDER BY, LIMIT and
> other clauses in the query.
>
> It also comes with number of another improvements and fixes.
>
>
> The group appreciate all the members who provided valuable codes and
> fruitful discussions.
>
> Best Regards;
> ---
> Koichi Suzuki
>

[Postgres-xc-developers] Postgres-XC 1.1 is out

From: Koichi S. <koi...@gm...> - 2013-08-22 04:52:42

Postgres-XC development group is proud to announce the release of
Postgres-XC version 1.1.   This is the second major release and comes with
many useful features.

Source tarball will be available at
http://sourceforge.net/projects/postgres-xc/files/Version_1.1/pgxc-v1.1.tar.gz/downloadwhich
comes with HTML documentation and man pages.


Please visit the project page http://postgres-xc.sourceforge.net and
development page https://sourceforge.net/projects/postgres-xc/ for more
materials.

New features of Postgres-XC 1.1 include:

* Node addition and removal while Postgres-XC cluster is in operation.
* Added --restoremode option to pg_ctl to import catalog information from
other coordinator/datanode, used when adding new node.
* Added --include-nodes option to pg_dump and pg_dumpall to export node
information as well. Mainly for node addition.
* pgxc_lock_for_backup() function to disable DDLs while new node is going
to be added and catalog is exported to the new node.
* Row TRIGGER support.
* RETURNING support.
* pgxc_ctl tool for Postgres-XC cluster configuration and operation
(contrib module).
* Backup GTM restart point with CREATE BARRIER statement.
* Merged with PostgreSQL 9.2.4.
* ALTER TABLE statement to redistribute tables

Also we have number of improvements in the planner for better performance
as:

* Push down sorting operation to the datanodes by using ORDER BY clause in
queries to sent to the datanodes.
* Push down LIMIT clause to datanodes.
* Pushdown outer joins to datanodes.
* Improve fast query shipping to ship queries containing subqueries.
* Push GROUP BY clause to the datanodes when there is ORDER BY, LIMIT and
other clauses in the query.

It also comes with number of another improvements and fixes.


The group appreciate all the members who provided valuable codes and
fruitful discussions.

Best Regards;
---
Koichi Suzuki

Re: [Postgres-xc-developers] ExecRemoteQuery

From: Ashutosh B. <ash...@en...> - 2013-08-22 04:10:11

On Wed, Aug 21, 2013 at 11:37 PM, Nikhil Sontakke <ni...@st...>wrote:

>
>
>
>> To me and to the reader of the code, it should be clear as to why or why
>> not (or when or when not) to have this flag set/reset. The comments in the
>> code do not say that. They just given one particular case where
>> materialisation is not needed and again, we don't know why it's not needed.
>>
>>
> I will try to add more comments. But looks like you are not reading this
> mail thread at all :-)
>

Whether or not I have read this mail thread, the reader of the code is not
going to read it. So it's important to have comments in the code.


>
> I have mentioned multiple times on this thread:
>
> 1)
> Here's the snippet from RemoteQueryNext:
>
> if (tuplestorestate && !TupIsNull(scanslot))
>                     tuplestore_puttupleslot(
> tuplestorestate, scanslot);
>
> I printed the current memory context inside this function, it is
> ""ExecutorState". This means that the tuple will stay around till the query
> is executing in its entirety! For large queries this is bound to cause
> issues.
>
> 2) 1 above means that we need to decide if we need the tuplestore in ALL
> cases or not. The code today always materializes. We DO NOT NEED to
> materialize if we are not going to rescan this RemoteQuery executor node.
>
> 3) ISTM, that the tuplestore should only be used for inner nodes of joins.
> There's no need to store outer nodes in tuplestores. Also if it's a single
> SELECT query, then there's no need to use tuplestore at all as well.It's
> possible that there are other cases where we might not need to materialize.
> But for now I have tried to optimize the above two cases. Someone who has
> dabbled with this code more should try to see if it's needed in other cases
> as well. As unnecessarily materializing is a huge performance penalty in
> terms of RAM resource usage.
>
>
>> If you have seen the code, we reduce paths and not plans. So, when I am
>> reducing two paths into one RemoteQuery path, what should happen to this
>> flag? How should it be used if it's set in both the paths/only one of the
>> path etc.?
>>
>>
> That's why the need to do it early on in the pathing/planning process. If
> a [potentially reduced] RemoteQuery node is topmost, then it's not going to
> be rescanned, so no need to materialize.
>
>
That's not always true. In case of cursors (esp. random access cursors)
even topmost RemoteQuery node needs a rescan.


> Additionally even if a [potentially reduced] RemoteQuery node is part of a
> join depending on whether it's inner or outer should we materialize it.
>
> This patch will help in general improvement in single top RemoteQuery node
> plans as well as some join cases which is a good start.
>
> Regards,
> Nikhils
>
>
>
>>
>> On Wed, Aug 21, 2013 at 4:00 PM, Ashutosh Bapat <
>> ash...@en...> wrote:
>>
>>>
>>>
>>>
>>> On Wed, Aug 21, 2013 at 3:54 PM, Nikhil Sontakke <ni...@st...>wrote:
>>>
>>>>
>>>> The patch needs some more comments as to why and when we should
>>>>> materialise or not.
>>>>>
>>>>
>>>> The existing code *always* materializes in all cases which can be a
>>>> huge performance penalty in some cases. This patch tries to address those
>>>> cases.
>>>>
>>>
>>> There are no comments in the code as to WHY we need to re/set this flag
>>> wherever set/reset.
>>>
>>>
>>>>
>>>>
>>>>> Also, I am not so in favour of adding another switch all the way in
>>>>> RemoteQuery path. Well, I am always not in favour of adding new switch,
>>>>> since it needs to be maintained at various places in code; and if somebody
>>>>> forgets to set/reset it somewhere it shows up as bug in entirely different
>>>>> areas. For example, if some code forgets to set/reset this switch while
>>>>> creating path, it would end up as a bug in executor or a performance
>>>>> regression in the executor and would be difficult to trace it back all the
>>>>> way to the path creation.
>>>>>
>>>>
>>>> This patch sets it to true when we initialize the remote query. It's
>>>> not a conditional setting. So it's always set. That ways we avoid causing
>>>> havoc in other parts until unless we know about other cases where it's not
>>>> needed.
>>>>
>>>
>>> I am talking about some code added in future (and since this is very
>>> happening area, we will add code in near future), where the coder or
>>> reviewer has to be congnizant that this flag needs to be set/reset at some
>>> places.
>>>
>>>>
>>>>
>>>>> Is there a way, we can do this based on some other members of
>>>>> RemoteQuery or RemoteQueryPath structure?
>>>>>
>>>>>
>>>> I did not find any and this follows the standard percolating
>>>> information down from higher nodes to lower nodes strategy. While execution
>>>> we rarely have any logic to go to the parent to check for stuff, so this is
>>>> pretty consistent with existing code I would say.
>>>>
>>>>
>>> If possible, we do use member of Plan/Parse node/s in execution code.
>>> It's not a common practice to percolate the information down through every
>>> structure that you come across.
>>>
>>>
>>>> Regards,
>>>> Nikhils
>>>>
>>>>
>>>>
>>>>>
>>>>> On Fri, Aug 16, 2013 at 5:57 PM, Nikhil Sontakke <ni...@st...>wrote:
>>>>>
>>>>>> Duh.. Patch attached :^)
>>>>>>
>>>>>>
>>>>>> On Fri, Aug 16, 2013 at 5:50 PM, Nikhil Sontakke <ni...@st...
>>>>>> > wrote:
>>>>>>
>>>>>>>
>>>>>>> Additionally, ISTM, that the tuplestore should only be used for
>>>>>>>> inner nodes. There's no need to store outer nodes in tuplestores. Also if
>>>>>>>> it's a single SELECT query, then there's no need to use tuplestore at all
>>>>>>>> as well.
>>>>>>>>
>>>>>>>>
>>>>>>> PFA, a patch which tries to avoid using the tuplestore in the above
>>>>>>> two cases. During planning we decide if a tuplestore should be used for the
>>>>>>> RemoteQuery. The default is true, and we set it to false for the above two
>>>>>>> cases for now.
>>>>>>>
>>>>>>> I ran regression test cases with and without the patch and got the
>>>>>>> exact same set of failures (and more importantly same diffs).
>>>>>>>
>>>>>>> To be clear this patch is not specific to COPY TO, but it's a
>>>>>>> generic change to avoid using tuplestore in certain simple scenarios
>>>>>>> thereby reducing the memory footprint of the remote query execution. Note
>>>>>>> that it also does not solve Hitoshi-san's COPY FROM issues. Will submit a
>>>>>>> separate patch for that.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Nikhils
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> Looks like if we can pass hints during plan creation as to whether
>>>>>>>> the remote scan is part of a join (and is inner node) or not, then
>>>>>>>> accordingly decision can be taken to materialize into the tuplestore.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Nikhils
>>>>>>>>
>>>>>>>> On Thu, Aug 15, 2013 at 10:43 PM, Nikhil Sontakke <
>>>>>>>> ni...@st...> wrote:
>>>>>>>>
>>>>>>>>> Looks like my theory was wrong, make installcheck is giving more
>>>>>>>>> errors with this patch applied. Will have to look at a different solution..
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Nikhils
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Thu, Aug 15, 2013 at 2:11 PM, Nikhil Sontakke <
>>>>>>>>> ni...@st...> wrote:
>>>>>>>>>
>>>>>>>>>> So, I looked at this code carefully and ISTM, that because of the
>>>>>>>>>> way we fetch the data from the connections and return it immediately inside
>>>>>>>>>> RemoteQueryNext, storing it in the tuplestore using tuplestore_puttupleslot
>>>>>>>>>> is NOT required at all.
>>>>>>>>>>
>>>>>>>>>> So, I have removed the call to tuplestore_puttupleslot and things
>>>>>>>>>> seem to be ok for me. I guess we should do a FULL test run with this patch
>>>>>>>>>> just to ensure that it does not cause issues in any scenarios.
>>>>>>>>>>
>>>>>>>>>> A careful look by new set of eyes will help here. I think, if
>>>>>>>>>> there are no issues, this plugs a major leak in the RemoteQuery code path
>>>>>>>>>> which is almost always used in our case.
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> Nikhils
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Aug 14, 2013 at 7:05 PM, Nikhil Sontakke <
>>>>>>>>>> ni...@st...> wrote:
>>>>>>>>>>
>>>>>>>>>>> Using a tuplestore for data coming from RemoteQuery is kinda
>>>>>>>>>>> wrong and that's what has introduced this issue. Looks like just changing
>>>>>>>>>>> the memory context will not work as it interferes with the other
>>>>>>>>>>> functioning of the tuplestore :-|
>>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>> Nikhils
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Aug 14, 2013 at 3:07 PM, Ashutosh Bapat <
>>>>>>>>>>> ash...@en...> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Yes, that's correct.
>>>>>>>>>>>>
>>>>>>>>>>>> My patch was not intended to fix this. This was added while
>>>>>>>>>>>> fixing a bug for parameterised quals on RemoteQuery I think. Check commits
>>>>>>>>>>>> by Amit in this area.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Aug 14, 2013 at 3:03 PM, Nikhil Sontakke <
>>>>>>>>>>>> ni...@st...> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Yeah, but AFAICS, even 1.1 (and head) *still* has a leak in
>>>>>>>>>>>>> it.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Here's the snippet from RemoteQueryNext:
>>>>>>>>>>>>>
>>>>>>>>>>>>> if (tuplestorestate && !TupIsNull(scanslot))
>>>>>>>>>>>>>                     tuplestore_puttupleslot(tuplestorestate,
>>>>>>>>>>>>> scanslot);
>>>>>>>>>>>>>
>>>>>>>>>>>>> I printed the current memory context inside this function, it
>>>>>>>>>>>>> is ""ExecutorState". This means that the tuple will stay around till the
>>>>>>>>>>>>> query is executing in its entirety! For large COPY queries this is bound to
>>>>>>>>>>>>> cause issues as is also reported by Hitoshi san on another thread.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I propose that in RemoteQueryNext, before calling the
>>>>>>>>>>>>> tuplestore_puttupleslot we switch into
>>>>>>>>>>>>> scan_node->ps.ps_ExprContext's ecxt_per_tuple_memory context.
>>>>>>>>>>>>> It will get reset, when the next tuple has to be returned to the caller and
>>>>>>>>>>>>> the leak will be curtailed. Thoughts?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>> Nikhils
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Aug 14, 2013 at 11:33 AM, Ashutosh Bapat <
>>>>>>>>>>>>> ash...@en...> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> There has been an overhaul in the planner (and corresponding
>>>>>>>>>>>>>> parts of executor) in 1.1, so it would be better if they move to 1.1 after
>>>>>>>>>>>>>> GA.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, Aug 14, 2013 at 10:54 AM, Nikhil Sontakke <
>>>>>>>>>>>>>> ni...@st...> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Ah, I see.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I was looking at REL_1_0 sources.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> There are people out there using REL_1_0 as well.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>> Nikhils
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Tue, Aug 13, 2013 at 9:52 AM, Ashutosh Bapat <
>>>>>>>>>>>>>>> ash...@en...> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> It should be part of 1.1 as well. It was done to support
>>>>>>>>>>>>>>>> projection out of RemoteQuery node.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Tue, Aug 13, 2013 at 7:17 AM, Nikhil Sontakke <
>>>>>>>>>>>>>>>> ni...@st...> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hi Ashutosh,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I guess you have changed it in pgxc head? I was looking at
>>>>>>>>>>>>>>>>> 103 and 11 branches and saw this. In that even ExecRemoteQuery seems to
>>>>>>>>>>>>>>>>> have an issue wherein it's not using the appropriate context.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>> Nikhils
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Sent from my iPhone
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Aug 12, 2013, at 9:54 AM, Ashutosh Bapat <
>>>>>>>>>>>>>>>>> ash...@en...> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Welcome to the mess ;) and enjoy junk food.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Sometime back, I have changed ExecRemoteQuery to be called
>>>>>>>>>>>>>>>>> in the same fashion as other Scan nodes. So, you will see ExecRemoteQuery
>>>>>>>>>>>>>>>>> calling ExecScan with RemoteQueryNext as the iterator. So, I assume your
>>>>>>>>>>>>>>>>> comment pertains to RemoteQueryNext and its minions and not ExecRemoteQuery
>>>>>>>>>>>>>>>>> per say!
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> This code needs a lot of rework, removing duplications,
>>>>>>>>>>>>>>>>> using proper way of materialisation, central response handler and error
>>>>>>>>>>>>>>>>> handler etc. If we clean up this code, some improvements in planner (like
>>>>>>>>>>>>>>>>> using MergeAppend plan) for Sort, will be possible. Regarding
>>>>>>>>>>>>>>>>> materialisation, the code uses a linked list for materialising the rows
>>>>>>>>>>>>>>>>> from datanodes (in case the same connection needs to be given to other
>>>>>>>>>>>>>>>>> remote query node), which must be eating a lot of performance. Instead we
>>>>>>>>>>>>>>>>> should be using some kind of tuplestore there. We actually use tuplestore
>>>>>>>>>>>>>>>>> (as well) in the RemoteQuery node; the same method can be used.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Sat, Aug 10, 2013 at 10:48 PM, Nikhil Sontakke <
>>>>>>>>>>>>>>>>> ni...@st...> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Have a Query about ExecRemoteQuery.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> The logic seems to have been modeled after ExecMaterial.
>>>>>>>>>>>>>>>>>> ISTM, that it should have been modeled after ExecScan because we fetch
>>>>>>>>>>>>>>>>>> tuples, and those which match the qual should be sent up. ExecMaterial is
>>>>>>>>>>>>>>>>>> for materializing and collecting and storing tuples.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Can anyone explain? The reason for asking this is I am
>>>>>>>>>>>>>>>>>> suspecting a big memory leak in this code path. We are not using any
>>>>>>>>>>>>>>>>>> expression context nor we are freeing up tuples as we scan for the one
>>>>>>>>>>>>>>>>>> which qualifies.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>>> Nikhils
>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>> StormDB - http://www.stormdb.com
>>>>>>>>>>>>>>>>>> The Database Cloud
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> ------------------------------------------------------------------------------
>>>>>>>>>>>>>>>>>> Get 100% visibility into Java/.NET code with AppDynamics
>>>>>>>>>>>>>>>>>> Lite!
>>>>>>>>>>>>>>>>>> It's a free troubleshooting tool designed for production.
>>>>>>>>>>>>>>>>>> Get down to code-level detail for bottlenecks, with <2%
>>>>>>>>>>>>>>>>>> overhead.
>>>>>>>>>>>>>>>>>> Download for free and get started troubleshooting in
>>>>>>>>>>>>>>>>>> minutes.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clktrk
>>>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>>>> Postgres-xc-developers mailing list
>>>>>>>>>>>>>>>>>> Pos...@li...
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>> Best Wishes,
>>>>>>>>>>>>>>>>> Ashutosh Bapat
>>>>>>>>>>>>>>>>> EntepriseDB Corporation
>>>>>>>>>>>>>>>>> The Postgres Database Company
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>> Best Wishes,
>>>>>>>>>>>>>>>> Ashutosh Bapat
>>>>>>>>>>>>>>>> EntepriseDB Corporation
>>>>>>>>>>>>>>>> The Postgres Database Company
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>> StormDB - http://www.stormdb.com
>>>>>>>>>>>>>>> The Database Cloud
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> Best Wishes,
>>>>>>>>>>>>>> Ashutosh Bapat
>>>>>>>>>>>>>> EntepriseDB Corporation
>>>>>>>>>>>>>> The Postgres Database Company
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> StormDB - http://www.stormdb.com
>>>>>>>>>>>>> The Database Cloud
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> Best Wishes,
>>>>>>>>>>>> Ashutosh Bapat
>>>>>>>>>>>> EntepriseDB Corporation
>>>>>>>>>>>> The Postgres Database Company
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> StormDB - http://www.stormdb.com
>>>>>>>>>>> The Database Cloud
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> StormDB - http://www.stormdb.com
>>>>>>>>>> The Database Cloud
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> StormDB - http://www.stormdb.com
>>>>>>>>> The Database Cloud
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> StormDB - http://www.stormdb.com
>>>>>>>> The Database Cloud
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> StormDB - http://www.stormdb.com
>>>>>>> The Database Cloud
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> StormDB - http://www.stormdb.com
>>>>>> The Database Cloud
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best Wishes,
>>>>> Ashutosh Bapat
>>>>> EntepriseDB Corporation
>>>>> The Postgres Database Company
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> StormDB - http://www.stormdb.com
>>>> The Database Cloud
>>>>
>>>
>>>
>>>
>>> --
>>> Best Wishes,
>>> Ashutosh Bapat
>>> EntepriseDB Corporation
>>> The Postgres Database Company
>>>
>>
>>
>>
>> --
>> Best Wishes,
>> Ashutosh Bapat
>> EntepriseDB Corporation
>> The Postgres Database Company
>>
>
>
>
> --
> StormDB - http://www.stormdb.com
> The Database Cloud
>



-- 
Best Wishes,
Ashutosh Bapat
EntepriseDB Corporation
The Postgres Database Company

Re: [Postgres-xc-developers] Installation Guide

From: West, W. <ww...@uc...> - 2013-08-21 21:31:04

All,

I was successful in finally configuring my 2 node cluster properly and it
is communicating on all nodes. Thanks for all the help especially from
Michael. Please feel free to close this ticket.

Regards,

bw

On 8/20/13 8:28 PM, "West, William" <ww...@uc...> wrote:

>Will do in the future.
>
>Thanks,
>
>bw
>
>On 8/20/13 7:56 PM, "West, William" <ww...@uc...> wrote:
>
>>Michael,
>>
>>Here are the results of the two queries:
>>
>>postgres=# execute direct on data1 'select 1';
>>ERROR:  Failed to get pooled connections
>>
>>postgres=# execute direct on data2 'select 1';
>>ERROR:  Failed to get pooled connections
>>
>>I am pretty sure that I have configured something incorrectly but
>>unfortunately I don't have the experience with this product to make it
>>viable for our project. I can only guess that the problem might lie in
>>the
>>postgresql.conf files on the datanodes but my track record in assumptions
>>is abysmal. Before I throw in the towel though I wonder if I might impose
>>on you to do a quick scan of the attached datanode postgresql.conf files
>>and see if anything jumps off the page to you. I have postfixed the
>>datanode names for each file so you can distinguish them.
>>
>>Thanks for your patience,
>>
>>Bill West
>>
>>
>>
>>
>>
>>On 8/20/13 6:05 PM, "Michael Paquier" <mic...@gm...> wrote:
>>
>>>(Re-adding pgxc-hackers in cc...)
>>>
>>>On Wed, Aug 21, 2013 at 9:38 AM, West, William <ww...@uc...> wrote:
>>>> This is the result from datanode data1:
>>>>
>>>> postgres'# execute direct on data2 'select clock_timestamp()';
>>>
>>>Datanodes are not able to connect between each other, please connect
>>>to a Coordinator and run the following:
>>>execute direct on data2 'select 1';
>>>execute direct on data1 'select 1';
>>>>
>>>> It returned no result.
>>>>
>>>> On datanode data2 this was the output:
>>>>
>>>> postgres=# execute direct on data1 'select clock_timestamp()';
>>>> ERROR:  Failed to get pooled connections
>>>This is the origin of your problem. You cannot connect to Datanode 1.
>>>-- 
>>>Michael
>>
>

Re: [Postgres-xc-developers] Fix for compile duo to Bison 3.0 update

From: Jure K. <j....@gm...> - 2013-08-21 18:47:16

On 19. 08. 2013 03:48, Michael Paquier wrote:
> On Tue, Aug 13, 2013 at 5:28 AM, Jure Kobal <j....@gm...> wrote:
>> Since Bison 3.0 they stopped support for YYPARSE_PARAM, which is used in
>> contrib/cube and contrib/seg. More can be read at:
>> http://www.postgresql.org/message-id/736...@ss...
>>
>> There is a patch for PostgreSQL which fixes this but doesn't apply clean on
>> Postgres-XC 1.0.3 without some minor changes.
>>
>> Attached is the patch for 1.0.3. Tested with bison 3.0 and 2.7.1. Both compile
>> without errors.
> Yes, this is definitely something to be aware of, but -1 for the patch
> as this fix has been committed in the Postgres code tree, and XC
> should directly pick it up from there by merging all its active
> branches with the latest commits of postgres.
> 

Sorry for that. Didn't know about the merge process between postgresql and
postgres-xc. Will keep it in mind in th future.

--
Regards,
Jure

Re: [Postgres-xc-developers] ExecRemoteQuery

From: Nikhil S. <ni...@st...> - 2013-08-21 18:08:18

> To me and to the reader of the code, it should be clear as to why or why
> not (or when or when not) to have this flag set/reset. The comments in the
> code do not say that. They just given one particular case where
> materialisation is not needed and again, we don't know why it's not needed.
>
>
I will try to add more comments. But looks like you are not reading this
mail thread at all :-)

I have mentioned multiple times on this thread:

1)
Here's the snippet from RemoteQueryNext:

if (tuplestorestate && !TupIsNull(scanslot))
                    tuplestore_puttupleslot(
tuplestorestate, scanslot);

I printed the current memory context inside this function, it is
""ExecutorState". This means that the tuple will stay around till the query
is executing in its entirety! For large queries this is bound to cause
issues.

2) 1 above means that we need to decide if we need the tuplestore in ALL
cases or not. The code today always materializes. We DO NOT NEED to
materialize if we are not going to rescan this RemoteQuery executor node.

3) ISTM, that the tuplestore should only be used for inner nodes of joins.
There's no need to store outer nodes in tuplestores. Also if it's a single
SELECT query, then there's no need to use tuplestore at all as well.It's
possible that there are other cases where we might not need to materialize.
But for now I have tried to optimize the above two cases. Someone who has
dabbled with this code more should try to see if it's needed in other cases
as well. As unnecessarily materializing is a huge performance penalty in
terms of RAM resource usage.


> If you have seen the code, we reduce paths and not plans. So, when I am
> reducing two paths into one RemoteQuery path, what should happen to this
> flag? How should it be used if it's set in both the paths/only one of the
> path etc.?
>
>
That's why the need to do it early on in the pathing/planning process. If a
[potentially reduced] RemoteQuery node is topmost, then it's not going to
be rescanned, so no need to materialize.

Additionally even if a [potentially reduced] RemoteQuery node is part of a
join depending on whether it's inner or outer should we materialize it.

This patch will help in general improvement in single top RemoteQuery node
plans as well as some join cases which is a good start.

Regards,
Nikhils



>
> On Wed, Aug 21, 2013 at 4:00 PM, Ashutosh Bapat <
> ash...@en...> wrote:
>
>>
>>
>>
>> On Wed, Aug 21, 2013 at 3:54 PM, Nikhil Sontakke <ni...@st...>wrote:
>>
>>>
>>> The patch needs some more comments as to why and when we should
>>>> materialise or not.
>>>>
>>>
>>> The existing code *always* materializes in all cases which can be a huge
>>> performance penalty in some cases. This patch tries to address those cases.
>>>
>>
>> There are no comments in the code as to WHY we need to re/set this flag
>> wherever set/reset.
>>
>>
>>>
>>>
>>>> Also, I am not so in favour of adding another switch all the way in
>>>> RemoteQuery path. Well, I am always not in favour of adding new switch,
>>>> since it needs to be maintained at various places in code; and if somebody
>>>> forgets to set/reset it somewhere it shows up as bug in entirely different
>>>> areas. For example, if some code forgets to set/reset this switch while
>>>> creating path, it would end up as a bug in executor or a performance
>>>> regression in the executor and would be difficult to trace it back all the
>>>> way to the path creation.
>>>>
>>>
>>> This patch sets it to true when we initialize the remote query. It's not
>>> a conditional setting. So it's always set. That ways we avoid causing havoc
>>> in other parts until unless we know about other cases where it's not needed.
>>>
>>
>> I am talking about some code added in future (and since this is very
>> happening area, we will add code in near future), where the coder or
>> reviewer has to be congnizant that this flag needs to be set/reset at some
>> places.
>>
>>>
>>>
>>>> Is there a way, we can do this based on some other members of
>>>> RemoteQuery or RemoteQueryPath structure?
>>>>
>>>>
>>> I did not find any and this follows the standard percolating information
>>> down from higher nodes to lower nodes strategy. While execution we rarely
>>> have any logic to go to the parent to check for stuff, so this is pretty
>>> consistent with existing code I would say.
>>>
>>>
>> If possible, we do use member of Plan/Parse node/s in execution code.
>> It's not a common practice to percolate the information down through every
>> structure that you come across.
>>
>>
>>> Regards,
>>> Nikhils
>>>
>>>
>>>
>>>>
>>>> On Fri, Aug 16, 2013 at 5:57 PM, Nikhil Sontakke <ni...@st...>wrote:
>>>>
>>>>> Duh.. Patch attached :^)
>>>>>
>>>>>
>>>>> On Fri, Aug 16, 2013 at 5:50 PM, Nikhil Sontakke <ni...@st...>wrote:
>>>>>
>>>>>>
>>>>>> Additionally, ISTM, that the tuplestore should only be used for inner
>>>>>>> nodes. There's no need to store outer nodes in tuplestores. Also if it's a
>>>>>>> single SELECT query, then there's no need to use tuplestore at all as well.
>>>>>>>
>>>>>>>
>>>>>> PFA, a patch which tries to avoid using the tuplestore in the above
>>>>>> two cases. During planning we decide if a tuplestore should be used for the
>>>>>> RemoteQuery. The default is true, and we set it to false for the above two
>>>>>> cases for now.
>>>>>>
>>>>>> I ran regression test cases with and without the patch and got the
>>>>>> exact same set of failures (and more importantly same diffs).
>>>>>>
>>>>>> To be clear this patch is not specific to COPY TO, but it's a generic
>>>>>> change to avoid using tuplestore in certain simple scenarios thereby
>>>>>> reducing the memory footprint of the remote query execution. Note that it
>>>>>> also does not solve Hitoshi-san's COPY FROM issues. Will submit a separate
>>>>>> patch for that.
>>>>>>
>>>>>> Regards,
>>>>>> Nikhils
>>>>>>
>>>>>>
>>>>>>
>>>>>>> Looks like if we can pass hints during plan creation as to whether
>>>>>>> the remote scan is part of a join (and is inner node) or not, then
>>>>>>> accordingly decision can be taken to materialize into the tuplestore.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Nikhils
>>>>>>>
>>>>>>> On Thu, Aug 15, 2013 at 10:43 PM, Nikhil Sontakke <
>>>>>>> ni...@st...> wrote:
>>>>>>>
>>>>>>>> Looks like my theory was wrong, make installcheck is giving more
>>>>>>>> errors with this patch applied. Will have to look at a different solution..
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Nikhils
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Aug 15, 2013 at 2:11 PM, Nikhil Sontakke <
>>>>>>>> ni...@st...> wrote:
>>>>>>>>
>>>>>>>>> So, I looked at this code carefully and ISTM, that because of the
>>>>>>>>> way we fetch the data from the connections and return it immediately inside
>>>>>>>>> RemoteQueryNext, storing it in the tuplestore using tuplestore_puttupleslot
>>>>>>>>> is NOT required at all.
>>>>>>>>>
>>>>>>>>> So, I have removed the call to tuplestore_puttupleslot and things
>>>>>>>>> seem to be ok for me. I guess we should do a FULL test run with this patch
>>>>>>>>> just to ensure that it does not cause issues in any scenarios.
>>>>>>>>>
>>>>>>>>> A careful look by new set of eyes will help here. I think, if
>>>>>>>>> there are no issues, this plugs a major leak in the RemoteQuery code path
>>>>>>>>> which is almost always used in our case.
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Nikhils
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Aug 14, 2013 at 7:05 PM, Nikhil Sontakke <
>>>>>>>>> ni...@st...> wrote:
>>>>>>>>>
>>>>>>>>>> Using a tuplestore for data coming from RemoteQuery is kinda
>>>>>>>>>> wrong and that's what has introduced this issue. Looks like just changing
>>>>>>>>>> the memory context will not work as it interferes with the other
>>>>>>>>>> functioning of the tuplestore :-|
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> Nikhils
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Aug 14, 2013 at 3:07 PM, Ashutosh Bapat <
>>>>>>>>>> ash...@en...> wrote:
>>>>>>>>>>
>>>>>>>>>>> Yes, that's correct.
>>>>>>>>>>>
>>>>>>>>>>> My patch was not intended to fix this. This was added while
>>>>>>>>>>> fixing a bug for parameterised quals on RemoteQuery I think. Check commits
>>>>>>>>>>> by Amit in this area.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Aug 14, 2013 at 3:03 PM, Nikhil Sontakke <
>>>>>>>>>>> ni...@st...> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Yeah, but AFAICS, even 1.1 (and head) *still* has a leak in it.
>>>>>>>>>>>>
>>>>>>>>>>>> Here's the snippet from RemoteQueryNext:
>>>>>>>>>>>>
>>>>>>>>>>>> if (tuplestorestate && !TupIsNull(scanslot))
>>>>>>>>>>>>                     tuplestore_puttupleslot(tuplestorestate,
>>>>>>>>>>>> scanslot);
>>>>>>>>>>>>
>>>>>>>>>>>> I printed the current memory context inside this function, it
>>>>>>>>>>>> is ""ExecutorState". This means that the tuple will stay around till the
>>>>>>>>>>>> query is executing in its entirety! For large COPY queries this is bound to
>>>>>>>>>>>> cause issues as is also reported by Hitoshi san on another thread.
>>>>>>>>>>>>
>>>>>>>>>>>> I propose that in RemoteQueryNext, before calling the
>>>>>>>>>>>> tuplestore_puttupleslot we switch into
>>>>>>>>>>>> scan_node->ps.ps_ExprContext's ecxt_per_tuple_memory context.
>>>>>>>>>>>> It will get reset, when the next tuple has to be returned to the caller and
>>>>>>>>>>>> the leak will be curtailed. Thoughts?
>>>>>>>>>>>>
>>>>>>>>>>>> Regards,
>>>>>>>>>>>> Nikhils
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Aug 14, 2013 at 11:33 AM, Ashutosh Bapat <
>>>>>>>>>>>> ash...@en...> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> There has been an overhaul in the planner (and corresponding
>>>>>>>>>>>>> parts of executor) in 1.1, so it would be better if they move to 1.1 after
>>>>>>>>>>>>> GA.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Aug 14, 2013 at 10:54 AM, Nikhil Sontakke <
>>>>>>>>>>>>> ni...@st...> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Ah, I see.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I was looking at REL_1_0 sources.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> There are people out there using REL_1_0 as well.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>> Nikhils
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Tue, Aug 13, 2013 at 9:52 AM, Ashutosh Bapat <
>>>>>>>>>>>>>> ash...@en...> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> It should be part of 1.1 as well. It was done to support
>>>>>>>>>>>>>>> projection out of RemoteQuery node.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Tue, Aug 13, 2013 at 7:17 AM, Nikhil Sontakke <
>>>>>>>>>>>>>>> ni...@st...> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi Ashutosh,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I guess you have changed it in pgxc head? I was looking at
>>>>>>>>>>>>>>>> 103 and 11 branches and saw this. In that even ExecRemoteQuery seems to
>>>>>>>>>>>>>>>> have an issue wherein it's not using the appropriate context.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>> Nikhils
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Sent from my iPhone
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Aug 12, 2013, at 9:54 AM, Ashutosh Bapat <
>>>>>>>>>>>>>>>> ash...@en...> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Welcome to the mess ;) and enjoy junk food.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Sometime back, I have changed ExecRemoteQuery to be called
>>>>>>>>>>>>>>>> in the same fashion as other Scan nodes. So, you will see ExecRemoteQuery
>>>>>>>>>>>>>>>> calling ExecScan with RemoteQueryNext as the iterator. So, I assume your
>>>>>>>>>>>>>>>> comment pertains to RemoteQueryNext and its minions and not ExecRemoteQuery
>>>>>>>>>>>>>>>> per say!
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> This code needs a lot of rework, removing duplications,
>>>>>>>>>>>>>>>> using proper way of materialisation, central response handler and error
>>>>>>>>>>>>>>>> handler etc. If we clean up this code, some improvements in planner (like
>>>>>>>>>>>>>>>> using MergeAppend plan) for Sort, will be possible. Regarding
>>>>>>>>>>>>>>>> materialisation, the code uses a linked list for materialising the rows
>>>>>>>>>>>>>>>> from datanodes (in case the same connection needs to be given to other
>>>>>>>>>>>>>>>> remote query node), which must be eating a lot of performance. Instead we
>>>>>>>>>>>>>>>> should be using some kind of tuplestore there. We actually use tuplestore
>>>>>>>>>>>>>>>> (as well) in the RemoteQuery node; the same method can be used.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Sat, Aug 10, 2013 at 10:48 PM, Nikhil Sontakke <
>>>>>>>>>>>>>>>> ni...@st...> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Have a Query about ExecRemoteQuery.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> The logic seems to have been modeled after ExecMaterial.
>>>>>>>>>>>>>>>>> ISTM, that it should have been modeled after ExecScan because we fetch
>>>>>>>>>>>>>>>>> tuples, and those which match the qual should be sent up. ExecMaterial is
>>>>>>>>>>>>>>>>> for materializing and collecting and storing tuples.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Can anyone explain? The reason for asking this is I am
>>>>>>>>>>>>>>>>> suspecting a big memory leak in this code path. We are not using any
>>>>>>>>>>>>>>>>> expression context nor we are freeing up tuples as we scan for the one
>>>>>>>>>>>>>>>>> which qualifies.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>> Nikhils
>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>> StormDB - http://www.stormdb.com
>>>>>>>>>>>>>>>>> The Database Cloud
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> ------------------------------------------------------------------------------
>>>>>>>>>>>>>>>>> Get 100% visibility into Java/.NET code with AppDynamics
>>>>>>>>>>>>>>>>> Lite!
>>>>>>>>>>>>>>>>> It's a free troubleshooting tool designed for production.
>>>>>>>>>>>>>>>>> Get down to code-level detail for bottlenecks, with <2%
>>>>>>>>>>>>>>>>> overhead.
>>>>>>>>>>>>>>>>> Download for free and get started troubleshooting in
>>>>>>>>>>>>>>>>> minutes.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clktrk
>>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>>> Postgres-xc-developers mailing list
>>>>>>>>>>>>>>>>> Pos...@li...
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>> Best Wishes,
>>>>>>>>>>>>>>>> Ashutosh Bapat
>>>>>>>>>>>>>>>> EntepriseDB Corporation
>>>>>>>>>>>>>>>> The Postgres Database Company
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>> Best Wishes,
>>>>>>>>>>>>>>> Ashutosh Bapat
>>>>>>>>>>>>>>> EntepriseDB Corporation
>>>>>>>>>>>>>>> The Postgres Database Company
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> StormDB - http://www.stormdb.com
>>>>>>>>>>>>>> The Database Cloud
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> Best Wishes,
>>>>>>>>>>>>> Ashutosh Bapat
>>>>>>>>>>>>> EntepriseDB Corporation
>>>>>>>>>>>>> The Postgres Database Company
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> StormDB - http://www.stormdb.com
>>>>>>>>>>>> The Database Cloud
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Best Wishes,
>>>>>>>>>>> Ashutosh Bapat
>>>>>>>>>>> EntepriseDB Corporation
>>>>>>>>>>> The Postgres Database Company
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> StormDB - http://www.stormdb.com
>>>>>>>>>> The Database Cloud
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> StormDB - http://www.stormdb.com
>>>>>>>>> The Database Cloud
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> StormDB - http://www.stormdb.com
>>>>>>>> The Database Cloud
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> StormDB - http://www.stormdb.com
>>>>>>> The Database Cloud
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> StormDB - http://www.stormdb.com
>>>>>> The Database Cloud
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> StormDB - http://www.stormdb.com
>>>>> The Database Cloud
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Best Wishes,
>>>> Ashutosh Bapat
>>>> EntepriseDB Corporation
>>>> The Postgres Database Company
>>>>
>>>
>>>
>>>
>>> --
>>> StormDB - http://www.stormdb.com
>>> The Database Cloud
>>>
>>
>>
>>
>> --
>> Best Wishes,
>> Ashutosh Bapat
>> EntepriseDB Corporation
>> The Postgres Database Company
>>
>
>
>
> --
> Best Wishes,
> Ashutosh Bapat
> EntepriseDB Corporation
> The Postgres Database Company
>



-- 
StormDB - http://www.stormdb.com
The Database Cloud

Re: [Postgres-xc-developers] ExecRemoteQuery

From: Ashutosh B. <ash...@en...> - 2013-08-21 10:39:37

I think, I need to be more clear on this.
To me and to the reader of the code, it should be clear as to why or why
not (or when or when not) to have this flag set/reset. The comments in the
code do not say that. They just given one particular case where
materialisation is not needed and again, we don't know why it's not needed.

If you have seen the code, we reduce paths and not plans. So, when I am
reducing two paths into one RemoteQuery path, what should happen to this
flag? How should it be used if it's set in both the paths/only one of the
path etc.?


On Wed, Aug 21, 2013 at 4:00 PM, Ashutosh Bapat <
ash...@en...> wrote:

>
>
>
> On Wed, Aug 21, 2013 at 3:54 PM, Nikhil Sontakke <ni...@st...>wrote:
>
>>
>> The patch needs some more comments as to why and when we should
>>> materialise or not.
>>>
>>
>> The existing code *always* materializes in all cases which can be a huge
>> performance penalty in some cases. This patch tries to address those cases.
>>
>
> There are no comments in the code as to WHY we need to re/set this flag
> wherever set/reset.
>
>
>>
>>
>>> Also, I am not so in favour of adding another switch all the way in
>>> RemoteQuery path. Well, I am always not in favour of adding new switch,
>>> since it needs to be maintained at various places in code; and if somebody
>>> forgets to set/reset it somewhere it shows up as bug in entirely different
>>> areas. For example, if some code forgets to set/reset this switch while
>>> creating path, it would end up as a bug in executor or a performance
>>> regression in the executor and would be difficult to trace it back all the
>>> way to the path creation.
>>>
>>
>> This patch sets it to true when we initialize the remote query. It's not
>> a conditional setting. So it's always set. That ways we avoid causing havoc
>> in other parts until unless we know about other cases where it's not needed.
>>
>
> I am talking about some code added in future (and since this is very
> happening area, we will add code in near future), where the coder or
> reviewer has to be congnizant that this flag needs to be set/reset at some
> places.
>
>>
>>
>>> Is there a way, we can do this based on some other members of
>>> RemoteQuery or RemoteQueryPath structure?
>>>
>>>
>> I did not find any and this follows the standard percolating information
>> down from higher nodes to lower nodes strategy. While execution we rarely
>> have any logic to go to the parent to check for stuff, so this is pretty
>> consistent with existing code I would say.
>>
>>
> If possible, we do use member of Plan/Parse node/s in execution code. It's
> not a common practice to percolate the information down through every
> structure that you come across.
>
>
>> Regards,
>> Nikhils
>>
>>
>>
>>>
>>> On Fri, Aug 16, 2013 at 5:57 PM, Nikhil Sontakke <ni...@st...>wrote:
>>>
>>>> Duh.. Patch attached :^)
>>>>
>>>>
>>>> On Fri, Aug 16, 2013 at 5:50 PM, Nikhil Sontakke <ni...@st...>wrote:
>>>>
>>>>>
>>>>> Additionally, ISTM, that the tuplestore should only be used for inner
>>>>>> nodes. There's no need to store outer nodes in tuplestores. Also if it's a
>>>>>> single SELECT query, then there's no need to use tuplestore at all as well.
>>>>>>
>>>>>>
>>>>> PFA, a patch which tries to avoid using the tuplestore in the above
>>>>> two cases. During planning we decide if a tuplestore should be used for the
>>>>> RemoteQuery. The default is true, and we set it to false for the above two
>>>>> cases for now.
>>>>>
>>>>> I ran regression test cases with and without the patch and got the
>>>>> exact same set of failures (and more importantly same diffs).
>>>>>
>>>>> To be clear this patch is not specific to COPY TO, but it's a generic
>>>>> change to avoid using tuplestore in certain simple scenarios thereby
>>>>> reducing the memory footprint of the remote query execution. Note that it
>>>>> also does not solve Hitoshi-san's COPY FROM issues. Will submit a separate
>>>>> patch for that.
>>>>>
>>>>> Regards,
>>>>> Nikhils
>>>>>
>>>>>
>>>>>
>>>>>> Looks like if we can pass hints during plan creation as to whether
>>>>>> the remote scan is part of a join (and is inner node) or not, then
>>>>>> accordingly decision can be taken to materialize into the tuplestore.
>>>>>>
>>>>>> Regards,
>>>>>> Nikhils
>>>>>>
>>>>>> On Thu, Aug 15, 2013 at 10:43 PM, Nikhil Sontakke <
>>>>>> ni...@st...> wrote:
>>>>>>
>>>>>>> Looks like my theory was wrong, make installcheck is giving more
>>>>>>> errors with this patch applied. Will have to look at a different solution..
>>>>>>>
>>>>>>> Regards,
>>>>>>> Nikhils
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Aug 15, 2013 at 2:11 PM, Nikhil Sontakke <
>>>>>>> ni...@st...> wrote:
>>>>>>>
>>>>>>>> So, I looked at this code carefully and ISTM, that because of the
>>>>>>>> way we fetch the data from the connections and return it immediately inside
>>>>>>>> RemoteQueryNext, storing it in the tuplestore using tuplestore_puttupleslot
>>>>>>>> is NOT required at all.
>>>>>>>>
>>>>>>>> So, I have removed the call to tuplestore_puttupleslot and things
>>>>>>>> seem to be ok for me. I guess we should do a FULL test run with this patch
>>>>>>>> just to ensure that it does not cause issues in any scenarios.
>>>>>>>>
>>>>>>>> A careful look by new set of eyes will help here. I think, if there
>>>>>>>> are no issues, this plugs a major leak in the RemoteQuery code path which
>>>>>>>> is almost always used in our case.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Nikhils
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Aug 14, 2013 at 7:05 PM, Nikhil Sontakke <
>>>>>>>> ni...@st...> wrote:
>>>>>>>>
>>>>>>>>> Using a tuplestore for data coming from RemoteQuery is kinda wrong
>>>>>>>>> and that's what has introduced this issue. Looks like just changing the
>>>>>>>>> memory context will not work as it interferes with the other functioning of
>>>>>>>>> the tuplestore :-|
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Nikhils
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Aug 14, 2013 at 3:07 PM, Ashutosh Bapat <
>>>>>>>>> ash...@en...> wrote:
>>>>>>>>>
>>>>>>>>>> Yes, that's correct.
>>>>>>>>>>
>>>>>>>>>> My patch was not intended to fix this. This was added while
>>>>>>>>>> fixing a bug for parameterised quals on RemoteQuery I think. Check commits
>>>>>>>>>> by Amit in this area.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Aug 14, 2013 at 3:03 PM, Nikhil Sontakke <
>>>>>>>>>> ni...@st...> wrote:
>>>>>>>>>>
>>>>>>>>>>> Yeah, but AFAICS, even 1.1 (and head) *still* has a leak in it.
>>>>>>>>>>>
>>>>>>>>>>> Here's the snippet from RemoteQueryNext:
>>>>>>>>>>>
>>>>>>>>>>> if (tuplestorestate && !TupIsNull(scanslot))
>>>>>>>>>>>                     tuplestore_puttupleslot(tuplestorestate,
>>>>>>>>>>> scanslot);
>>>>>>>>>>>
>>>>>>>>>>> I printed the current memory context inside this function, it is
>>>>>>>>>>> ""ExecutorState". This means that the tuple will stay around till the query
>>>>>>>>>>> is executing in its entirety! For large COPY queries this is bound to cause
>>>>>>>>>>> issues as is also reported by Hitoshi san on another thread.
>>>>>>>>>>>
>>>>>>>>>>> I propose that in RemoteQueryNext, before calling the
>>>>>>>>>>> tuplestore_puttupleslot we switch into
>>>>>>>>>>> scan_node->ps.ps_ExprContext's ecxt_per_tuple_memory context. It
>>>>>>>>>>> will get reset, when the next tuple has to be returned to the caller and
>>>>>>>>>>> the leak will be curtailed. Thoughts?
>>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>> Nikhils
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Aug 14, 2013 at 11:33 AM, Ashutosh Bapat <
>>>>>>>>>>> ash...@en...> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> There has been an overhaul in the planner (and corresponding
>>>>>>>>>>>> parts of executor) in 1.1, so it would be better if they move to 1.1 after
>>>>>>>>>>>> GA.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Aug 14, 2013 at 10:54 AM, Nikhil Sontakke <
>>>>>>>>>>>> ni...@st...> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Ah, I see.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I was looking at REL_1_0 sources.
>>>>>>>>>>>>>
>>>>>>>>>>>>> There are people out there using REL_1_0 as well.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>> Nikhils
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Tue, Aug 13, 2013 at 9:52 AM, Ashutosh Bapat <
>>>>>>>>>>>>> ash...@en...> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> It should be part of 1.1 as well. It was done to support
>>>>>>>>>>>>>> projection out of RemoteQuery node.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Tue, Aug 13, 2013 at 7:17 AM, Nikhil Sontakke <
>>>>>>>>>>>>>> ni...@st...> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi Ashutosh,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I guess you have changed it in pgxc head? I was looking at
>>>>>>>>>>>>>>> 103 and 11 branches and saw this. In that even ExecRemoteQuery seems to
>>>>>>>>>>>>>>> have an issue wherein it's not using the appropriate context.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>> Nikhils
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Sent from my iPhone
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Aug 12, 2013, at 9:54 AM, Ashutosh Bapat <
>>>>>>>>>>>>>>> ash...@en...> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Welcome to the mess ;) and enjoy junk food.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Sometime back, I have changed ExecRemoteQuery to be called
>>>>>>>>>>>>>>> in the same fashion as other Scan nodes. So, you will see ExecRemoteQuery
>>>>>>>>>>>>>>> calling ExecScan with RemoteQueryNext as the iterator. So, I assume your
>>>>>>>>>>>>>>> comment pertains to RemoteQueryNext and its minions and not ExecRemoteQuery
>>>>>>>>>>>>>>> per say!
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> This code needs a lot of rework, removing duplications,
>>>>>>>>>>>>>>> using proper way of materialisation, central response handler and error
>>>>>>>>>>>>>>> handler etc. If we clean up this code, some improvements in planner (like
>>>>>>>>>>>>>>> using MergeAppend plan) for Sort, will be possible. Regarding
>>>>>>>>>>>>>>> materialisation, the code uses a linked list for materialising the rows
>>>>>>>>>>>>>>> from datanodes (in case the same connection needs to be given to other
>>>>>>>>>>>>>>> remote query node), which must be eating a lot of performance. Instead we
>>>>>>>>>>>>>>> should be using some kind of tuplestore there. We actually use tuplestore
>>>>>>>>>>>>>>> (as well) in the RemoteQuery node; the same method can be used.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Sat, Aug 10, 2013 at 10:48 PM, Nikhil Sontakke <
>>>>>>>>>>>>>>> ni...@st...> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Have a Query about ExecRemoteQuery.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The logic seems to have been modeled after ExecMaterial.
>>>>>>>>>>>>>>>> ISTM, that it should have been modeled after ExecScan because we fetch
>>>>>>>>>>>>>>>> tuples, and those which match the qual should be sent up. ExecMaterial is
>>>>>>>>>>>>>>>> for materializing and collecting and storing tuples.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Can anyone explain? The reason for asking this is I am
>>>>>>>>>>>>>>>> suspecting a big memory leak in this code path. We are not using any
>>>>>>>>>>>>>>>> expression context nor we are freeing up tuples as we scan for the one
>>>>>>>>>>>>>>>> which qualifies.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>> Nikhils
>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>> StormDB - http://www.stormdb.com
>>>>>>>>>>>>>>>> The Database Cloud
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> ------------------------------------------------------------------------------
>>>>>>>>>>>>>>>> Get 100% visibility into Java/.NET code with AppDynamics
>>>>>>>>>>>>>>>> Lite!
>>>>>>>>>>>>>>>> It's a free troubleshooting tool designed for production.
>>>>>>>>>>>>>>>> Get down to code-level detail for bottlenecks, with <2%
>>>>>>>>>>>>>>>> overhead.
>>>>>>>>>>>>>>>> Download for free and get started troubleshooting in
>>>>>>>>>>>>>>>> minutes.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clktrk
>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>> Postgres-xc-developers mailing list
>>>>>>>>>>>>>>>> Pos...@li...
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>> Best Wishes,
>>>>>>>>>>>>>>> Ashutosh Bapat
>>>>>>>>>>>>>>> EntepriseDB Corporation
>>>>>>>>>>>>>>> The Postgres Database Company
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> Best Wishes,
>>>>>>>>>>>>>> Ashutosh Bapat
>>>>>>>>>>>>>> EntepriseDB Corporation
>>>>>>>>>>>>>> The Postgres Database Company
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> StormDB - http://www.stormdb.com
>>>>>>>>>>>>> The Database Cloud
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> Best Wishes,
>>>>>>>>>>>> Ashutosh Bapat
>>>>>>>>>>>> EntepriseDB Corporation
>>>>>>>>>>>> The Postgres Database Company
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> StormDB - http://www.stormdb.com
>>>>>>>>>>> The Database Cloud
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Best Wishes,
>>>>>>>>>> Ashutosh Bapat
>>>>>>>>>> EntepriseDB Corporation
>>>>>>>>>> The Postgres Database Company
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> StormDB - http://www.stormdb.com
>>>>>>>>> The Database Cloud
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> StormDB - http://www.stormdb.com
>>>>>>>> The Database Cloud
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> StormDB - http://www.stormdb.com
>>>>>>> The Database Cloud
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> StormDB - http://www.stormdb.com
>>>>>> The Database Cloud
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> StormDB - http://www.stormdb.com
>>>>> The Database Cloud
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> StormDB - http://www.stormdb.com
>>>> The Database Cloud
>>>>
>>>
>>>
>>>
>>> --
>>> Best Wishes,
>>> Ashutosh Bapat
>>> EntepriseDB Corporation
>>> The Postgres Database Company
>>>
>>
>>
>>
>> --
>> StormDB - http://www.stormdb.com
>> The Database Cloud
>>
>
>
>
> --
> Best Wishes,
> Ashutosh Bapat
> EntepriseDB Corporation
> The Postgres Database Company
>



-- 
Best Wishes,
Ashutosh Bapat
EntepriseDB Corporation
The Postgres Database Company

Re: [Postgres-xc-developers] ExecRemoteQuery

From: Ashutosh B. <ash...@en...> - 2013-08-21 10:30:45

On Wed, Aug 21, 2013 at 3:54 PM, Nikhil Sontakke <ni...@st...>wrote:

>
> The patch needs some more comments as to why and when we should
>> materialise or not.
>>
>
> The existing code *always* materializes in all cases which can be a huge
> performance penalty in some cases. This patch tries to address those cases.
>

There are no comments in the code as to WHY we need to re/set this flag
wherever set/reset.


>
>
>> Also, I am not so in favour of adding another switch all the way in
>> RemoteQuery path. Well, I am always not in favour of adding new switch,
>> since it needs to be maintained at various places in code; and if somebody
>> forgets to set/reset it somewhere it shows up as bug in entirely different
>> areas. For example, if some code forgets to set/reset this switch while
>> creating path, it would end up as a bug in executor or a performance
>> regression in the executor and would be difficult to trace it back all the
>> way to the path creation.
>>
>
> This patch sets it to true when we initialize the remote query. It's not a
> conditional setting. So it's always set. That ways we avoid causing havoc
> in other parts until unless we know about other cases where it's not needed.
>

I am talking about some code added in future (and since this is very
happening area, we will add code in near future), where the coder or
reviewer has to be congnizant that this flag needs to be set/reset at some
places.

>
>
>> Is there a way, we can do this based on some other members of RemoteQuery
>> or RemoteQueryPath structure?
>>
>>
> I did not find any and this follows the standard percolating information
> down from higher nodes to lower nodes strategy. While execution we rarely
> have any logic to go to the parent to check for stuff, so this is pretty
> consistent with existing code I would say.
>
>
If possible, we do use member of Plan/Parse node/s in execution code. It's
not a common practice to percolate the information down through every
structure that you come across.


> Regards,
> Nikhils
>
>
>
>>
>> On Fri, Aug 16, 2013 at 5:57 PM, Nikhil Sontakke <ni...@st...>wrote:
>>
>>> Duh.. Patch attached :^)
>>>
>>>
>>> On Fri, Aug 16, 2013 at 5:50 PM, Nikhil Sontakke <ni...@st...>wrote:
>>>
>>>>
>>>> Additionally, ISTM, that the tuplestore should only be used for inner
>>>>> nodes. There's no need to store outer nodes in tuplestores. Also if it's a
>>>>> single SELECT query, then there's no need to use tuplestore at all as well.
>>>>>
>>>>>
>>>> PFA, a patch which tries to avoid using the tuplestore in the above two
>>>> cases. During planning we decide if a tuplestore should be used for the
>>>> RemoteQuery. The default is true, and we set it to false for the above two
>>>> cases for now.
>>>>
>>>> I ran regression test cases with and without the patch and got the
>>>> exact same set of failures (and more importantly same diffs).
>>>>
>>>> To be clear this patch is not specific to COPY TO, but it's a generic
>>>> change to avoid using tuplestore in certain simple scenarios thereby
>>>> reducing the memory footprint of the remote query execution. Note that it
>>>> also does not solve Hitoshi-san's COPY FROM issues. Will submit a separate
>>>> patch for that.
>>>>
>>>> Regards,
>>>> Nikhils
>>>>
>>>>
>>>>
>>>>> Looks like if we can pass hints during plan creation as to whether the
>>>>> remote scan is part of a join (and is inner node) or not, then accordingly
>>>>> decision can be taken to materialize into the tuplestore.
>>>>>
>>>>> Regards,
>>>>> Nikhils
>>>>>
>>>>> On Thu, Aug 15, 2013 at 10:43 PM, Nikhil Sontakke <ni...@st...
>>>>> > wrote:
>>>>>
>>>>>> Looks like my theory was wrong, make installcheck is giving more
>>>>>> errors with this patch applied. Will have to look at a different solution..
>>>>>>
>>>>>> Regards,
>>>>>> Nikhils
>>>>>>
>>>>>>
>>>>>> On Thu, Aug 15, 2013 at 2:11 PM, Nikhil Sontakke <ni...@st...
>>>>>> > wrote:
>>>>>>
>>>>>>> So, I looked at this code carefully and ISTM, that because of the
>>>>>>> way we fetch the data from the connections and return it immediately inside
>>>>>>> RemoteQueryNext, storing it in the tuplestore using tuplestore_puttupleslot
>>>>>>> is NOT required at all.
>>>>>>>
>>>>>>> So, I have removed the call to tuplestore_puttupleslot and things
>>>>>>> seem to be ok for me. I guess we should do a FULL test run with this patch
>>>>>>> just to ensure that it does not cause issues in any scenarios.
>>>>>>>
>>>>>>> A careful look by new set of eyes will help here. I think, if there
>>>>>>> are no issues, this plugs a major leak in the RemoteQuery code path which
>>>>>>> is almost always used in our case.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Nikhils
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Aug 14, 2013 at 7:05 PM, Nikhil Sontakke <
>>>>>>> ni...@st...> wrote:
>>>>>>>
>>>>>>>> Using a tuplestore for data coming from RemoteQuery is kinda wrong
>>>>>>>> and that's what has introduced this issue. Looks like just changing the
>>>>>>>> memory context will not work as it interferes with the other functioning of
>>>>>>>> the tuplestore :-|
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Nikhils
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Aug 14, 2013 at 3:07 PM, Ashutosh Bapat <
>>>>>>>> ash...@en...> wrote:
>>>>>>>>
>>>>>>>>> Yes, that's correct.
>>>>>>>>>
>>>>>>>>> My patch was not intended to fix this. This was added while fixing
>>>>>>>>> a bug for parameterised quals on RemoteQuery I think. Check commits by Amit
>>>>>>>>> in this area.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Aug 14, 2013 at 3:03 PM, Nikhil Sontakke <
>>>>>>>>> ni...@st...> wrote:
>>>>>>>>>
>>>>>>>>>> Yeah, but AFAICS, even 1.1 (and head) *still* has a leak in it.
>>>>>>>>>>
>>>>>>>>>> Here's the snippet from RemoteQueryNext:
>>>>>>>>>>
>>>>>>>>>> if (tuplestorestate && !TupIsNull(scanslot))
>>>>>>>>>>                     tuplestore_puttupleslot(tuplestorestate,
>>>>>>>>>> scanslot);
>>>>>>>>>>
>>>>>>>>>> I printed the current memory context inside this function, it is
>>>>>>>>>> ""ExecutorState". This means that the tuple will stay around till the query
>>>>>>>>>> is executing in its entirety! For large COPY queries this is bound to cause
>>>>>>>>>> issues as is also reported by Hitoshi san on another thread.
>>>>>>>>>>
>>>>>>>>>> I propose that in RemoteQueryNext, before calling the
>>>>>>>>>> tuplestore_puttupleslot we switch into
>>>>>>>>>> scan_node->ps.ps_ExprContext's ecxt_per_tuple_memory context. It
>>>>>>>>>> will get reset, when the next tuple has to be returned to the caller and
>>>>>>>>>> the leak will be curtailed. Thoughts?
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> Nikhils
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Aug 14, 2013 at 11:33 AM, Ashutosh Bapat <
>>>>>>>>>> ash...@en...> wrote:
>>>>>>>>>>
>>>>>>>>>>> There has been an overhaul in the planner (and corresponding
>>>>>>>>>>> parts of executor) in 1.1, so it would be better if they move to 1.1 after
>>>>>>>>>>> GA.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Aug 14, 2013 at 10:54 AM, Nikhil Sontakke <
>>>>>>>>>>> ni...@st...> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Ah, I see.
>>>>>>>>>>>>
>>>>>>>>>>>> I was looking at REL_1_0 sources.
>>>>>>>>>>>>
>>>>>>>>>>>> There are people out there using REL_1_0 as well.
>>>>>>>>>>>>
>>>>>>>>>>>> Regards,
>>>>>>>>>>>> Nikhils
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, Aug 13, 2013 at 9:52 AM, Ashutosh Bapat <
>>>>>>>>>>>> ash...@en...> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> It should be part of 1.1 as well. It was done to support
>>>>>>>>>>>>> projection out of RemoteQuery node.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Tue, Aug 13, 2013 at 7:17 AM, Nikhil Sontakke <
>>>>>>>>>>>>> ni...@st...> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Ashutosh,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I guess you have changed it in pgxc head? I was looking at
>>>>>>>>>>>>>> 103 and 11 branches and saw this. In that even ExecRemoteQuery seems to
>>>>>>>>>>>>>> have an issue wherein it's not using the appropriate context.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>> Nikhils
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Sent from my iPhone
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Aug 12, 2013, at 9:54 AM, Ashutosh Bapat <
>>>>>>>>>>>>>> ash...@en...> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Welcome to the mess ;) and enjoy junk food.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Sometime back, I have changed ExecRemoteQuery to be called in
>>>>>>>>>>>>>> the same fashion as other Scan nodes. So, you will see ExecRemoteQuery
>>>>>>>>>>>>>> calling ExecScan with RemoteQueryNext as the iterator. So, I assume your
>>>>>>>>>>>>>> comment pertains to RemoteQueryNext and its minions and not ExecRemoteQuery
>>>>>>>>>>>>>> per say!
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This code needs a lot of rework, removing duplications, using
>>>>>>>>>>>>>> proper way of materialisation, central response handler and error handler
>>>>>>>>>>>>>> etc. If we clean up this code, some improvements in planner (like using
>>>>>>>>>>>>>> MergeAppend plan) for Sort, will be possible. Regarding materialisation,
>>>>>>>>>>>>>> the code uses a linked list for materialising the rows from datanodes (in
>>>>>>>>>>>>>> case the same connection needs to be given to other remote query node),
>>>>>>>>>>>>>> which must be eating a lot of performance. Instead we should be using some
>>>>>>>>>>>>>> kind of tuplestore there. We actually use tuplestore (as well) in the
>>>>>>>>>>>>>> RemoteQuery node; the same method can be used.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Sat, Aug 10, 2013 at 10:48 PM, Nikhil Sontakke <
>>>>>>>>>>>>>> ni...@st...> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Have a Query about ExecRemoteQuery.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The logic seems to have been modeled after ExecMaterial.
>>>>>>>>>>>>>>> ISTM, that it should have been modeled after ExecScan because we fetch
>>>>>>>>>>>>>>> tuples, and those which match the qual should be sent up. ExecMaterial is
>>>>>>>>>>>>>>> for materializing and collecting and storing tuples.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Can anyone explain? The reason for asking this is I am
>>>>>>>>>>>>>>> suspecting a big memory leak in this code path. We are not using any
>>>>>>>>>>>>>>> expression context nor we are freeing up tuples as we scan for the one
>>>>>>>>>>>>>>> which qualifies.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>> Nikhils
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>> StormDB - http://www.stormdb.com
>>>>>>>>>>>>>>> The Database Cloud
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> ------------------------------------------------------------------------------
>>>>>>>>>>>>>>> Get 100% visibility into Java/.NET code with AppDynamics
>>>>>>>>>>>>>>> Lite!
>>>>>>>>>>>>>>> It's a free troubleshooting tool designed for production.
>>>>>>>>>>>>>>> Get down to code-level detail for bottlenecks, with <2%
>>>>>>>>>>>>>>> overhead.
>>>>>>>>>>>>>>> Download for free and get started troubleshooting in minutes.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clktrk
>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>> Postgres-xc-developers mailing list
>>>>>>>>>>>>>>> Pos...@li...
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> Best Wishes,
>>>>>>>>>>>>>> Ashutosh Bapat
>>>>>>>>>>>>>> EntepriseDB Corporation
>>>>>>>>>>>>>> The Postgres Database Company
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> Best Wishes,
>>>>>>>>>>>>> Ashutosh Bapat
>>>>>>>>>>>>> EntepriseDB Corporation
>>>>>>>>>>>>> The Postgres Database Company
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> StormDB - http://www.stormdb.com
>>>>>>>>>>>> The Database Cloud
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Best Wishes,
>>>>>>>>>>> Ashutosh Bapat
>>>>>>>>>>> EntepriseDB Corporation
>>>>>>>>>>> The Postgres Database Company
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> StormDB - http://www.stormdb.com
>>>>>>>>>> The Database Cloud
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Best Wishes,
>>>>>>>>> Ashutosh Bapat
>>>>>>>>> EntepriseDB Corporation
>>>>>>>>> The Postgres Database Company
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> StormDB - http://www.stormdb.com
>>>>>>>> The Database Cloud
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> StormDB - http://www.stormdb.com
>>>>>>> The Database Cloud
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> StormDB - http://www.stormdb.com
>>>>>> The Database Cloud
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> StormDB - http://www.stormdb.com
>>>>> The Database Cloud
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> StormDB - http://www.stormdb.com
>>>> The Database Cloud
>>>>
>>>
>>>
>>>
>>> --
>>> StormDB - http://www.stormdb.com
>>> The Database Cloud
>>>
>>
>>
>>
>> --
>> Best Wishes,
>> Ashutosh Bapat
>> EntepriseDB Corporation
>> The Postgres Database Company
>>
>
>
>
> --
> StormDB - http://www.stormdb.com
> The Database Cloud
>



-- 
Best Wishes,
Ashutosh Bapat
EntepriseDB Corporation
The Postgres Database Company

Re: [Postgres-xc-developers] ExecRemoteQuery

From: Nikhil S. <ni...@st...> - 2013-08-21 10:25:06

> The patch needs some more comments as to why and when we should
> materialise or not.
>

The existing code *always* materializes in all cases which can be a huge
performance penalty in some cases. This patch tries to address those cases.


> Also, I am not so in favour of adding another switch all the way in
> RemoteQuery path. Well, I am always not in favour of adding new switch,
> since it needs to be maintained at various places in code; and if somebody
> forgets to set/reset it somewhere it shows up as bug in entirely different
> areas. For example, if some code forgets to set/reset this switch while
> creating path, it would end up as a bug in executor or a performance
> regression in the executor and would be difficult to trace it back all the
> way to the path creation.
>

This patch sets it to true when we initialize the remote query. It's not a
conditional setting. So it's always set. That ways we avoid causing havoc
in other parts until unless we know about other cases where it's not needed.


> Is there a way, we can do this based on some other members of RemoteQuery
> or RemoteQueryPath structure?
>
>
I did not find any and this follows the standard percolating information
down from higher nodes to lower nodes strategy. While execution we rarely
have any logic to go to the parent to check for stuff, so this is pretty
consistent with existing code I would say.

Regards,
Nikhils



>
> On Fri, Aug 16, 2013 at 5:57 PM, Nikhil Sontakke <ni...@st...>wrote:
>
>> Duh.. Patch attached :^)
>>
>>
>> On Fri, Aug 16, 2013 at 5:50 PM, Nikhil Sontakke <ni...@st...>wrote:
>>
>>>
>>> Additionally, ISTM, that the tuplestore should only be used for inner
>>>> nodes. There's no need to store outer nodes in tuplestores. Also if it's a
>>>> single SELECT query, then there's no need to use tuplestore at all as well.
>>>>
>>>>
>>> PFA, a patch which tries to avoid using the tuplestore in the above two
>>> cases. During planning we decide if a tuplestore should be used for the
>>> RemoteQuery. The default is true, and we set it to false for the above two
>>> cases for now.
>>>
>>> I ran regression test cases with and without the patch and got the exact
>>> same set of failures (and more importantly same diffs).
>>>
>>> To be clear this patch is not specific to COPY TO, but it's a generic
>>> change to avoid using tuplestore in certain simple scenarios thereby
>>> reducing the memory footprint of the remote query execution. Note that it
>>> also does not solve Hitoshi-san's COPY FROM issues. Will submit a separate
>>> patch for that.
>>>
>>> Regards,
>>> Nikhils
>>>
>>>
>>>
>>>> Looks like if we can pass hints during plan creation as to whether the
>>>> remote scan is part of a join (and is inner node) or not, then accordingly
>>>> decision can be taken to materialize into the tuplestore.
>>>>
>>>> Regards,
>>>> Nikhils
>>>>
>>>> On Thu, Aug 15, 2013 at 10:43 PM, Nikhil Sontakke <ni...@st...>wrote:
>>>>
>>>>> Looks like my theory was wrong, make installcheck is giving more
>>>>> errors with this patch applied. Will have to look at a different solution..
>>>>>
>>>>> Regards,
>>>>> Nikhils
>>>>>
>>>>>
>>>>> On Thu, Aug 15, 2013 at 2:11 PM, Nikhil Sontakke <ni...@st...>wrote:
>>>>>
>>>>>> So, I looked at this code carefully and ISTM, that because of the way
>>>>>> we fetch the data from the connections and return it immediately inside
>>>>>> RemoteQueryNext, storing it in the tuplestore using tuplestore_puttupleslot
>>>>>> is NOT required at all.
>>>>>>
>>>>>> So, I have removed the call to tuplestore_puttupleslot and things
>>>>>> seem to be ok for me. I guess we should do a FULL test run with this patch
>>>>>> just to ensure that it does not cause issues in any scenarios.
>>>>>>
>>>>>> A careful look by new set of eyes will help here. I think, if there
>>>>>> are no issues, this plugs a major leak in the RemoteQuery code path which
>>>>>> is almost always used in our case.
>>>>>>
>>>>>> Regards,
>>>>>> Nikhils
>>>>>>
>>>>>>
>>>>>> On Wed, Aug 14, 2013 at 7:05 PM, Nikhil Sontakke <ni...@st...
>>>>>> > wrote:
>>>>>>
>>>>>>> Using a tuplestore for data coming from RemoteQuery is kinda wrong
>>>>>>> and that's what has introduced this issue. Looks like just changing the
>>>>>>> memory context will not work as it interferes with the other functioning of
>>>>>>> the tuplestore :-|
>>>>>>>
>>>>>>> Regards,
>>>>>>> Nikhils
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Aug 14, 2013 at 3:07 PM, Ashutosh Bapat <
>>>>>>> ash...@en...> wrote:
>>>>>>>
>>>>>>>> Yes, that's correct.
>>>>>>>>
>>>>>>>> My patch was not intended to fix this. This was added while fixing
>>>>>>>> a bug for parameterised quals on RemoteQuery I think. Check commits by Amit
>>>>>>>> in this area.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Aug 14, 2013 at 3:03 PM, Nikhil Sontakke <
>>>>>>>> ni...@st...> wrote:
>>>>>>>>
>>>>>>>>> Yeah, but AFAICS, even 1.1 (and head) *still* has a leak in it.
>>>>>>>>>
>>>>>>>>> Here's the snippet from RemoteQueryNext:
>>>>>>>>>
>>>>>>>>> if (tuplestorestate && !TupIsNull(scanslot))
>>>>>>>>>                     tuplestore_puttupleslot(tuplestorestate,
>>>>>>>>> scanslot);
>>>>>>>>>
>>>>>>>>> I printed the current memory context inside this function, it is
>>>>>>>>> ""ExecutorState". This means that the tuple will stay around till the query
>>>>>>>>> is executing in its entirety! For large COPY queries this is bound to cause
>>>>>>>>> issues as is also reported by Hitoshi san on another thread.
>>>>>>>>>
>>>>>>>>> I propose that in RemoteQueryNext, before calling the
>>>>>>>>> tuplestore_puttupleslot we switch into
>>>>>>>>> scan_node->ps.ps_ExprContext's ecxt_per_tuple_memory context. It
>>>>>>>>> will get reset, when the next tuple has to be returned to the caller and
>>>>>>>>> the leak will be curtailed. Thoughts?
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Nikhils
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Aug 14, 2013 at 11:33 AM, Ashutosh Bapat <
>>>>>>>>> ash...@en...> wrote:
>>>>>>>>>
>>>>>>>>>> There has been an overhaul in the planner (and corresponding
>>>>>>>>>> parts of executor) in 1.1, so it would be better if they move to 1.1 after
>>>>>>>>>> GA.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Aug 14, 2013 at 10:54 AM, Nikhil Sontakke <
>>>>>>>>>> ni...@st...> wrote:
>>>>>>>>>>
>>>>>>>>>>> Ah, I see.
>>>>>>>>>>>
>>>>>>>>>>> I was looking at REL_1_0 sources.
>>>>>>>>>>>
>>>>>>>>>>> There are people out there using REL_1_0 as well.
>>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>> Nikhils
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Aug 13, 2013 at 9:52 AM, Ashutosh Bapat <
>>>>>>>>>>> ash...@en...> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> It should be part of 1.1 as well. It was done to support
>>>>>>>>>>>> projection out of RemoteQuery node.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, Aug 13, 2013 at 7:17 AM, Nikhil Sontakke <
>>>>>>>>>>>> ni...@st...> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Ashutosh,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I guess you have changed it in pgxc head? I was looking at 103
>>>>>>>>>>>>> and 11 branches and saw this. In that even ExecRemoteQuery seems to have an
>>>>>>>>>>>>> issue wherein it's not using the appropriate context.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>> Nikhils
>>>>>>>>>>>>>
>>>>>>>>>>>>> Sent from my iPhone
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Aug 12, 2013, at 9:54 AM, Ashutosh Bapat <
>>>>>>>>>>>>> ash...@en...> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Welcome to the mess ;) and enjoy junk food.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Sometime back, I have changed ExecRemoteQuery to be called in
>>>>>>>>>>>>> the same fashion as other Scan nodes. So, you will see ExecRemoteQuery
>>>>>>>>>>>>> calling ExecScan with RemoteQueryNext as the iterator. So, I assume your
>>>>>>>>>>>>> comment pertains to RemoteQueryNext and its minions and not ExecRemoteQuery
>>>>>>>>>>>>> per say!
>>>>>>>>>>>>>
>>>>>>>>>>>>> This code needs a lot of rework, removing duplications, using
>>>>>>>>>>>>> proper way of materialisation, central response handler and error handler
>>>>>>>>>>>>> etc. If we clean up this code, some improvements in planner (like using
>>>>>>>>>>>>> MergeAppend plan) for Sort, will be possible. Regarding materialisation,
>>>>>>>>>>>>> the code uses a linked list for materialising the rows from datanodes (in
>>>>>>>>>>>>> case the same connection needs to be given to other remote query node),
>>>>>>>>>>>>> which must be eating a lot of performance. Instead we should be using some
>>>>>>>>>>>>> kind of tuplestore there. We actually use tuplestore (as well) in the
>>>>>>>>>>>>> RemoteQuery node; the same method can be used.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Sat, Aug 10, 2013 at 10:48 PM, Nikhil Sontakke <
>>>>>>>>>>>>> ni...@st...> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Have a Query about ExecRemoteQuery.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The logic seems to have been modeled after ExecMaterial.
>>>>>>>>>>>>>> ISTM, that it should have been modeled after ExecScan because we fetch
>>>>>>>>>>>>>> tuples, and those which match the qual should be sent up. ExecMaterial is
>>>>>>>>>>>>>> for materializing and collecting and storing tuples.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Can anyone explain? The reason for asking this is I am
>>>>>>>>>>>>>> suspecting a big memory leak in this code path. We are not using any
>>>>>>>>>>>>>> expression context nor we are freeing up tuples as we scan for the one
>>>>>>>>>>>>>> which qualifies.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>> Nikhils
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> StormDB - http://www.stormdb.com
>>>>>>>>>>>>>> The Database Cloud
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ------------------------------------------------------------------------------
>>>>>>>>>>>>>> Get 100% visibility into Java/.NET code with AppDynamics Lite!
>>>>>>>>>>>>>> It's a free troubleshooting tool designed for production.
>>>>>>>>>>>>>> Get down to code-level detail for bottlenecks, with <2%
>>>>>>>>>>>>>> overhead.
>>>>>>>>>>>>>> Download for free and get started troubleshooting in minutes.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> http://pubads.g.doubleclick.net/gampad/clk?id=48897031&iu=/4140/ostg.clktrk
>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>> Postgres-xc-developers mailing list
>>>>>>>>>>>>>> Pos...@li...
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> Best Wishes,
>>>>>>>>>>>>> Ashutosh Bapat
>>>>>>>>>>>>> EntepriseDB Corporation
>>>>>>>>>>>>> The Postgres Database Company
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> Best Wishes,
>>>>>>>>>>>> Ashutosh Bapat
>>>>>>>>>>>> EntepriseDB Corporation
>>>>>>>>>>>> The Postgres Database Company
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> StormDB - http://www.stormdb.com
>>>>>>>>>>> The Database Cloud
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Best Wishes,
>>>>>>>>>> Ashutosh Bapat
>>>>>>>>>> EntepriseDB Corporation
>>>>>>>>>> The Postgres Database Company
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> StormDB - http://www.stormdb.com
>>>>>>>>> The Database Cloud
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Best Wishes,
>>>>>>>> Ashutosh Bapat
>>>>>>>> EntepriseDB Corporation
>>>>>>>> The Postgres Database Company
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> StormDB - http://www.stormdb.com
>>>>>>> The Database Cloud
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> StormDB - http://www.stormdb.com
>>>>>> The Database Cloud
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> StormDB - http://www.stormdb.com
>>>>> The Database Cloud
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> StormDB - http://www.stormdb.com
>>>> The Database Cloud
>>>>
>>>
>>>
>>>
>>> --
>>> StormDB - http://www.stormdb.com
>>> The Database Cloud
>>>
>>
>>
>>
>> --
>> StormDB - http://www.stormdb.com
>> The Database Cloud
>>
>
>
>
> --
> Best Wishes,
> Ashutosh Bapat
> EntepriseDB Corporation
> The Postgres Database Company
>



-- 
StormDB - http://www.stormdb.com
The Database Cloud

17 messages has been excluded from this view by a project administrator.

Flat | Threaded

1 2 3 .. 7 > >> (Page 1 of 7)

S	M	T	W	T	F	S
				1 (10)	2 (18)	3 (1)
4	5 (15)	6 (16)	7 (11)	8 (17)	9 (7)	10 (6)
11 (1)	12 (6)	13 (4)	14 (8)	15 (3)	16 (3)	17
18	19 (8)	20 (10)	21 (12)	22 (5)	23 (3)	24
25	26 (2)	27 (2)	28 (1)	29 (2)	30 (5)	31 (1)