[Postgres-xc-developers] rowkarks problem and its suggested solution.

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Hi,

Problem Description
===================

Consider the following use case

create table junk (a int, b int);
insert into junk values(1,1),(1,2),(1,3),(2,4),(2,5),(3,6),(3,7),(3,8);
create table junk2(a int, b int);
insert into junk2
values(0,11),(0,22),(0,33),(1,44),(1,55),(10,66),(10,77),(20,88);

select * from junk, junk2 where junk.a=junk2.a;

 a | b | a | b
---+---+---+----
 1 | 1 | 1 | 44
 1 | 1 | 1 | 55
 1 | 2 | 1 | 44
 1 | 2 | 1 | 55
 1 | 3 | 1 | 44
 1 | 3 | 1 | 55
(6 rows)

BEGIN
 declare c1 cursor for select * from junk, junk2 where junk.a=junk2.a for
update;
 fetch 1 from c1;

At this point try this query from another psql prompt
 update junk set b=200 where a=2;

should not block since we should have locked only the rows where a=1
In our current implementation the above query blocks since we lock complete
tables.

Suggested Solution
==================

1. Select for update/share if run inside a transaction and all cursor
queries
   will contain nodeid and ctid as additional target columns.

2. Support for a new SQL command used to lock individual rows will be added

    LOCK ROWS FROM TABLE name IN lockmode MODE WHERE condition

    where lockmode is one of:

        ACCESS SHARE | ACCESS EXCLUSIVE

3. The LockRows node would work differently on coordinator, whereas it
would work
   as it it on data nodes. On the coordinator It will create a list of
nodeid,ctid
   pairs one for each table whose rows are supposed to be locked.
   Consider the example where we have two data nodes

    test=# explain verbose select * from junk, junk2 where junk.a=junk2.a
for update;
                                           QUERY
PLAN

-------------------------------------------------------------------------------------------------
     LockRows  (cost=0.00..0.02 rows=1 width=16)
       Output: junk.a, junk.b, junk2.a, junk2.b
       ->  Nested Loop  (cost=0.00..0.01 rows=1 width=16)
         Output: junk.a, junk.b, junk2.a, junk2.b
         Join Filter: (junk.a = junk2.a)
         ->  Data Node Scan on junk "_REMOTE_TABLE_QUERY_"
(cost=0.00..0.00 rows=1000 width=8)
               Output: junk.a, junk.b
               Node/s: data_node_1, data_node_2
               Remote query: SELECT a, b FROM ONLY junk WHERE true FOR
UPDATE OF junk
         ->  Data Node Scan on junk2 "_REMOTE_TABLE_QUERY_"
(cost=0.00..0.00 rows=1000 width=8)
               Output: junk2.a, junk2.b
               Node/s: data_node_1, data_node_2
               Remote query: SELECT a, b FROM ONLY junk2 WHERE true FOR
UPDATE OF junk2
    (13 rows)

    In each run the LockRows node would add one nodeid,ctid pair for each
table in the list.
    When the coordinator executes ExecEndLockRows it will prepare LOCK ROWS
query
    one for each table and for each data node. In case you have two data
nodes and
    the list of nodeid,ctid pairs contain nodeids of both the datanodes then
    we will have a total of four queries two for one data node and two for
the other.
    Each query would look like

    LOCK ROWS FROM TABLE junk IN ACCESS EXCLUSIVE MODE
    WHERE ctid in (.....) containing ctids of first data node

    LOCK ROWS FROM TABLE junk IN ACCESS EXCLUSIVE MODE
    WHERE ctid in (.....) containing ctids of second data node

    LOCK ROWS FROM TABLE junk2 IN ACCESS EXCLUSIVE MODE
    WHERE ctid in (.....) containing ctids of first data node

    LOCK ROWS FROM TABLE junk2 IN ACCESS EXCLUSIVE MODE
    WHERE ctid in (.....) containing ctids of second data node

4. Now consider the example of a cursor query

   BEGIN;
     declare c1 cursor for select * from abc for update;
     fetch 1 from c1;
     -- do some stuff here
     fetch 1 from c1;
     -- do some stuff here
   END;

   In this case when the first fetch finishes its task in PortalRunSelect
   We will then issue the LOCK ROWS statement to the appropriate datanode
   with appropriate ctid. In case of a 'fetch all' the LOCK ROWS would
contain
   multiple ctids and may span over multiple datanodes too.

   It is also possible that while PortalRunSelect accumulates rows in
tuplestore
   we create LOCK ROWS statements in parallel and when PortalRunSelect
finishes
   we run those LOCK ROWS queries.

5. There will be no mechanism to unlock the rows, rows will be unlocked
implicitly when
   the transaction ends.

--
Abbas
Architect
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Phone: 92-334-5100153

Website: www.enterprisedb.com
EnterpriseDB Blog: http://blogs.enterprisedb.com/
Follow us on Twitter: http://www.twitter.com/enterprisedb

This e-mail message (and any attachment) is intended for the use of
the individual or entity to whom it is addressed. This message
contains information from EnterpriseDB Corporation that may be
privileged, confidential, or exempt from disclosure under applicable
law. If you are not the intended recipient or authorized to receive
this for the intended recipient, any use, dissemination, distribution,
retention, archiving, or copying of this communication is strictly
prohibited. If you have received this e-mail in error, please notify
the sender immediately by reply e-mail and delete this message.