From: Abbas B. <abb...@en...> - 2012-07-31 04:37:57
|
Hi, Problem Description =================== Consider the following use case create table junk (a int, b int); insert into junk values(1,1),(1,2),(1,3),(2,4),(2,5),(3,6),(3,7),(3,8); create table junk2(a int, b int); insert into junk2 values(0,11),(0,22),(0,33),(1,44),(1,55),(10,66),(10,77),(20,88); select * from junk, junk2 where junk.a=junk2.a; a | b | a | b ---+---+---+---- 1 | 1 | 1 | 44 1 | 1 | 1 | 55 1 | 2 | 1 | 44 1 | 2 | 1 | 55 1 | 3 | 1 | 44 1 | 3 | 1 | 55 (6 rows) BEGIN declare c1 cursor for select * from junk, junk2 where junk.a=junk2.a for update; fetch 1 from c1; At this point try this query from another psql prompt update junk set b=200 where a=2; should not block since we should have locked only the rows where a=1 In our current implementation the above query blocks since we lock complete tables. Suggested Solution ================== 1. Select for update/share if run inside a transaction and all cursor queries will contain nodeid and ctid as additional target columns. 2. Support for a new SQL command used to lock individual rows will be added LOCK ROWS FROM TABLE name IN lockmode MODE WHERE condition where lockmode is one of: ACCESS SHARE | ACCESS EXCLUSIVE 3. The LockRows node would work differently on coordinator, whereas it would work as it it on data nodes. On the coordinator It will create a list of nodeid,ctid pairs one for each table whose rows are supposed to be locked. Consider the example where we have two data nodes test=# explain verbose select * from junk, junk2 where junk.a=junk2.a for update; QUERY PLAN ------------------------------------------------------------------------------------------------- LockRows (cost=0.00..0.02 rows=1 width=16) Output: junk.a, junk.b, junk2.a, junk2.b -> Nested Loop (cost=0.00..0.01 rows=1 width=16) Output: junk.a, junk.b, junk2.a, junk2.b Join Filter: (junk.a = junk2.a) -> Data Node Scan on junk "_REMOTE_TABLE_QUERY_" (cost=0.00..0.00 rows=1000 width=8) Output: junk.a, junk.b Node/s: data_node_1, data_node_2 Remote query: SELECT a, b FROM ONLY junk WHERE true FOR UPDATE OF junk -> Data Node Scan on junk2 "_REMOTE_TABLE_QUERY_" (cost=0.00..0.00 rows=1000 width=8) Output: junk2.a, junk2.b Node/s: data_node_1, data_node_2 Remote query: SELECT a, b FROM ONLY junk2 WHERE true FOR UPDATE OF junk2 (13 rows) In each run the LockRows node would add one nodeid,ctid pair for each table in the list. When the coordinator executes ExecEndLockRows it will prepare LOCK ROWS query one for each table and for each data node. In case you have two data nodes and the list of nodeid,ctid pairs contain nodeids of both the datanodes then we will have a total of four queries two for one data node and two for the other. Each query would look like LOCK ROWS FROM TABLE junk IN ACCESS EXCLUSIVE MODE WHERE ctid in (.....) containing ctids of first data node LOCK ROWS FROM TABLE junk IN ACCESS EXCLUSIVE MODE WHERE ctid in (.....) containing ctids of second data node LOCK ROWS FROM TABLE junk2 IN ACCESS EXCLUSIVE MODE WHERE ctid in (.....) containing ctids of first data node LOCK ROWS FROM TABLE junk2 IN ACCESS EXCLUSIVE MODE WHERE ctid in (.....) containing ctids of second data node 4. Now consider the example of a cursor query BEGIN; declare c1 cursor for select * from abc for update; fetch 1 from c1; -- do some stuff here fetch 1 from c1; -- do some stuff here END; In this case when the first fetch finishes its task in PortalRunSelect We will then issue the LOCK ROWS statement to the appropriate datanode with appropriate ctid. In case of a 'fetch all' the LOCK ROWS would contain multiple ctids and may span over multiple datanodes too. It is also possible that while PortalRunSelect accumulates rows in tuplestore we create LOCK ROWS statements in parallel and when PortalRunSelect finishes we run those LOCK ROWS queries. 5. There will be no mechanism to unlock the rows, rows will be unlocked implicitly when the transaction ends. -- Abbas Architect EnterpriseDB Corporation The Enterprise PostgreSQL Company Phone: 92-334-5100153 Website: www.enterprisedb.com EnterpriseDB Blog: http://blogs.enterprisedb.com/ Follow us on Twitter: http://www.twitter.com/enterprisedb This e-mail message (and any attachment) is intended for the use of the individual or entity to whom it is addressed. This message contains information from EnterpriseDB Corporation that may be privileged, confidential, or exempt from disclosure under applicable law. If you are not the intended recipient or authorized to receive this for the intended recipient, any use, dissemination, distribution, retention, archiving, or copying of this communication is strictly prohibited. If you have received this e-mail in error, please notify the sender immediately by reply e-mail and delete this message. |