Re: [Postgres-xc-developers] Patch to lock cluster

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

On 6 March 2013 15:20, Abbas Butt <abb...@en...> wrote:

>
>
> On Fri, Mar 1, 2013 at 5:48 PM, Amit Khandekar <
> ami...@en...> wrote:
>
>> On 19 February 2013 12:37, Abbas Butt <abb...@en...>
>> wrote:
>> >
>> > Hi,
>> > Attached please find a patch that locks the cluster so that dump can be
>> > taken to be restored on the new node to be added.
>> >
>> > To lock the cluster the patch adds a new GUC parameter called
>> > xc_lock_for_backup, however its status is maintained by the pooler. The
>> > reason is that the default behavior of XC is to release connections as
>> soon
>> > as a command is done and it uses PersistentConnections GUC to control
>> the
>> > behavior. We in this case however need a status that is independent of
>> the
>> > setting of PersistentConnections.
>> >
>> > Assume we have two coordinator cluster, the patch provides this
>> behavior:
>> >
>> > Case 1: set and show
>> > ====================
>> > psql test -p 5432
>> > set xc_lock_for_backup=yes;
>> > show xc_lock_for_backup;
>> >  xc_lock_for_backup
>> > --------------------
>> >  yes
>> > (1 row)
>> >
>> > Case 2: set from one client show from other
>> > ==================================
>> > psql test -p 5432
>> > set xc_lock_for_backup=yes;
>> > (From another tab)
>> > psql test -p 5432
>> > show xc_lock_for_backup;
>> >  xc_lock_for_backup
>> > --------------------
>> >  yes
>> > (1 row)
>> >
>> > Case 3: set from one, quit it, run again and show
>> > ======================================
>> > psql test -p 5432
>> > set xc_lock_for_backup=yes;
>> > \q
>> > psql test -p 5432
>> > show xc_lock_for_backup;
>> >  xc_lock_for_backup
>> > --------------------
>> >  yes
>> > (1 row)
>> >
>> > Case 4: set on one coordinator, show from other
>> > =====================================
>> > psql test -p 5432
>> > set xc_lock_for_backup=yes;
>> > (From another tab)
>> > psql test -p 5433
>> > show xc_lock_for_backup;
>> >  xc_lock_for_backup
>> > --------------------
>> >  yes
>> > (1 row)
>> >
>> > pg_dump and pg_dumpall seem to work fine after locking the cluster for
>> > backup but I would test these utilities in detail next.
>> >
>> > Also I have yet to look in detail that standard_ProcessUtility is the
>> only
>> > place that updates the portion of catalog that is dumped. There may be
>> some
>> > other places too that need to be blocked for catalog updates.
>> >
>> > The patch adds no extra warnings and regression shows no extra failure.
>> >
>> > Comments are welcome.
>>
>> Abbas wrote on another thread:
>>
>> >   Amit wrote on another thread:
>> >> I haven't given a thought on the earlier patch you sent for cluster
>> lock
>> >> implementation; may be we can discuss this on that thread, but just a
>> quick
>> >> question:
>> >>
>> >> Does the cluster-lock command wait for the ongoing DDL commands to
>> finish
>> >> ? If not, we have problems. The subsequent pg_dump would not contain
>> objects
>> >> created by these particular DDLs.
>> >
>> >
>> > Suppose you have a two coordinator cluster. Assume one client connected
>> to
>> > each. Suppose one client issues a lock cluster command and the other
>> issues
>> > a DDL. Is this what you mean by an ongoing DDL? If true then answer to
>> your
>> > question is Yes.
>> >
>> > Suppose you have a prepared transaction that has a DDL in it, again if
>> this
>> > can be considered an on going DDL, then again answer to your question is
>> > Yes.
>> >
>> > Suppose you have a two coordinator cluster. Assume one client connected
>> to
>> > each. One client starts a transaction and issues a DDL, the second
>> client
>> > issues a lock cluster command, the first commits the transaction. If
>> this is
>> > an ongoing DDL, then the answer to your question is No.
>>
>> Yes this last scenario is what I meant: A DDL has been executed on nodes,
>> but
>>  not committed, when the cluster lock command is run and then pg_dump
>> immediately
>> starts its transaction before the DDL is committed. Here pg_dump does
>> not see the new objects that would be created.
>>
>> I myself am not sure how would we prevent this from happening. There
>> are two callback hooks that might be worth considering though:
>> 1. Transaction End callback (CallXactCallbacks)
>> 2. Object creation/drop hook (InvokeObjectAccessHook)
>>
>> Suppose we create an object creation/drop hook function that would :
>> 1. store the current transaction id in a global objects_created list
>> if the cluster is not locked,
>> 2. or else if the cluster is locked, this hook would ereport() saying
>> "cannot create catalog objects in this mode".
>>
>> And then during transaction commit , a new transaction callback hook will:
>> 1. Check the above objects_created list to see if the current
>> transaction has any objects created/dropped.
>> 2. If found and if the cluster-lock is on, it will again ereport()
>> saying "cannot create catalog objects in this mode"
>>
>> Thinking more on the object creation hook, we can even consider this
>> as a substitute for checking the cluster-lock status in
>> standardProcessUtility(). But I am not sure whether this hook does get
>> called on each of the catalog objects. At least the code comments say
>> it does.
>>
>
> Thanks for the ideas, here is how I handled the problem of ongoing DDLs.
>
> 1. Online node addition feature requires that each transaction
>     should be monitored for any activity that would be prohibited
>     if the cluster is locked before the transaction commit.
>     This obviously adds some overhead in each transaction.
>     If the database administrator is sure that the deployed
>     cluster would never require online addition of nodes
>     OR the database administrator decides that node addition
>     will be done by bringing the cluster down then a
>     command line parameter "disable-online-node-addition"
>     can be used to disable transaction monitoring for online node addition
>     By default on line addition of nodes will be available.
>

Is this overhead because you do pooler communication during commit ? If so,
yes, that is a overhead.

In other reply, you said, we have to keep the lock across the sessions; if
we leave that session, the lock goes away, so we would have the restriction
that everything else should be run in the same session. So if we acquire a
session lock in pg_dump itself, would that solve the problem ?

2.  Suppose we have a two coordinator cluster CO1 and CO2
>     Assume one client connected to each coordinator.
>     Further assume one client starts a transaction
>     and issues a DDL. This is an unfinished transaction.
>     Now assume the second client issues
>     SET xc_lock_for_backup=yes
>     The commit on the unfinished transaction should now
>     fail. To handle this situation we monitor each
>     transaction for any activity that would be prohibited
>     if the cluster is locked before transaction commit.
>     At the time of commit we check that if the transaction
>     had issued a prohibited statement and now the cluster
>     has been locked, we abort the commit.
>     This is done only if online addition of nodes has not
>     been disabled explicitly and the server is not running
>     in bootstrap mode.
>
>
Does the object access hook seem to be a feasible option for keeping track
of unfinished DDLs ? If this is feasible, we don't have to prohibit
according to wihch DDL is being run.

-- 

> 3. I did not use CallXactCallbacks because the comment in
> CommitTransaction reads
>      * This is all post-commit cleanup.  Note that if an error is raised
> here,
>      * it's too late to abort the transaction.  This should be just
>      * noncritical resource releasing.
>

Yes, you are right. The transaction has already been committed when this
callback gets invoked.

> I have attached the revised patch with detailed comments.
>
>
>>
>>
>> > But its a matter of
>> > deciding which camp are we going to put COMMIT in, the allow camp, or
>> the
>> > deny camp. I decided to put it in allow camp, because I have not yet
>> written
>> > any code to detect whether a transaction being committed has a DDL in
>> it or
>> > not, and stopping all transactions from committing looks too
>> restrictive to
>> > me.
>>
>>
>> >
>> > Do you have some other meaning of an ongoing DDL?
>>
>>
>>
>> >
>> > --
>> > Abbas
>> > Architect
>> > EnterpriseDB Corporation
>> > The Enterprise PostgreSQL Company
>> >
>> > Phone: 92-334-5100153
>> >
>> > Website: www.enterprisedb.com
>> > EnterpriseDB Blog: http://blogs.enterprisedb.com/
>> > Follow us on Twitter: http://www.twitter.com/enterprisedb
>> >
>> > This e-mail message (and any attachment) is intended for the use of
>> > the individual or entity to whom it is addressed. This message
>> > contains information from EnterpriseDB Corporation that may be
>> > privileged, confidential, or exempt from disclosure under applicable
>> > law. If you are not the intended recipient or authorized to receive
>> > this for the intended recipient, any use, dissemination, distribution,
>> > retention, archiving, or copying of this communication is strictly
>> > prohibited. If you have received this e-mail in error, please notify
>> > the sender immediately by reply e-mail and delete this message.
>> >
>> >
>> ------------------------------------------------------------------------------
>> > Everyone hates slow websites. So do we.
>> > Make your web apps faster with AppDynamics
>> > Download AppDynamics Lite for free today:
>> > http://p.sf.net/sfu/appdyn_d2d_feb
>> > _______________________________________________
>> > Postgres-xc-developers mailing list
>> > Pos...@li...
>> > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
>> >
>>
>
>
>
> --
> --
> Abbas
> Architect
> EnterpriseDB Corporation
> The Enterprise PostgreSQL Company
>
> Phone: 92-334-5100153
>
> Website: www.enterprisedb.com
> EnterpriseDB Blog: http://blogs.enterprisedb.com/
> Follow us on Twitter: http://www.twitter.com/enterprisedb
>
> This e-mail message (and any attachment) is intended for the use of
> the individual or entity to whom it is addressed. This message
> contains information from EnterpriseDB Corporation that may be
> privileged, confidential, or exempt from disclosure under applicable
> law. If you are not the intended recipient or authorized to receive
> this for the intended recipient, any use, dissemination, distribution,
> retention, archiving, or copying of this communication is strictly
> prohibited. If you have received this e-mail in error, please notify
> the sender immediately by reply e-mail and delete this message.
>