First of all, thank you guys for the greate xc project, which somehow
represents the future of relational database.
However, I personally think xc in current stage is more academic than
practical. It is hard to configure and manage for a production system. It
will be a complicated cluster system and even more complicated with HA
configuration, in the eyes of an application developer/architecturer
instead of a database expert or DBA. I would like to suggest something, as
I already talked a little with Mr. Suzuki.
(1) Centralized Management
If you see pacemaker/corosync, you will find the crm interesting. Every
configuration change made on any node will be broadcased to the
cluster. For xc, the nodes in the cluster do now known anything about each
other. Actually, GTM may know the whole thing, although I have to repeat
each configuration on every gtm-proxies, coordinators, and datanodes.
Do you think it is possible to centralize configuration, monitoring,
tuning, management, etc? The first idea comes to my mind is to improve gtm
module, which already communicates with every other nodes.
(2) High Avalibility
I know currently xc-HA depends on HA of each node, either gtm-standby or
streaming replication of coordinator and data nodes. However, the solution
is too complicated and freaks me out. I have tried to create a xc-HA
cluster by myself and it is painful. Not only we are short of heartbeat
resource agent for streaming replicated postgresql, but also I think the
system is too fragile to be a production system without reliable fail-over
and fail-back.
Do you think we can make the xc-cluster itself fail-tolerent? For example,
create anonther type of node - 'shadow/backup' nodes. They are identical to
its coresponding coordinator/data node and SQL statement-level replicated.
There will be no single failure point in the cluster.
(3) Disaster Backup
All data in this share-nothing cluster have to be shipped to the backup
facility. If we just use streaming replication of each node, the backup
cluster will never function when replication of any node fails or not all
replation streams are synchronized. So far I have to use pgpool as the
front-end of the two clusters.
I think it will be better if there is a cluster replication solution other
than nodes replication. Does any DML go through gtm? If it is true, gtm
will be the best module to replicate all data changes to the backup cluster.
I am really looking forward to "dynamic node adding/removing" in xc
roadmap. It will make xc not a set of machines, but a real cluster.
Furthermore, it is the starting point of all these features.
Thanks in advance.
Liu
|