Galera Cluster

TechEvent

Synchronous Multi-Master
Replication for MySQL HA

April 2013

Ludovico CALDARA
LS-IMS
27.04.2013

BASEL

1

BERN

LAUSANNE

ZÜRICH

DÜSSELDORF

FRANKFURT A.M.

FREIBURG I.BR.

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013

HAMBURG

MÜNCHEN

STUTTGART

WIEN
MySQL forks: which one is better?

MySQL

Oracle MySQL

New forks
Percona Server

Many new features
MariaDB

Improved instrumentation
Drizzle

New solutions for DEVs and DBAs
Fast-paced competition between forks’ developers
Recent evolutions in HA and scalability have made MySQL enterprise ready

2

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013
There is no recipe that can satisfy all tastes

Percona Server

MariaDB

MySQL

Multi source replication

NO

YES (rel. 10)

NO

NoSQL integration

YES (cassandra)

YES (cassandra)

YES (memcached)

Virtual Columns

NO

YES

NO

Improved diagnostics

YES

NO

NO

Online DDL

NO

YES

YES

Galera Cluster

YES

YES

YES (codership patch)

Many many others

YES/NO

YES/NO

YES/NO

3

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013
Your real requirements will let you choose… Need HA?

•

4

How will react your customer if there is an important loss of service?

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013
Old-school solutions have weaknesses
Native MySQL Replication
• Doesn’t scale writes
• Complex to promote slaves
MySQL Multi-Master Replication
• Complex and not reliable
• Concurrent writes lead to logical corruption
DRBD Replication
• Standby is offline, doesn’t scale at all
• Poor performance
MySQL Cluster
• Very complex
• It’s not InnoDB!

NDB

NDB
NDB

5

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013
New school solutions: 3rd parties are playing a decisive role
Continuent Tungsten Replicator
• Similar to Golden Gate
• Heterogeneous databases
• Provides complex topologies
• Asynchronous
• Conflicts are complex to resolve
• Complex to maintain
• Not free

ORACLE MYSQL

Galera Cluster Replication
• Transparent Multi-Master easy to mantain
• (Virtually) Synchronous
• It’s InnoDB (only InnoDB)
• Great and easy scalability
• Optimistic locking (side effects)
• At least 3 nodes for good HA

6

MYSQL ORACLE

MYSQL

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013
Multi-Master and virtually synchronous: it’s transparent

R/W

7

R/W

R/W

R/W

R/W

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013
Cluster implementation - Ingredients
• One or more standalone servers (either physical or virtual)
• Linux (other operating systems are not yet available)
• “Permissive” Firewall between nodes
• Codership’s Galera Library package
• A package of your choice:
• Percona XtraDB Cluster
• MariaDB Galera Cluster
• MySQL with wsrep patch
(patched by Codership)

8

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013
Cluster implementation - Variables

• Each server’s my.cnf must contain:
• wsrep_cluster_address=gcomm://192.168.1.100,…,192.168.1.10x
• wsrep_provider=/usr/lib64/libgalera_smm.so
• binlog_format=ROW
• default_storage_engine=InnoDB
• innodb_autoinc_lock_mode=2
• innodb_locks_unsafe_for_binlog=1 #disables gap locking

9

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013
Cluster implementation – Start the cluster

mysqld_safe --wsrep_cluster_address=gcomm:// &
[…]
130220 17:56:46 [Note] WSREP: Starting new group from scratch:
[…]

The empty gcomm:// address starts the node as the first of the cluster
NEVER USE IT TO JOIN AN EXISTING CLUSTER

10

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013
Cluster implementation – Adding nodes to the cluster

mysqld_safe 
--wsrep_cluster_address=gcomm://host1,host2… &
[…]
130220 18:01:56 [Note] WSREP: Shifting OPEN -> PRIMARY (TO:…)
130220 18:01:56 [Note] WSREP: State transfer required:
[…]

The address should be already present in the my.cnf!

11

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013
Server State Transfer
• The joiner asks for a SST

R/W

12

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013

R/W

R/W
Server State Transfer
• The joiner asks for a SST
• The cluster chooses a donor, the donor is taken offline

R/W

DONOR

13

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013

R/W
Server State Transfer
• The joiner asks for a SST
• The cluster chooses a donor, the donor is taken offline
• The donor is backed up
• The donor comes online again and the joiner is loaded

R/W

DONOR

14

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013

R/W

R/W
Server State Transfer
• The joiner asks for a SST
• The cluster chooses a donor, the donor is taken offline
• The donor is backed up
• The donor comes online again and the joiner is loaded
• The joiner replays the missing transactions
and joins the cluster
R/W

R/W

DONOR

15

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013

R/W

R/W
Server State Transfer
• The joiner asks for a SST
• The cluster chooses a donor, the donor is taken offline
• The donor is backed up
• The donor comes online again and the joiner is loaded
• The joiner replays the missing transactions
and joins the cluster
• The cluster can also do
Incremental State Transfers (IST)

16

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013

R/W

R/W

R/W

R/W
Split-Brain
• The majority of nodes wins
• Complete loss of network: all nodes
go offline
• The offline nodes will respond:
mysql> select * from emp;
ERROR 1047 (08S01): Unknown
command
• Galera arbitrator (garbd) can join the
cluster and count as a member in split
brain resolution.
• NEW: Galera 2.4 intruduces weighted
quorum
17

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013

gar
arbitrator
Example 1: Arbitrator in Trivadis Swiss
BASEL

… sorry for German/Austrian attenders ☺

ZURICH

WAN
arbitrator

• If the WAN connection is lost,
Zurich survives

BERN

• If the Zurich site is lost, the cluster
will be off lined
LAUSANNE
18

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013
Example 2: Arbitrator in Trivadis Swiss
BASEL

… sorry for German/Austrian attenders ☺

ZURICH

WAN

• If the Zurich site is lost, the other
sites survive

BERN

• If the WAN connection is lost, the
cluster will be off lined
LAUSANNE
19

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013

arbitrator
What does “Virtually synchronous” mean? In brief:

Write

20

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013
What does “Virtually synchronous” mean? In brief:

Write
Commit

WS

21

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013
What does “Virtually synchronous” mean? In brief:

Write
Commit

WS

22

WS

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013

WS
What does “Virtually synchronous” mean? In brief:

Write
Commit
Commit
OK

WS

23

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013

WS
What does “Virtually synchronous” mean? In brief:
•

Writes are as fast as if they were local

•

Commits take just the time of a network
roundtrip: if acceptable then the cluster
can be spread geographically

Write
Commit
Commit
OK

WS

24

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013

WS
Optimistic locking leads to side effects
mysql> update emp set salary=‘peanuts’ where name=‘Caldara';
Query OK, 1 row affected (0.03 sec)
Rows matched: 1 Changed: 1 Warnings: 0

25

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013
Optimistic locking leads to side effects
mysql> update emp set salary=‘peanuts’ where name=‘Caldara';
Query OK, 1 row affected (0.03 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> update emp set salary=‘one billion' where name=‘Caldara';
Query OK, 1 row affected (0.03 sec)
Rows matched: 1 Changed: 1 Warnings: 0

26

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013
Optimistic locking leads to side effects
mysql> update emp set salary=‘peanuts’ where name=‘Caldara';
Query OK, 1 row affected (0.03 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> update emp set salary=‘one billion' where name=‘Caldara';
Query OK, 1 row affected (0.03 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> commit;
Query OK, 0 rows affected (0.01 sec

WS

27

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013

WS
Optimistic locking leads to side effects
mysql> update emp set salary=‘peanuts’ where name=‘Caldara';
Query OK, 1 row affected (0.03 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> update emp set salary=‘one billion' where name=‘Caldara';
Query OK, 1 row affected (0.03 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> commit;
Query OK, 0 rows affected (0.01 sec
mysql> commit;
ERROR 1213 (40001): Deadlock found when trying to get lock; try
restarting transaction
mysql> select salary from emp where name=‘Caldara’;
+-------------+
| salary
|
+-------------+
| one billion |
+-------------+
WS

28

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013
Conclusions on optimistic locking…
• Locally, the first that acquires the lock wins (it’s InnoDB…)
• Cluster-wise, the first that broadcasts its commit wins (it’s
Galera…)
• The application should not have hotspots...
• … or it should retry the transaction after the deadlock occurs…
• … or, for each database, you can elegy one node as the master

29

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013
About performance
• Commit performance loss is between 5% and 10% plus the network RTT
• Write workloads scale to up to 8 nodes
• >8 nodes: it scales reads, not writes
• Many benchmarks show that Galera overcomes NDB with few nodes
• NDB scales out more with many nodes thanks to data sharding
• Benchmarks on internet are not always reliable… test the performance
of YOUR application

30

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013
How to migrate
• Converts all your tables to InnoDB
• Double-check that all tables have primary keys
• Think about potential problems caused by triggers (if you have any)
• Create a new empty Galera Cluster
• Setup MySQL native replication between the old database and the
Galera cluster
• Once all is aligned, direct your clients on the new cluster
• Setup the old node to join the cluster

NATIVE
REPLICATION

31

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013

JOIN
Load balancing
• HAProxy is the most used solution so far
• Codership is actively developing his own load
balancer: Galera Load Balancer (glbd)
• Several balancing modes: round robin,
custom, least connected, …
• Automatically drains disconnected nodes
• New nodes can be added with a single tcp
call
• Release 1.0 (now rc1) will support
watchdog and automatic discover of
nodes composing the cluster
• Other methods possible (e.g. java connector
properties, HW load balancer)
32

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013
Conclusions on Galera Cluster
• Multi-master
• Shared-nothing
• Great performances and scalability
• «Virtually» synchronous
• It uses InnoDB!!
• Conflict prevention
• Split-brain (no inconsistencies)
• Easy to add/remove nodes

33

• At least 3 nodes to have good HA
• Optimistic locking (side effects)
• Explicit locking doesn’t work
• Only InnoDB is replicated
• Primary keys are mandatory
• Not yet available for MySQL 5.6
• Linux only

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013
Links
http://www.slideshare.net/skysql/galera-cluster-by-seppo-jaakola-codership-at-skysql-roadshow-instuttgart-2013
http://www.codership.com/files/presentations/Galera_Replication_PLL_2011.pdf
http://www.mysqlperformanceblog.com/2013/01/31/feature-in-details-incremental-state-transfer-after-anode-crash-in-percona-xtradb-cluster/
http://www.percona.tv/percona-webinars/migrating-to-percona-xtradb-cluster
http://www.codership.com/content/5-tips-migrating-your-mysql-server-galera-cluster
http://www.mysqlperformanceblog.com/2012/08/17/percona-xtradb-cluster-multi-node-writing-andunexpected-deadlocks/
http://www.mysqlperformanceblog.com/2012/11/20/understanding-multi-node-writing-conflict-metrics-inpercona-xtradb-cluster-and-galera/
http://www.mysqlperformanceblog.com/2011/10/13/benchmarking-galera-replication-overhead/
http://karlssonondatabases.blogspot.ch/2012/12/galera-features-beyond-just-ha.html
http://infoscience.epfl.ch/record/52305/files/IC_TECH_REPORT_199908.pdf
http://www.inf.usi.ch/faculty/pedone/Paper/2005/2005WDIDDR.pdf

34

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013
Little demo?

35

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013
?

36

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013
Trivadis SA

THANK YOU.

Ludovico Caldara
Senior Consultant

Ludovico.caldara@trivadis.com
www.trivadis.com

BASEL

37

BERN

LAUSANNE

ZÜRICH

DÜSSELDORF

FRANKFURT A.M.

FREIBURG I.BR.

2012 © Trivadis
Galera Cluster Synchronous Multi-Master Replication for MySQL HA
27.04.2013

HAMBURG

MÜNCHEN

STUTTGART

WIEN

More Related Content

PDF
What’s the Best PostgreSQL High Availability Framework? PAF vs. repmgr vs. Pa...
PDF
Meet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike Steenbergen
PDF
Best Practices for Getting Started with NGINX Open Source
PDF
PostgreSQL Replication High Availability Methods
PPTX
Enable GoldenGate Monitoring with OEM 12c/JAgent
PDF
PostgreSQL replication
PDF
Percona XtraDB Cluster
PPTX
Running MariaDB in multiple data centers
What’s the Best PostgreSQL High Availability Framework? PAF vs. repmgr vs. Pa...
Meet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike Steenbergen
Best Practices for Getting Started with NGINX Open Source
PostgreSQL Replication High Availability Methods
Enable GoldenGate Monitoring with OEM 12c/JAgent
PostgreSQL replication
Percona XtraDB Cluster
Running MariaDB in multiple data centers

What's hot (20)

PDF
Planning for Disaster Recovery (DR) with Galera Cluster
PDF
Almost Perfect Service Discovery and Failover with ProxySQL and Orchestrator
PDF
Advanced Percona XtraDB Cluster in a nutshell... la suite
PPTX
2-day-dba-oracle.pptx
PPT
Using galera replication to create geo distributed clusters on the wan
PDF
Optimizing MariaDB for maximum performance
PDF
MySQL InnoDB Cluster - A complete High Availability solution for MySQL
PDF
M|18 How MariaDB Server Scales with Spider
PPT
Galera Cluster Best Practices for DBA's and DevOps Part 1
PDF
Replicacion Postgresql
PDF
MariaDB Performance Tuning and Optimization
PDF
第三回IoT関連技術勉強会 データ通信編
PPTX
Top 10 senior systems architect interview questions and answers
PDF
High Availability PostgreSQL with Zalando Patroni
PPT
Oracle database - Get external data via HTTP, FTP and Web Services
PDF
Percona Xtradb Cluster (pxc) 101 percona university 2019
PPTX
Christo kutrovsky oracle, memory & linux
PDF
Curso de MySQL 5.7
PPTX
ProxySQL & PXC(Query routing and Failover Test)
PDF
MySQL InnoDB Cluster - New Features in 8.0 Releases - Best Practices
Planning for Disaster Recovery (DR) with Galera Cluster
Almost Perfect Service Discovery and Failover with ProxySQL and Orchestrator
Advanced Percona XtraDB Cluster in a nutshell... la suite
2-day-dba-oracle.pptx
Using galera replication to create geo distributed clusters on the wan
Optimizing MariaDB for maximum performance
MySQL InnoDB Cluster - A complete High Availability solution for MySQL
M|18 How MariaDB Server Scales with Spider
Galera Cluster Best Practices for DBA's and DevOps Part 1
Replicacion Postgresql
MariaDB Performance Tuning and Optimization
第三回IoT関連技術勉強会 データ通信編
Top 10 senior systems architect interview questions and answers
High Availability PostgreSQL with Zalando Patroni
Oracle database - Get external data via HTTP, FTP and Web Services
Percona Xtradb Cluster (pxc) 101 percona university 2019
Christo kutrovsky oracle, memory & linux
Curso de MySQL 5.7
ProxySQL & PXC(Query routing and Failover Test)
MySQL InnoDB Cluster - New Features in 8.0 Releases - Best Practices
Ad

Viewers also liked (10)

PDF
Galera Cluster for MySQL vs MySQL (NDB) Cluster: A High Level Comparison
PDF
Introduction to Galera
PDF
CMS + CRM: Integrando Plone y Salesforce
PDF
Multi source replication pdf
PDF
High Availability with Galera Cluster - SkySQL Road Show 2013 in Berlin
PDF
Rapid Home Provisioning
PDF
Oracle Active Data Guard and Global Data Services in Action!
ODP
Getting to Know MySQL Enterprise Monitor
PDF
AWS로 불꺼온 나날들
PDF
Galera cluster for MySQL - Introduction Slides
Galera Cluster for MySQL vs MySQL (NDB) Cluster: A High Level Comparison
Introduction to Galera
CMS + CRM: Integrando Plone y Salesforce
Multi source replication pdf
High Availability with Galera Cluster - SkySQL Road Show 2013 in Berlin
Rapid Home Provisioning
Oracle Active Data Guard and Global Data Services in Action!
Getting to Know MySQL Enterprise Monitor
AWS로 불꺼온 나날들
Galera cluster for MySQL - Introduction Slides
Ad

Similar to Galera Cluster: Synchronous Multi-Master Replication for MySQL HA (20)

PDF
MariaDB Galera Cluster - Simple, Transparent, Highly Available
PDF
State of The Dolphin - May 2021
PDF
Webinar Slides: MySQL HA/DR/Geo-Scale - High Noon #7: ClusterControl
PDF
MariaDB - The Future of MySQL?
PDF
Scaling MySQL -- Swanseacon.co.uk
PDF
Webinar Slides: MySQL Multi-Site Multi-Master Done Right
PDF
20190817 coscup-oracle my sql innodb cluster sharing
PPTX
ConFoo MySQL Replication Evolution : From Simple to Group Replication
PDF
Oss4b - pxc introduction
PPTX
MySQL Replication Evolution -- Confoo Montreal 2017
PDF
Disaster Recovery with MySQL InnoDB ClusterSet - What is it and how do I use it?
PDF
Webinar slides: Introducing Galera 3.0 - Now supporting MySQL 5.6
PDF
Webinar Slides: Multi-Region AWS Aurora vs Continuent Tungsten for MySQL & Ma...
PDF
MySQL High Availability Solutions - Avoid loss of service by reducing the r...
PPT
Mysql high availability and scalability
PDF
MySQL InnoDB Cluster and Group Replication in a Nutshell
PDF
MySQL Replication Update -- Zendcon 2016
PPTX
SQL Server Clustering Part1
PPT
WLS12c_NewFeatures_Basics
MariaDB Galera Cluster - Simple, Transparent, Highly Available
State of The Dolphin - May 2021
Webinar Slides: MySQL HA/DR/Geo-Scale - High Noon #7: ClusterControl
MariaDB - The Future of MySQL?
Scaling MySQL -- Swanseacon.co.uk
Webinar Slides: MySQL Multi-Site Multi-Master Done Right
20190817 coscup-oracle my sql innodb cluster sharing
ConFoo MySQL Replication Evolution : From Simple to Group Replication
Oss4b - pxc introduction
MySQL Replication Evolution -- Confoo Montreal 2017
Disaster Recovery with MySQL InnoDB ClusterSet - What is it and how do I use it?
Webinar slides: Introducing Galera 3.0 - Now supporting MySQL 5.6
Webinar Slides: Multi-Region AWS Aurora vs Continuent Tungsten for MySQL & Ma...
MySQL High Availability Solutions - Avoid loss of service by reducing the r...
Mysql high availability and scalability
MySQL InnoDB Cluster and Group Replication in a Nutshell
MySQL Replication Update -- Zendcon 2016
SQL Server Clustering Part1
WLS12c_NewFeatures_Basics

More from Ludovico Caldara (20)

PDF
Oracle Fleet Patching and Provisioning Deep Dive Webcast Slides
PDF
Oracle Drivers configuration for High Availability, is it a developer's job?
PDF
Oracle Drivers configuration for High Availability
PDF
Long live to CMAN!
PDF
Let your DBAs get some REST(api)
PDF
Effective Oracle Home Management - UKOUG_Tech18
PDF
Effective Oracle Home Management in the new Release Model era
PPTX
Oracle Active Data Guard 12cR2. Is it the best option?
PPTX
How to bake a Customer Story with With Windows, NVM-e, Data Guard, ACFS Snaps...
PPTX
Get the most out of Oracle Data Guard - OOW version
PPTX
Get the most out of Oracle Data Guard - POUG version
PDF
ADAPTIVE FEATURES OR: HOW I LEARNED TO STOP WORRYING AND TROUBLESHOOT THE BOMB
PDF
Oracle Client Failover - Under The Hood
PPTX
Adaptive Features or: How I Learned to Stop Worrying and Troubleshoot the Bomb.
PPTX
Database Migration Assistant for Unicode (DMU)
PPTX
Migrating to Oracle Database 12c: 300 DBs in 300 days.
PDF
Oracle RAC, Oracle Data Guard, and Pluggable Databases: When MAA Meets Oracle...
PDF
Oracle Database on ACFS: a perfect marriage?
PDF
ORACLE 12C DATA GUARD: FAR SYNC, REAL-TIME CASCADE STANDBY AND OTHER GOODIES
PDF
Oracle RAC 12c and Policy-Managed Databases, a Technical Overview
Oracle Fleet Patching and Provisioning Deep Dive Webcast Slides
Oracle Drivers configuration for High Availability, is it a developer's job?
Oracle Drivers configuration for High Availability
Long live to CMAN!
Let your DBAs get some REST(api)
Effective Oracle Home Management - UKOUG_Tech18
Effective Oracle Home Management in the new Release Model era
Oracle Active Data Guard 12cR2. Is it the best option?
How to bake a Customer Story with With Windows, NVM-e, Data Guard, ACFS Snaps...
Get the most out of Oracle Data Guard - OOW version
Get the most out of Oracle Data Guard - POUG version
ADAPTIVE FEATURES OR: HOW I LEARNED TO STOP WORRYING AND TROUBLESHOOT THE BOMB
Oracle Client Failover - Under The Hood
Adaptive Features or: How I Learned to Stop Worrying and Troubleshoot the Bomb.
Database Migration Assistant for Unicode (DMU)
Migrating to Oracle Database 12c: 300 DBs in 300 days.
Oracle RAC, Oracle Data Guard, and Pluggable Databases: When MAA Meets Oracle...
Oracle Database on ACFS: a perfect marriage?
ORACLE 12C DATA GUARD: FAR SYNC, REAL-TIME CASCADE STANDBY AND OTHER GOODIES
Oracle RAC 12c and Policy-Managed Databases, a Technical Overview

Recently uploaded (20)

PDF
CXOs-Are-you-still-doing-manual-DevOps-in-the-age-of-AI.pdf
PDF
Enhancing plagiarism detection using data pre-processing and machine learning...
PDF
Comparative analysis of machine learning models for fake news detection in so...
PDF
The-Future-of-Automotive-Quality-is-Here-AI-Driven-Engineering.pdf
PPTX
MuleSoft-Compete-Deck for midddleware integrations
PDF
Auditboard EB SOX Playbook 2023 edition.
PDF
Rapid Prototyping: A lecture on prototyping techniques for interface design
PDF
Taming the Chaos: How to Turn Unstructured Data into Decisions
PDF
Statistics on Ai - sourced from AIPRM.pdf
PPTX
Build Your First AI Agent with UiPath.pptx
PPTX
GROUP4NURSINGINFORMATICSREPORT-2 PRESENTATION
PPTX
Internet of Everything -Basic concepts details
PPT
Galois Field Theory of Risk: A Perspective, Protocol, and Mathematical Backgr...
PDF
The-2025-Engineering-Revolution-AI-Quality-and-DevOps-Convergence.pdf
PDF
giants, standing on the shoulders of - by Daniel Stenberg
PDF
Produktkatalog für HOBO Datenlogger, Wetterstationen, Sensoren, Software und ...
PDF
4 layer Arch & Reference Arch of IoT.pdf
PDF
Consumable AI The What, Why & How for Small Teams.pdf
PDF
Transform-Quality-Engineering-with-AI-A-60-Day-Blueprint-for-Digital-Success.pdf
DOCX
search engine optimization ppt fir known well about this
CXOs-Are-you-still-doing-manual-DevOps-in-the-age-of-AI.pdf
Enhancing plagiarism detection using data pre-processing and machine learning...
Comparative analysis of machine learning models for fake news detection in so...
The-Future-of-Automotive-Quality-is-Here-AI-Driven-Engineering.pdf
MuleSoft-Compete-Deck for midddleware integrations
Auditboard EB SOX Playbook 2023 edition.
Rapid Prototyping: A lecture on prototyping techniques for interface design
Taming the Chaos: How to Turn Unstructured Data into Decisions
Statistics on Ai - sourced from AIPRM.pdf
Build Your First AI Agent with UiPath.pptx
GROUP4NURSINGINFORMATICSREPORT-2 PRESENTATION
Internet of Everything -Basic concepts details
Galois Field Theory of Risk: A Perspective, Protocol, and Mathematical Backgr...
The-2025-Engineering-Revolution-AI-Quality-and-DevOps-Convergence.pdf
giants, standing on the shoulders of - by Daniel Stenberg
Produktkatalog für HOBO Datenlogger, Wetterstationen, Sensoren, Software und ...
4 layer Arch & Reference Arch of IoT.pdf
Consumable AI The What, Why & How for Small Teams.pdf
Transform-Quality-Engineering-with-AI-A-60-Day-Blueprint-for-Digital-Success.pdf
search engine optimization ppt fir known well about this

Galera Cluster: Synchronous Multi-Master Replication for MySQL HA

  • 1. Galera Cluster TechEvent Synchronous Multi-Master Replication for MySQL HA April 2013 Ludovico CALDARA LS-IMS 27.04.2013 BASEL 1 BERN LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013 HAMBURG MÜNCHEN STUTTGART WIEN
  • 2. MySQL forks: which one is better? MySQL Oracle MySQL New forks Percona Server Many new features MariaDB Improved instrumentation Drizzle New solutions for DEVs and DBAs Fast-paced competition between forks’ developers Recent evolutions in HA and scalability have made MySQL enterprise ready 2 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 3. There is no recipe that can satisfy all tastes Percona Server MariaDB MySQL Multi source replication NO YES (rel. 10) NO NoSQL integration YES (cassandra) YES (cassandra) YES (memcached) Virtual Columns NO YES NO Improved diagnostics YES NO NO Online DDL NO YES YES Galera Cluster YES YES YES (codership patch) Many many others YES/NO YES/NO YES/NO 3 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 4. Your real requirements will let you choose… Need HA? • 4 How will react your customer if there is an important loss of service? 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 5. Old-school solutions have weaknesses Native MySQL Replication • Doesn’t scale writes • Complex to promote slaves MySQL Multi-Master Replication • Complex and not reliable • Concurrent writes lead to logical corruption DRBD Replication • Standby is offline, doesn’t scale at all • Poor performance MySQL Cluster • Very complex • It’s not InnoDB! NDB NDB NDB 5 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 6. New school solutions: 3rd parties are playing a decisive role Continuent Tungsten Replicator • Similar to Golden Gate • Heterogeneous databases • Provides complex topologies • Asynchronous • Conflicts are complex to resolve • Complex to maintain • Not free ORACLE MYSQL Galera Cluster Replication • Transparent Multi-Master easy to mantain • (Virtually) Synchronous • It’s InnoDB (only InnoDB) • Great and easy scalability • Optimistic locking (side effects) • At least 3 nodes for good HA 6 MYSQL ORACLE MYSQL 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 7. Multi-Master and virtually synchronous: it’s transparent R/W 7 R/W R/W R/W R/W 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 8. Cluster implementation - Ingredients • One or more standalone servers (either physical or virtual) • Linux (other operating systems are not yet available) • “Permissive” Firewall between nodes • Codership’s Galera Library package • A package of your choice: • Percona XtraDB Cluster • MariaDB Galera Cluster • MySQL with wsrep patch (patched by Codership) 8 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 9. Cluster implementation - Variables • Each server’s my.cnf must contain: • wsrep_cluster_address=gcomm://192.168.1.100,…,192.168.1.10x • wsrep_provider=/usr/lib64/libgalera_smm.so • binlog_format=ROW • default_storage_engine=InnoDB • innodb_autoinc_lock_mode=2 • innodb_locks_unsafe_for_binlog=1 #disables gap locking 9 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 10. Cluster implementation – Start the cluster mysqld_safe --wsrep_cluster_address=gcomm:// & […] 130220 17:56:46 [Note] WSREP: Starting new group from scratch: […] The empty gcomm:// address starts the node as the first of the cluster NEVER USE IT TO JOIN AN EXISTING CLUSTER 10 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 11. Cluster implementation – Adding nodes to the cluster mysqld_safe --wsrep_cluster_address=gcomm://host1,host2… & […] 130220 18:01:56 [Note] WSREP: Shifting OPEN -> PRIMARY (TO:…) 130220 18:01:56 [Note] WSREP: State transfer required: […] The address should be already present in the my.cnf! 11 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 12. Server State Transfer • The joiner asks for a SST R/W 12 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013 R/W R/W
  • 13. Server State Transfer • The joiner asks for a SST • The cluster chooses a donor, the donor is taken offline R/W DONOR 13 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013 R/W
  • 14. Server State Transfer • The joiner asks for a SST • The cluster chooses a donor, the donor is taken offline • The donor is backed up • The donor comes online again and the joiner is loaded R/W DONOR 14 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013 R/W R/W
  • 15. Server State Transfer • The joiner asks for a SST • The cluster chooses a donor, the donor is taken offline • The donor is backed up • The donor comes online again and the joiner is loaded • The joiner replays the missing transactions and joins the cluster R/W R/W DONOR 15 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013 R/W R/W
  • 16. Server State Transfer • The joiner asks for a SST • The cluster chooses a donor, the donor is taken offline • The donor is backed up • The donor comes online again and the joiner is loaded • The joiner replays the missing transactions and joins the cluster • The cluster can also do Incremental State Transfers (IST) 16 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013 R/W R/W R/W R/W
  • 17. Split-Brain • The majority of nodes wins • Complete loss of network: all nodes go offline • The offline nodes will respond: mysql> select * from emp; ERROR 1047 (08S01): Unknown command • Galera arbitrator (garbd) can join the cluster and count as a member in split brain resolution. • NEW: Galera 2.4 intruduces weighted quorum 17 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013 gar arbitrator
  • 18. Example 1: Arbitrator in Trivadis Swiss BASEL … sorry for German/Austrian attenders ☺ ZURICH WAN arbitrator • If the WAN connection is lost, Zurich survives BERN • If the Zurich site is lost, the cluster will be off lined LAUSANNE 18 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 19. Example 2: Arbitrator in Trivadis Swiss BASEL … sorry for German/Austrian attenders ☺ ZURICH WAN • If the Zurich site is lost, the other sites survive BERN • If the WAN connection is lost, the cluster will be off lined LAUSANNE 19 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013 arbitrator
  • 20. What does “Virtually synchronous” mean? In brief: Write 20 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 21. What does “Virtually synchronous” mean? In brief: Write Commit WS 21 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 22. What does “Virtually synchronous” mean? In brief: Write Commit WS 22 WS 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013 WS
  • 23. What does “Virtually synchronous” mean? In brief: Write Commit Commit OK WS 23 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013 WS
  • 24. What does “Virtually synchronous” mean? In brief: • Writes are as fast as if they were local • Commits take just the time of a network roundtrip: if acceptable then the cluster can be spread geographically Write Commit Commit OK WS 24 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013 WS
  • 25. Optimistic locking leads to side effects mysql> update emp set salary=‘peanuts’ where name=‘Caldara'; Query OK, 1 row affected (0.03 sec) Rows matched: 1 Changed: 1 Warnings: 0 25 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 26. Optimistic locking leads to side effects mysql> update emp set salary=‘peanuts’ where name=‘Caldara'; Query OK, 1 row affected (0.03 sec) Rows matched: 1 Changed: 1 Warnings: 0 mysql> update emp set salary=‘one billion' where name=‘Caldara'; Query OK, 1 row affected (0.03 sec) Rows matched: 1 Changed: 1 Warnings: 0 26 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 27. Optimistic locking leads to side effects mysql> update emp set salary=‘peanuts’ where name=‘Caldara'; Query OK, 1 row affected (0.03 sec) Rows matched: 1 Changed: 1 Warnings: 0 mysql> update emp set salary=‘one billion' where name=‘Caldara'; Query OK, 1 row affected (0.03 sec) Rows matched: 1 Changed: 1 Warnings: 0 mysql> commit; Query OK, 0 rows affected (0.01 sec WS 27 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013 WS
  • 28. Optimistic locking leads to side effects mysql> update emp set salary=‘peanuts’ where name=‘Caldara'; Query OK, 1 row affected (0.03 sec) Rows matched: 1 Changed: 1 Warnings: 0 mysql> update emp set salary=‘one billion' where name=‘Caldara'; Query OK, 1 row affected (0.03 sec) Rows matched: 1 Changed: 1 Warnings: 0 mysql> commit; Query OK, 0 rows affected (0.01 sec mysql> commit; ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction mysql> select salary from emp where name=‘Caldara’; +-------------+ | salary | +-------------+ | one billion | +-------------+ WS 28 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 29. Conclusions on optimistic locking… • Locally, the first that acquires the lock wins (it’s InnoDB…) • Cluster-wise, the first that broadcasts its commit wins (it’s Galera…) • The application should not have hotspots... • … or it should retry the transaction after the deadlock occurs… • … or, for each database, you can elegy one node as the master 29 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 30. About performance • Commit performance loss is between 5% and 10% plus the network RTT • Write workloads scale to up to 8 nodes • >8 nodes: it scales reads, not writes • Many benchmarks show that Galera overcomes NDB with few nodes • NDB scales out more with many nodes thanks to data sharding • Benchmarks on internet are not always reliable… test the performance of YOUR application 30 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 31. How to migrate • Converts all your tables to InnoDB • Double-check that all tables have primary keys • Think about potential problems caused by triggers (if you have any) • Create a new empty Galera Cluster • Setup MySQL native replication between the old database and the Galera cluster • Once all is aligned, direct your clients on the new cluster • Setup the old node to join the cluster NATIVE REPLICATION 31 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013 JOIN
  • 32. Load balancing • HAProxy is the most used solution so far • Codership is actively developing his own load balancer: Galera Load Balancer (glbd) • Several balancing modes: round robin, custom, least connected, … • Automatically drains disconnected nodes • New nodes can be added with a single tcp call • Release 1.0 (now rc1) will support watchdog and automatic discover of nodes composing the cluster • Other methods possible (e.g. java connector properties, HW load balancer) 32 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 33. Conclusions on Galera Cluster • Multi-master • Shared-nothing • Great performances and scalability • «Virtually» synchronous • It uses InnoDB!! • Conflict prevention • Split-brain (no inconsistencies) • Easy to add/remove nodes 33 • At least 3 nodes to have good HA • Optimistic locking (side effects) • Explicit locking doesn’t work • Only InnoDB is replicated • Primary keys are mandatory • Not yet available for MySQL 5.6 • Linux only 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 34. Links http://www.slideshare.net/skysql/galera-cluster-by-seppo-jaakola-codership-at-skysql-roadshow-instuttgart-2013 http://www.codership.com/files/presentations/Galera_Replication_PLL_2011.pdf http://www.mysqlperformanceblog.com/2013/01/31/feature-in-details-incremental-state-transfer-after-anode-crash-in-percona-xtradb-cluster/ http://www.percona.tv/percona-webinars/migrating-to-percona-xtradb-cluster http://www.codership.com/content/5-tips-migrating-your-mysql-server-galera-cluster http://www.mysqlperformanceblog.com/2012/08/17/percona-xtradb-cluster-multi-node-writing-andunexpected-deadlocks/ http://www.mysqlperformanceblog.com/2012/11/20/understanding-multi-node-writing-conflict-metrics-inpercona-xtradb-cluster-and-galera/ http://www.mysqlperformanceblog.com/2011/10/13/benchmarking-galera-replication-overhead/ http://karlssonondatabases.blogspot.ch/2012/12/galera-features-beyond-just-ha.html http://infoscience.epfl.ch/record/52305/files/IC_TECH_REPORT_199908.pdf http://www.inf.usi.ch/faculty/pedone/Paper/2005/2005WDIDDR.pdf 34 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 35. Little demo? 35 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 36. ? 36 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013
  • 37. Trivadis SA THANK YOU. Ludovico Caldara Senior Consultant [email protected] www.trivadis.com BASEL 37 BERN LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. 2012 © Trivadis Galera Cluster Synchronous Multi-Master Replication for MySQL HA 27.04.2013 HAMBURG MÜNCHEN STUTTGART WIEN